Biodiversity underpins ecosystem functioning and the provision of ecosystem services essential for human well-being. It contributes to local livelihoods, and economic development, and is essential for the achievement of the Millennium Development Goals (2015), including poverty reduction. Achieving the 2011-2020 Strategic Plan and its Aichi Biodiversity Targets are also fundamental to addressing grand global biodiversity challenges.
With its Strategic Advisory Board and synergies, EUBrazilOpenBio has defined future cooperation priorities to address biodiversity challenges that are common to Brazil and Europe. The EUBrazilOpenBio Joint Action Plan draws on policy strategies, analyses current progress in contributing to international targets and defines actions for future collaborative research. The Joint Action Plan defines common actions for Europe and Brazil with the aim of contributing to relevant Aichi Targets in the years ahead, building on the current collaborative success story.
EUBrazilOpenBio and its Strategic Advisory Board propose potential future scenarios to address the global challenges through joint collaboration. These action-lines highlight co-operation opportunities not only for multi-disciplinary research but also for building stronger ties between the business communities, and supporting sustainable development through the green economy. Most of the proposed scenarios require relatively small-scale investments.
EUBrazilOpenBio action lines for future cooperation:§ User-friendly approach is essential for the successfull uptake by the user community
§ New knowledge creation together with the integration of foundational technologies
§ Creation of a EU-Brazil Joint Biodiversity Platform to promote the development of public and commercial exploitable assets
§ Support e-skills development for large-scale and long-term information processes
§ Create conditions favourable for the generation of a market of open science and open data services
This report has been publicly presented at BIH2013 - Biodiversity Informatics Confernce Conference in Rome, 3 September 2013. The final release has been delivered in November 2013 and it is now available for download.
The Global Biodiversity Informatics Outlook (GBIO) offers a framework for reaching a much deeper understanding of the world’s biodiversity, and through that understanding the means to conserve it better and to use it more sustainably.
The GBIO identifies four major focal areas, each with a number of core components, to help coordinate efforts and funding:
According to GBIO, EUBrazilOPenBio is one of the projects that will contribute to the "Comprehensive knowledge access" component within the "Evidence" area, which focus, which focus on making all published biodiversity knowledge linked and accessible through the rich indexing of biodiversity literature, data, multimedia and other resources, including presentation of the information as species pages and via web services.
From the report:
"Many national and thematic activities –[...] organize and deliver web content for particular species, either as online databases or as species pages. The Encyclopedia of Life (EOL), now represents an international partnership of institutions and agencies committed to providing open access to authoritative species information, including multimedia, while EUBrazilOpenBio aims to combine open access resources including data, tools and services in a single ‘e-infrastructure’."
The Report is available at http://www.biodiversityinformatics.org/
The final project newsletter presents the final press release showcasing the main results of 2 year collaborative work, namely: the innovative, web-based working environment designed to serve biodiversity scenarios; the new version of the Catalogue of Life cross-mapping tool developed in the i4Life project; the provision of the Ecological Niche Modelling tool as a service through the openModeller extended web service, and its application in collaboration with BioVeL; the EUBrazilOpenBio Joint Action Plan.
This feature also highlights:
§ EUBrazilOpenBio Joint Action Plan
§ EGI federatec use case on ecology
§ EUBrazilOpenBio results
§ A vision from the Experts
"The Crossmapper itself is a great tool, and an ideal way to identify errors (whether nomenclatural, as in incorrect authors, taxonomic or database artifacts) and updates (new names, changes in taxonomy), differences in taxonomic opinion between datasets (not always resolvable, but useful to know). The results reflect a significant taxonomic intelligence and impressive pragmatism in the tool creators. It is a really impressive, and well-thought-out approach. Also of note is the incredible speed with which the crossmaps are created. Given how complicated the background computing must be, this is a great achievement. This is a fantastically useful and clever tool. At the same time it is intricate, and because the results are so interlinked, complicated to get sensible outputs for updating the compared databases. It is worth investing more time and money in making this truly useable".
Dr Christina Flann, Species 2000 and Global Compositae Checklist Editor
Assessment of usability for regional GSD comparisons. Conclusions, p. 14.
L. Candela, D. Castelli, G. Coro, P. Pagano, F. Sinibaldi
and Computation: Practice and Experience. doi: 10.1002/cpe.3030 July
2013 John Wiley & Sons, Ltd n.a. 2013 n.a. http://onlinelibrary.wiley.com/doi/10.1002/cpe.3030/abstract
Species distribution modeling is a process aiming at computationally predicting the distribution of species in geographic areas on the basis of environmental parameters including climate data. Such a quantitative approach has a lot of potentialities in many areas that include setting up conservation priorities, testing biogeographic hypotheses, and assessing the impact of accelerated land use. To further promote the diffusion of such an approach, it is fundamental to develop a flexible, comprehensive, and robust environment capable of enabling practitioners and communities of practice to produce species distribution models more efficiently. A promising way to build such an environment is offered by modern infrastructures promoting the sharing of resources, including hardware, software, data, and services. This paper describes an approach to species distribution modeling based on a Hybrid Data Infrastructure that can offer a rich array of data and data management services by leveraging other infrastructures (including Cloud). It discusses the whole set of services needed to support the phases of such a complex process including access to occurrence records and environmental parameters and the processing of such information to predict the probability of a species’ occurrence in given areas.Copyright © 2013 John Wiley & Sons, Ltd.
species distribution modeling;
Hybrid Data Infrastructure;
Cloud computing is a computing model where hardware, platforms and software are seen as services; viz. Infrastructure as a Service, Platform as a Service, and Software as a Service, respectively. Data as a Service (DaaS) is based on the concept that the product, data in this case, can be provided on demand to the user, regardless of geographic or organizational separation between provider and consumer. DaaS applications are for the most part based on excessive data replication in order to guarantee data availability, which means excessive costs in hardware investments. This white paper presents the specification, implementation and evaluation of a system called USTO.RE which aims to be an effective and low-cost alternative for storing data, thereby mitigating the problem of excessive data replication and thus allows itself to be considered a reliable platform from the perspective of data availability. Evaluation scenarios and the results achieved in our experiments to evaluate the system as well as possible lines for future development will be presented.
F. Durao, R. Assad, A. Fonseca, J. Fernando, V. Garcia and F. Trinta
Web Engineering Lecture Notes in Computer Science
Volume 7977, 2013 Springer Berlin Heidelberg Germany 2013, pp 452-466
Nuno Ferreira, Nuno Ferreira, EGI User Community Support, writes about this pioneering use case of the EGI Federated Cloud and its scientific applications.
The use case is presented as the results of a "trinity": BioVeL, providing the use case, EUBrazilOpenBio, providing the ENM as a service through the openModeller extended web service, and EGI Federated Cloud, providing the computing resources.
The article was published in the October issue of the Inspired Magazine, available at this link
The paper by Hardisty and Roberts (BMC Ecology, 2013) summarizes the opinion of a large part of the biodiversity informatics community with respect to their views on future developments in terms of systems, information needs and global sustainability. They conclude that biodiversity informatics plays a central enabling role in deploying the research community's know-how to help address scientific conservation and sustainability issues. Enormous progress was made establishing a framework for sharing collection and observation data (i.e. GBIF) or on species level (EoL) or with respect to genetic information (CBOL). An integrated systems approach, which moves significantly beyond taxonomy and species observations, is needed to truly understand biodiversity.
A global effort instrumental in connecting the thousands of resources and making them interoperable in a meaningful is the Catalogue of Life (CoL: see www.catalogueoflife.org). This authoritative synonymic index is a true core-product of the taxonomic community, unifying expert views and opinions in a single database resource. The CoL offers key information on species concepts based on over 140 peer-reviewed taxonomic databases representing opinions of thousands of specialists, and list valid names, synonyms and common names. A new ICT infrastructure for producing the CoL in a partly automated fashion was put into place under the EC funded 4D4Life project, resulting in monthly updates of CoL being released on the web. The EC funded i4Life project delivered robust CoL webservices, cross-mapper tools for automated comparison of the CoL with other systems holding taxonomic names (i.e. GBIF, EoL, IUCN, CBOL, LifeWatch), and piping tools to funnel back unlisted names for processing by experts at the providing databases side responsible for specific sectors in the CoL system. This mechanism facilitates a virtual taxonomic community and underpins an iterative cycle of matching the CoL against other sources with taxonomic components that will result in an enriched CoL (more names) plus a system with ‘unplacable names’ that need at a certain moment to be removed from circulation. The EUBrazilOpenBio project provided a more versatile cross-mapper webservice working in combination with a partly automated piping tool funneling data back to the GSD sources. Species 2000 will include the cross-mapping tools and piping mechanism in the regular CoL services.
So what are the main challenges for the future with respect to the Catalogue of Life and further developments in biodiversity informatics?
The following list may be used for discussions.
1. Over the years there has been a proliferation of digital sources for biodiversity data on the web, many holding a taxonomic component. A growing number of (thematic) aggregator sites harvesting from these, resulting in a complex environment for users to locate properly validated up-to-date taxonomic information. The community may consider a more collaborative approach integrating services and reducing the number of portals and sites.
2. The number of unvalidated web sources containing old names, misspellings, misapplied names etc. is increasing; this hampers interoperability and the reliable recombination of information sources and further confuses the taxonomic realm. Collaborative action in the biodiversity community is essential to address a necessary ‘cleaning up process’ i.e. as facilitated by the recent CoL initiatives.
3. The funding basis in the taxonomic community for the oncoming job of cleaning up name data and providing quality stamps on biodiversity information services is very vulnerable. Many taxonomic sectors are covered by single specialists who run databases and information processing on a shoestring budget. An integrated approach to a shared ICT infrastructure to store and treat the data, and a more robust funding model for data processing is needed.
4. Further automation of data processing in the taxonomic realm with respect to providing reliable, up-to-date biodiversity information services is needed. The CoL is the most complete taxonomic index currently available (covering about 75% of the know species) but it still has gaps. Including fossil taxa in the CoL system is highly recommended; a process that has just started this year. Similarly, stable identifiers are urgently needed. These issues need to be addressed with priority to further enhance the CoL’s usefulness in the biodiversity community.
5. IPR issues are still a potential stumbling block to open wide the CoL services. Moving towards Creative Common licenses underpinning data providing and use is recommended. Te community as a whole should contemplate a full open data model, as this will enhance use, and stimulate feedback.
6. Sustainability of biodiversity information services is an issue of great concern. Many sites and data services are based upon ‘soft money’ generated in the context of subsidized projects and grants, not core funding or part of the academic mission and output of the bigger players: taxonomic institutions and international initiatives. The larger taxonomic facilities may contemplates adopting these services and integrating them in their core business so that continuity is warranted. It is important to offer such services as part of the GBIF mission on sharing data.
7. Further integration of basic taxonomic services into (automated) workflows and research environments of other disciplines may be rewarding from the financial (cost efficient use of resources, reduction of duplication of efforts) and academic point of view.
This mini-course was given at the Simposio Brasileiro de Sistemas de Informação (SBSI 13). The paper presents the results about the effort spent on Ustore.re development to provide a reliable p2p storage platform in the context of EUBrazilOpenBio Project.
The attached documents were presented at SBSI '13 to present the work done.
SILVA, A. F., MACHADO, M. A. S., SOARES, P. F. A., GARCIA, V. C., ASSAD, R. E., SILVA, T. J. E., VIEIRA, T. B. P., DURAO, F. A.
Date: April 2013
The European Commission and its Brazilian partners will organise a high-level workshop in Brasília, on 11 November, to showcase the strategic importance of ICT research cooperation, highlighting its huge potential for stimulating new ICT developments and galvanising research communities around various initiatives and topics. With two coordinated calls already completed, resulting in a number of projects, preparations are now underway for a third call.
The workshop will provide a snapshot of the valuable networking and sharing opportunities generated thanks to these coordinated ICT research and funding activities under FP7. This will include results from on-going Call 1 projects, with a focus on their expected impact in the EU and Brazil. Success stories from this 2011 call include:
PodiTrodi – An autonomous microfluidic system is being developed with highly sensitive, integrated biosensors for detecting Chagas disease. Chagas is a major cause of death in tropical regions. The solution could be further adapted to test for leishmaniasis, dengue, malaria or HIV.
BEMO-COFRA – Mobile sensors and modern ICTs introduced into a Brazilian car production plant are set to improve monitoring and control, as well as overall productivity, reliability, safety and quality assurance.
EUBrazilOpenBio – An open-access platform integrating existing EU and Brazilian infrastructures and resources covering biodiversity species (mainly linking the Brazilian Flora catalogue with the EU-hosted ‘Catalogue of Life’).
Responses to Call 2 were successfully evaluated in Brasília in March 2013, the first time an assessment of this nature has been performed outside Europe. Four projects, one per topic, were selected for funding, out of 60 proposals. The new projects will be formally launched at the upcoming workshop, and a few will use the opportunity to organise parallel kick-off meetings in Brazil.
Mariane Silveira Sousa-Baena
Letícia Couto Garcia
Andrew Townsend Peterson
Article first published online: 24 OCT 2013
flora, inventory completeness, online biodiversity primary data,
priorities for survey and conservation, speciesLink, taxonomic knowledge
information is the focus of major initiatives aimed at assembling
large-scale primary-data documentation (‘digital accessible knowledge’)
of the distribution of life on Earth. Recent efforts within Brazil have
assembled a massive amount of such documentation for Brazilian plants,
which we analyse in this study. Our aim is to identify areas
representing gaps in current knowledge; these gaps can guide future
botanical exploration and discovery in Brazil.
assessed angiosperm inventories across Brazil at diverse spatial scales
using statistics that summarize inventory completeness. In particular,
we assess the completeness of geographical knowledge of Brazilian floras
as measured in terms of geographical distance and climatic difference
from well-documented sites.
of Brazilian angiosperms is very unevenly distributed: well-known sites
are concentrated in eastern and southern regions, whereas the remainder
of the country remains poorly documented. Worse still, in many regions,
areas lacking detailed botanical documentation coincide with areas of
intense habitat destruction, such that many such sites will never be
illustrates how biodiversity survey and inventory efforts can be guided
by existing knowledge. That is, to the extent that existing biodiversity
knowledge is made digital and openly available, and to the extent that
information is sufficiently comprehensive and informative, spatial
summaries of completeness such as that presented here offer clear and
strategic directions for maximizing the yield of new knowledge from any
de novo field efforts.
Methods for evaluating risk of biodiversity loss are linked closely to decisions about species’ conservation status, which in turn depend on data documenting species’ distributions, population status, and natural history. In Brazil, the scientific community and government have differing points of view regarding which plant species have insufficient data to be accorded a formal threat category, with the official list of threatened flora published by the Brazilian Ministry of Environment listing many fewer species as Data Deficient than a broader list prepared by a large, knowledgeable group of taxonomists. This paper aims to evaluate, using diverse analyses, whether “Digital Accessible Knowledge” is genuinely lacking or insufficient for basic characterization of distributions for 934 angiosperm species classified as Data Deficient on Brazil’s official list. Analyses were based on large-scale databases of information associated with herbarium specimens, as part of the speciesLink network. Evaluating these species in terms of completeness of geographic range knowledge accumulated through time, our results show that at least 40.9% of species listed as Data Deficient do not appear genuinely to be particularly lacking in data, but rather may be knowledge-deficient: data exist that can provide rich information about the species, but such data remain unanalyzed and dormant for conservation decision-making. Such approaches may be useful in identifying cases in which data are genuinely lacking regarding conservation status of species, as well as in moving species out of Data Deficient categories and into appropriate threat status classifications.
Mariane S. Sousa-Baena, Letícia Couto Garcia, A. Townsend Peterson,
behind conservation status decisions: Data basis for “Data Deficient”
Brazilian plant species
Biological Conservation, Available online 29
July 2013, ISSN 0006-3207,
Keywords: Data Deficient species, Conservation status; Red List; Angiosperms; Digital Accessible Knowledge; Primary biodiversity data
isgtw Magazine recently published a spotlight on Horizon2020 biodiversity targets and namely EU’s actions against the loss of biodiversity and ecosystem services.
EUBrazilOpenBio is mentioned here "to demonstrate that new and creative approaches to scientific discovery will be made possible by mastering the main technical and data-related challenges that lie ahead. Brazil and Europe have much to contribute to the creation of better services for researchers at all levels. By embracing the diversity of talent that exists in informatics, as well as all other fields, future international co-operation can help make collaborative research more efficient, more open, and multidisciplinary".
Read the spotlight here
The strategic value of the synergy with EGI lies in its efforts to build on its current user based (21,000 users) mainly through federated cloud services targeting the ‘long tail of science’, leveraging its extensive outreach activities. Specifically, the synergy with the EGI Federated Clouds Task Force is part of a drive to create an open platform designed to enable other resource centres, technology providers and user communities to join, build on the current portfolio of use cases and foster innovation.
Over the last two years, BioVeL and EUBrazilOpenBio have joined forces to make openModeller ready for cloud deployment. Work within the EGI Federated Cloud Task Force has led to considerable success in enabling the openModeller service on the EGI Federated Cloud, becoming one of the earliest use cases along with examples from biomedical research, linguistics, literature studies and physics. This synergy has led to improvements for optimising the openModeller software within the context of the EGI federated cloud use cases to ensure researchers conduct their work more effectively and maximise the usage of resources.
The outcomes of this work were presented at:
§ BIH2013 (3-6 September, Rome) Session 5: The Technical Environment, 5 September 2013. EUBrazilOpenBio efforts were presented by Daniele Lezzi, BSC, in his talk “Future computing platforms for biodiversity science”; and EUBrazilOpenBio Training session, 6 September 2013;
§ EGI Technical Forum (16-20 September), OGF39 (16-18 September) IBERGRID Conference (19-20 September (Madrid): The session on "Cloud computing lightning talks: Technology Providers, Resource Providers and User Communities" at the EGI Technical Forum brought together current and potential stakeholders of the EGI cloud infrastructure platform.. BSC presented its COMPSs framework with the idea of highlighting the role of COMPSs for the interoperable execution of the use cases and showcased the implementation of ecological niche modelling leveraged by the EUBrazilOpenBio and BioVeL projects.
Abstract. Ecological niche models are essential intruments in the development of strategies and policies in various application domains. The development of ever better models requires that scientists and pratictioners are provided with (i) both the data and the processing capacities they need on demand and (ii) innovative mechanisms supporting the effective sharing of such products at scale.
Authors: Ignacio Blanquer1 , Leonardo Candela2, Pasquale Pagano2 and Erik Torres1
1 Institute of Instrumentation for Molecular Imaging (I3M), University of Valencia
2 Istituto di Scienza e Tecnologie dell'Informazione (ISTI), Consiglio Nazionale delle Ricerche CNR
The paper has been included in the Proceedings of the 7th IBERGRID, 19-20 September 2013, Madrid, Spain, Iberian Grid Infrastructure Conference, ISBN 978-84-9048-110-3, pages 175-188, published by the Editorial of the UPV and
downloadable from the following link.
The IBERGRID conference series is organized since 2007 BY ES-NGI, The Spanish National Grid Initiative, in the context of the bi-lateral agreements between Portugal and Spain in matter of grid computing, super-computing, and scientific data repositories.
This year edition took place in Madrid, co-located with the EGI Technical Forum 2013. EUBrazilOpenBio took part to the e-Infrastructures & Biodiversity Workshop during the IBERGRID Parallel Session, Tuesday 19th September.
EUBrazilOpenBio consotrium presented the paper "Leveraging Hybrid Data Infrastructure for Ecological Niche Modeling: The EUBrazilOpenBio Experience" (Ignacio Blanquer, Leonardo Candela, Pasquale Pagano and Erik Torres) that has been published in the conference proceedings.
The paper has been included in the conference proceedings published by the Editorial of the UPV and downloadable from the following link.
Abstract. Recently cloud services have been evaluated by scientific communities as a viable solution to satisfy their computing needs, reducing the cost of ownership and operation to the minimum. The analysis on the adoption of the cloud computing model for eScience has identified several areas of improvement as federation management and interoperability between providers. Portability between cloud vendors is a widely recognized feature to avoid the risk of lock-in of users in proprietary systems, a stopper to the complete adoption of clouds.
In this paper we present a programming framework that allows the coordination of applications on federated clouds used to provide flexibility to traditional research infrastructures as clusters and grids. This approach allows researchers to program applications abstracting the underlying infrastructure and providing scaling and elasticity features through the dynamic provision of virtualized resources. The adoption of standard interfaces is a basic feature in the development of connectors for different middlewares ensuring the portability of the code between different providers.
Authors: Daniele Lezzi1, Francesc Lordan1, Roger Rafanell1, and Rosa M. Badia1,2
1 Barcelona Supercomputing Center - Centro Nacional de Supercomputaci´on (BSC-CNS)
2 Artificial Intelligence Research Institute (IIIA), Spanish Council for Scientific Research (CSIC)
Open science enables the sharing of methods, software and other types of virtual infrastructure. It is key to making research data more reproducible, increasing transparency and easing collaboration on existing research data. It also helps demonstrate return on investments for funding from the public purse.
EUBrazilOpenBio is demonstrating the benefits of Open Access in all its core aspects, by focusing on the integration of open data, software and services. It promotes the concept of openness for scientific research by leveraging the achievements, components and infrastructures developed in other projects, so that both Brazil and Europe can capitalise on earlier investments and bring to the table experiences on user-centric approaches. It has also demonstrated the benefits of small-scale funding to enable this open integration, overcoming different approaches by focusing attention on smarter approaches.
Rather than creating a new monolithic infrastructure, EUBrazilOpenBio has focused on demonstrating the benefits of an infrastructure model that is capable of harnessing existing disparate resources to provide a coherent research environment for scientists. The EUBrazilOpenBio infrastructure is the result of the collaborative aggregation of data, tools, services and computational resources into a coherent and integrated research environment for the benefit of the biodiversity community. EUBrazilOpenBio supports the open access movement, promoting the concept of openness for scientific research by leveraging the achievements, components and infrastructures developed in other projects, so that both regions can capitalise on earlier investments and bring to the table experiences on user-centric approaches.
Euro-Par is an annual series of international conferences dedicated to the promotion and advancement of all aspects of parallel and distributed computing. The conference is jointly organized by the German Research School
for Simulation Sciences, Forschungszentrum Jülich, and RWTH Aachen
University in the framework of the Jülich Aachen Research Alliance.
Euro-Par covers a wide spectrum of topics from algorithms and theory to software technology and hardware-related issues, with application areas ranging from scientific to mobile and cloud computing. The main audience of Euro-Par are the researchers in academic institutions, government laboratories and industrial organisations.
As a wide-spectrum conference, Euro-Par fosters the synergy of different topics in parallel and distributed computing. Of special interest are applications which demonstrate the effectiveness of the main Euro-Par topics.
EUBrazilOpenBio consortium successfully submitted the "Execution of scientific workflows on federated multi-cloud
infrastructures" paper which was presented at FedICI workshop.
CloudCaps is one of the EGI-funded mini-projects, whose aim is to help the community create scientific applications that are cloud-ready. Its first report highlighted the BioVeL community is one of those communities that could use the EGI Federated Cloud to its fullest potential. Another one is the community of niche modelling scientists, as EUBrazilOpenBio case demonstrated. For the last two years the Brazilian organisation CRIA (Centro de Referência em Informação Ambiental) has been working with the EUBrazilOpenBio project and BioVeL to get openModeller, the ecological niche modelling framework that helps biodiversity researchers to predict and understand the distribution of plant and animal species across the world under different environmental conditions, ready for cloud deployment.
“Ecological niche modelling is being increasingly used by the scientific community and we believe that this field can greatly benefit from cloud computing technologies, especially in complex experiments involving multiple species, algorithms and environmental scenarios" explains Renato De Giovanni from CRIA ”I’m excited to be part of this effort and it’s good to see that it will also benefit the wider EGI community.”
We welcome you in EUBrazilOpenBio training area, wher you will access the training materials made available by the project and designed for researchers in biodiversity, life science, climate Change, etc...; application developers and also regulatory authorities and policy decision-makers.
The materials published in this area are organized in the following structure, that we recommend you to follow.
1. Introducing EUBrazilOpenBio
2. Biodiversity research with EUBrazilOpenBio
- 2.1. Ecologic Niche Modelling (ENM) with EUBrazilOpenBio
- 2.2. Cross Mapping taxonomies with EUBrazilOpenBio
3. Understanding the EUBrazilOpenBio Platform technologies
- 3.1. Insight of the Technological details of EUBrazilOpenBio
“Our analysis of cloud computing needs in Brazil and the present maturing market of cloud offering in Europe has highlighted the many different opportunities that exist for joint co-operation between Europe and Brazil. It is therefore important to build on the successful exercise of the EU-BrazilOpenBio project and ensure that the wider research community benefits from easy to access virtual computing infrastructures such as cloud computing to tackle the rising tide of large scientific data. These opportunities range from new multidisciplinary approaches to biodiversity, including internationally recognised post-graduate qualifications and joint facilities, to small and medium-sized businesses creating value-add services around open data. We expect that the many fruitful EU-Brazil discussions will continue in the near future to help realise the vision we are now shaping thanks to the pioneering approach of EUBrazilOpenBio.”
“I think that having been one of the original incubators of the project I understand the background and I have a vested interest in seeing this project develop and succeed.
I am also interested in fostering stronger links between European science and LATAM. I incubated in the past, projects such EELA and collaborated with most of NRENs in that region. I still have good connections to CLARA and RNP senior technical management. I sit in several EU and other scientific organisation advisory boards, including on biodiversity. For all these reasons I believe that I could help this EU Brazil cooperation to succeed.”
Dr. Fabrizio GAGLIARDI has been Director, EMEA, Microsoft Research Connections (External Research), Microsoft Research, since July 2008, based in Geneva with main office at the MSR research centre in Cambridge, UK. He is now an independent Consultant & Chair of ACM Europe.
He is responsible for engaging with the scientific community in EMEA and as part of this job, also supporting and contributing to MSR Cloud computing strategy in Europe, including the incubation of a major EU project (www.venus-c.eu ), and with 3 direct engagements with major national funding agencies (EPSRC in the UK, INRIA in France and HLRS in Germany). He is responsible for the strategic planning of the Cloud Computing team at EMIC in Germany.
He joined Microsoft in November 2005, when after the last EGEE conference in his home town of Pisa, Fabrizio Gagliardi took responsibility for the company Technical Computing Initiative in Europe, Middle East, Africa and Latin America.
Before then and starting at the end of the 90’ he was among the pioneers in developing and introducing Grid computing in Europe, this led to projects like EU-DataGrid and EGEE, of which he was Principal Investigator and Director from 2000 till 2005.
In 2004-2005 while still Director of EGEE (www.eu-egee.org), he contributed to the incubation and launch of more than 10 other Grid EU projects all inspired and supported by the EU EGEE flag-ship initiative.
This unit is intended for Computer scientists that could become application developers or infrastructure providers for the EUBrazilOpenBio platform. It is not intended to be a “user-guide” to the technologies, but an insight of the tools and services developed in the EUBrazilOpenBio, including further references.
Application developers will understand the benefits and feasibility of adapting their own applications to the platform and they will find references to deep into the technological details. This training material concentrates on the new layers and components developed in the frame of EUBrazilOpenBio project, which ease the access to the general-purpose services.
The trainee should have previously read the modules describing the general concepts available in the training section of the EUBrazilOpenBio channel. It is also recommendable to go through the demo videos and the use case descriptions to increase the understanding of the services.
This module will not provide all the skills needed to develop new applications but the knowledge to understand the suitability of the tools and services used for different problems. Further information could be accessed through the links.
"EU-BrazilOpenBio has successfully supported co-operation between Brazil and Europe on the key theme of biodiversity. Its focus has been on demonstrating the benefits of integrating and sharing data, tools and services in an open way to facilitate biodiversity research communities across borders rather than on technological innovation. Part of its lasting legacy will be the cultural and environmental bridges between the two regions that it has successfully built.”"
"My contribute to the SAB will be based on more than 40 years of computing systems and applications research, including periods in industry, ten years playing leading roles in e-Science and many projects concerned with data-intensive science. I have been involved in EU projects since FP2, and so may be able to contribute advice on how to get the best out of their collaborative opportunities, and have some experience of global efforts to establish and adopt relevant standards".
Malcolm Atkinson is Director of the e-Science Institute and National e-Science Centre and a professor of e-Science at the School of Informatics, University of Edinburgh.
Malcolm Atkinson has 43 years of academic service, having worked in seven universities. He has three year’s experience in industry, plus many years of experience consulting to industry on data management and training. He became a (full) professor in 1983 and led the Department of Computing Science at the University of Glasgow to the top levels of teaching and research assessment. He has worked on databases and programming languages, large-scale and long-lived applications, since 1970. Those applications of data include: horse racing administration, whiskey bottling, a wide variety of computer-aided design, healthcare and scientific research. He has taught systems, distributed systems, databases, programming languages and introductory courses. He has taught at all levels from introductory first-year courses to advanced post-doctoral summer schools. He was leader of the International Summer School in Grid Computing for three years, a member of the British Computer Society accreditation committee for fourteen years, a member of the group that drew up the UK’s first software engineering curriculum (for BCS and IEE (now IET) accreditation) and initiated the education and training group at the Open Grid Forum. He is currently professor of e-Science in the School of Informatics, University of Edinburgh, Director of the e-Science Institute and UK e-Science Envoy. He was until recently a member of the JISC Board and the Open Grid Forum Board, as well as many advisory boards. He has taken a leading role in five books, all edited collections.
Malcolm Atkinson has led research projects continuously since 1978 on data management, programming languages, computational systems and e-Science. He directed the National e-Science Centre and e-Science Institute from 2001. He led the design of OGSA-DAI, and was data-area chair when standards for data-access and integration were developed at the Grid Forum. He was involved in six European Commission funded projects since 2001, and was most recently chief architect on the ADMIRE project, which supported the effort of developing this book. He is involved in two new EU projects in data infrastructure for research until August 2015, and currently has projects funded by three of the UK’s research councils.
The concept of Virtual Research Environments (VRE) is used in EUBrazilOpenBio to provide the users with a single entry point to processing and data resources. The main services offered by this VRE are:
§ Species Products Discovery
this service enables users to discover and manage species products (occurrence data and taxa names) from a number of heterogeneous providers in a seamless way. The service supports different features for discovery and management of taxa names and occurrence points. The service enable to store objects in the user workspace for future use.
§ Ecological Niche Modellingthis service enables users to define and manage ecological niche modelling tasks. These tasks consist of complex and computationally intensive model creation, testing, and projection activities. Users are provided with a feature-rich environment allowing them to characterise such tasks by specifying the data to be exploited (occurrence records and environmental parameters), the algorithms to use and the parameters to be considered during the testing phase. Tasks can be monitored after being submitted. Moreover, the results of each task can be easily accessed, visualised and saved into the user workspace for future use.
§ Cross-mappingthis service enable users to compare different taxonomic checklists. Checklists can be either produced by relying on the Species Products Discovery or owned by the user. Users are provided with a feature rich environment enabling them to import checklists to be compared, to inspect the content of everychecklist, to compare (a.k.a. cross-map) two checklists by using diverse comparison mechanisms, to visualise the results of a comparison and to save it for future use.
Cross-Mapping Taxonomic Checklists in the Passifloraceae family: Learn how to use EUBrazilOpenBio cross-mapping tool
This exercise focuses on obtaining the list of taxa present in the family Passifloraceae in the List of Species of the Brazilian Flora (Brazilian Flora; LSBF) that are not found in the Catalogue of Life (CoL). The exercise takes you through the various steps from getting the Darwincore files with taxonomic checklists, running the cross-map algorithm, checking and exporting results.
In thiss issue:
Training: Exercice n.2: Creating distribution models for the Adenocalymma dichilum
Event: EUBrazilOpenBio cross-mapping e-service at INTECOL 2013
Paper: "EUBrazilOpenBio cross-mapping e-service: comparative analysis of data from the List of Species of the Brazilian Flora versus the Catalogue of Life
Paper: "EUBrazilOpenBio cross-mapping e-service: comparative analysis of data from the List of Species of the Brazilian Flora versus the Catalogue of Life
The objective of this exercise is to study Adenocalymma dichilum A.H.Gentry. This species is interesting because it is endemic from Brazil and its conservation status is “Data Deficient” according to the Official List of Threatened Brazilian Plant Species. However, there are 20 points with distinct geographic coordinates for this species available from speciesLink, all of them occurring in a well-delimitated region in Brazil, suggesting that in fact this species distribution is restricted. The available occurrence points may be sufficient to create a reliable distribution model, and thus it would be possible to get an idea whether the distribution is indeed poorly-known or if it is actually restricted.
Such information could be used to reassess its conservation status
The exercise will go through the following steps: Retrieval of occurrence points and creation of the models; analyses of the results of the models; projection of the models under existing conditions and analyses of the projections.
Register or login to dowload the execices and
The documents contains 2 additional exercises that you can perform through the EUBrazilOpenBio eInfrastructure:
Study on Fridericia elegans
The objective of this second exercise is to study Fridericia elegans (Vell.) L.G.Lohmann. This species is also endemic from Brazil, and it was selected because it has only 5 points with distinct coordinates and all of them occur in a very restrict area, suggesting that this plant is endemic of a small region of the Atlantic Forest in Rio de Janeiro. In this case, the distribution model generated with 5 points would be considered exploratory, but it could still be used to plan new collections of this species to evaluate if its distribution is indeed restricted or if this species was not well sampled across its distribution. A fieldwork planned according to the potential distribution map would have a higher probability of success and cost much less resources.
© New York Botanical Garden
The exercise will go through the following steps: Identification of all the synonyms for the species; retrieval of occurrence points and creation of the model; analysis of the results of the model; projection of the model under existing conditions; and analysis of the projections
The objective of this last exercise is to study Tanaecium neobrasiliense (Baill.) L.G.Lohmann. Apparently, although this species is not well-collected, with only 8 points with distinct coordinates, it is widely distributed has no problem regarding its conservation status (it is not in the Official List of Threatened Brazilian Plant Species). Differently from the other 2 cases, this species has a broad distribution, although with many sampling gaps. In this case, the potential distribution model could be useful to guide new collections mainly in these gaps and help to evaluate if the species is well-represented in Protected Areas in Brazil.
The exercise will go through the following steps: Identification of all the synonyms for the species; retrieval of occurrence points and creation of the model; analysis of the results of the model; projection of the model under existing conditions; and analysis of the projections. This case requires the same procedure as the previous case, although the conclusions are different.
The trainee should have already went through the following modules:
- OpenBio e-Infrastructure essentials. At least the two first modules of the three blocks of this section:
- “All about EUBrazilOpenBio: training module”, .
- “Introduction for Biodiversity experts”, .
- “Introduction for Computer Scientists” .
- Introduction to Biodiversity experts. At least the first blocks of this section:
- “Introduction for Biodiversity experts. EUBrazilOpenBio Essentials”
- “Ecological Niche Modelling with EUBrazilOpenBio”
- “EUBrazilOpenBio tools for Taxa cross mapping and Niche modelling – demo”
The Bignoniaceae, the bignonias, are a family of flowering plants in the
order Lamiales. The family has a nearly cosmopolitan distribution, but
is mostly tropical, with a few species native to the temperate zones.
greatest diversity is found in northern South America; it includes a
large tribe, Bignonieae (383 species), which represents the most diverse
and abundant clade of lianas in the Neotropics.
EUBrazilOpenBio case study, that tested the CoL Cross-mapping tool and
transformed the results obtained into a proto-Global Species Database of
Bignoniaceae to be included in the Catalogue of Life, researchers
discovered that from the 393 species of Bignoniaceae present in the List
of Species of the Brazilian Flora, 368 are not in the Catalogue of
Life. Besides finding this large difference, the cross-map results
revealed more specific information on species relationships and is
another step towards the improvement of the CoL dataset regarding this
EUBrazilOpenBio case study will be presented as a
poster and an oral presentation at the 11th "INTECOL Congress: Ecology -
Into the next 100 years", that will be held in London from 18-23 August
This paper was written by EUBrazilOpenBio consortium and accepted for the 5th International Workshop on Science Gateways, IWSG 2013, that will take place from 3-5 June 2013, in Zurich, Switzerland.
EUBrazilOpenBio is a collaborative initiative addressing strategic barriers in biodiversity research by integrating open access data and user-friendly tools widely available in Brazil and Europe. The project deploys the EU-Brazil cloudbased e-infrastructure that allows the sharing of hardware, software and data on-demand. This e-Infrastructure provides access to several integrated services and resources to seamlessly aggregate taxonomic, biodiversity and climate data, used by processing services implementing checklist cross-mapping and ecological niche modelling. The concept of Virtual Research Environments is used to provide the users with a single entry point to processing and data resources. This article describes the architecture, demonstration use cases and initial experimental
Keywords—Biodiversity, Data Infrastructure, Virtual Research, Environments, Cloud, Taxonomy, Ecological niche modelling
AUTHORS: Rafael Amaral, Rosa Badia, Ignacio Blanquer, Leonardo Candela, Donatella Castelli, Renato Giovanni, Alex Gray, Andrew Jones, Daniele Lezzi, Pasquale Pagano, Vanderlei Perez-Canhos, Francisco Quevedo, Roger Rafanell, Vinod Rebello and Erik Torres
EUBrazilOpenBio eTraining - Programme Introduction
Addressing biodiversity challengesBiodiversity boosts ecosystem productivity where each species, no matter how small, all have an important role to play. Improving research, enabling access to open resources and sharing results are fundamental steps in understanding the impact of climate change and human exploitation of natural resources. EUBrazilOpenBio is focused on tackling the complexity of biodiversity science such as the diversity of multidisciplinary datasets spanning from climatology to earth sciences by integrating advanced computing resources with data sources across Europe and Brazil.
"EUBrazilOpenBio is addressing a very large user community, which has very little knowledge about the EUBrazilOpenBio infrastructure. It is therefore important to have material that can help new users to understand the potential of this system.
Our two main use cases deal with the Integration between Regional & Global Taxonomies, and with Data usability and the use of ecological niche modeling. For these two Use Cases we have two applications available in the EUBrazilOpenBio webportal and several services to be exploited by the biodiversity user communities. Thus this training material will help biodiversity users to understand and use these applications, and see the benefits they can gain".
Ignacio Blanquer, UPV
EUBrazilOpenBio Training activities coordinator
Who is the eTraining for?Educating and enabling current and potential users of EUBrazilOpenBio is key to unlocking new knowledge and shaping effective policy on biodiversity challenges. The EUBrazilOpenBio anytime, anywhere eTraining tools are designed for:
How is the training structured?
- Researchers: Biodiversity, Life science, Climate Change
- Application Developers: technical advice and guidance on adapting applications to the infrastructure
- Regulatory authorities and policy decision-makers
The eTraining tools comprise step-by-step modules that can be used anytime, anywhere. The first module is an Introduction to EUBrazilOpenBio recommended to all trainees to understand the value-add and basic concepts of this international initiative. The other modules are specifically targeted at application developers with specific technical content or to researchers with a focus on the use cases and benefits.
To access the training material we simply ask you to register for free to EUBrazilOpenBio webchannel.
You will then be able to access not only the applications, but also the general services for species data discovery (occurrences and taxonomies) and the geo server included in the training.
This paper was submitted at the 27th International Conference on Advanced Information Networking and Applications Workshops, AINA 2013, 25-28 March 2013 in Barcelona, Spain.
Abstract - In the last decades biology scientists have relied on their own resources and tools to run the experiments and store the results of the analysis. However, the explosion of big data and the growing availability of computational methods find an obstacle in the lack of computational and storage resources.
Cloud computing platforms are emerging as potential solution to overcome these limitations, but adaptation of the applications to enable scientific users to benefit from resources acquired on demand is a complex process requiring multidisciplinary expertise.
The EUBrazilOpenBio initiative is implementing an e-Infrastructure that provides biodiversity community with a rich set of computational and data resources exploiting existing cloud technologies from EU and Brazil. This paper presents the implementation of one of the two use cases selected, the environmental niche modeling by means of implementing such workflow through the COMPSs framework and its deployment on the EUBrazil OpenBio platform. The proposed approach has been evaluated on a Cloud testbed managed by the VENUS-C middleware.
- Daniele Lezzi, Roger Rafanell, Rosa M. Badia, Department of Computer Sciences Barcelona Supercomputing Center, Barcelona, Spain
- Erik Torres, Ignacio Blanquer, Artificial Intelligence Research Institute (IIIA), Spanish National Research Council (CSIC), Instituto de Instrumentacion para Imagen Molecular (I3M), Centro mixto CSIC, Universitat Politecnica de Valencia - CIEMAT, Valencia, Spain
- Renato De Giovanni, Centro de Referencia em Informaao Ambiental, Campinas, SP, Brasil
EUBrazilOpenBio features in the EGI Demo Booth 3 at the EGI Community Forum 8-12 April 2013 in Manchester, UK. The Demo Booth showcases the project's support of biodiversity using cloud computing. EUBrazilOpenBio is focused on federating and integrating existing multidisciplinary data sets, European and Brazilian infrastructures and resources to enable biodiversity scientists. Benefits include ease of use, faster time to get results and cost efficiency. The European Grid Infrastructure (EGI), which is leading the transition through virtualisation to a cloud-based e-infrastructure, co-hosts the Community Forum with the UK NGI, a partnership between GridPP and the National e-Infrastructure Service (NES).
The EUBrazilOpenBio Infrastructure will not be built from scratch. It will leverage previous efforts and activities to deliver an infrastructure aggregating resources (data, tools, services) in the biodiversity domain from existing EU and Brazilian infrastructures and services.
In order to process the data and to generate results that are of interest for the biodiversity community, several components are requested and have been identified to be integrated in EUBrazilOpenBio eInfrastructure. One of these is the Cross reference tool of i4Life, which performs cross mapping tasks between different checklists.
A “cross-map” enables the relationships between lists of species and other taxa in one species information system to be related to those in another species information system. This is a fundamental element of making it possible to perform data aggregation and complex analyses which require the use of data from multiple, diverse species information systems.
One of the objectives for the EUBrazilOpenBio crossmapping tool is to improve the performance (response time) of this facility by using the computing resources available in the EUBrazilOpenBio infrastructure. The application consists of 2 components, a portlet realising the facility front end and a web service realising the cross mapping business logic.
A new version of the i4Life cross-mapping tool
EUBrazilOpenBio developed a new version of the i4Life cross-mapping tool to compare regional and global taxonomies, such as the list of species of Brazilian Flora, containing over 43,000 species plus around 30,000 synonyms, and the global Species2000/ITIS Catalogue of Life (CoL), indexing about 250,000 plant species and 300,000 synonyms.
The EUBrazilOpenBio cross-mapping tool enables taxonomists and data curators to find relationships between lists of species and higher taxa in two different species information systems. Examples of relationships are: “not_found_in”, “corresponds”, “includes”, “included_by”, and “overlaps”. This tool makes it easier for scientists to work with diverse taxonomic data from multiple sources.
The 7th Framework Programme promoted a number of important developments
to shape the European Research Area. Amongst these are the developments
of research infrastructures at the European scale (ESFRI and IA
projects) and of a number of large Joint Programming Initiatives. These
developments are often lacking mutual interaction and coordination.
The Research Infrastructures Unit of DG RTD and the infrastructure project LifeWatch has invited around 40 FP7 initiatives working in the Biodiversity and Ecosystem research to a workshop to develop synergies between ESFRI research infrastructures (RI) and existing research infrastructures.
The agenda will address the synergies between the biodiversity components of different initiatives, also in view of the supporting role of the European research infrastructures in this area; and a strategy for the development of biodiversity research infrastructures in the next ten years in view of emerging scientific and technical challenges.
The EUBrazilOpenBio scientific poster highlights the key elements of the project in terms of technologies, computing and storage resources involved, the data repositories EUBrazilOpenBio leverages on, the Gateway as the access point to the project applications and, last but not least, the 2 use cases that are guiding the development of the eInfrastructure.
The poster will be showcased for the first time at CloudscapeV, 28-28 February 2013, in Brussels.
The international weekly online publication iSGTW, covering distributed
computing and the research it enables, dedicated the 23 January main
feature to EUBrazilOpenBio and its effort to help biodiversity
scientists solving taxonomic problems, such as the cross-reference
between regional and global taxonomic data sets and the complex
differences that exist in taxonomic classification.
On 20 November the seminar “Connecting Brazil and Europe”, which took place in Brussels, brought together high-level Brazilian and European officials, practitioners, academics and industry representatives.
EUBrazilOpenBio was among the initiatives taking part in the session dedicated to EU-Brazil cooperation for research and development.
Rosa Badia, EUBrazilOpenBio European coordinator, presented the project results achieved in the first year of activities.
Press Release - December 2012
EUBrazilOpenBio project is making user-friendly tools more widely available to specialists in Brazil and Europe.
Leveragin on the the “taxonomic intelligence”
of the cross-mapping Tool developed by Cardiff University, it supports a pilot study that analyses the regional List of Species of Brazilian Flora, containing around 10,000 species, against the global index of plants within the Species2000/ITIS Catalogue of Life (CoL), indexing about 150,000 plant species.
In addition, EUBrazilOpenBio is building on the on-going efforts of the Brazilian National Institute of Science & Technology – Virtual Herbarium of Flora and Fungi
. Using a standard procedure for plant species native to Brazil, the aim is to generate species distribution (or ecological niche) models
based on specimen data.
The cross-mapping tool and the ecological niche modelling tool are available from the EUBrazilOpenBio Gateway
, the project open-access platform where existing European and Brazilian infrastructures and resources are integrated.
EUBrazilOpenBio positions itself in a moment where “systems” engineers afford challenging tasks, such as highly-evolving requirements, large scale resource & player distribution and heterogeneity of data. Ad-hoc solutions too often do not result to be sustainable.
At the same time, eInfrastructure are becoming increasingly important tools for scientific discovery, enabling researchers across the world to share access to unique or distributed scientific facilities through user-friendly interfaces.
To respond to these challenges EUBrazilOpenBio adopts what’s called a Hybrid Data Infrastructure Approach, that means a facility where research resources can be shared and exploited on-demand, built on existing systems, infrastructures and repositories, conceived to supplement but not supplant “systems” mandates and arrangements, supporting an innovative application-delivery-model.
This video gives an introduction to EUBrazilOpenBio eInfrastructure essentials and it is the first module of the training programme that will accompany all EUBrazilOpenBio target users through a path towards the complete understanding, use and re-use of EUBrazilOpenBio tools and applications.
In line with its implementation roadmap, the EUBrazilOpenBio eInfrastructure has evolved from its testbed architecture up to the actual technological environment hosting the initial versions of the EUBrazilOpenBio services. This so called "Production e-Infrastructure"
, will allow users to provide to the team of developers with eedback and requests for new features, improvements and, if necessary, requests for support.
Officially presented to the scientific community in Recife during the EUBrazilOpenBio - INCT Joint Workshop in September 2012,
when the Biodiversity Virtual Research Environment with the two Use Case services was showcased at the audience of experts from the Brazilian Virtual Herbaria, the first Production version of the EUBrazilOpenBio e-Infrastructure has been improved to offer the 2 project use cases tools, i.e. ecological niche modelling and cross-mapping, as services to the biodiversity scientific community.
In particular the current release of the EUBrazilOpenBio eInfrastructure has:
* an enhanced version of the Species Products Discovery
tool that allows:
(a) to discover and access Occurrence Points from GBIF and speciesLink;
(b) to discover and access taxonomic names from Catalogue of Life and Flora do Brazil;
(c) to produce and store Checklists in DarwinCore-Archive format them in the Workspace;
(d) to save occurrence points in openModeller format in the Workspace to make these data usable by the Ecological Niche Modeling;
(e) to visualize occurrence points on a map;
* an enhanced version of the Ecological Niche Modeling application
, deployed in the EUBrazilOpenBio Gateway that enables:
(a) to define and execute complex experiments consisting of models creation, testing and projection on multiple species;
(b) to rely on large scale computational facilities offered by COMPSs
(c) to monitor the execution of an experiment;
(d) to interact with the workspace to store and load occurrence points, experiments and experiment results;
(e) to publish and visualise species distribution maps by relying on GIS technologies;
* an enhanced version of the Cross Mapping application
, deployed in the EUBrazilOpenBio Gateway. It allows:
(a) to perform a cross mapping of checklists produced via the Species Products Discovery as well as any other checklist in DWC-A format;
(b) to visualise and inspect the results of a cross-mapping in an user friendly way, e.g. providing a user with a motivation justifying the suggested relation between two taxa;
(c) to interact with the workspace to store and load checklists and cross-mapping results;
* the 47 gCube Hosting Nodes
have been upgraded to version 3.5.0 (part of gCube 2.11.0);
* gCube Services
have been upgraded to gCube 2.11.0 release;
Finally, a GeoExplorer application
has been deployed in the EUBrazilOpenBio Gateway to make it possible to display maps produced via the Ecological Niche Modeling.
Rosa Badia, EUBrazilOpenBio European Coordinator, was among the speakers of the seminar "Connecting the EU and Brazil", which took place in Brussels on 20th November organized by CEPS, the Centre for European Policy Studies. The seminar saw the presence of representatives of the EUBrasil Association and of Mario Campolargo, Director for "Net Futures", DG CONNECT, European Commission.
"ICT is a very important priority for the EU-Brazil partnership, which must be based on mutual trust, co-operation and a “win-win” mentality. In 2011, both parts discussed issues such as cloud computing, sustainable technologies, and smart services and applications, but this year “we decided to go further” and organise workshops".
Mario Campolargo, Responsible for Research and Innovation DG Connect.
On 20th November the seminar “Connecting Brazil and Europe”, which took place in Brussels, brought together high-level Brazilian and EU officials, practitioners, academics and industry representatives from both sides of the Atlantic.
EUBrazilOpenBio was among the initiatives taking part in the session dedicated to EU-Brazil cooperation for Research and Development. Rosa Badia, EUBrazilOpenBio EU coordinator, presented the project results achieved in the first year of activities.
A recent article published by NewEurope - The European Political Newspaper - highlights the results of this meeting presenting EUBrazilOpenBio as one of the projects aimed at strengthening the relation between both parts through new technologies and services.
Over 130 participants attended the EUBrazilOpenBio and INCT joint workshop "Advancing Biodiversity e-science innovation through Global Cooperation", on 19 and 20 September 2012 in Recife, Brazil. Participant came from the Brazilian Virtual Herbaria, the Brazilian Government and Brazilian and European Instutions.
The whole EUBrazilOpenBio consortium took part to the event.
Addressing biodiversity challenges
boosts ecosystem productivity where each species, no matter how small,
all have an important role to play. Improving research, enabling access
to open resources and sharing results are fundamental steps in
understanding the impact of climate change and human exploitation of
natural resources. EUBrazilOpenBio is focused on tackling the complexity
of biodiversity science such as the diversity of multidisciplinary
datasets spanning from climatology to earth sciences by integrating
advanced computing resources with data sources across Europe and Brazil.
"EUBrazilOpenBio is addressing a very large user community, which has very little knowledge about the infrastructure. It is therefore important to have material that can help new users to understand the potential of this system.For this purpose, we adopt the following approach.First of all some basic material will be produced to explain the Use cases EUBrazilOpenBIo is focusing on. Our 2 main use cases deal with the Integration between Regional & Global Taxonomies, and with Data usability and the use of ecological niche modeling. For these 2 Use Cases we have 2 applications available in the EUBrazilOpenBio webportal and several services to be exploited by the biodiversity user communities. Thus this training material will help biodiversity users to understand and use these applications, and see the benefits they can gain.Along with this material we want to capture new users and new sub-communities in the biodiversity field. Material will be available for training new developers that could bring new applications within the EUBrazilOpenBio platform. This technical material will help them in using different components of the technology we have.Finally we have to frame this with other training activities for the benefit of higher education institutions and universities, with training materials combining our technology with the basic principles of taxonomy and ecological niche modelling"
Ignacio Blanquer, UPV
EUBrazilOpenBio Training activities coordinator
EUBrazilOpenBio training plan
Who is the eTraining for?
and enabling current and potential users of EUBrazilOpenBio is key to
unlocking new knowledge and shaping effective policy on biodiversity
challenges. The EUBrazilOpenBio anytime, anywhere eTraining tools are
- Researchers: Biodiversity, Life science, Climate Change
- Application Developers: technical advice and guidance on adapting applications to the infrastructure
- Regulatory authorities and policy decision-makers
How is the training structured?
eTraining tools comprise step-by-step modules that can be used anytime,
anywhere. The first module is an Introduction to EUBrazilOpenBio
recommended to all trainees to understand the value-add and basic
concepts of this international initiative. The other modules are
specifically targeted at application developers with specific technical
content or to researchers with a focus on the use cases and benefits.
To access the training material we simply ask you to register for free to EUBrazilOpenBio webchannel.
You will then be able to access not only the applications, but also the general services for species data discovery (occurrences and taxonomies) and the geo server included in the training.
The workshop “Advancing Biodiversity e-science innovation through global cooperation”, jointly organized in Recife on 19 and 20 September 2012 by the Brazilian Virtual Herbarium & EUbrazilOpenBio looked to address the issue of collaborative research between biodiversity and computer scientists for two days of intense dialogue.
The following are some of the main conclusions retained after the workshop. Register and download them all.
1. Make it easy
The EUBrazilOpenBio is providing the biodiversity community with an infrastructure that by combining the characteristics of data and cloud infrastructures, is capable of supporting their application scenarios in an innovative way...
2. Go cloud
One of the promises of the cloud is ease of use and flexibility. What about EUBrazilOpenBio applications?
3. Biodiversity as a start
The EUBrazilOpenBio infrastructure should be viewed as not only supporting biodiversity scientific research....
4. Support research policy at regional level
"Although some problems addressed are global, key decisions need to be made at a local level”
5. Start local to then go global
The EUBrazilOpenBio infrastructure will be a valuable part of the larger biodiversity infrastructure
6. Open Access for open-minded citizens
Provision of open data to citizens is also a powerful way of raising awareness to environmental issues
7. One size network does not fit all
Network requirements need to be met and adequate network performance ensured to be able to transfer and manage data across different disciplines
8. Think big
How can EUBrazilOpenBio contribute to Brazil and Europe's economy?
With species from online records being served by SpeciesLink and Brazil’s Virtual Herbarium representing 91.4% of all Brazilian Angiosperm species, collaborative research opens up considerable opportunities for the international biodiversity research community. Science is becoming increasingly data driven and multidisciplinary, turning researchers from many fields of inquiry into data scientists. On their part, biodiversity researchers need complex mathematical models to analyse large data sets from different sources and easier access to that data. However, they typically lack advanced IT skills. One of the promises of the cloud is ease of use and flexibility. Access to large data and new services like the cloud will enable researchers from many domains to make discoveries they could not make before. Together, the data deluge, cloud computing and open access are creating a revolution in scientific research. [...]
Register and download the full press release
Today, the systematic utility of Ecological Niche Modelling (ENM) is well known, and the models created with this technique are used in several disciplines as a standard tool to predict the distribution of species in geographic space.
ENM algorithms include a wide range of methods for modelling species' distributions that vary in how they analyse the data and construct the models. While some of the methods are based on relatively simple and understandable mathematical principles (e.g. Bioclim, or Environmental Distance), more recently several novel algorithms have emerged that have foundations in complex statistical research that is opaque to the final users (e.g. GARP, or Maxent). Often, biodiversity scientists need to delve into these black boxes, uncovering their inner working in order to achieve better models for noisy data or rare species. Also, conventional methods have often proved to deliver poor performance to scale, while the new methods have the advantage of taking less time to complete, having a greater capacity for dealing with high-resolution datasets than conventional methods.
What is it and what is its purpose?
The ENM algorithm profile manager is a software artefact that is being developed in the context of EUBrazilOpenBio to facilitate the use and customization of ENM algorithms within the project's infrastructure. It provides biodiversity experts with the necessary information about the algorithms available to the ENM facilities, provided by the infrastructure, with the required computing and storage resources, as well as the means for editing the parameters of the algorithms and saving them as customized profiles that are available to the users for creating new models.
ENM algorithm profile manager users
Although biodiversity scientists are the primary users of the ENM algorithm profiles, other users within the biodiversity community can benefit from the manager. In particular, developers of new ENM algorithms and species distribution predictors can use the manager to create and handle large sets of parametrised algorithm profiles that are used to test and validate the new methods under development. Also, database compilers can find useful the ENM algorithm profile manager to maintain their own collections of models and model projections created from the latest algorithm profiles developed by biodiversity researchers.
The ENM algorithm profile manager is accessible to users through the EUBrazilOpenBio Graphical User Interface (GUI) for ENM. It shows the information on the algorithms in a simple and practical manner, resembling the openModeller Desktop GUI, a front-end to the widely used openModeller library. The manager supports two different kinds of algorithm profiles: 1) non-editable system profiles, which are available to all users and are likely to be the state-of-the-art of ENM practices, and 2) editable user profiles, which are private, their application being linked to the particular needs of their creators.
The integration of the ENM algorithm profile manager in the EUBrazilOpenBio GUI for ENM allows biodiversity experts to use ENM algorithm profiles as part of their experiments. For example, they can customize the algorithm profiles to create models in their experiments from large spatial extent and resolution datasets. Also, they can simultaneously use several algorithm profiles to massively create and project models for a high number of species (e.g. create models for all the species in the List of Species of the Brazilian Flora).
Collaboration and sharing of data across geographical boundaries and disciplines is becoming increasingly important and beneficial to researchers worldwide. gCube is an open-source distributed system specifically designed to operate data e-infrastructures for the benefit of different research communities. gCube allows distributed and dynamic communities (called Virtual Organisations) to efficiently collaborate and share resources through common Virtual Research Environments (VREs), configured as applications on the top of different e-Infrastructures.
VRE applications are dynamically and interactively aggregations of both data collections and data management services with interfaces for a variety of actors, from end-users to administrators, in a variety of life science and biodiversity domains. Virtual Research Environments are managed by Virtual Organisations that define scopes and resources of each VRE. A new VRE can be dynamically set up at the time and for the time the community needs it, satisfying the requirements of both long and short term projects. By offering mechanisms that concurrently exploit networked resources and data in a seamless fashion, gCube enables scientific communities to share and operate storage and computational resources within a common framework through a personalized interface, regardless of the geographical location of their research facilities. Virtual Organisations can deploy Virtual Research Environments on demand, dynamically managing the configuration and the lifetime of their services autonomously.
gCube currently counts more than 500 software packages to cover different needs. gCube has also been used to serve the needs of AquaMaps, a third-part service largely used by the biodiversity scientific community. A dedicated VRE supporting the generation of species distribution maps through sophisticated analyses of data integrated from a variety of relevant sources is currently operated by the D4Science e-infrastructure.
How do gCube VREs work?
Once joined the D4Science e-Infrastructure, each community (i.e. Virtual Organisation – VO ) is provided with a set of resources, e.g. databases, storage areas, worker nodes, offered by the infrastructure. It can also register its own resources under its domain, and authorize its users, assigning to some of them special administrative roles. The VO administrator through the Information System facility manages all community’ resources, either the ones provided by the community or assigned to the community by D4Science. This allows to monitor the resources, e.g. it is possible to observe the load, the status, the available free memory, etc.; to manage them, e.g. it is possible to start each remote resource, to clean it and restart, etc.; to observe the accounting information, e.g. it is possible to see the operations requested by each users, the frequency of login, the services that are exploited, etc.
Then, VREs are created to serve the needs of the community’s members and users invited to join one or more VREs.
Each user logins to the VO’s personalised portal and access to the VREs he/she has been authorized to. From there, the user will search, elaborate and store shared and personal information and will access all the services included in that VRE, for example data management services for indexing, accessing, searching, transforming, describing, and annotating data.
Different facilities help users to perform their research activities. The workspace is a storage area for user-related files and objects. It has been designed to serve the needs of a user that triggers the storage of objects and files in the workspace. Customization allows to save files as private till the user explicitly asks for making it available to all other users. The result of any operation, e.g. data analysis, clustering, mining, etc., can be saved in the personal workspace and when needed either downloaded or shared with other users.
VRE can be modified by the VRE administrators during its lifetime to adapt to the changing requirements.
The EUBrazilOpenBio gCube applications so far
The Biodiversity VRE
Many data owners in the Biodiversity domain have difficulty in gaining consistent access to biodiversity and environmental data in enough detail and with relevant metadata. The management of those data covers the observations of species occurrences, their distribution mapping, and the visual and statistical analysis of both observations (occurrence points) and distributions (areas). It requires import of structured data in various formats, in particular compliant with the Darwin Core as xml or csv datasets. There are concerns about the sheer number of datasets that have to be maintained, with multiple data streams and formats. Moreover, most species information, including occurrence data, is hooked to a scientific species name that is coined to designate a taxon at specific level. However both taxonomy (the way to split the living organism diversity in well-defined and identifiable species in a hierarchical classification) and nomenclature (the proper way to assign unique names to the different taxa) is a work constantly in progress leading to the difficult situation where a species may be designated by several names (synonymies), or that a name may designate several species (homonymies), along the time or simultaneously according to different authors.
Compared to other initiatives, the Biodiversity VRE, already available from the EUBrazilOpenBio Gateway, already offers the basic components to load, share, publish and analyze those data regardless their formats. The Biodiversity VRE offers unique facilities to biodiversity users to integrate data from various sources by combining in innovative way knowledge about taxonomy, nomenclature, synonymies, and homonymies.
The Biodiversity Research Environment is designed to provide biodiversity users with a number of facilities for accessing and managing Biodiversity data. In particular the VRE is equipped with a service allowing to discover species products (including occurrence points) from various data providers including SpeciesLink, GBIF, Catalogue of Life.
The EUBrazilOpenBio gCube future applications
The EUBrazilOpenBio framework will actually be comprised of a number of gCube applications, such as the Species Occurrence Data Reconciliation and Enrichment Service for managing sets of occurrence points (each set from a particular species), the Environmental Data Access Service for managing environmental data, the Regional and Global Taxonomies Integration Service for managing species checklists EUBrazilOpenBio Use Case 1
), and the Ecological Niche Modeling Service for managing training, testing, and projection of species distribution models (EUBrazilOpenBio Use Case 2
The workshop saw the coming together of the Brazilian Virtual Herbarium community to identify the requirements of the Biodiversity user community and to match these to the current eInfrastructure and applications being developed by the EUBrazilOpenBio initiative.
With keynotes presentations from Brazilian Ministry of Science, the workshop also examined the social impact potential of e-infrastructures that leverage cloud and grid technologies. This impact will also come from building stronger ties between Europe and Brazil to address global biodiversity challenges.
Below you can download all the presentations from the workshop.
DAY 1 - WEDNESDAY 19 SEPTEMBER
09:00 – 09:30 Welcome
- Anísio Brasileiro, Chancellor of the Federal University of Pernambuco (UFPE)
- Leonor Costa Maia, Coordinator of the Virtual Herbarium of Plants and Fungi of Brazil (BVH) and professor at UFPE
- Vanderlei Canhos, Coordinator of the EUBrazilOpenBio project , and Director of CRIA the Brazilian Reference Center on Environmental Information
09:30 – 10:30 SESSION 1: S&T INNOVATION IN EUROPE & BRAZIL – ACHIEVEMENTS AND PROSPECTS
A high-level view on progress, achievements and expectations for the future of biodiversity e-Infrastructures related to open access to data, e-infrastructures and cloud computing and applications in Europe and Brazil.Chair: Nelson Simões
, Executive Director, Brazilian Education and Research Network (RNP), Brazil
- 09:30 – 09:50 Representative from the Brazilian Ministry of Science, Technology, and Innovation (MCTI)
11:00 – 12:30 SESSION 2: SCIENCE & TECHNOLOGY CHALLENGES & USER COMMUNITIES
Chair: Silvana Muscella,
Director, Trust IT Services & EUBrazilOpenBio partner
- 11:20 - 11:40 The Brazilian e-Infrastructure.
Jose Luiz Ribeiro Filho, Director for Services and Solutions, Brazilian Education and Research Network (RNP), Brazil
- 12:00 – 12:30 – Roundtable Discussion – Addressing user-community needs and requirements
14:00 – 15:30 SESSION 3: SCIENCE & TECHNOLOGY FOR BIODIVERSITY. The Brazilian Virtual Herbarium: data, tools, and services
Chair: Leonor Maia
, Brazilian Virtual Herbarium Coordinator
- 15:20 – 15:30 Final Considerations & Questions
16:00 – 17:30 SESSION 4: EUBRAZILOPENBIO: SERVICES AND USE CASES
Chair: Donatella Castelli,
CNR-ISTI & EUBrazilOpenBio European Scientific Director
Co-Chair: Vinod Rebello
, UFF & EUBrazilOpenBio Brazilian Scientific Director
- 17:20 – 17:30 Final Considerations & Questions
17:30 – 18:30 SESSION 5: EUBRAZILOPENBIO & BVH DEMONSTRATIONS
Chair: Vinod Rebello
, UFF & EUBrazilOpenBio Brazilian Scientific Director
Demonstrations will be available of the following applications:• EUBrazilOpenBio – tools for ecological niche modelling and cross mapping of taxonomies
> Video: Ecological Niche Modelling demo• Brazilian Virtual Herbarium: tools and services to assess data quality and gap analysis
DAY 2 - THURSDAY 20 SEPTEMBER
09:00 – 10:30 SESSION 6: LOOKING TO THE FUTURE
Chair: Mercedes Bustamante
, SEPED, Ministry of Science, Technology and Innovation
- 10:00 – 10:30 – Roundtable discussion - Future directions for biodiversity e-Infrastructures
11:00 – 12:30 SESSION 7: ENABLING E-INFRASTRUCTURES FOR MULTI-DISCIPLINARY COLLABORATION
Chair: Fabrizio Gagliardi,
EEMEA Director, Microsoft & EUBrazilOpenBio Strategic Advisory Board member
- 12:00 – 12:30 Roundtable discussion – Meeting the needs of multi-disciplinary collaboration
14:00 – 16:00 SESSION 8: ADVANCING BIODIVERSITY E-SCIENCE THROUGH GLOBAL COLLABORATION15:00 –16:00 CONCLUDING ROUNDTABLE on FURTHERING E-SCIENCE TO SUPPORT THE EU-BRAZIL POLICY DIALOGUE
with representatives from EU and Brazilian Agencies, Session 8 speakers plus a representative from the AMERICAS project
19 & 20 September 2012
Park Hotel, Rua dos Navegantes, 9, Recife, Brazil
With 70% of the world’s catalogued animal and plant species found in Brazil, the countries botanists and biodiversity researchers can benefit massively from specialised large data e-Infrastructures which can speed up and facilitate scientific research. However, research is becoming increasingly data-driven and complex mathematical models are needed to analyse increasingly abundant data sets.
This workshop is the perfect platform for the biodiversity researchers to learn about how the tools and services of the EUBrazilOpenBio infrastructure can facilitate and accelerate their work with more accurate results. With special focus on the Brazilian Virtual Herbarium community, experts will present step-by-step guides to EUBrazilOpenBio’s innovative niche modeling tool and the comparative analysis of the Brazilian and the Global taxonomic plants catalogue.
With keynotes presentations from Brazilian Ministry of Science and the European Commission, the workshop will also examine the social impact potential of e-infrastructures that leverage cloud and grid technologies. This impact will also come from building stronger ties between Europe and Brazil to address global biodiversity challenges.
For more information & participation contact firstname.lastname@example.org or email@example.com
The Virtual Herbarium of Plants and Fungi of Brazil (INCT)
The mission of the INCT Virtual Herbarium of Plants and Fungi
is to provide a high quality, open botanical collections data
infrastructure accessible to the public. This will integrate the data
held in Brazil’s herbaria and will repatriate data held in foreign
herbaria. The principal activities of the Institute focus on two basic
lines of research: (1) Biological diversity and taxonomy of plants and
fungi, and (2) Use of data on species distributions for the formulation
of public policy on plant and fungal diversity.
EUBrazilOpenBio project: Open Data and Cloud Computing e-Infrastructure for Biodiversity across Europe and BrazilEUBrazilOpenBio is a collaborative research project co-funded by the European Commission and the Brazilian Minister of Science Technology and Innovation (MCTI - CNPq). Its objective is to deploy an e-Infrastructure of Open Access resources serving the needs of the biodiversity scientific community. This will be done by integrating and interoperating EU and Brazilian data, cloud, and grid infrastructures and resources across biodiversity & taxonomy, fostering cooperation between the two ICT and biodiversity scientific communities.
What are the financial implications of using ‘the Cloud’ for research computing?
For computing services charged by the hour, as cloud computing services may be, cost is very directly linked to the performance. The autors of this study believe that the researchers are the best judges of performance and what is right for them. The economic and cost aspects of cloud computing are far from the only factors that should influence institutions’ or researchers’ decisions about cloud computing, and include aspects such as security and data protection, lock-in, service level management. Nonetheless this study should help researchers to understand the cost implications of the performance they are purchasing.
Curtis+Cartwright Consulting, with support from Dr Lee Gillam of the University of Surrey, undertook this ‘Cost analysis of cloud computing for research’ on behalf of the Engineering and Physical Sciences Research Council (EPSRC) and the Joint Information Systems Committee (JISC)
COMPSs is a programming framework developed by the Grid Computing and Clusters team at BSC
whose main objective is to ease the development of applications for distributed environments. It is composed of a programming model and an execution runtime which supports it.
What does COMPSs do?
The COMPSs programming model allows programmers to create a sequential application and specify which methods of the application code will be executed remotely irrespective of the execution environment and parallelization details.
At the same time, the COMPSs runtime optimizes the performance of the application by exploiting its inherent concurrency. The runtime intercepts any call to a selected method creating a representative task and finding the data dependencies with all the previous ones that must be considered along the application run. Through the monitoring of the workload of the application, the runtime determines the excess/lack of resources and turns to cloud providers enforcing a dynamic management of the resource pool.
COMPSs and EUBrazilOpenBio
EUBrazilOpenBio will leverage on the VENUS-C COMPSs Framework to enhance the porting and execution of the scenario “Ecological Niche Modelling” in the Cloud infrastructure in a seamless way, with regard to the specific provider.
The COMPSs programming model will enable the users to program complex workflows without the use of any API leaving to the COMPSs runtime the responsibility of efficiently scheduling the parts of this workflow on the available resources and of interfacing with the data storage in order to retrieve the input data and publish the results.
An online service for Botany Niche Modelling activities
Within the Ecological Niche Modelling use case, openModeller
will implement COMPSs workflows and users will benefit from the automation of several operations used for producing, testing and projecting the models. Such composite applications will be offered as a service through their deployment in the OpenModeller Web Service. Users will be able, on one hand, to enhance the offered functionalities for model generation and, on the other hand, to outsource their execution to the VENUS-C Platform. Through a Programming Model Enactment Service a bridge will be created to the VENUS-C platform, allowing the execution of applications through the COMPSs programming model
Job and data management functionalities will be integrated in existing clients like portals and science gateways, like the EUBrazilOpenBio Gateway.
On 26 and 27 April 2012 EUBrazilOpenBio partners met for a technical meeting in Pisa, at CNR-ISTI premises.
O paradigma da computação em nuvem permite o fornecimento de Tecnologia da Informação (TI) sob a forma de um serviço adquirido sob demanda. Entre os vários benefícios providos por esse novo paradigma, a elasticidade, que habilita o cliente a aumentar ou diminuir a capacidade de sua infraestrutura de TI sem qualquer custo adicional, é um dos mais importantes. Essa característica faz com que o ônus dos custos e riscos associados ao planejamento da capacidade da infraestrutura de TI passem do cliente para o provedor do serviço.
O estado-da-prática em provimento de infraestrutura como um serviço (IaaS) impõe um limite a essa elasticidade, para que se possa garantir uma disponibilidade suficientemente elevada para os serviços e, ao mesmo tempo, manter os custos operacionais em um nível aceitável. Isso restringe o escopo das aplicações que poderiam se beneficiar do paradigma de computação em nuvem.
Nesse projeto nós iremos investigar uma arquitetura alternativa para a construção de provedores de IaaS, onde os mesmos apenas incorrem em custos de propriedade quando os recursos usados para prover a sua infraestrutura são demandados pelos seus clientes, permitindo uma ampliação de algumas ordens de magnitude no limite que precisa ser imposto aos clientes. Além das questões relacionadas com o provimento de IaaS, serão estudadas questões relacionadas com o acoplamento com os outros níveis de computação em nuvem.
Abstract. The increase of the peer-to-peer networks popularity is related to decentralized resources sharing, where peers could exchanging messages among theirs. Through the aggregation and replication techniques, this model offer more robustness and performance when compared to client-server model. In this context, where the peer-to-peer paradigm was selected to development the U-Store, a data cloud solution project, this article presents an architectural proposal towards to create distributed and scalable services, aiming to solve the known issues such as latency, performance, load balancing and so on.
CESAR: Anderson Fonseca e Silva, Rodrigo Elia Assad
UFPE: Marco Andr´e Santos Machado, Paulo Fernando A. Soares, Francisco M. Soares-Neto , Vinicius Cardoso Garcia
Acknolwedgements: the work performed and described in this paper has been co-financed by EUBrazilOpenBio project, EU-Brazil Open Data and Cloud Computing e-Infrastructure for Biodiversity (288754) funded under the Objective FP7-ICT-2011-EU-Brazil Research and Development cooperation and by the Brazilian Minister of Science Technology and Innovation (MCTI) - National Council for Scientific and Technological Development (CNPq).
This paper provides the motivation for the JiT - Just In Time Cloud idea. An analysis is performed on why current public cloud providers impose quite restrictive limits on the number of instances that can be simultaneously acquired by any individual client. This is not a limitation for most commercial applications, but it can be a severe limitation for eScience applications, expecially those that can be parallelized as "bag-of-tasks".
This paper has been accepted for publication at the 2nd International Workshop on Cloud Computing and Scientific Applications (CCSA) 2012 (co-located with CCGrid 2012).
Abstract. Bag-of-tasks (BoT) is an important class of scientific applications. These applications are typically comprised of a very large number of tasks that can be executed in parallel in an independent way. Due to its cost associativity property, a public cloud computing environment is, theoretically, the ideal platform to execute BoT applications, since it could allow them to be executed as fast as possible, yet without implying on any extra costs for the rapid turnaround achieved. Unfortunately, current public cloud computing providers do impose strict limits on the amount of resources that a single user can simultaneously acquire, substantially increasing the response time of large BoT applications. In this paper we analyze the reasons why traditional providers need to impose such a limit. We show that increases on the limit imposed have a severe impact on the profit achieved by the providers. This leads to the conclusion that new approaches to deploy cloud computing services are required to properly serve BoT applications.
The overall architectural design of the EUBrazilOpenBio e-Infrastructure follows a number of incremental architectural milestones planned for the duration of the project. Each release will integrate a variety of data and IT resources for the coordination of jobs and the orchestration of data, in principal, to be provided by the gCube suite of services.
The objective of this initial release, as part of Milestone MS4 of the testbed infrastructure is to integrate remote compute resources from partner sites, create EUBrazilOpenBio within the D4Science Infrastructure, and exercise core gCube services such as monitoring.
EUBrazilOpenBio e-Infrastructure release 1.0 achievements:
- The installation and configuration of gCube core services (namely Information Service, Resources Manager and Resource Broker) at CNR.
- Deployment of gCube Hosting Nodes (gHNs) on partners resources at UPVLC, BSC (Spain), CNR (Italy) and UFF (Brazil) to host EUBrazilOpenBio services.
- Installation of OpenModeller (at this point to be executed sequentially on single host) for profiling and analysis. A EUBrazilOpenBio OpenModeller Installation Guide will be produced and made be available
- Deployment of a species mediator service to access to data from multiple possible sources, initially the Biodiversity Heritage Library and Bioline international.
- Implementation of a user interface/portal for available services.
This paper was published in ERCIM NEWS issue n.89, April 2012.
Abstract. Long-established technological platforms are no longer able to address the data and processing requirements of the emerging data-intensive scientific paradigm. At the same time, modern distributed computational platforms are not yet capable of addressing the global, elastic, and networked needs of the scientific communities producing and exploiting huge quantities and varieties of data. A novel approach, the Hybrid Data Infrastructure, integrates several technologies, including Grid and Cloud, and promises to offer the necessary management and usage capabilities required to implement the ‘Big Data’ enabled scientific paradigm.
EUBrazilOpenBio has launched The EUBrazilOpenBio Gateway, the portal conceived to provide the project users with a thin client for accessing EUBrazilOpenBio eInfrastructure facilities.
This gateway is an access point to a number of services developed and operated in the context of the EUBrazilOpenBio project. It serves the needs of the biodiversity scientific community first and in particular biodiversity scientists and information specialists involved in the development and alignment of species taxonomies (use case I) and biodiversity scientists involved in the production, comparison and projection of species distribution models to predict and to understand the distribution of species (use case II).
The gateway provides EUBrazilOpenBio community members with different applications enabling production of models on species distribution; seamless access to species data from multiple providers including specimen data and occurrence points; and management of taxonomies.
The gateway offering is implemented through the EUBrazilOpenBio eInfrastructure
. This gateway will evolve during the project
lifetime as a consequence of infrastructure evolution and services
Not a EUBrazilOpenBio member?
If you are not a EUBrazilOpenBio community member you can request access to one or more gCube Apps VREs. gCube Apps VREs offer a free-to-use environment made of storage and computational capabilities. The data you upload and generate are kept private to your workspace until you decide to share with other members of any gCube Apps VRE.
EUBrazilOpenBio Gateway is part of the EUBrazilOpenBio support to European and Brazilian biodiversity communities. Stay up-to-date on project developments or make a contribution to it by registering on the EUBrazilOpenBio Channel and joining our international community.
, developed through a FAPESP (São Paulo State Research Foundation) funded project, is one of the first and most relevant examples of software framework making available open access tools in the biodiversity field. It offers an open-source, flexible, user friendly environment where the complete process of conducting a fundamental niche modelling experiment can be carried out. The software includes facilities for reading species occurrence and environmental data, selection of environmental layers on which the model should be based, creating a fundamental niche model and projecting the model into an environmental scenario.
openModeller and EUBrazilOpenBio
The new software platform resulting from the integration of openModeller with Cloud computing will be deployed as part of the EUBrazilOpenBio enabling technology. This platform will provide the support for building larger and more complex applications that leverages the available information to synthesise new knowledge. By integrating openModeller in EUBrazilOpenBio, it will become possible to use the rich library of niche modelling facilities, as well as the variety of algorithms for modelling distribution patterns, that are included in this software. The development of these services from scratch is not trivial, and it is very unlikely that EUBrazilOpenBio would be able to do it in a reasonable time, without the experience gained from the significant effort that has been made by the openModeller development team.
In the same way, openModeller will benefit from the integration in EUBrazilOpenBio, along with an extremely rich set of open access resources to biodiversity scientists, by harnessing new openly available EU and Brazilian computing and data e-Infrastructures. If applied on a large enough scale openModeller will provide species distribution models on which future movements, expansions or extinctions of species can be predicted. However, as more data becomes available and new modelling strategies are incorporated the openModeller routines may soon require special computational resources. Integrating openModeller in EUBrazilOpenBio will increase its ability to deal with this unprecedented workload, by dynamically providing resources through the Cloud provision facilities of the EUBrazilOpenBio Software Platform, even speeding-up the process by the concurrent execution of different experiments. Also, openModeller will gain seamless access to relevant data available in EUBrazilOpenBio, such as taxonomies like Catalogue of Life (CoL) or the List of Species of the Brazilian Flora, without the need of locally downloading and reformatting data.
A desired side-effect of deploying openModeller as a service in the EUBrazilOpenBio Software Platform is promoting its use through its involvement in a wider community.
openModeller and Ecological Niche Modelling
OpenModeller is the main software application of Use Case II Ecological Niche Modelling
(ENM). It computes the models, validate them and use them to generate new projections about the probability of the occurrence of species under specific conditions.
OpenModeller is a software framework that offers a rich library of niche modelling facilities, providing a uniform method for modelling distribution patterns using a variety of algorithms. It offers a flexible, user friendly, cross-platform environment where the entire process of conducting a fundamental niche modelling experiment can be carried out. The software includes facilities for reading species occurrence and environmental data, selection of environmental layers on which the model should be based, creating a fundamental niche model and projecting the model into an environmental scenario
If applied on a large enough scale openModeller will provide species distribution models on which future movements, expansions or extinctions of species can be predicted. However, the openModeller routines require heavy computing processing and are yet to be widely applied to the world’s species distribution data.
ENM is used for many different research activities, beyond the specific validation of Use Case II. For example, AquaMaps (http://www.aquamaps.org/
) use this approach to create predicted global distribution maps for marine species.
The path to openModeller integration
Before integrating openModeller in EUBrazilOpenBio e-Infrastructure, it is necessary to understand its behaviour, requirements and bottlenecks. For this purpose, the following actions were performed:
- Software architecture analysis to identify the different stages, current implementation and usage models of the tool.
- Definition of a representative sample case for validation and performance measuring, covering different resolution levels and therefore different computing and data requirements.
- Execution profiling, analysing the computational cost of the different stages involved, with the objective of understanding the benefits of the inner parallelisation of the software.
- Data profiling, analysing the size of the data (and the nature) exchanged and generated in each step, as well as the source for this data.
By this analysis, the suitability of the different software tools available in EUBrazilOpenBio e-Infrastructure were assessed.
Scientific cooperation in cloud computing between Europe and Brazil can be very fruitful in terms of exchange of results and cross-testing, and to exchange lessons learnt. Euro-Brazilian cooperation was already in place at a time when cloud computing was not already there in domains such as eGov, eHealth or disaster recovery. Now the time has come to see how can Europe and Brazil work together to see how cloud computing can benefit these specific applications and what are the challenges for both the regions.
Lisandro Zambenedetti Granville, Associate Professor at the Federal University of Rio Grande do Sul (UFRGS), Institute of Informatics, Brazil, in this interview taken at CloudscapeIV provides his opinion.
How can scientific research best make use of the work coming out of the existing cloud projects and related activities? What new projects along these lines should be pursued?
In order to help OGF community to respond to this question EUBrazilOpenBio project will be presented at the "Science Applications and Infrastructure in Clouds and Grids" workshop (SAICG), that will be held in conjunction with OGF34, OGF's first event of 2012, hosted by the Oxford e-Research Centre of the University of Oxford in the UK.
The purpose of this workshop is to investigate cloud and grid framework software efforts and applications in greater detail. Science in general continues to make increasing use of advanced computing methods to process and visualize data, to perform simulations for comparison with expensive or difficult experiments, to extend the reach of theory beyond accessible experimental ranges, and to mine results from large collections of complex data.
Vassil Alexandrov, BSC and EUBrazilOpenBio European Coordinator, will present the project and will give an overview of the Optimization of the OpenModeler Use Case implemented by the project.
AbstractThe EU-Brazil Open Data and Cloud Computing e-Infrastructure for Biodiversity is one of the EC funded EU-Brazil FP7 projects.
This e-Infrastructure will provide the biodiversity community with access to biodiversity data, accompanying transformations and analysis tools that are currently being developed by the consortium. It combines characteristics of both Cloud and Grid infrastructures and is built on top of existing field-tested facilities, namely the VENUS-C Cloud Platform and the gCube Grid-infrastructure. We will outline our approach on the example of one use case, how the corresponding biodiversity data from different databases will be processed via the openModeller niche modelling library. We will also outline our approach on how to enhance the computation through interfacing with the COMP Superscalar programming framework.
EUBrazilOpenBio is driving a new collaborative framework that is underpinned by multi-faceted open e-science aimed at empowering the biodiversity scientific community by placing their needs centre stage.
The value-add of VENUS-C (Virtual Multidisciplinary Environments Using Clouds) project lies in its user-centric approach to cloud computing as well as the new components and infrastructures that have been developed to support 27 use cases for science and small businesses. Leveraging the achievements, components and infrastructures developed in other projects means that both regions can capitalise on earlier investments and bring to the table experiences on user-centric approaches to the cloud.
Specifically, VENUS-C is bringing to EUBrazilOpenBio assets like the COMPSs
(COMPS superscalar) adaptation for cloud computing infrastructures developed by the Barcelona Supercomputing Centre
(BSC). Another asset is the gCube
model for scientific data which acts as a bridge for VENUS-C processing kernels. gCube and Rainy Clouds have both been developed by the Italian National Research Council
But it doesn’t stop here. The synergy with VENUS-C is also aimed at educating new communities on cloud computing based on the benefits derived in real-world settings as the cloud gains traction in Brazil.
EUBrazilOpenBio Four o' clock flower seeds will be distributed at major events in Europe and Brazil. Register as EUBrazilOpenBio community member and stay tuned!
Mirabilis jalapa (The four o'clock flower or marvel of Peru) is the most commonly grown ornamental species of Mirabilis, and is available in a range of colours. Mirabilis jalapa is said to have been exported from the Peruvian Andes in 1540. The flowers usually open from late afternoon onwards, then producing a strong, sweet-smelling fragrance, hence the first of its common names.
Description: A garden annual with pretty red flowers. Mutants with large yellow or white flowers are known. The faintly fragrant flowers open in the late afternoon. The leaves and roots are said to be medicinal. A curious aspect of this plant is that flowers of different colors can be found simultaneously on the same plant.
Habitat and cultivation: M. jalapa hails from tropical South America, but has become naturalised throughout tropical and warm temperate regions. In cooler temperate regions, it will die back with the first frosts, regrowing in the following spring from the tuberous roots. The plant does best in full sun.
Distribution: A native of South America, widely cultivated and found as an escape in many tropical areas.
Distribution: Also suitable for vases. The leaves and roots are said to be medicinal. The seeds are considered poisonous .
Exposure: Full Sun
Germination: 7-14 days
Max plant height: 70 cm
Germination ability: 75% - Seeds purity: 97%
Open Access does not only refer to scientific papers made accessible through the net, it can be much, much more.
EUBrazilOpenBio project tries to address a new paradigm of openness of science.
Donatella Castelli, EUBrazilOpenBio European Scientific Director, explains how.
Cloudscape series has become an annual date for funding agencies, service providers, end-users, IT analysts and information security experts to debate on the Cloud computing landscape, covering benefits and challenges for research, enterprise, and government with practical use cases, success stories from Europe’s R&D landscape and ample networking opportunities.
Organized and promoted by the SIENA Initiative
, Cloudscape IV will focus strongly on interoperability issues with the involvement of key Standard Development Organizations (SDOs); highlighting how Distributed Computing Infrastructures' (DCIs) assets can be taken up by enterprise and eGovernment; and on the established collaboration with NIST, the US National Institute of Standards and Technology, and SIENA's recently contribution to the NIST Cloud Computing Roadmap.
Cloudscpae IV will be held on 23 & 24 February 2012. On day 24th February a session on "Global Developments in Cloud Computing for Science & Future Collaborations with Europe" has been scheduled in the agenda, where visibility will be provided to Brazilian projects working on cloud computing.
Cloud computing in North and South America
Follow the blog by GridCast
A total of €61mn is designated for Brazil in the EC’s Brazil Country Strategy paper 2007-2013 with the two focal areas: enhance bilateral relations, through sectoral dialogues, scholarship programmes and European Studies Institute, and environment.Brazil and the EU are committed to the building of the people-centred, non-discriminatory and development-oriented Information Society envisaged by the World Summit on the Information Society (WSIS) outcomes, as well as with the establishment of multilateral, transparent and democratic multi-stakeholder mechanisms for the governance of the global Internet. European Union and Brazil share the understanding that Information and Communication Technologies (ICT) are essential to foster innovation, competitiveness and economic growth, to create jobs and to increase the efficiency of the public sector. Moreover, ICT have a fundamental role in promoting digital inclusion and improving social cohesion, increasing the quality of life and reducing poverty.In this context, Brazil and the Europe agree to:
- Work in close co-ordination in all relevant international fora in order to facilitate the full implementation of all WSIS outcomes;
- Expand the bilateral dialogue and cooperation on ICT matters, encompassing policy, regulatory and research issues. This collaboration will contribute to ensure a stable regulatory framework in this sector, which will set the conditions to take full advantage of ICT in support of public policies and social welfare;
- Develop cooperation in relevant scientific and technological ICT areas of common interest in the context of the implementation of the Brazil-EU Agreement for Scientific and Technological Cooperation, in particular by enhancing collaboration within the 7th Framework Programme for Research and Technological Development, and by raising awareness through workshops, seminars and joint activities;
- Promote exchanges on e-infrastructures for networking and access to the electronic services between research libraries and data archives.
Biodiversity data and resources coming from different scientific communities all around the world too often are not integrated and even not available in a digital format.
By enhancing existing infrastructures of open access resources EUBrazilOpenBio will exploit new opportunities to perform data integration on biodiversity across Europe and Brazil, to develop new scenarios and to facilitate the decision making process.
How important is taxonomy for biodiversity? How do scientists predict the geographical distribution of species?
EUBrazilOpenBio collaborative project supports the openness of research findings and resources to help a wider variety of scientists in addressing grand global biodiversity challenges. This flier highlights the project's mandate and the two use cases that will be developed by EUBrazilOpenBio consortium.
Taxonomic information is essential for reliable environmental science, for monitoring changes in biodiversity and for proper management of the global problems related to environmental change, nature conservation and the sustainable use of biological resources.
Adequate access to taxonomic information on the Internet, to support electronic use of biodiversity information is still scarce - both in terms of availability (lacking completeness) and connectedness (lacking effective integration). One major problem is how to integrate between regional taxonomies created locally for regional floras and faunas and global taxonomies linked to global monographs and global species databases.
The Species2000/ITIS Catalogue of Life
(CoL) is an index of the world's known species of animals, plants, fungi and micro-organisms created primarily using global taxonomies. As such it represents a view at the global level of what lives on planet Earth. However, global species databases do not cover all groups of organisms. There are gaps and even when a group is covered the coverage may not be globally complete. Especially in the ‘megadiverse’ countries like Brazil, there is a potential for the CoL to harvest ‘missing species’ records from regional taxonomy, together with additional distributional and common names data that may be listed for a species. Conversely, there is information in the global databases that can assist the regional taxonomy – common names and distributions from other parts of the world, detail for domesticated and introduced species, and updates for groups whose centre of diversity and study is in other parts of the world.
However, the ability to cross-supplement between regional and global taxonomic data sets is often masked by complex differences in the taxonomic classification used. A species may, for instance appear to be missing from one catalogue because it is subsumed within another species in the other catalogue. This is the case, for instance, for the users of a regional taxonomy wanting to locate their species in the CoL, or wanting to gather comparable information for the same species from other parts of the world; as might be needed in a niche modelling study (Use Case 2
The CoL Cross-mapping Tool, developed by Cardiff University
is a tool that helps to manage differences between catalogues by detecting, analysing and reporting not only differences between two checklists of species, but also differences in their taxonomic treatment.
Backed by many years of research, the CoL Cross-mapping Tool is being developed in the i4Life
project to support comparisons between CoL and other taxonomies; for example, those of GBIF
, EMBL ENA
, etc. However, it is equally relevant for comparisons (cross-maps) between CoL and regional taxonomies for resolving the difficulties illustrated, for example by the Moure Catalogue
and the Flora of Brasil
. The Brazilian-created Moure Catalogue of Bees (actually covering all of Latin America) differs significantly from the global ITIS Bees database used by the CoL. The Flora of Brasil differs from the CoL in many details and documents a large number of extra species occurring in Brazil that are not even listed in the CoL.
In the EUBrazilOpenBio project the goal is thus to use the ‘taxonomic intelligence’
of the CoL Cross-mapping Tool to help taxonomic specialists to carry out a pilot study to cross-map and analyse the regional plant catalogue of Brazil with the global index of plants within the CoL. The Flora of Brasil, served by CRIA
contains c.10,000 species while the corresponding part of the CoL, provided by Species 2000 contains c.150,000 species of plants. Although the current algorithms of the Cross-mapping Tool operate sufficiently fast when comparing even very large data sets (such as two different editions of the CoL) this will change in the future. Algorithms are becoming more sensitive and thus more complex. Partial re-computation will be required as taxonomic specialists manually refine their cross-maps, and cross-maps will have to be revised repeatedly as data sets are revised and updated. Combining the CoL Cross-mapping Tool with the power of Cloud Computing facilities accessible in the project thus represents a smart evolutionary path for the capability.
A demostration of the CoL Cross-mapping Tool is available under the Community & Training Section of the EUBrazilOpenBio Channel. Follow this link
to view the demo.
How will a taxonomic specialist make use of the CoL Cross-mapping Tool?
A Brazilian flora specialist wishes to compare (cross-map) a checklist of taxa (Checklist Brazil) extracted from the Flora of Brasil catalogue with a checklist of taxa (Checklist CoL) extracted from the currently published edition of the Species 2000 / ITIS Catalogue of Life.
The specialist wishes to explore and analyse disparities between taxa in Checklist Brazil and taxa in Checklist CoL, identifying taxa that are absent from Checklist CoL. This analysis takes into account that: i) a taxon that occurs in Checklist Brazil may be known by a different name in Checklist CoL; ii) that it may be a subset of a taxon that appears in Checklist CoL (or vice versa); or iii) that it only partially overlaps with a taxon that appears in Checklist CoL (or vice versa).
The specialist sends a list of those taxa ‘missing’ from Checklist CoL to the Species 2000 Secretariat. The Species 2000 Secretariat forwards the ‘missing taxa’ information to relevant experts around the world for consideration as additions to be made to the appropriate Global Species Databases (GSD) making up the Catalogue of Life.
The relevant experts scrutinise the missing taxa information, making decisions about inclusion or rejection of specific taxon information into the GSD for which they are responsible. They annotate the information provided to them as feedback to the Brazilian specialist. In Brazil the specialist is able to check on the progress of these considerations and to review the feedback.
Watch the video
Recife, September 2012
What EUBrazilOpenBio will do for taxonomy at regional and global level?
The Species2000/ITIS Catalogue of Life
is unique in its breadth of coverage, the depth of its validation, and its widespread global take-up. Providing a sound baseline of species information for more than 1.3 million species (about 70% of all those known to science) for biologists around the world, the Catalogue is a global partnership
and a key partner to the major global programmes
that inform our understanding of global biodiversity. By providing a platform for validating and sharing species information, the Catalogue strengthens these programmes individually and collectively and provides a conduit for filling and closing the gaps between taxonomies. A planned “Global Multi-Hub Network” will create a framework of linkages between the CoL and regional hubs, initially in China, New Zealand, Australia, Brazil and North America. CRIA represents the Brazilian Hub, ‘Catálogo da Vida Brasil’.
By providing capabilities for cross-mapping taxonomies, as explained above, EUBrazilOpenBio aims to support users to make smarter use of the information they obtain; by helping them to better understand the differences of taxonomic treatment in information retrieved from different sources. Over time, greater understanding of these differences and swopping information in both directions (from regional taxonomy to CoL and vice-versa) will help to fill and close the gaps between different taxonomies, leading to a more complete and integrated view overall. Focussing on taxonomy of plants initially, this will eventually extend to include other biological groups (kingdoms) as well.
Need more information on EUBrazilOpenBio Cross-mapping services?
Please contact us at firstname.lastname@example.org
Ecological Niche Modelling (ENM) is a widely used approach to predict and understand the distribution of species in our planet.
An ecological niche can be understood as the set of ecological requirements for a certain species to survive and maintain viable populations over the time. In most cases, they are generated by combining species occurrence data with environmental data to find a representation of the conditions that are suitable for the species. Such models can be projected into different geographical regions under different environmental scenarios to predict the impact of climate changes on biodiversity, prevent the spread of invasive species, identify geographical and ecological factors regarding disease transmission, facilitate conservation planning and guide field surveys. Among many other use cases.
ENM challenges are associated with intensive computational requirements involving a large number of species, complex modelling approaches and high-resolution environmental data
This use case is aimed at investigating and proposing efficient ways of generating a large number of ecological niche models so that they can be retrieved and used by different applications.
The case study will be built upon the ongoing effort of the Brazilian National Institute of Science & Technology – Virtual Herbarium of Flora and Fungi, which will carry out a standard procedure for plant species that are native to Brazil to generate ecological niche models based on specimen data. This will require interaction with the List of Species of the Brazilian Flora, containing ~40000 plant species in its 2010 version, and with the speciesLink network, currently serving almost 5 million species occurrence records from hundreds of institutions. The modelling procedure will use different algorithms and separate steps for model assessment and projection into different environmental scenarios.
In this way EUBrazil OpenBio will set up an infrastructure to allow applications such as the Virtual Herbarium to perform these tasks more efficiently by exploiting computational resources from both Europe and Brazil. Data from at least 10 different families of plant species will be used to perform tests.
Watch the video:
Recife, September 2012
Allow a large number of ecological niche models to be efficiently generated by using the EUBrazil OpenBio infrastructure, so that other applications or end users can benefit from that.
Open Data and Cloud Computing e-Infrastructure for Biodiversity
across Europe and Brazil
EUBrazilOpenBio - Open Data and Cloud Computing e-Infrastructure for Biodiversity (2011-2013) funded under the Objective FP7-ICT-2011-EU-Brazil Research and Development cooperation and by the Brazilian Minister of Science Technology and Innovation (MCTI) - National Council for Scientific and Technological Development (CNPq) will deploy an e-Infrastructure of open access resources supporting the needs of the biodiversity scientific community.
Tackling the complexity of Biodiversity Science
requires dealing with multiple multidisciplinary datasets spanning from climatology to earth sciences all of key importance to overcome the fragmentation and focus on uniting existing different European and Brazilian data sources to provide scientists with an even greater knowledge base, achieved through the integration and shared use of appropriate computing resources.
In parallel EUBrazilOpenBio supports the Open Access Movement
, promoting the concept of openness for scientific research, aligned with the OpenAIRE initiative launched in 2010 to establish an infrastructure for EC-funded researchers to publish their OA work. EUBrazilOpenBio supports these critical initial steps towards greater openness in the advancement of research and scholarship, through both a policy mandate for open access and a provision of infrastructure to support that policy.
The breadth and depth of the resulting data infrastructure and the openness of its resources will enable a large variety of new cost-effective, cross-disciplinary virtual research environment applications thus opening the way to its widespread adoption and exploitation by the biodiversity scientific community.
EUBrazilOpenBio aims to ambitiously combine the two key themes above to deploy an e-Infrastructure of open access resources (data, tools and services) that will make significant strides towards fully supporting the needs and requirements of the biodiversity scientific community. This data e-Infrastructure will result from the federation and integration of existing EU and Brazilian developed infrastructures and resources, namely through Catalogue of Life, D4Science-II, openModeller and Venus-C.
Specifically EUBrazilOpenBio has three key objectives:
- Drive forward the interoperation of existing Brazilian and European e-Infrastructures in the distributed computing, scientific data and portals & platform layers
- Provide greater focus to the integration of data software platforms running through all of infrastructures
- Identify further future EU-Brazil collaboration in support to the biodiversity area in all types of infrastructures