geoshare-cimsans - overview and lessons from the pilot project paul hendley (phasera ltd. for the...
TRANSCRIPT
GEOSHARE-CIMSANS - Overview and Lessons from the Pilot Project
Paul Hendley (Phasera Ltd. for the ILSI Research Foundation) and Tom Hertel/Nelson Villoria (Purdue) .
Center for Integrated Modeling of Sustainable Agriculture & Nutrition Security (CIMSANS)
• International Life Sciences Institute (ILSI) Research Foundation initiative• Collaborating and building tri-partite relationships
– Between industries, academics and government– Use scientific data and approaches to address emerging challenges and help inform policy through
productive dialogue
•Especially via improved Open Source Ag Data & improved modeling approaches• Goal - conduct Sustainable Nutrition Security (SNS) impact assessment
– Requires Scoping of SNS Landscape & metrics– Conceptual Model & manuscript– Requires some new assumptions, new or improved models & support tools– Requires unique combination of data – private industry may be only global source
•Data Translation and Reprocessing tools – to assist data sharing and recovery•Model Friendly, Multi-scale Global Data Portal – Funding GEOSHARE
CIMSANS – GEOSHARE Pilot Project
• PURPOSE “…..to generate necessary infra-structure and levels of user participation and interest for collecting and curating public and private data while ensuring data quality and enhancing inter-operability. “ via examples– Cloud based Spatial Production Allocation Model (SPAM-IFPRI)
•Ghana and India – scalability & data density/availability/richness – HUBZero suitability for CIMSANS-relevant modeling frameworks
• Effective curation mechanism providing lasting added local/global value• Could GEOSHARE provide framework for unified access to baseline
and added value data for economic/physical modeling to support global efforts to assure secure/sustainable nutrition for all– Complementary with well established initiatives to avoid duplication
CIMSANS – GEOSHARE Pilot – Lessons so far
• Not stamp collecting – no incentives to contribute/maintain data• Adding lasting value to data contributors/users via WORKFLOWS essential
– New or better workflows with automated metadata, replicability advantages key• Or enhanced accessibility (format, visualization, avoid pre-processing) – especially cross-discipline or cross-expertise
level
• HUBZero offers tools that can make workflows readily publically accessible• Certain key replicable “best-available” fused data not generally available
– Slavish “copying” of existing tools into new cyber-infrastructure will not meet tomorrow’s needs• Conceptual models defining scope of new tools need to be (RE)designed to fit anticipated needs/data
– Communities of practice to input to design/endorse tools -“FIT FOR PARTICULAR PURPOSES”– Spatial allocation approaches a good example
• Making endorsed “Best Available” added-value data readily available to all is an important goal for improving overall quality/acceptance of food and nutrition security modeling
Where does GEOSHARE-CIMSANS go next??
• What we are looking for from the next two days is stakeholder input through the breakouts– Examples and Q&A should give everyone a good view on scope/capabilities of HUBZero
and GEOSHARE concept– Questions targeted to explore how you see
• GEOSHARE & HUBZero per se• Necessary next steps in
– Data Fusion – Aggregation of intermediate data at disparate scales/sophistication levels– Cross-sector unification/endorsement of best-available Climate/Environ/Economic data
» CREDIBILITY by CONSENSUS where assumptions are needed– Assistance for making important global data available in usable formats
» Improved inter-operability between data and model I/O modules
On to the workshop!………………
HUBZero and GEOSHARE Breakout 1430-1530
• Q1: What specific needs should cyber-infrastructure for the geospatial community address? Do these needs vary across data/model users and data/model providers?
• Q2: What are the work flows implied by these needs? How much of the workflow should be on-line? Is the hub a place to ensure reproducible research?
• Q3: What incentives can be put in place to ensure these work flows are utilized? • Q4: In what ways will the use of HUBZero contribute to transparency, credibility, and
acceptance of the underlying data and work flows available within GEOSHARE?
Group 1: Carol Song*, Paul Hendley**, Group 2: Tom Hertel*, Jingyu Song**Group 3: Nelson Villoria*, Ulrike Wood-Sichra**, Group 4: Paul Preckel*, Dave Gustafson**
Institutional Design Breakout – 1420 - 1710
• Q1: How can a public-goods project like GEOSHARE be sustained over the long run?
• Q2: What are the appropriate roles for the Advisory Board?• Q3: What are the incentives for participation in GEOSHARE, as viewed from
the perspective of: node leaders, government data generators, private industry, academics, other data consortia and user communities?
• Q4: How are data and work flow priorities defined?
• Group 1: Tom Hertel*, James Jones**, Group 2: Dave Gustafson*, Hermann Lotze-Campen**
• Group 3: Nelson Villoria*, Kate Schneider**, Group 4: Paul Hendley*, Ron Sands**
Data Validation/Endorsement Breakout – 1000-1100
• Q1: What is the value of GEOSHARE-endorsed geospatial data products for agriculture, food security and the environment? Who will benefit from such endorsement? Are there downsides from such endorsement?
• Q2: Different data sets and workflows may be suited for different purposes: Which of these is most pressing for GEOSHARE to achieve, and how might this be accomplished?
• Q3: Mechanics of endorsement: Who does the endorsement? How frequently, and in what form?
• Q4: Is broad stakeholder input needed? If so, how will this collected and integrated as part of the endorsement process?
• Group 1: Niven Winchester*, Paul Hendley**, Group 2: Petr Havlik*, Mark Imhoff**• Group 3: Hermann Lotze-Campen*, Jawoo Koo**
Data Fusion/Reconciliation Breakout – 1000-1100
• Q1: There is a trade-off between transparency (simplicity) and sophistication (complexity) in data fusion. Where should GEOSHARE aim on this spectrum?
• Q2: What is the role for prior information in this process? • Q3: How can GEOSHARE harness private- and public-sector knowledge and expertise in the process of data fusion?
• Q4: What is the role of ground-truthing and crowdsourcing in complementing data fusion based on more census and remote-sensing sources?
• Group 1: Navin Ramankutty*, Sherman Robinson**, Group 2: Paul Preckel*, Stefan Siebert**, Group 3: Andrew Nelson*, Paul Hendley**
What does this Holy Grail look like anyway? • …generating unified high quality spatial datasets that are a readily
accessible to and accepted as best-available science by all potential stakeholders involved in the production and forecasting of a secure and nutritious global food supply…..
• “UNIFIED home” for curating open source data from all sources – open source– Mechanism for alerting potential users to new and existing data
•..and associated strengths and weaknesses via solid metadata• Scientific recognition for data generators – “data journal” citable with DOI• Cross-sector unification/endorsement of best-available
Climate/Environ/Economic data– Collaboration between intermediate data “aggregation” processes– CREDIBILITY by CONSENSUS where assumptions have to be made
• Assistance for making important global data available in usable formats– Improved inter-operability between data and model I/O modules