the crop ontology - harmonizing semantics for agricultural field data, by elizabeth arnaud
TRANSCRIPT
The Crop Ontologyharmonizing semantics for agricultural field data
www.cropontology.org
Elizabeth Arnaud (Bioversity International)Co-authors: Leo Valette, Marie Angelique Laporte (Bioversity), Julian Pietragalla
(Integrated Breeding Platform), Medha Devare (CGIAR)And all crop curators and breeders
IGAD pre-meeting to 6th Research Data Alliance Conference, 21-22 September 2015
A common structured language for multidisciplinary agricultural research
– Molecular geneticists, breeders, agronomists, physiologists, highthrouput phenotyping, and crop modelers
– Enabling farmers to access information and exchange their preferences
– Calls for • a Common Terminology for annotating data• A mediation language that supports data
interpretation• Ontology for knowledge inference
Photos : courtesy of IRRI
Semantic Barriers to data interpretation
• No naming convention for variables and methods of measurement which are heterogeneous
• Confusion between traits and variables• No semantic coherence
Same trait given different names or abbreviations
One trait named the same way for various species but refers to different plant structures
• Definitions and measurements are different between farmers, breeders, agronomists, modelers
• No ontology on methods of measurement for formal description
The Integrated Breeding Platformwww.integratedbreeding.net
Crop Ontology provides most frequently measured traits and their standard variables for the Breeding fieldbook and for data annotation in the crop databases
• Crop Traits (agronomy, morphology, phenology, physiology, quality, stress)
• Experimental Design, trial management• Environmental factors
Crop Ontology www.cropontology.org
• Banana• Barley• Cassava• Chickpea • Common bean • Cowpea • Groundnut • Lentil• Maize • Oat (Global Triticeae )• Pearl millet• Pigeon Pea• Potato• Soybean (USDA & IITA)• Sweet Potato• Rice• Sorghum• Vitis (INRA)• Wheat • Yam
Extracting standard variables Trait Dictionary Template 5.0
Trait = Entity + Quality(Flower) (colour)
Trait ID CO_341:0000090Trait Flower colorEntity Flower
Attribute Colour
Trait synonyms Flower pigmentationTrait abbreviation FCL
Trait abbreviation synonyms
FlwCol
Trait description Color of the flower
Trait class Morphological traitsTrait status Recommended
Trait Xref TO:0000537
• A Trait can group several variables
Grain Weight– Weight of 100 grains expressed in g– Average weight of a grain, expressed in g– Weight of 100 grains expressed on a
categorical scale: 1=low (50-100g), 2= medium (100-150g), 3=high (150-200g)
Julian Pietragalla, IBP, Agronomist - Based in CIMMYT
Léo Valette, Bioversity, Agronomist
Standard VariableMethod and scales are important information to capture for data comparison & interpretation (e.g. crop models). Current ontologies provide sometimes brief information on Methods but as a text in the attribute information
A Variable is described by the assembly: Property (Trait) + Method + Scales/units
Unique name Annotate the real value of the measurement (for
fieldbook, for databases) Proposed convention for a standard variable naming :
P_M_S•Measurement•Counting•Estimation•Computation
•Nominal•Ordinal•Numerical•Time•Duration•Text•Code
Methods types Scales/Units
Online vizualizationTrait, Methods , Scales & Standard Variables
Work in progress
Naming convention for standard variables
Property (Trait) Method of measurement Scale or Unit
Applicable to any type of measurement & indicators for survey, monitoring
Google Cloud & API
EU-SOL - Solanaceae Breeding DBWageningen.
International cassava DB – Boyce Thompson Institute/IITA
USERSGlobal Repository of Evaluation trials – Agtrials1,410 agronomic variables are mapped to Crop Ontology traits for 29,633 trials out of 34,329 trials description
Phenomics Ontology Driven DB (PODD)
Luca Matteis, Web developer
Breeding Management System
Annotation of breeders data
Agronomy Ontology• Agronomic trial data are often collected, described and/or
formatted in inconsistent ways• An Agronomy ontology will support the integration of pre-
breeding, breeding and agronomy data• Combining results of field management practices x crop traits
measurements leads to fully understand how factors vary within a cropping system
• First step: Aligning with the International Consortium for Agricultural System Applications (ICASA)(http://research.agmip.org/display/dev/ICASA+Master+Variable+List ) - 600 standard variables – used by Crop Models of AGMIP and Crop Research Ontology
©Cimmyt
CROP - PLANTINGSEED TREATMENT IRRIGATIONFERTILIZERPESTICIDE SOILBIOTIC STRESSABIOTIC STRESSHARVEST-YIELD
Medha DevareData and Knowledge ManagerCGIAR Consortium Office
Work in progress
ICASA Variables for Crop Models
14
Common Reference Ontologies for Plants (cROP) and Tools for Integrative Plant GenomicsCommon Reference Ontologies for Plants (cROP) and Tools for
Integrative Plant GenomicsPlanteome pilot project
• Centralized platform for reference ontologies for plants • Online informatics portal for ontology-based, annotated data for plant germplasm, gene
expression, and non-model genomes• Data query, analysis, visualization and community-based annotation and curation tools
• Plant Ontology (PO)• Plant Trait Ontology (TO)• Plant Stress Ontology (PSO)• Plant Experimental Conditions
Ontology (PECO/EO)• Gene Ontology (plants)• Phenotypic Qualities Ontology (PATO)• Cell Type Ontology (CL)• Chemicals (ChEBI)• Protein Ontology (PRO)
Common Reference Ontologies for Plants and Tools for Integrative Plant Genomics
• Lead PI : Pankaj Jaiswal, • Sinisa Todorovic, Eugene Zhang Oregon State University, USA• Dennis W. Stevenson New York Botanical Garden, NY, USA,• Elizabeth Arnaud sity International, Montpellier, France; • Christopher Mungall . Lawrence Berkeley National Laboratory,
Berkeley CA, USA,• Georgios V. Gkoutos, John Doonan ; University of Aberystwyth, UK• Barry Smith, University of Buffalo, NY, USA
PATO:0000122Length
CO_321:0000056Spike Length
TO:0000271Inflorescence Length?
PO:0009049Inflorescence
Narrow Synonym: spike
CO_321:0000056Spike Length
TO:0000271Inflorescence Length
PO:0009049Inflorescence
Narrow Synonym: spike
Com
poun
d m
appi
ngs
Infe
renc
e
PATO:0000122Length
Mapping Crop Ontology terms across species and to the Reference Ontologies
• Mappings performed by Marie Angélique Laporte • Automatic mapping generated by AML tool developed by Catia Pesquita [email protected] ,
Daniela Oliveira [email protected] from LaSIGE - Large-Scale Informatics Systems Laboratory ((https://github.com/AgreementMakerLight/AML-Project )
• This mapping tool is used for thesaurus alignment of FAO, CABI, NAL in the Global Agricultural Concept Server (GACS) project
Future Activities• Content Expansion:
– Farmers’ preferences of Participatory Variety Selection (PVS)– Functional traits for Agroecology and Ecosystem Services restoration– Hosting Agricultural and Nutrition Technology ontology (ANT) of IFPRI– Aligning with Agrovoc, CABI, NAL thesauri for literature mining
• Community Use Expansion– Through Planteome and Divseek initiative– International Wheat Initiative– Collaborative, Open Plant Omics (COPO): a community-driven bioinformatics
platform for plant science (BBSRC)– The ISA-Tools group at Oxford and test their Statistical Method Ontology :
http://www.stato-ontology.org
April 2016: Workshop on Crop Ontology for scientists to discuss their data and the definitions of traits, present our project results – Hands on sessions; Vocamp
Sponsors, sessions conveners ?
CGIAR Crop Lead Centers and partners Since 2008
Community workshop in 2014, Montpellier : http://tiny.cc/rw51ax