mental functioning and semantic search in the neuroscience information framework

26
Mental Functioning and Semantic Search in the Neuroscience Information Framework Maryann Martone Fahim Imam Funded in part by the NIH Neuroscience Blueprint HHSN271200800035C via NIDA Neuroscience Information Framework – http://neuinfo.org

Upload: borna

Post on 24-Feb-2016

63 views

Category:

Documents


0 download

DESCRIPTION

Mental Functioning and Semantic Search in the Neuroscience Information Framework. Maryann Martone Fahim Imam. Funded in part by the NIH Neuroscience Blueprint HHSN271200800035C via NIDA. Neuroscience Information Framework – http://neuinfo.org. Literature. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Mental Functioning and Semantic Search in the Neuroscience Information Framework

Maryann MartoneFahim Imam

Funded in part by the NIH Neuroscience Blueprint HHSN271200800035C via NIDA

Neuroscience Information Framework – http://neuinfo.org

Page 2: Mental Functioning and Semantic Search in the Neuroscience Information Framework

The Neuroscience Information Framework: Discovery and utilization of web-based resources for neuroscience

• A portal for finding and using neuroscience resources

A consistent framework for describing resources

Provides simultaneous search of multiple types of information, organized by category

Supported by an expansive ontology for neuroscience

Utilizes advanced technologies to search the “hidden web”

http://neuinfo.org

UCSD, Yale, Cal Tech, George Mason, Washington Univ

Supported by NIH Blueprint

Literature

Database Federation

Registry

Page 3: Mental Functioning and Semantic Search in the Neuroscience Information Framework

NIF takes a global view of resources• NIF’s goal: Discover and use resources

– Data– Databases– Tools– Materials– Services

• Federated approach: Resources are developed and maintained by the community– >150 data sources; 350M records

• Agile approach: the NIF system is designed to be populated quickly and allow for incremental improvements to representation and search– Contract specifies 25 sources/year

NIF’s Rules for using digital

resources#1: YOU HAVE TO FIND THEM!!!!!!!

#2: You have to access/open them

#3: You have to understand them

Neuroscience is inherently interdisciplinary; no one technique reveals all

Page 4: Mental Functioning and Semantic Search in the Neuroscience Information Framework

What do you mean by data?Databases come in many shapes and sizes• Primary data:

– Data available for reanalysis, e.g., microarray data sets from GEO; brain images from XNAT; microscopic images (CCDB/CIL)

• Secondary data– Data features extracted through data

processing and sometimes normalization, e.g, brain structure volumes (IBVD), gene expression levels (Allen Brain Atlas); brain connectivity statements (BAMS)

• Tertiary data– Claims and assertions about the meaning

of data• E.g., gene

upregulation/downregulation, brain activation as a function of task

• Registries:– Metadata– Pointers to data sets or materials

stored elsewhere• Data aggregators

– Aggregate data of the same type from multiple sources, e.g., Cell Image Library ,SUMSdb, Brede

• Single source– Data acquired within a single

context , e.g., Allen Brain Atlas

Page 5: Mental Functioning and Semantic Search in the Neuroscience Information Framework

• Set of modular ontologies – 86, 000 + distinct concepts +

synonyms• Expressed in OWL-DL language

– Supported by common DL Reasoners

– Currently supports OWL 2• Closely follows OBO community best

practices • Avoids duplication of efforts

– Standardized to the same upper level ontologies • e.g., Basic Formal Ontology

(BFO), OBO Relations Ontology (OBO-RO)

– Relies on existing community ontologies • e.g., CHEBI, GO, PRO, DOID,

OBI etc.

5

• Modules cover orthogonal domain e.g. , Brain Regions, Cells, Molecules,

Subcellular parts, Diseases, Nervous system functions, etc.

Bill Bug et al.

NIFSTD Ontologies

Neuroscience Information Framework – http://neuinfo.org

Page 6: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Importing into NIFSTD• NIF converts to OWL and aligns to BFO, if not already

– Facilitates ingestion, but can have negative consequences for search if model adds computational complexity• Data sources do not make careful distinctions but use what is customary for

the domain

• Modularity: NIF seeks to have single coverage of a sub-domain– We are not UMLS or Bioportal

• NIF uses MIREOT to import individual classes or branches of classes from large ontologies– NIF retains identifier of source

• NIF uses ID’s for names, not text strings– Avoids collision– Allows retiring of class without retiring the string

NIFSTD has evolved as the ontologies have evolved; had to make many compromises based on ontologies and tools available

Page 7: Mental Functioning and Semantic Search in the Neuroscience Information Framework

NIFSTD Modules and SourcesNIFSTD Modules External Source Import/ AdaptOrganismal taxonomy NCBI Taxonomy, GBIF, ITIS, IMSR, Jackson Labs mouse catalog; the model

organisms in common use by neuroscientists are extracted from NCBI Taxonomy and kept in a separate module with mappings

Adapt

Molecules,Chemicals

IUPHAR ion channels and receptors, Sequence Ontology (SO); NIDA drug lists from ChEBI, and imported Protein Ontology (PRO)

Adapt/Import

Sub-cellular anatomy Sub-cellular Anatomy Ontology (SAO). Extracted cell parts and subcellular structures from SAO-CORE. Imported GO Cellular Component with mapping.

Adapt/Import

Cell CCDB, NeuronDB, NeuroMorpho.org. Terminologies; OBO Cell Ontology was not considered as it did not contain region specific cell types

Adapt

Gross Anatomy NeuroNames extended by including terms from BIRNLex, SumsDB, BrainMap.org, etc; Multi-scale representation of Nervous System, Macroscopic anatomy

Adapt

Nervous system function

BIRN, BrainMap.org, MeSH, and UMLS , GO Biological functions Adapt

Nervous system dysfunction

Nervous system disease from MeSH, NINDS terminology; Imported Disease Ontology (DO) with mapping

Adapt/Import

Phenotypic qualities Phenotypic Quality Ontology (PATO); Imported as part of the OBO foundry core

Import

Investigation: reagents Overlaps with molecules above from ChEBI, SO, and PRO Adapt/Import

Investigation: instruments, protocols, plans

CogPo, BIRNLex Adapt

Investigation: resource type

NIF, OBI, NITRC, Biomedical Resource Ontology (BRO) Adapt

Biological Process Gene Ontology (GO) biological process Import

Neuroscience Information Framework – http://neuinfo.org

Page 8: Mental Functioning and Semantic Search in the Neuroscience Information Framework

What are the connections of the hippocampus?

Hippocampus OR “Cornu Ammonis” OR “Ammon’s horn” Query expansion: Synonyms

and related conceptsBoolean queries

Data sources categorized by

“data type” and level of nervous

system

Common views across multiple

sources

Tutorials for using full resource when getting there from

NIF

Link back to record in

original source

Page 9: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Entity mapping

BIRNLex_435 Brodmann.3

Explicit mapping of database content helps disambiguate non-unique and custom terminology

Page 10: Mental Functioning and Semantic Search in the Neuroscience Information Framework

• Search Google: GABAergic neuron• Search NIF: GABAergic neuron

– NIF automatically searches for types of GABAergic neurons

– Defined by OWL axioms

Types of GABAergic neurons

NIF Concept-Based Search

Neuroscience Information Framework – http://neuinfo.org

Page 11: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Ontological Query expansion through OntoQuestExample Query Type Ontological ExpansionA single term query for Hippocampus and its synonyms

synonyms(Hippocampus); expands to Hippocampus OR "Cornu ammonis" OR "Ammon's horn" OR "hippocampus proper".

A conjunctive query with 3 terms transcription AND gene AND pathwayA 6-term AND/OR query with one term expanded into synonyms

(gene) AND (pathway) AND (regulation OR "biological regulation") AND (transcription) AND (recombinant)

A conjunctive query with 2 terms, where a user chooses to select the subclasses of the 2nd term

synonyms(zebrafish AND descendants(promoter,subclassOf))), zebrafish gets expanded by synonym search and the second term transitively expands to all subclasses of promoter as well as their synonyms.

A single term query for an anatomical structure where a user chooses to select all of the anatomical parts of the term along with synonyms

synonyms(descendants(Hippocampus,partOf)),expands to all parts of hippocampus and all their synonyms through the ontology. All parts are joined as an “OR” operation.

A conjunctive query with 2 terms, where a user chooses to select all the equivalent terms for the 2nd term

synonyms(Hippocampus) AND equivalent(synonyms(memory)), the second term uses the ontology to find all terms that are equivalent to the term memory by ontological assertion, along with synonyms.

A conjunctive query with 2 terms, where a user is interested in a specific subclasses for both of the terms

synonyms(x:descendants(neuron,subclassOf) where x.neurotransmitter='GABA') AND synonyms(gene where gene. name='IGF'), x is an internal variable.

A query to seek all subclasses of neuron whose soma location is in any transitive part of the hippocampus

synonyms(x:descendants(neuron,subclassOf) where x.soma.location = descendants (Hippocampus, partOf))

A query to seek a conceptual term that is semantically equivalent to a collection of terms rather than a single term.

'GABAergic neuron' AND Equivalent ('GABAergic neuron'), The term is recognized as ontologically equivalent to any neuron that has GABA as a neurotransmitter and therefore expands to a list of inferred neuron types OntoQuest – NIF’s ontology management system for NIFSTD ontologies

• Implements various graph search algorithms for ontological graphs•Automated query expansion for NIFSTD terms, including the ones with defined logical restrictions.

Gupta et al., 2010

Page 12: Mental Functioning and Semantic Search in the Neuroscience Information Framework

NIF information spaceNIF developed a tiered system• Domain knowledge

– What you would teach someone coming into your domain• NIFSTD/Ontoquest• All upper level BFO categories are

suppressed

• Claims based on data– Bridge files across domains

(constructed by NIF), Databases, triple stores,

– Text• Data

– Relational databases– Spreadsheets

Concepts

DataKnowledgeBase

Concepts, Entities + data summaries

Scientists search via the terms they use, not what we would like them to use-NIF needs a broad net to find relevant resources

Page 13: Mental Functioning and Semantic Search in the Neuroscience Information Framework

When searching across broad information sources, need to search for what people are looking for

What genes are upregulated by drugs of abuse in the adult mouse?

Gene upregulated mice illegal drug

Page 14: Mental Functioning and Semantic Search in the Neuroscience Information Framework

NIF “translates” common concepts through ontology and annotation standards

• What genes are upregulated by drugs of abuse in the adult mouse?

MorphineIncreased expression

Adult Mouse

Arbitrary but defensible

Page 15: Mental Functioning and Semantic Search in the Neuroscience Information Framework

NIFSTD AND NEUROLEX WIKI

• Semantic wiki platform• Provides simple forms for

structured knowledge• People can add concepts,

properties, and annotations• Generate hierarchies without

having to learn complicated ontology tools

• Community can contribute– Relax rules for NIFSTD so

dedicated domain scientists can contribute their knowledge and review other contributions

– Teaches structuring of knowledge via red links/blue links

– Process is tracked and exposed– Implemented versioning

15

Larson et al.

Readily indexed by Google; queries to NIF data via NIF navigator

Page 16: Mental Functioning and Semantic Search in the Neuroscience Information Framework

NeuroLex Content Structure

Stephen D. Larson et al.

Neurolex is becoming a significant knowledge base

Page 17: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Top Down Vs. Bottom upTop-down ontology construction• A select few authors have write privileges• Maximizes consistency of terms with each other• Making changes requires approval and re-publishing• Works best when domain to be organized has: small corpus, formal categories, stable entities, restricted entities, clear edges.• Works best with participants who are: expert catalogers, coordinated users, expert users, people with authoritative source of judgment

Bottom-up ontology construction• Multiple participants can edit the ontology instantly• Semantics are limited to what is convenient for the domain• Not a replacement for top-down construction; sometimes necessary to increase flexibility• Necessary when domain has: large corpus, no formal categories, no clear edges• Necessary when participants are: uncoordinated users, amateur users, naïve catalogers• Neuroscience is a domain that is less formal and neuroscientists are more uncoordinated

Larson et. al

NIFSTD

NEUROLEX

Neuroscience Information Framework – http://neuinfo.org

Page 18: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Engaging domain scientists

Memory

Mental Process

Cognitive process

Recall

Retrieval

Encoding

Disposition Planned process Continuant

Episodic Non-declarative

Mental state

? ? ?

Page 19: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Mental functioning is difficult to define and dissect

•Very few behaviors are “pure”• Operationally defined

through experiments

•What is a mental function?• Activity, state, function,

process

•Subtypes are rarely disjoint• Episodic memory• Semantic memory• Procedural memory• Declarative memory

•Distinctions among paradigms, assessments, tests, rating scales, tasks are often subtle

Early work done in BIRN; later terms added by students and curators

Page 20: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Neurolex does not adhere strictly to BFO

Concepts and things happily co-exist; content gets reconciled over time

Page 21: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Nevertheless...• We do not allow

duplicates• We do not allow

multiple inheritance– Use “role” to shortcut

many relations• We do try to re-factor

contributions so as to avoid collisions across our domains

• But...once they are in the wiki, they will move about and be added to as necessary

Neuinfo.org/neurolex/wiki/COGPO_00123

Page 22: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Cognitive-related searches through NIF

• fear prefrontal arousal• Attention and distraction• Passive viewing• stroop effect• sequence learning• studies done on the cognitive-

behavioral model of addiction• memory recall• self-administration• Visual oddball paradigm• Sexual Orientation

•Face recognition•neurophysiology of language•Olfaction•Consciousness•Gustatory

Scientists tend to focus on tests and general concepts rather than deep considerations of cognitive processes

Page 23: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Mental Functioning: What NIF needs

• Computable taxonomies of test (assessments, paradigms, tasks) types– Test types should be related to the function they purport to

measure but will only be an approximation– Not just human!!!

• Computable operational definitions of cognitive concepts– Translates tests into concepts used in search– Dementia rating scale scores = Dementia– Smoking assessment scores = smoker

Page 24: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Concluding Remarks• NIFSTD is utilized to provide a semantic index to heterogeneous

data sources– BFO allows us to promote a broad semantic interoperability between

biomedical ontologies. – The modularity principles allows us to limit the complexity of the

base ontologies• NIF defines a process to form complex semantics to

neuroscience concepts through NIFSTD and NeuroLex collaborative environment. – NIF encourages the use of community ontologies

• Moving towards building rich knowledgebase for Neuroscience that integrates with larger life science communities

Neuroscience Information Framework – http://neuinfo.org

Page 25: Mental Functioning and Semantic Search in the Neuroscience Information Framework

Points of DiscussionCogPO/CogAT/NEMO/MHO Harmonization?• What kind of interplay are we looking at? • Is it about re-use of ontological vocabularies?• What should be the best practice for reuse?

– Re-using URI vs Creating new class and Mapping– Non-semantic reuse of classes as entities (e.g., MIREOT)

• Is it about building new relationships between the entities covered in all these four ontologies?– What do we achieve through doing this?

• Are we trying to connect all the curated/ annotated experimental data-set to a common semantic layer?

• All of the above?

What should be NIF's role?• How can we help to expose your experiments and results to a broader audience through our

interface? • What kind of involvement can people have in terms of re-using your ontological content or

contributing to your content?• We want to be the 'host' of all the NS concepts and entities, but not necessarily the 'maintainer'.

Page 26: Mental Functioning and Semantic Search in the Neuroscience Information Framework

What ontology isn’t(or shouldn’t be)

• A rigid top-down fixed hierarchy for limiting expression in the neurosciences– Not about restricting expression but

how to express meaning clearly and in a machine readable form

• A bottomless resource-eating pit that consumes dollars and returns nothing

• A cure-all for all our problems• A completely solved area

– Applied vs theoretical• Easy to understand

Mike Bergman