biomedical knowledge visualization · 6/7/2004 · synonymous terms are clustered in a concept ......

34
Olivier Bodenreider Olivier Bodenreider Lister Hill National Center Lister Hill National Center for Biomedical Communications for Biomedical Communications Bethesda, Maryland Bethesda, Maryland - - USA USA Biomedical Knowledge Visualization Bethesda, MD July 6, 2004 7 th International Protégé Conference 2 nd Workshop on Visualizing Information in Knowledge Engineering (VIKE’04)

Upload: donguyet

Post on 16-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • Olivier BodenreiderOlivier Bodenreider

    Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland Bethesda, Maryland -- USAUSA

    Biomedical Knowledge Visualization

    Bethesda, MD July 6, 2004

    7th International Protg Conference2nd Workshop on Visualizing Information

    in Knowledge Engineering (VIKE04)

  • UMLS Semantic Navigator SemNav

    http://umlsks.nlm.nih.gov*

    SN Resources Semantic Navigator(* free UMLS registration required)

  • 3Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    UUnified nified MMedical edical LLanguage anguage SSystemystem

    Developed at NLM since 1990Developed at NLM since 1990

    1515thth edition in 2004edition in 2004

    Integrates some 60 terminological resourcesIntegrates some 60 terminological resources Clinical vocabularies (including specialties)Clinical vocabularies (including specialties)

    Core terminologies (anatomy, drugs, med. devices)Core terminologies (anatomy, drugs, med. devices)

    Administrative terminologies, standardsAdministrative terminologies, standards

    IntegrationIntegration Synonymous terms are clustered in a conceptSynonymous terms are clustered in a concept

    Hierarchies (trees) are combined in a graph structureHierarchies (trees) are combined in a graph structure

  • 4Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    Terminology integration Terminology integration TermsTerms

    Duchenne muscular dystrophy

    MeSH, SNOMEDCTV3, Jablonski,CRISP, DxPlain,MedDRA, LOINC

    pseudohypertrophic muscular dystrophyMeSH, CTV3SNOMED

    X-liked recessive muscular dystrophy Jablonski

    Duchenne de Boulogne muscular dystrophy Jablonski

    Duchennes muscular dystrophy COSTAR

    severe generalized familial muscular dystrophy SNOMED

    Duchenne type progressive muscular dystrophy SNOMED

  • 5Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    Terminology integration Terminology integration RelationshipsRelationships

    UMLS

    Adrenal Cortex Diseases

    Hypoadrenalism

    Adrenal Gland Hypofunction

    Adrenal cortical hypofunction

    Adrenal Gland Diseases

    Addisons Disease

    SNOMEDMeSHAODRead Codes

  • 6Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    UMLSUMLS

    TwoTwo--level structurelevel structure Semantic NetworkSemantic Network

    135 Semantic Types (135 Semantic Types (STsSTs))

    54 types of relationships54 types of relationshipsamong among STsSTs

    MetathesaurusMetathesaurus >1M concepts>1M concepts

    ~12 M inter~12 M inter--conceptconceptrelationshipsrelationships

    Link = categorizationLink = categorizationConcept

    Metathesaurus

    SemanticType

    Semantic Network

    categorization

  • Heart

    Concepts

    Metathesaurus

    22

    225

    97

    4

    12

    9 31

    Esophagus

    Left PhrenicNerve

    HeartValves

    FetalHeart

    Medias-tinum

    SaccularViscus

    AnginaPectoris

    CardiotonicAgents

    TissueDonors

    AnatomicalStructure

    Fully FormedAnatomical

    Structure

    EmbryonicStructure

    Body Part, Organ orOrgan Component Pharmacologic

    Substance

    Disease orSyndrome

    PopulationGroup

    Semantic Types

    SemanticNetwork

  • MeSH Browser

  • 12Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    SemNavSemNav Visualization optionsVisualization options

  • 17Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    SemNavSemNav RelationshipsRelationships

    Dystrophin

    Concepts

    Semantic Types

    MuscularDystrophy,Duchenne55

    Amino Acid,Peptide or Protein

    Disease orSyndrome

    Biologically ActiveSubstance

  • Gene Ontology browser

    http://mor.nlm.nih.gov/perl/gennav.pl

  • 19Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    Gene OntologyGene Ontology

    Developed by the GO ConsortiumDeveloped by the GO Consortium

    Several components (GO database)Several components (GO database) Ontology (~17,000 concepts)Ontology (~17,000 concepts)

    Molecular functionsMolecular functions

    Cellular componentsCellular components

    Biological processesBiological processes

    Gene products (~1.6M)Gene products (~1.6M)

    Associations between Gene products and GO concepts Associations between Gene products and GO concepts (~6.8M)(~6.8M)

  • Material and Methods

  • Technical details

  • 26Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    Technical detailsTechnical details

    Simple web/Simple web/cgicgi technology (apache, Perl)technology (apache, Perl)

    dot (dot (GraphVizGraphViz)) PNG file (PNG file (--TpngTpng))

    ClientClient--side map (side map (--TcmapTcmap))

    PrecomputePrecompute the transitive closure on hierarchical the transitive closure on hierarchical relations to perform the transitive closure fastrelations to perform the transitive closure fast

    Remove cycles (UMLS)Remove cycles (UMLS)

  • Discussion Issues and Challenges

  • 28Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    IssuesIssues

    SizeSize Large number of concepts (>1 million)Large number of concepts (>1 million)

    ComplexityComplexity PolyhierarchicalPolyhierarchical structuresstructures

    Multiple information sourcesMultiple information sources

    Multiple propertiesMultiple properties

    Lack of formalityLack of formality Redundant relationsRedundant relations

    Hierarchies vs. hierarchical relationsHierarchies vs. hierarchical relations

  • 29Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    ChallengesChallenges

    Restrict information spaceRestrict information space To selected information sources (To selected information sources (SemNavSemNav))

    To selected organisms (To selected organisms (GenNavGenNav))

    Reduce complexity (Reduce complexity (SemNavSemNav)) Group concepts by semantic groupsGroup concepts by semantic groups

    Transitive reduction on hierarchical relationsTransitive reduction on hierarchical relations

    Select coSelect co--occurring conceptsoccurring concepts

    Reduce the cognitive burden on the userReduce the cognitive burden on the user Use graphUse graph--based rather than treebased rather than tree--based representationsbased representations

  • 30Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    SemNavSemNav Semantic groupsSemantic groups

  • 31Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    ChallengesChallenges

    Restrict information spaceRestrict information space To selected information sources (To selected information sources (SemNavSemNav))

    To selected organisms (To selected organisms (GenNavGenNav))

    Reduce complexity (Reduce complexity (SemNavSemNav)) Group concepts by semantic groupsGroup concepts by semantic groups

    Transitive reduction on hierarchical relationsTransitive reduction on hierarchical relations

    Select coSelect co--occurring conceptsoccurring concepts

    Reduce the cognitive burden on the userReduce the cognitive burden on the user Use graphUse graph--based rather than treebased rather than tree--based representationsbased representations

  • 32Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    SemNavSemNav Transitive reductionTransitive reduction

  • 33Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

    ChallengesChallenges

    Restrict information spaceRestrict information space To selected information sources (To selected information sources (SemNavSemNav))

    To selected organisms (To selected organisms (GenNavGenNav))

    Reduce complexity (Reduce complexity (SemNavSemNav)) Group concepts by semantic groupsGroup concepts by semantic groups

    Transitive reduction on hierarchical relationsTransitive reduction on hierarchical relations

    Select coSelect co--occurring conceptsoccurring concepts

    Reduce the cognitive burden on the userReduce the cognitive burden on the user Use graphUse graph--based rather than treebased rather than tree--based representationsbased representations

  • MedicalOntologyResearch

    Olivier BodenreiderOlivier Bodenreider

    Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland Bethesda, Maryland -- USAUSA

    Contact:Contact:Web:Web:

    [email protected]@nlm.nih.govmor.nlm.nih.govmor.nlm.nih.gov