consistency between metathesaurus and semantic …consistency between metathesaurus and semantic...

Post on 06-Apr-2018

238 Views

Category:

Documents

7 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Consistency between Metathesaurusand Semantic Network

Workshop onThe Future of the UMLS Semantic Network

NLM, April 8, 2005

Olivier Bodenreider

Lister Hill National Centerfor Biomedical CommunicationsBethesda, Maryland - USA

2Lister Hill National Center for Biomedical Communications

Overview

Defining consistencyWhat does inconsistency mean?Testing consistency

Comparing Metathesaurus relations to SN relationsAligning Metathesaurus concepts and semantic typesSemantic type distribution of sets of descendants of Metathesaurusconcepts

SuggestionsEnforcement mechanismOntology of relationshipsCVF

Two levels in the UMLS

4Lister Hill National Center for Biomedical Communications

The UMLS: a two-level structure

Concept 1

Metathesaurus

Semantic Network

SemanticType a

SemanticType b

SemanticType c

Concept 2

Heart

Concepts

Metathesaurus

22

225

97

4

12

9 31

Esophagus

Left PhrenicNerve

HeartValves

FetalHeart

Medias-tinum

SaccularViscus

AnginaPectoris

CardiotonicAgents

TissueDonors

AnatomicalStructure

Fully FormedAnatomical

StructureEmbryonicStructure

Body Part, Organ orOrgan Component Pharmacologic

Substance

Disease orSyndrome

PopulationGroup

Semantic Types

SemanticNetwork

6Lister Hill National Center for Biomedical Communications

Relationships can inherit semantics

Semantic Network

Metathesaurus

AdrenalCortex

AdrenalCortical

hypofunction

Disease or SyndromeBody Part, Organ,

or Organ Component

Pathologic Functionisa

Biologic Function

isa

Fully FormedAnatomical

Structure

isa

location of

location of

Defining consistency

8Lister Hill National Center for Biomedical Communications

The consistency “square”

Concept 1

Metathesaurus

Semantic Network

SemanticType a

SemanticType b

Concept 2

9Lister Hill National Center for Biomedical Communications

The categorization link

Semantic Network

Professional Society

Metathesaurus

SalmonellaAmericanMedical

Association

Organism

Bacteriumisa

isa is an instance of

10Lister Hill National Center for Biomedical Communications

Semantic Network relations

54 types of relationships558 asserted relations (SRSTR)6703 fully expanded relations (SRSTRE*)

Semantic Network

Disease or SyndromeBody Part, Organ,

or Organ Component

Pathologic Functionisa

Biologic Function

isa

Fully FormedAnatomical

Structure

isa

location of

11Lister Hill National Center for Biomedical Communications

Metathesaurus relations

REL vs. RELANot always labeled

106 additional types of relationships~7 M symbolic relations

Heart

Concepts

Metathesaurus

22

225

97

4

12

9 31

Esophagus

Left PhrenicNerve

HeartValves

FetalHeart

Medias-tinum

SaccularViscus

AnginaPectoris

CardiotonicAgents

TissueDonors

12Lister Hill National Center for Biomedical Communications

Metathesaurus relations

Recordedat the term level: from source vocabulariesat the concept level: from Metathesaurus editors

Aggregated at the concept level

Oat cell carcinoma of lungCarcinoma, Small CellSCLC

Lung structureLungPulmonary

has_finding_site

Oat cell carcinoma of lungCarcinoma, Small CellSCLC

Lung structureLungPulmonary

has_finding_site

13Lister Hill National Center for Biomedical Communications

Not all relationships in hierarchies are isa (1)

Autoimmune Diseases

Addison’s disease

Addison’s diseasedue to autoimmunity

TuberculousAddison’s disease

is generally a

14Lister Hill National Center for Biomedical Communications

Not all relationships in hierarchies are isa (2)

Environment and Public Health [G03]

Public Health [G03.850]

Accidents [G03.850.110]

Accident Prevention [G03.850.110.060] +

Accidental Falls [G03.850.110.085]

Accidents, Aviation [G03.850.110.185]

[…]

Drowning [G03.850.110.500] +

15Lister Hill National Center for Biomedical Communications

Defining consistency

SN rel. and Meta rel. must have the same direction

SN rel. and Meta rel. must be of the same type (both hierarchical or associative)

Meta rel. must be the same as SN rel. or one of its descendants

Concept 1

Metathesaurus

Semantic Network

SemanticType a

SemanticType b

Concept 2

16Lister Hill National Center for Biomedical Communications

Examples of consistent relations

Lung

Body Part, Organ,or Organ Component

Disease orSyndrome

Pneumonia

has_location

has_location

17Lister Hill National Center for Biomedical Communications

Examples of consistent relations

Concept 1

Metathesaurus

Semantic Network

SemanticType a

SemanticType b

Concept 2

affects

treats

What does inconsistency mean?

19Lister Hill National Center for Biomedical Communications

The consistency “square” revisited

Concept 1

Metathesaurus

Semantic Network

SemanticType a

SemanticType b

Concept 2 Concept 1

Metathesaurus

Semantic Network

SemanticType a

SemanticType b

Concept 2

??

?

?

20Lister Hill National Center for Biomedical Communications

What does inconsistency mean?

Inaccurate/missing Semantic Network relation

Inaccurate (/missing?) categorization

Inaccurate Metathesaurus relation

Testing consistency

22Lister Hill National Center for Biomedical Communications

(A) Consistency of associative relations

[McCray& Bodenreider, 2002]

23Lister Hill National Center for Biomedical Communications

Results

6894 pairs of related concepts4496 (65%): a SN relation can be inferred unambiguously

Validity confirmed in 1981 cases2515 not labeled in the Metathesaurus

1491 (22%): multiple possible SN relationshipsmultiple possible Metathesaurus relationships

907 (13%): inconsistency SN/Meta relationships372: no SN relationship between the STs415: inconsistent SN/Meta relationship type (REL)120: inconsistent SN/Meta relationship attribute (RELA)

24Lister Hill National Center for Biomedical Communications

(B) Consistency of hierarchical relations

Relations usedSN: isaCategorization: isaMetathesaurus: PAR/CHD + RB/RN

HypothesisFor a pair of (ST, C), the concepts categorized by ST (and its descendants) correspond to the descendants of the concept CIn the set of descendants of C, expected STs are the ST of C (and its descendants)

25Lister Hill National Center for Biomedical Communications

ST-based classes vs. descendants

Semantic typeList of all conceptshaving this semantic type

ConceptList of all descendants

Comparing the 2 setsIntersection of the 2 sets

[Bodenreider& Burgun, 2004]

26Lister Hill National Center for Biomedical Communications

Analyzing inconsistenciesAmphibian Amphibia

1126descendants

1135concepts

1124in common

TadpoleInvertebrate

Toadlicking

PharmacologicSubstance

Miscategor-ization (?)

Wronghierarchical

relationMissing

hierarchicalrelation

Miscategor-ization

Ranaunclassified

ClassReptilia

Amphibians and Reptiles

27Lister Hill National Center for Biomedical Communications

Semantic types of descendants

ConceptSet of all descendants

Distribution of semantic types in the set

Allowable STs: ST of C and its descendants (strict) or ST from the same semantic group (loose)

[Mougin& Bodenreider, 2005]

28Lister Hill National Center for Biomedical Communications

Analyzing inconsistencies

26,584 concepts studied59% of their descendants have a semantic type incompatible with that of the original concept

Reaction belligerent

Finding Hostility

Mental Process

29Lister Hill National Center for Biomedical Communications

# ------------------------------------------------------------# C0597249 Neoplasm of placenta (disorder) (neop)# * B: 190

C0597249|ST|acab| 5.50|incpC0597249|ST|anab| 1.50|incpC0597249|ST|cgab| 76.50|incpC0597249|ST|dsyn| 27.50|incpC0597249|ST|inpo| 1.00|incpC0597249|ST|neop| 76.50|compC0597249|ST|patf| 1.50|incp

C0597249|SG|DISO| 190.00|comp# ------------------------------------------------------------

Analyzing inconsistencies

Suggestions

31Lister Hill National Center for Biomedical Communications

Aligning SN and Meta relationships

54 types of SN relationships106 additional types of Metathesaurusrelationships

Some are simply synonymous(caused_by / due_to; follows / temporally_follows)Some are specialized relationships(manifestation_of / definitional_manifestation_of)Many types of mapping relationships, not in SN

32Lister Hill National Center for Biomedical Communications

Add classification information to SN

Explicit classificatory principles (in addition to textual definition and examples)Abandon economy principle and return to JEPD (jointly exhaustive/pairwise disjoint) approach

33Lister Hill National Center for Biomedical Communications

Metathesaurus editing environment

Use SN/Meta relation consistency as a constraint for assigning semantic types

Use SN relations to suggest labels for unspecified Meta relationsUse SN/Meta relation consistency to guide the review by the Metathesaurus editors

Inaccurate categorization?Inaccurate Metathesaurus relation?

Conclusions

35Lister Hill National Center for Biomedical Communications

Conclusions

SimultaneouslyImprove SNImprove categorization

ST assignment can be automated in part

36Lister Hill National Center for Biomedical Communications

Some references

McCray AT, Bodenreider O.A conceptual framework for the biomedical domain.In: Green R, Bean CA, Myaeng SH, editors. The semantics of relationships: an interdisciplinary perspective. Boston: Kluwer Academic Publishers; 2002. p. 181-198. Bodenreider O, Burgun A.Aligning knowledge sources in the UMLS: Methods, quantitative results, and applications.Medinfo 2004:327-331.

37Lister Hill National Center for Biomedical Communications

Some references

Burgun A, Bodenreider O.Aspects of the taxonomic relation in the biomedical domain.In: Welty C, Smith B, editors. Collected papers from the Second International Conference "Formal Ontology in Information Systems": ACM Press; 2001. p. 222-233. Mougin F, Bodenreider O.Approaches to eliminating cycles in the UMLS Metathesaurus: Naive vs. formal.Proceedings of AMIA Annual Symposium 2005:(submitted).

MedicalOntologyResearch

Olivier Bodenreider

Lister Hill National Centerfor Biomedical CommunicationsBethesda, Maryland - USA

Contact:Web:

olivier@nlm.nih.govmor.nlm.nih.gov

top related