ontology mapping needs context

Ontology mapping needs context

Frank van HarmelenVrije Universiteit Amsterdam

introductory comments onontologies & contexts

3

Ontologieswe know what they are“consensual, formalised models of a domain”we know how to make and maintain them (methods, tools, experience)we know how to deploy them(search, personalisation, data-integration, …)

Main remaining open questionsAutomatic construction (learning)Automatic mapping (integration)

4

ContextsI don’t really know what they are…Quote from CfP: “Earlier workshops were mostly focused on what contexts and ontologies are”. At least (?) two views:

context as “module”, ist(p,c)context as “relevant knowledge”, “contextual meaning”

I will use 2nd meaning

context-specific nature of knowledge

6

Opinion pollleft

meaning of a sentence is only determinedby the sentence itself,and not influenced bythe surroundingsentences, and not by the situationin which the sentence is used

meaning of a sentence is only determinedby the sentence itself,and not influenced bythe surroundingsentences, and not by the situationin which the sentence is used

meaning of sentence is not only determinedby the sentence itself,but is also influenced byby the surroundingsentences,and also by the situationin which the sentenceis used

meaning of sentence is not only determinedby the sentence itself,but is also influenced byby the surroundingsentences,and also by the situationin which the sentenceis used

right

7

Opinion pollrightleft

don’t you see what I mean?

8

Agenda for talkDoes this “context dependency” also hold for ontology mapping?

Intuitively: yes, obviouslyMore precisely:

can context compensate for lack of structure in source and target?is more context knowledge better?is richer context knowledge better?

This work withZharko Aleksovski &

Michel Klein

Does context knowledge help mapping?

11

The general idea

source target

background knowledge

anchoring anchoring

mapping

inference

12

source and target vocab’sOLVG “problem-list”:

around 3000 problems in a flat listbased on ICD9 + “classificatie van verrichtingen”contains general and specific categories

• implicit hierarchy• e.g.6 types of Diabetes Mellitus, many fractures

some redundancy because of spelling mistakesused to keep track of the problems of patients during the whole stay at the ICU

OLVG-1400:the subset used in the first 24 hour of stay since 2000 (contains data about 3602 patients)

AMC: similar list, but from different hospital

13

Context ontology used

DICE:2500 concepts (5000 terms), 4500 linksFormalised in DLfive main categories:• tractus (e.g. nervous_system,

respiratory_system)• aetiology (e.g. virus, poising)• abnormality (e.g. fracture, tumor)• action (e.g. biopsy, observation, removal)• anatomic_location (e.g. lungs, skin)

14

Baseline: Linguistic methods

Combine lexical analysis with hierarchical structureFirst round

compare with complete DICE313 suggested matches, around 70 % correct

Second round:only compare with “reasons for admission” subtree209 suggested matches, around 90 % correct

High precision, low recall (“the easy cases”)

15

The general idea

source target

background knowledge

anchoring anchoring

mapping

inference

16

Example found with context knowledge (beyond lexical)

17

Example 2

18

Anchoring strengthAnchoring = substring + trivial morphology

anchored on N aspects OLVG AMC

N=5N=4N=3N=2N=1

004

144401

2198711285208

total nr. of anchored termstotal nr. of anchorings

5491298

39% 14045816

96%

19

Experimental resultsSource & target = flat lists of ±1400 ICU terms eachBackground = DICE (2300 concepts in DL)Manual Gold Standard (n=200)

20

So…The OLVG & AMC terms get their meaning from the context in which they are being used. Different background knowledge would have resulted in different mappingsTheir semantics is not context-freeSee also: S-MATCH by Trento

Does more context knowledge help?

22

Adding more context Only lexicalDICE (2300 concepts)MeSH (22000 concepts) ICD-10 (11000 concepts)

Anchoring strength:DICE MeSH ICD10

4 aspects 0 8 03 aspects 0 89 02 aspects 135 201 01 aspect 413 694 80total 548 992 80

23

Results with multiple ontologies

Monotonic improvement Independent of orderLinear increase of cost

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4

Separate Lexical ICD-10 DICE MeSHRecallPrecision

64%95%

64%95%

76%94%

88%89%

Joint

does structured context knowledge help?

25

Exploiting structureCRISP: 700 concepts, broader-thanMeSH: 1475 concepts, broader-thanFMA: 75.000 concepts, 160 relation-types(we used: is-a & part-of)

26

Direct vs. inferred matchesusing only:source-target lexical matchesrelations inside source or target:

e.g: (S <d T) & (T < T’) → (S <d T’)e.g: CRISP:brain =d MESH:brain

MESH:brain > MESH:temp_lobe→ CRISP:brain >d MESH:temp_lobe

27

Direct vs. inferred matches

Using:Lexical anchorings with backgroundRelations inside background knowledge

Matches inferred via anchorings:(S <a B) & (B < B’) & (B’ <a T) → (S <i T)

28

Using the structure or not(S <a B) & (B < B’) & (B’ <a T) → (S <i T)

Only stated is-a & part-ofTransitive chains of is-a, andtransitive chains of part-ofTransitive chains of is-a and part-ofOne chain of part-of before one chain of is-a

29

Examples

30

Examples

31

Anchoring strength

Anchoring concepts

= · ≥ Anchored concepts

CRISP to FMAMeSH to FMA

7381475

483 1042

6071545

14742227

7301462

32

Matching results (CRISP to MeSH)

Recall = · ≥ total incr.Exp.1:DirectExp.2:Indir. is-a + part-ofExp.3:Indir. separate closuresExp.4:Indir. mixed closuresExp.5:Indir. part-of before is-a

448395395395395

417516933

1511972

156405

140222281800

10211316273041433167

-29%

167%306%210%

Precision = · ≥ total correctExp.1:DirectExp.4:Indir. mixed closuresExp.5:Indir. part-of before is-a

171414

183937

35950

38112101

100%94%

100%

(Golden Standard n=30)

wrapping up

34

Related workContext knowledge for mapping is mostly linguistic (WordNet)Notable exception is S-Match using UMLS,but: we have shown source/target structure is not needed

35

ConclusionsStructured investigation on:

The role of source/target structure:we can even do without, given good contextThe role of context structure(it helps, but be careful with its semantics)The amound of context knowledge(surprisingly robust monotonic improvements)

ontology mapping needs context

Documents