controlled language for ontology editing adam funk, valentin tablan, kalina bontcheva, hamish...

31
Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

Upload: kathleen-daniel

Post on 24-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

Controlled Language for Ontology Editing

Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham,

Brian Davis, Siegfried Handschuh

Page 2: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

2

University of Sheffield NLP

Purpose

• To provide a controlled language for basic ontology-editing (and later, querying) functions: easy to learn from examples and simple rules relatively easy to deploy (Java, GATE) unambiguous compact (e.g., create many classes or

instances with one sentence) natural but grammatically lax

Page 3: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

3

University of Sheffield NLP

Implementation

• Developed and tested in the GATE GUI, but deployable as a service

• GATE application using text as input to modify an ontology

• Based partly on standard NLP components and modified IE components, with manipulation of the GATE ontology API

Page 4: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

4

University of Sheffield NLP

Implementation

Page 5: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

5

University of Sheffield NLP

Syntax

• Quoted chunks: words in pairs of single or double quotes

• Keyphrases: identified and tagged by the gazetteer (is, are: Copula; is a, InstanceOf; forget, Negate)

• Prepositions and determiners: POS-tagged• Chunks: everything else• ChunkLists: one or more chunks separated

by and or commas

Page 6: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

6

University of Sheffield NLP

Syntax and semantics

• 10 syntactic rules• Some have up to three semantic rules;

CLOnE refers to the ontology to select one deterministically

• Create and delete classes, subclass relations and instances

• Create and instantiate datatype and object properties

Page 7: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

7

University of Sheffield NLP

Syntax and semantics

• Rule: ChunkList0 InstanceOf Chunk1“.”

• Example: Alice Jones and Bob Smith are persons.

• Semantics: If Chunk1 names a class, create instances of it. Otherwise return an error message.

Page 8: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

8

University of Sheffield NLP

Syntax and semantics

• Rule: ChunkList0 Copula Chunk Prep ChunkList1 “.”

• Examples: Persons are authors of documents. Carl Pollard and Ivan Sag are authors of 'Head-Driven Phrase-Structure Grammar'.

• Flexible semantics: Create a property between two classes. Instantiate a suitable property between two instances. Return an error message (mixed classes and

instances, or a chunk that can't be dereferenced).

Page 9: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

9

University of Sheffield NLP

Syntax and semantics

• Rule: Negate ChunkList “.”

• Example: Forget projects, journals and 'Department of Computer Science'.

• Semantics: Delete each class or instance in the list.

Page 10: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

10

University of Sheffield NLP

Evaluation

• Pre-test questionnaire to let users rate their own knowledge of ontologies and CLs

• Short manual on ontologies and both tools• Two progressive lists of 6 simple tasks, A & B

CLOnE task list A -> Protégé B or Protégé A then CLOnE B

• SUS and SUS-based questionnaires

Page 11: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

11

University of Sheffield NLP

Evaluation

• “Repeated-measures, task-based” evaluation of CLOnE in comparison with Protégé

• Sample size = 15 (sufficient for SUS)• Evenly split by task-tool association and tool

order

Page 12: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

12

University of Sheffield NLP

Evaluation

• 95% confidence intervals of SUS scores (SUS baseline is 65 to 70%)

Page 13: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

13

University of Sheffield NLP

Evaluation: correlations

Page 14: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

14

University of Sheffield NLP

Evaluation: correlations

• Pre-test score has no correlation with task times or SUS results.

• Correlations between C/P, CLOnE SUS and Protégé SUS show coherence of the set of questionnaires.

Page 15: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

15

University of Sheffield NLP

Evaluation: correlations

• Task times for both tools are moderately correlated with each other, but not with SUS values. Both tools are technically suitable for both

tasks. We do not claim that CLOnE is faster for

simple tasks, just that users prefer it.

Page 16: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

16

University of Sheffield NLP

Evaluation: sample quality

• Sample is sufficient for SUS evaluation• Sample quality according to task-tool

association, tool order, and subject type?

Page 17: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

17

University of Sheffield NLP

Evaluation: sample quality

• SUS values for both tools were slightly lower for task list B: waning interest as the evaluation progressed

• Similar task times for A & B: similar effort required (in any case, the task-tool association was almost evenly split)

• Consistent SUS and C/P values between groups G and NG

Page 18: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

18

University of Sheffield NLP

Continuing work

• Bugfixes, technical improvements• Better error messages• Support for distinct string, date and numeric

datatypes• Development of CLOnE-QL query language• Implementation of a web-service for

question-answering from an ontology

Page 19: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

19

University of Sheffield NLP

Acknowledgements

• KnowledgeWeb (EU Network of Excellence IST-2004-507482)

• TAO (EU FP6 project IST-2004-026460)• SEKT (EU FP6 project IST IP-2003-506826• Líon (Science Foundation Ireland project

SFI/02/CE1/1131)• NEPOMUK (EU project FP6-027705)

Page 20: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

20

University of Sheffield NLP

Page 21: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

21

University of Sheffield NLP

Evaluation summary

Page 22: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

22

University of Sheffield NLP

Questionnaire CIs

A data sample’s 95% confidence interval is a range 95% likely to contain the mean of the whole population that the sample represents.

Page 23: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

23

University of Sheffield NLP

Correlation coefficients

Page 24: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

24

University of Sheffield NLP

Correlation coefficients

• +1 = perfect correlation equivalent to a straight ascending line on a

scatter plot

• +0.7 = strong correlation• 0 = no correlation

random scatter plot)

• -0.7 = strong negative correlation• -1 = perfect negative correlation

Page 25: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

25

University of Sheffield NLP

Correlation coefficients

• Pearson's formula assumes that the two variables are linearly meaningful; especially suitable for physical measurements

• Spearman's formula assumes only that they are ordinally meaningful (ranking); suitable for subjective measures such as many in social sciences

Page 26: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

26

University of Sheffield NLP

Sample quality

Page 27: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

27

University of Sheffield NLP

Sample quality

Page 28: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

28

University of Sheffield NLP

Sample quality

Page 29: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

29

University of Sheffield NLP

Sample quality

Page 30: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

30

University of Sheffield NLP

Subsequent improvements

• Better handling of punctuation inside quoted chunks

• A catch-all syntactic rule that produces an error message for unparseable sentences

• Support for different datatypes: string, date, numeric

• Better unit-testing• Embedded in web-service

Page 31: Controlled Language for Ontology Editing Adam Funk, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham, Brian Davis, Siegfried Handschuh

31

University of Sheffield NLP

Subsequent improvements

• Use the features of the new GATE ontology API for more efficient dereferencing of names and RDF-friendly handling of synonyms

• Web-application using CLOnE-QL for question answering

• Better documentation of the input language