11111 benchmarking in kw. sep 10th, 2004 © r. garcía-castro, a. gómez-pérez raúl...

22
1 1 1 1 1 Sep 10th, 2004 © R. García-Castro, A. Raúl García-Castro, Asunción Gómez-Pérez <rgarcia,[email protected]> September 10th, 2004 Benchmarking in Knowledge Web Jérôme Euzenat <[email protected]>

Upload: posy-hall

Post on 30-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

1 1 1 1 1Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Raúl García-Castro, Asunción Gómez-Pérez<rgarcia,[email protected]>

September 10th, 2004

Benchmarking in Knowledge Web

Jérôme Euzenat<[email protected]>

Page 2: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

2 2 2 2 2Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Research Benchmarking

Industrial Benchmarking

WP 1.2

(From T.A. page 26)

WP 2.1(From T.A. Page 41)

Point of view • Tool recommendation • Research progress

Criteria • Utility • Scalalability• Robustness• Interoperability

Tools • Ontology development tools• Annotation tools• Querying and reasoning services of ontology development tools• Merging and alignment tools

• Ontology development tools• Annotation tools• Querying and reasoning services of ontology development tools• Semantic Web Service technology

Page 3: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

3 3 3 3 3Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Index

Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web

Page 4: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

4 4 4 4 4Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Overview of the benchmarking activities:

• Progress

• What to expect from them

• What are their relationships/dependencies

• What could be shared/reused between them

Benchmarking activities in KW

Page 5: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

5 5 5 5 5Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

0 6 12 18 24 30 36 42 48

D2.1.1:Benchmarking

SoA

D2.2.2:Benchmarkingmethodologyfor alignment

D2.2.4:Benchmarking

alignmentresults

D2.1.4:BenchmarkingMethodology,

criteria, test suites

D2.1.6:Benchmarkingbuilding tools

Benchmarkingquerying, reasoning,

annotation

Benchmarkingweb service technology

D1.31:Best practices and guidelines

for industry

Best practices and guidelinesfor business cases

D1.2.1:Utility of ontologydevelopment tools

Utility of merging,alignment, annotation

Performance ofquerying, reasoning

Finished

Started

Not started

Progress:

WP 1.2Roberta Cuel

WP 1.3Luigi Lancieri

WP 2.1Raúl García

WP 2.2Jérôme Euzenat

Benchmarking timeline

?

Page 6: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

6 6 6 6 6Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

T 2.1.1 SoA on the technology of the scalability

WP

T 2.1.4 Definitionof a methodology, general criteria for

benchmarking

T 1.2.1 Utility of ontology-based tools

T 2.1.6 Benchmarking

of ontology building tools

T 2.2.2 Designof a benchmark

suite for alignment

T 2.2.4 Researchon alignment

techniques and implementations

T 1.3.1Best Practices and Guidelines

Benchmarking relationships

Benchmarking methodology alignmentBenchmark suite alignment

Benchmarking methodologyBenchmark suites

6 12 18 24

Benchmarking overviewSoA ontology tech. evaluation

Benchmarking methodology

Best Practices

Page 7: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

7 7 7 7 7Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Index

Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web

Page 8: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

8 8 8 8 8Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

0 6 12 18 24 ... 36 ... 48

T 2.1.4 Definition of a methodology, general criteria for ontology tools benchmarking

T 2.1.1 State of the Art

Benchmarking methodology

Type of tools to be benchmarked:• Ontology building tools• Annotation tools• Querying and reasoning services of ontology development tools• Semantic Web Services technology

General evaluation criteria:• Interoperability• Scalability• Robustness

Test suites for each type of tools

Benchmarking supporting tools

• Overview of benchmarking, experimentation, and measurement

• SoA of ontology technology evaluation

T 2.1.6 Benchmarking of ontology building tools

T2.1.xBenchmarkingquerying, reasoning,annotation, web service

Specific evaluation criteria:• Interoperability• Scalability• Robustness

Test suites for ontology building tools

Benchmarking supporting tools

Benchmarking in WP 2.1

Page 9: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

9 9 9 9 9Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Ontology Technology/Methods

Evaluation

Benchm

arking

Desired attributesWeaknesses

Comparative analysis...

Continuous improvementBest practices

Measurement

Experimentation

T 2.1.1: Benchmarking Ontology Technologyin D 2.1.1 Survey of Scalability Techniques for Reasoning with Ontologies

• Overview of benchmarking, experimentation, and measurement• State of the Art of Ontology-based Technology Evaluation

Recommendations

Page 10: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

10 10 10 10 10Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Plan 1 Goals identification

2 Subject identification

3 Management involvement

4 Participant identification

5 Planning and resource allocation

6 Partner selection

Experiment 7 Experiment definition

8 Experiment execution

9 Experiment results analysis

Improve10 Report writing

11 Findings communication

12 Findings implementation

13 Recalibration

T 2.1.4: Benchmarking methodology, criteria, and test suites

General evaluation criteria:• Interoperability• Scalability• Robustness

Benchmark suites for:• Ontology building tools• Annotation tools• Querying and reasoning services• Semantic Web Services technology

Benchmarking supporting tools:• Workload generators• Test generators• Statistical packages•...

Methodology

Page 11: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

11 11 11 11 11Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

T 2.1.6: Benchmarking of ontology building tools

Benchmarking ontology

building tools

Partners/Tools:

UPM

...... ... ...

Benchmark suites:• Interoperability

(x tests)• Scalability

(y tests)• Robustness

(z tests)

Benchmarking results:• Comparative• Weaknesses• (Best) practices• Recommendations

Benchmark suites:• RDF(S) Import capability • OWL Import capability• RDF(S) Export capability• OWL Export capability

Experiments:• Import/export RDF(S) ontologies• Import/export OWL ontologies• Check for knowledge loss• ...

Experiment results:

• test 1• test 2• test 3• ...

NOOKOK

Benchmarking results:• Comparative• Weaknesses• (Best) practices

Interoperability• Do the tools import/export from/to RDF(S)/OWL?• Are the imported/exported ontologies the same?• Is there any knowledge loss during import/export?• ...

Page 12: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

12 12 12 12 12Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Index

Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web

Page 13: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

13 13 13 13 13Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

T 2.2.2 Design of a benchmark suite for alignment

Why evaluate?• Comparing the possible solutions;• Detecting the best methods;• Finding out where we are bad.

Two goals:• For the developer: improving the solutions;• For the user: choosing the best tools;• For both: testing compliance with a norm.

Results:• Benchmarking methodology for alignment techniques;• Benchmark suite for alignment;• First evaluation campaign;• Greater benchmarking effort.

How evaluate?• Take a real life case and set the deadline• Take several cases normalizing them• Take simple cases identifying what they highlight

(benchmark suite)• Build a challenge (MUC, TREC)

Page 14: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

14 14 14 14 14Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

T 2.2.2 What has been done?Information Interpretation and Integration Conference (I3CON), to held at the NIST Performance Metrics for Intelligent Systems (PerMIS) Workshop: focuses on "real-life" test cases and compare algorithm global performance.

Facts:• 7 ontology pairs;• 5 participants;• Undisclosed target alignments (independently made);• Ask for the alignments in normalized format;• Evaluation on the F-measure.

Results:• Difficult to find pairs in the wild (they have

been created);• No dominating algorithm, no most difficult

case for all;• 5 participants was the targetted number, we

must have more next time!

The Ontology Alignment Contest at the 3rd Evaluation of Ontology-based Tools (EON) Workshop, to be held the International Semantic Web Conference (ISWC): aims at defining a proper set of benchmark tests for assessing feature-related behavior.

Facts:• 1 ontology and 20 variations (15 hand-crafted on

some particular aspects);• Target alignment (made on purpose) published;• Ask for a paper, with comments on the tests and on

the achieved results (as well as the results in normalized format).

Results:

We are currently benchmarking the tools!

See you at

EON Workshop, ISWC 2004,

Hiroshima, JP

November …

Page 15: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

15 15 15 15 15Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

T 2.2.2 What’s next?

• More consensus on what’s to be done?

• Learn more

• Take advantage of the remarks

• Make a more complete:

real-world+bench suite+challenge?

• Provide automated procedures

Page 16: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

16 16 16 16 16Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Index

Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web

Page 17: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

17 17 17 17 17Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Benchmarking information repository

Web pages inside the Knowledge Web portal with:• General benchmarking information

(methodology, criteria, test suites, references, ...)• Information about the different benchmarking activities in Knowledge Web• Benchmarking results and lessons learned• ...

Objectives:• Inform• Coordinate• Share/reuse• ...

Proposal for a benchmarking working group in the SDK cluster.

Page 18: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

18 18 18 18 18Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Index

Benchmarking activities in Knowledge WebBenchmarking in WP 2.1Benchmarking in WP 2.2Benchmarking information repositoryBenchmarking in Knowledge Web

Page 19: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

19 19 19 19 19Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

In Knowledge Web:• Benchmarking is performed over products/methods (not processes)• Benchmarking is not a continuous process

Ends with findings communication, there is no findings implementation or recalibration• Benchmarking technology involes evaluating technology• Benchmarking technology is NOT just evaluating technology

We must extract practices and best practices• Benchmarking results

• Comparative• Weaknesses• (Best) practices

• Benchmarking results are needed!Both in industry and research

• ...

Recommendations (Continuous) Improvement

What is benchmarking in Knowledge Web?

Page 20: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

20 20 20 20 20Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

How much do we share?

Benchmarking methodology, criteria, and test suites

Benchmarking results

• Is the view about benchmarking from industry “similar” to the view from research?• Is it viable to have a common methodology? Will anyone use it? • Can the test suites be reused between industry/research?• Can be useful a common way of presenting test suites?• ...

• Can research benchmarking results be (re)used by industry, and viceversa?• Can be useful a common way of presenting results?• ...

Page 21: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

21 21 21 21 21Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Provide the benchmarking methodology to industry:• First draft after Manchester Research meeting. 1st October.• Feedback from WP 1.2. End of October.• (Almost) final version by half-November.

Set up web pages with benchmarking information in the portal:• Benchmarking activities • Methodology• Criteria• Test suites

Discuss in a mailing list and agree on a definition of “best practice”.

Next meeting? To be decided (around November) (with O2I)

Next steps

Page 22: 11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking

22 22 22 22 22Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez

Benchmarking in Knowledge Web

Raúl García-Castro, Asunción Gómez-Pérez<rgarcia,[email protected]>

September 10th, 2004

Jérôme Euzenat<[email protected]>