lod2 plenary vienna 2012: wp4 - reuse, interlinking and knowledge fusion

20
LOD2 Plenary Meeting Vienna 2012/03/21 Page 1 http://lod2.eu Creating Knowledge out of Interlinked Data LOD2 Presentation . 02.09.2010 . Page http://lod2.eu Freie Universität Berlin Robert Isele WP4: Reuse, Interlinking and Knowledge Fusion LOD2 Plenary Meeting 2012 Vienna

Upload: lod2-creating-knowledge-out-of-interlinked-data

Post on 01-Nov-2014

853 views

Category:

Technology


0 download

DESCRIPTION

State of Play presentation at the LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion by Robert Isele of FUB.

TRANSCRIPT

Page 1: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 1 http://lod2.euCreating Knowledge out of Interlinked Data

LOD2 Presentation . 02.09.2010 . Page http://lod2.euFreie Universität Berlin

Robert Isele

WP4: Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting 2012Vienna

Page 2: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 2 http://lod2.eu

WP4 Goals

Translate heterogeneous data from the Web of Linked Data into a clean local target representationProvide open-source software components for:– Link Generation– Vocabulary Mapping– Linked Data quality assessment– Linked Data Fusion

Page 3: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 3 http://lod2.eu

WP4 in the LOD Stack

Page 4: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 4 http://lod2.eu

Task 4.1: Semi-Automatic Data Interlinking

Partners: ULEI, NUIG, FUB, KAIST Goals: – Develop a Linking Assist, which guides the knowledge

engineer through the linking process (FUB, ULEI).– (New) Provide a platform for automatic linking with Korean,

Chinese, Japanese RDF resources (KAIST).

Page 5: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 5 http://lod2.eu

Task 4.1: Progress

First Linking Assist/Silk Workbench (D4.1.1) has been delivered in February 2012– Define Data Sources (e.g. SPARQL endpoint, RDF dump)– Specify the types of resources which should be interlinked– Build linkage rules supported by maching learning– Evaluate the quality of linkage rules

Preliminary work on Korean Resource Linking Assist– Transformed test datasets into RDF.– This data will be an input to Korean resource linking module. – Finished preliminary design of the Korean resource linking

module

Page 6: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 6 http://lod2.eu

Task 4.1: Improving Silk Workbench (1/2)

Use Active learning to reduce the manual effort and required expertise to interlink data sources– Automating the generation of a linkage rule.– The user only confirms or declines a set of example links.

Page 7: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 7 http://lod2.eu

Task 4.1: Improving Silk Workbench (2/2)

Improving the usability based on user-feedbackFirst results for the Y2 review meetingFinal deliverable D4.1.2 (Second Linking Assist Release) in February 2013

Page 8: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 8 http://lod2.eu

Task 4.2: Data Interlinking Environment

Partners: NUIGGoals:– To research and develop LATC well beyond 2012 into 2014– Interlinking recommendations– Interaction with data linkage validator from WP3

Progress:– First version of Data Interlinking Environment (D4.2.1)

submitted in December 2011– Combines Analytics Graph produced from Sindice data

sources and the Silk Link Discovery Framework

Page 9: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 9 http://lod2.eu

Task 4.2: Silk Workbench Extension

New Sindice datasource for the linking of datasets.Dataset suggestion based on keywords, classes, and datasets Autocompletion for data types when executing linking tasks.A retrieval method for entity properties to also aid in the execution of linking tasks.

Dataset suggestion

Page 10: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 10 http://lod2.eu

Task 4.3: Linked Data Quality Assessment

Partners: FUB, NUIG, ULEI, SWCG Goals:– Research into recent advances in quality assessment of

Linked Data– Develop design metrics for quality assessment– Release a Linked Data Quality Assessment Component

Page 11: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 11 http://lod2.eu

Task 4.3 Progress

Survey on the State of the Art in Mapping, Quality Assessment and Data Fusion (D4.3.1) finished in February 2011Conceptual Design and Implementation of Metrics (D4.3.2) finished in February 2012Released first prototype of Sieve, a Linked Data Quality Assessment and Fusion framework– Allows Web data to be filtered according to different data quality

assessment policies – Provides for fusing Web data according to different conflict

resolution methods.– http://sieve.wbsg.de– D4.3.2: Release of the data quality assessment tool (August 2012)

Page 12: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 12 http://lod2.eu

Task 4.4: Schema Mapping Publication and Discovery

Partners: FUB, ULEI, OGL, SWCG, UEPGoals:– Specification of the vocabulary mapping publication and

discovery language– Implementation of the Vocabulary Mapping Component

Page 13: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 13 http://lod2.eu

Task 4.4 Progress

Specification of the Mapping Publication and Discovery Language (D4.4.1) finished in June 2011Implementation of the Mapping Publication and Discovery Framework (D4.4.2 ) finished in February 2012.– Adapted the R2R Framework based on the use cases in LOD2. – Conducted various experiments to demonstrate the

performance and scaling behavior for translating data sets (http://www.assembla.com/spaces/ldif/wiki/Benchmark)

– Implementation published under the terms of the Apache License

Page 14: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 14 http://lod2.eu

Task 4.4: Future Work

Integration of the Mapping Publication and Discovery Framework into the LOD2 stack (D4.4.3)Deadline: February 2013

Page 15: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 15 http://lod2.eu

Task 4.4a: Schema Mapping Robust to Modeling Style

Partners: UEPGoal: Extend the methods and tools of schema matching discovery (from the original Task 4.4) by ontology transformation methods implemented within the (enhanced) PatOMat framework Start: March 2012First deliverable in December 2012

Page 16: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 16 http://lod2.eu

Task 4.5: Linked Data Fusion

Partners: FUB, ULEIGoal:– Build a Data Fusion Component which fuses data from

multiple sources– Fuse multiple entities representing the same real-world object

into a single, consistent and clean representationFirst deliverable:– Initial release of Data Fusion Component (D4.5.1). – Deadline: 31.08.12– Integrating the data quality assessment module (Sieve)

developed in Task 4.3 with a data fusion module.

Page 17: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 17 http://lod2.eu

Task 4.5a: Multilingual Linked Data Fusion

Involved: KAIST, ULEIGoal: Fusion of multilingual datasets– DBpedia dataset as the pivot multilingual dataset, since it is

extracted from various kinds of languages – First step: Bilingual fusion between the Korean DBpedia and the

English Dbpedia– Next: Include other languages such as Chinese and Japanese

First deliverable in February 2013: Korean Data Fusion Assistant – The component will support Korean data fusion into English LOD

by combining Deliverable 4.5.1 with the fused dataset of English and Korean DBpedia.

Page 18: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 18 http://lod2.eu

Task 4.6: Tools for Cleansing Entity Data and Crowdsourcing of Cleansing

Involved: ZemantaGoals: – Adapt Google Refine for Linked Open Data based on the

existing Deri Plugin– Integrate crowdsourcing services such as Amazon Mechanical

Turk for LOD data cleansing. Progress:– D 4.6.1 (M18) Release of an LOD-Enabled Version of Google

Refine submitted.Next deliberable:– D 4.6.2 (M30) Release of Documentation and Software

Infrastructure for Using GR along with Amazon Mechanical Turk

Page 19: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 19 http://lod2.eu

WP 4 Summary (M12 - M18)

5 Deliverables submitted in the last 6 months:ULEI and FUB submitted the First Linking Assist (D4.1.1)NUIG submitted the first version of the Data Linking Environment Release (D4.2.1)FUB finished the Conceptual Design and Implementation of Quality Assessment Metrics (D 4.3.2)FUB finished the Implementation of the Mapping Publication and Discovery Framework (D4.4.2)Zemanta submitted the first release of the LOD-enabled version of Google Refine for review (D4.6.1)

Page 20: LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion

LOD2 Plenary Meeting Vienna – 2012/03/21 – Page 20 http://lod2.eu

Contact

Address

Freie Universität BerlinSchool of Business & EconomicsWeb-based Systems Group

Garystr. 21 14195 BerlinGermany

Presenter

Robert [email protected]