lod2 webinar series fox

20
LOD2 Webinar . 29.11.2011 . Page 1 http:// lod2.eu Creating Knowledge out of Interlinked Data

Upload: lod2-creating-knowledge-out-of-interlinked-data

Post on 12-Jan-2015

1.118 views

Category:

Technology


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: LOD2 Webinar Series FOX

LOD2 Webinar . 29.11.2011 . Page 1 http://lod2.eu

Creating Knowledge out of Interlinked Data

Page 2: LOD2 Webinar Series FOX

http://lod2.eu

LOD2 is a large-scale integrating project co-funded by the European Commission within the FP7 Information and Communication Technologies Work Programme. This 4-year project comprises leading Linked Open Data technology researchers, companies, and service providers. Coming from across 12 countries the partners are coordinated by the Agile Knowledge Engineering and Semantic Web Research Group at the University of Leipzig, Germany.

LOD2 will integrate and syndicate Linked Data with existing large-scale applications. The project shows the benefits in the scenarios of Media and Publishing, Corporate Data intranets and eGovernment.

Page 3: LOD2 Webinar Series FOX

http://lod2.eu

Once per month the LOD2 webinar series offer a free webinar about tools and services along the Linked Open Data Life Cycle.

Stay with us and learn more about acquisition, editing, composing, connected applications – and finally publishing Linked Open Data.

Page 4: LOD2 Webinar Series FOX

Federated Knowledge Extraction Framework

Axel Ngonga

Page 5: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 5 http://lod2.eu

• Steady growth but incomplete

• Structured data• Triplify, Sparqlify

• Semi-structured data• DBpedia

• Unstructured data• Make up 80% of the Web

• Diverse solutions, yet low F-score even on non-

noisy data

• Solution: FOX

Motivation

Page 6: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 6 http://lod2.eu

Insight

Dictionary-based approaches

Pattern-based approaches

Condition Random FieldsSu

pport Vecto

r Mach

ines

Page 7: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 7 http://lod2.eu

• Diversity of solutions to one problem• NER, KE, RE

• Each solution has its strengths and

weakness

• Apply ensemble learning to • Combine the tools at hand

• Compute better results

• In our case, decision trees (v2)

Insight

Page 8: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 8 http://lod2.eu

Architecture

Learning

Prediction

Orchestration

NER

KE

RE

NED

Page 9: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 9 http://lod2.eu

• Use AGDISTIS Framework

http://aksw.org/projects/AGDISTIS

Named Entity Disambiguation

Page 10: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 10 http://lod2.eu

• Input• Text

• HTML

• URL

• Output• JSON-LD

• RDF/XML

Implementation

• N3

• …

• Execution• Single tools

(light)

• FOX Full

• Access• REST

Page 11: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 11 http://lod2.eu

Evaluation (FOX)

MUC-7 Corpus• 6013 locations• 11093 organizations• 5882 persons

Page 12: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 12 http://lod2.eu

Evaluation (AGDISTIS)

Page 13: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 13 http://lod2.eu

http://fox.aksw.org

Demo

Page 14: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 14 http://lod2.eu

input : text or an url

type : { text | url }

task : { NER }

output : { JSONLD | N3 | N-TRIPLE | RDF/{ JSON |

XML | XML-ABBREV} | TURTLE }

returnHtml : { true | false }

foxlight : an implemented INER class name (e.g.

`org.aksw.fox.nertools.NEROpenNLP`) or `OFF`.

FOX API Parameters

Page 15: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 15 http://lod2.eu

curl -d type=text -d task=NER -d output=JSONLD --

data-urlencode "input=The foundation of the University

of Leipzig in 1409 initiated the city's development into a

centre of German law and the publishing industry, and

towards being a location of the Reichsgericht (High

Court), and the German National Library (founded in

1912). The philosopher and mathematician Gottfried

Leibniz was born in Leipzig in 1646, and attended the

university from 1661-1666." -H "Content-Type:

application/x-www-form-urlencoded" <SERVICE_URI>

FOX API Parameters

Page 16: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 16 http://lod2.eu

{ "@id" : "_:t1", "http://www.w3.org/2000/10/annotation-ns#body" :

[ { "@value" : "University of Leipzig" } ],

"http://ns.aksw.org/scms/source" : [ { "@id" :

"http://ns.aksw.org/scms/tools/fox" } ],

"http://ns.aksw.org/scms/means" : [ { "@id" :

"http://dbpedia.org/resource/Leipzig_University" } ],

"http://ns.aksw.org/scms/endIndex" : [ { "@value" : "43", "@type" :

"http://www.w3.org/2001/XMLSchema#int" } ],

"http://ns.aksw.org/scms/beginIndex" : [ { "@value" : "22", "@type" :

"http://www.w3.org/2001/XMLSchema#int" } ], "@type" : [

"http://ns.aksw.org/scms/annotations/ORGANIZATION",

"http://www.w3.org/2000/10/annotation-ns#Annotation" ] }

FOX Response

Page 17: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 17 http://lod2.eu

[

a scmsann:ORGANIZATION , ann:Annotation ;

scms:beginIndex "22"^^xsd:int ;

scms:endIndex "43"^^xsd:int ;

scms:means <http://dbpedia.org/resource/Leipzig_University> ;

scms:source <http://ns.aksw.org/scms/tools/fox> ;

ann:body "University of Leipzig"^^xsd:string

] .

FOX Response

Page 18: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 18 http://lod2.eu

curl --data-urlencode "text='The <entity>University of

Leipzig</entity> was visited by <entity>Barack

Obama</entity>.'" -d type='agdistis' <SERVICEURL>

[{"namedEntity":"Barack Obama","start":42,

"disambiguatedURL":"http://dbpedia.org/resource/B

arack_Obama","offset":12},

{"namedEntity":"University of

Leipzig","start":5,"disambiguatedURL":"http://dbpedi

a.org/resource/Leipzig_University","offset":21}]

AGDISTIS API

Page 19: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 19 http://lod2.eu

• > 90% F-score

• Can be extended to cover other KE tasks (RE, POS,

…)

• Easy integration into semantic applications

• More info at http://fox.aksw.org and

http://aksw.org/projects/agdistis

Conclusion and Future Work

Page 20: LOD2 Webinar Series FOX

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 20 http://lod2.eu

Thank you for your attention!

Axel Ngongahttp://aksw.org/AxelNgonga | http://fox.aksw.org | http://lod2.org [email protected]