principles for knowledge engineering on the web

Post on 30-Nov-2014

450 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Keynote ICK3 conference, Paris, 2011

TRANSCRIPT

Principles for knowledge engineering on the Web

Guus Schreiber

VU University Amsterdam

Computer Science, Web & Media

Overview of this talk

• Semantic Web: the digital heritage case

• Knowledge-engineering principles

• Challenges for Web KE

My journeyknowledge engineering

• design patterns for problem solving

• methodology for knowledge systems

• models of domain knowledge

• ontology engineering

My journeyaccess to digital heritage

My journeyWeb standards

• Web metadata: RDF

• OWL Web Ontology Language

• SKOS model for publishing vocabularies on the Web

SEMANTIC WEB: THE DIGITAL-HERITAGE CASE

The Web: resources and links

URL URL

Web link

The Semantic Web: typed resources and links

URL URL

Web link

ULAN

Henri Matisse

Dublin Core

creator

Painting“Woman with hat”SFMOMA

Vocabulary interoperability: SKOS

Vocabulary representations

• SKOS has been a major success

• Easy to understand and create

• LCSH publication set important example

The myth of a unified vocabulary

• In large virtual collections there are always multiple vocabularies – In multiple languages

• Every vocabulary has its own perspective– You can’t just merge them

• But you can use vocabularies jointly by defining a limited set of links– “Vocabulary alignment”

• It is surprising what you can do with just a few links

Example use of vocabulary alignment

“Tokugawa”

SVCN period Edo

SVCN is local in-house ethnology thesaurus

AAT style/period Edo (Japanese period) Tokugawa

AAT is Getty’s Art & Architecture Thesaurus

Enriching metadata with concepts

Learning vocabulary alignments

• Example: learning relations between art styles and artists through NLP of art historic texts– “Who are Impressionist painters?”

Semantic search: result clustering based on retrieval path

Research issues

• Information retrieval as graph search– more semantics => more paths– finding optimal graph patterns

• Vocabulary alignment

• Information extraction– recognizing people, locations, …– identity resolution

• Multi-lingual resources

Personalized Rijksmuseum

• Interactive user modeling

•Recommendations of artworks and art topics

Mobile museum tour

KNOWLEDGE ENGINEERING PRINCIPLES

Lessons I learned

Principle 1: Be modest!

• Ontology engineers should refrain from developing their own idiosyncratic ontologies

• Instead, they should make the available rich vocabularies, thesauri and databases available in an interoperable (web) format

• Initially, only add the originally intended semantics

Principle 2: Think large!

"Once you have a truly massive amount of information integrated as knowledge, then the human-software system will be superhuman, in the same sense that mankind with writing is superhuman compared to mankind before writing."

Doug Lenat

Principle 3: Develop and use patterns!

• Don’t try to be (too) creative

• Ontology engineering should not be an art but a discipline

• Patterns play a key role in methodology for ontology engineering

• See for example patterns developed by the W3C Semantic Web Best Practices group

http://www.w3.org/2001/sw/BestPractices/

• SKOS can also be considered a pattern

Principle 4: Don’t recreate, but enrich and align

• Techniques:– Learning ontology relations/mappings– Semantic analysis, e.g. OntoClean– Processing of scope notes in thesauri

Principle 5: Beware of ontologicalover-commitment!

Principle 6: writing in an ontology language doesn’t make it an ontology!

• Ontology is vehicle for sharing

• Papers about your own idiosyncratic “university ontology” should be rejected at conferences

• The quality of an ontology does not depend on the number of, for example, OWL constructs used

Principle 7: Required level of formal semantics depends on the domain!

• In our semantic search we use three OWL constructs:– owl:sameAs, owl:TransitiveProperty,

owl:SymmetricProperty

• But cultural heritage has is very different from medicine and bioinformatics– Don’t over-generalize on requirements for

e.g. OWL

CHALLENGES FOR WEB KE

Challenge: Linked Open Data

Availability of government data: http://data.gov.uk

The fight for “standard” semantics Schema.org

Challenge: vocabulary alignment methodology

• Multitude of alignment techniques available– Direct syntactic match– Lexical manipulation– Structured, ….

• Precision & recall varies

• Large evaluation initiative– OAEI http://oaei.ontologymatching.org/

Limitations of categorical thinking

• The set theory on which ontology languages are built is inadequate for modelling how people think about categories (Lakoff)– Category boundaries are not hard: cf. art styles– People think of prototypes; some examples are

very prototypical, others less

• We also need to make meta-distinctions explicit– organizing class: “furniture”– base-level class: “chair”– domain-specific: “Windsor chair”

Challenge: new types of search exploiting semantics

Relation search: Picasso, Matisse & Braque

Challenge: combining professional annotations with public “tags”

Challenge: data trust issues

• How can a museum trust annotations of outsiders?

• Need to adapt techniques from closed world to open world

• Ongoing case studies study reputation assessment, use of probability theories, ….

Challenge: event-centred approach => people like narratives

Extracting piracy eventsfrom piracy reports & Web sources

Visualising piracy events

Large-scale experimentation!

TOWARDS WEB SCIENCE

We need to study the Web as a phenomenon

• Web dynamics• Collective intelligence• Privacy, trust and

security• Linked open data• Universal access

Web for Social

Development

48

Acknowledgements

• Long list of people

• Projects: MIA, MultiemdiaN E-Culture, CHOICE, MunCH, CHIP, Agora, PrestoPrime, NoTube, EuropeanaConnect, Poseidon

top related