semantic search: different meanings. semantic search: different meanings definition 1: semantic...

30
Semantic Search: different meanings

Upload: morris-holmes

Post on 29-Dec-2015

222 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Semantic Search: different meanings

Page 2: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Semantic search: different meanings

• Definition 1: Semantic search as the problem of searching documents beyond the syntactic level of matching keywords– Hakia, PowerSet, SearchMonkey

• Definition 2: Semantic search as the problem of searching large semantic web datasets– Watson, PowerAqua, Swoogle, Sindice, SWSE

Page 3: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Facing keyword-based search problems

• Relations between search terms: – “books about recommender systems” vs. “systems that

recommend books”• Polisemy

– “mouth” as part of the body vs. “mouth” as part of a stream

• Synonymy– “movies” vs. “films”

• Documents about individuals where query keywords do not appear: – “English banks”, individual “Abbey”

Page 4: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Several attempts from the IR community

• Early 80s: elaboration of conceptual frameworks and their introduction in IR models– Taxonomies (categories + hierarchical relations) ,

e.g., The ODP (Open Directory Project)– Thesaurus (categories + fixed hierarchical &

associative relations), e.g., WordNet (used by linguistic approaches)

– Algebraic methods such as LSA • Limitations: The level of conceptualization is

often shallow (specially at the level of relations)

Page 5: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

The emergence of the SW

• Late 90s: introduction of ontologies as conceptual framework (classes + instances (KBs) + arbitrary semantic relations + rules) – Semantic search: Exploiting ontologies as a richer

conceptualizations & formal languages to enhance traditional keyword-based document retrieval

– Semantic search: Need to search this emergent and continuously growing structured information space (the Web of Data)

• DPLP, Geonames, DBPedia, BBC Music,... (http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/DataSets)

Page 6: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

The Web of Data 2007

2008 2009

Extracted from: Linked Data Tutorial (Florianópolis) http://www.slideshare.net/ocorcho/linked-data-tutorial-florianpolis

Page 7: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

LOD cloud May 2007

Figure from [4]

Facts:• Focal points:

• DBPedia: RDFized vesion of Wikipiedia; many ingoing and outgoing links

• Music-related datasets• Big datasets include FOAF, US Census data• Size approx. 1 billion triples, 250k links

Extracted from: Linked Data Tutorial (Florianópolis) http://www.slideshare.net/ocorcho/linked-data-tutorial-florianpolis

Page 8: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

LOD cloud September 2008

Facts:• More than 35 datasets interlinked• Commercial players joined the cloud, e.g.,

BBC• Companies began to publish and host

dataset, e.g. OpenLink, Talis, or Garlik.• Size approx. 2 billion triples, 3 million links

Extracted from: Linked Data Tutorial (Florianópolis) http://www.slideshare.net/ocorcho/linked-data-tutorial-florianpolis

Page 9: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

LOD cloud March 2009

Facts:• Big part from Linking Open Drug cloud and the

BIO2RDF project• Notable new datasets: Freebase, OpenCalais,

ACM/IEEE• Size > 10 billion triples

Extracted from: Linked Data Tutorial (Florianópolis) http://www.slideshare.net/ocorcho/linked-data-tutorial-florianpolis

Page 10: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

The LOD clouds

Extracted from: Linked Data Tutorial (Florianópolis) http://www.slideshare.net/ocorcho/linked-data-tutorial-florianpolis

Page 11: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Commercial interest by publishers

Page 12: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Commercial interest by search engines

• 2007 Yahoo! Presents Search Monkey

Page 13: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Commercial interest by search engines

• July-2008 Microsoft buys Powerset

Page 14: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Commercial interest by search engines

• April 2010 Facebook announced the use of the Open Graph protocol

Page 15: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Commercial interest by search engines

• May-2009 Google announces Rich Snippets and it’s official use of RDFa and Microformats

Page 16: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Commercial interest by search engines

• July-2010 Google buys Metaweb (the company behind FreeBase)

Page 17: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Commercial interest by search engines

• November-2010 Google announced the support of the GoodRelations vocabulary for Google Rich Snippets.

Page 18: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Challenges

• Exploiting this new information space for semantic search purposes opens new research challenges:– Scalability– Heterogeneity– Uncertainty

Page 19: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Scalability

Effective exploitation of the linked data requires infrastructure that scales to a large and ever growing collection of interlinked data!

Page 20: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Heterogeneity

Dbpedia:Rudi_Studer

Dblp:Studer:Rudi.html

SW:/en/rudi_studer

Dblp:~ley/db/../author

SW:Person

Dbpedia:Professor

SCHEMA-LEVEL DATA-LEVEL

Align Reconcile,Combine

Effective exploitation of the data web requires an effective mechanism for • finding the relevant data sources• integrating data sources• combining elements from different data sources

Page 21: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Uncertainty

• Incomplete Representation of User’s Needs and content meanings– User cannot completely specify the need – The semantic information in the search space is

incompleteEffective exploitation requires• match user’s needs to data in an imprecise way • rank the results• be flexible enough to adjust to changes in constraints!

“Find action films directed by some Hong Kong film director and starring Chinese martial actors”

Page 22: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

The Search Space: different representations

Page 23: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

The search space: different representations

• Unstructured search space– The Web of documents (textual and multimedia

content)• Structured search space

– The Web of data (ontologies + Knowledge Bases)• Hybrid search space

– Unstructured content is enriched with metadata• Embedded annotations • Not embedded annotations

Page 24: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

The unstructured search space

• The Web of human-understandable content.• The Web of documents and links

– <a href="http://creativecommons.org/licenses/by/3.0/">CC License</a>

Documents

Searchspace

Page 25: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Search engines

Page 26: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

The structured search space

• The Web of machine understandable content.• The Web of objects and relations

– <a rel="license" href="http://creativecommons.org/licenses/by/3.0/"> Creative Commons License </a>

objects

Searchspace

Page 27: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Search engines

Page 28: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

The hybrid search space

• Enriching documents with metadata

Objects

Documents

How to interlink documents and data?

Searchspace

Page 29: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Two ways of interlinking metadata and documents

• Information Extraction• By relying on Web publishers

– More on the section Data on the (Semantic) Web

Page 30: Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the

Search engines