concept based semantic search

15
© Semantic Web Company – http://www.semantic-web.at/ 1 Concept-based, semantic search Andreas Blumauer Semantic Web Company www.semantic-web.at

Upload: semantic-web-company

Post on 27-Jan-2015

130 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 1

Concept-based, semantic search

Andreas BlumauerSemantic Web Companywww.semantic-web.at

Page 2: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 2

1. What means „concept-based“?2. Concept-tagging3. Semantic search

• Faceted search• Similarity search

4. Semantics as a means for ‚interpretation‘

5. Topic pages6. Three levels of semantic search

Content/agenda

© Semantic Web Company – http://www.semantic-web.at/ 2

Page 3: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 3

What is a concept?The semiotic triangle

concept

objectlabel

A-Class

A-Klasse

W 176

Mental model of „A-Class“

anotherobject

Another mental model of „A-Class“

© Semantic Web Company – http://www.semantic-web.at/ 3

Page 4: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 4

4

A-Klasse

W 176

http://voc.org.com/core/54

compact car

Each concept has a unique URI and can have various multi-lingual labels. Additionaly, it can have various types of semantic relations with other concepts. W3C´s SKOS standard describes a pre-defined set of semantic relations especially for controlled vocabularies.

prefLabel (de)

prefLabel

bro

ader

Daimler AGprefLabel

http://voc.org.com/core/176

prefLabelA 250 Sport

nar

row

er

Vehicle manufacturing

company

prefLabel

bro

ader

A-Class

altLabel

prefLabel (en)

related

http://voc.org.com/core/77

AMGprefLabel

http://voc.org.com/core/44

http://voc.org.com/core/355

Daimler-Benz hiddenLabel

http://voc.org.com/core/97

nar

row

er

related

Mercedes-AMG

altLabel

Concept-based enterprise vocabulary

© Semantic Web Company – http://www.semantic-web.at/ 4

Page 5: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 5

Concept-tagging vs. Term-tagging

Enterprise vocabulary

--- ------ --- --- ---- ----- ----

--- ---- --- - --- --- ---- -----

------

Concept Tagging

Content from CMS

Term Tagging

‚Term-tags‘ become a ‚concept‘as part of the enterprise vocabulary

Concept-tagging is done on top of concepts which are already part of the enterprise vocabulary, thus contextualised and linked to other concepts.

Term-tagging means that tags are extracted from text (automatically via text mining) which are not part of the controlled vocabulary yet.

Term-tags can be inserted into the enterprise vocabulary. This extends and refines the vocabulary more and more.

© Semantic Web Company – http://www.semantic-web.at/ 5

Page 6: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 6

Concept-tagging: pre-condition for semantic search

W 176

--- -- ----- -- ------ ---- ---

------ --- ------ --- --

A 250 Sport ---- ----- ---- ----

---- ---

search

A-Class

W 176

prefLabel

prefLabelA 250 Sport

nar

row

er

altLabel

© Semantic Web Company – http://www.semantic-web.at/ 6

Page 7: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 7

Traditional search methods vs. semantic search

W 176

--- -- -- --- ------ ---

------ --- ---- --- ----------A

250 Sport ---- ----- ---- ----

---- ---

search

A-Class

W 176

prefLabel

prefLabelA 250 Sport

nar

row

er

altLabel

--- -- ----- -- ------ ---- ---

------ --- ------ --- --- ----A 250 Sport

---- ----- ---- ---- ---- ---

Traditional:Can the search phrasebe found literallyin the document?

Semantic:Can the search phrasebe found analogously?

© Semantic Web Company – http://www.semantic-web.at/ 7

Page 8: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 8

Semantics as a means for interpretation

Semantics helps to make different language levels or various perspectives comparable.

Example: Vendors and their customers quite often talk different languages. Wrong or sometimes time-consuming ‚translations‘ and interpretations have to be done by the customers themselves.

Example: The state of knowledge of employees can be quite divergent. Semantics as a search assistant can serve especially less experienced colleagues.

W 176

----- --- ------ --- ---- ---- --- --- -A 250 Sport ---- ----- ---- ----

---- ---

search

A-Class

W 176

prefLabel

prefLabelA 250 Sport

nar

row

er

altLabel

© Semantic Web Company – http://www.semantic-web.at/ 8

Page 9: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 9

9

Concept-/thesaurus-based facet classification of documents is as precise as the classification scheme used by the enterprise thesaurus itself. In consideration of all different labels of concepts and their transitive hierarchical relations, a more precise facet classification can be realised than with traditional term-based methods.

---- --- -- -- Daimler-Benz ------ --- ------ ----

----- ---- ---- ---- ------ --

----- ------ --- ------ -- ------- ----

----- ---- ---- ---- AMG --

---- --- --------

Vehicle manufacturer (2)

Daimler AG (2)

COMPANY

#2

#1Synonyms and hidden labels:#1 is also classified as ‚Daimler AG‘ because ‚Daimler-Benz‘ is also (an old) name for ‚Daimler AG‘.

Transitivity:#2 is categorized as ‚vehicle manufacturer‘ too, because in our thesaurus ‚AMG‘ is narrower (is part of) of ‚Daimler‘ which is a ‚vehicle manufacturer‘.

AMG (1)

© Semantic Web Company – http://www.semantic-web.at/ 9

Concept-based high-precision facet classification

Page 10: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 10

10

Content-authors as well as end-users can benefit from similarity search (content recommendation), e.g. by ‚skim reading‘ or by the avoidance of duplicated work. Even if two documents have no words in common they can be classified as similar when using a concept-based text analysis.

--- ---- ----- -- ------ -- --- ------

-- Mercedes-AMG

-------- --- -------- -- W 176 ---- ----- ---- ---- ---- ---

--- -- AMG

--- ------ --- ------ -- ---- -- ---- ----- --- --A 250

Sport ---- ----- ---- ---- ----

--- ------ -------- ----

http://voc.org.com/core/77

AMGprefLabel

Mercedes-AMG

altLabel

A-Class

W 176

http://voc.org.com/core/176

A 250 Sport

nar

row

er

http://voc.org.com/core/44

© Semantic Web Company – http://www.semantic-web.at/ 10

Similarity search: efficient re-use of existing information

Page 11: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 11

Topic Pages: Mashups for a fast 360O view

11

Short description

Relatedconcepts

Geo search

Artic

les (tw

itter, v

ideos e

tc.) c

an

be re

trieved

fro

m v

ario

us c

on

ten

t sou

rces

API

http://

CMS

© Semantic Web Company – http://www.semantic-web.at/ 11

Page 12: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 12

Linked Data: complex queries on top of standard technologies

12

IndustryNews

Example: Find industry news which mention countries or regions, in which our export volume increased by more than 10% over the last 5 years an which mention either one of our products and/or a competitor.

(Federated) SPARQL Queries

Export statistics

© Semantic Web Company – http://www.semantic-web.at/ 12

Page 13: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 13

Conclusio 1: The three levels of semantic search

Concept-based search

Term-based search

Linked Data based search

Semantics is calculated by text analysis. Example: Because „Dieter Zetsche“ frequently occurs together with „Daimler AG“ in a text the algorithm assumes that those two phrases relate somehow to each other. Term-based methods are less precise than the two from further above.

Semantics is explicitly available by using controlled vocabularies and thesauri. Thesauri are the basis for precise text analysis and to build a semantic index. Building knowledge models is especially cost-efficient for larger organisations since a more precise search can be provided.

Semantics is explicitly available via linked knowledge models. Content from various sources and deparments can be linked and mashed on top of an explicit meta data layer. Complex queries which use data from many sources can be made by using the standard query language SPARQL.

2011

2005

2014Year in which theunderlying technology will be/has been rolled out.

No Standards

© Semantic Web Company – http://www.semantic-web.at/ 13

Page 14: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 14

Conclusio 2: Explicit metadata layer

Metadata: • Stored and processed separately from data• Metadata management is part of the enterprise information management

strategy

HRMarketing/Sales

Research ProductionData Data

Data Data

© Semantic Web Company – http://www.semantic-web.at/ 14

Page 15: Concept based semantic search

© Semantic Web Company – http://www.semantic-web.at/ 15

Semantic Web Company GmbHMariahilfer Strasse 70/81070 ViennaAustria

http://www.semantic-web.at/ http://poolparty.biz

http://twitter.com/semwebcompany

Andreas BlumauerManaging [email protected]

15© Semantic Web Company – http://www.semantic-web.at/

“Thank you for your time and please forward any comments or questions to me to get more information on our product or linked data & vocabularies!”