concept based semantic search
DESCRIPTION
TRANSCRIPT
© Semantic Web Company – http://www.semantic-web.at/ 1
Concept-based, semantic search
Andreas BlumauerSemantic Web Companywww.semantic-web.at
© Semantic Web Company – http://www.semantic-web.at/ 2
1. What means „concept-based“?2. Concept-tagging3. Semantic search
• Faceted search• Similarity search
4. Semantics as a means for ‚interpretation‘
5. Topic pages6. Three levels of semantic search
Content/agenda
© Semantic Web Company – http://www.semantic-web.at/ 2
© Semantic Web Company – http://www.semantic-web.at/ 3
What is a concept?The semiotic triangle
concept
objectlabel
A-Class
A-Klasse
W 176
Mental model of „A-Class“
anotherobject
Another mental model of „A-Class“
© Semantic Web Company – http://www.semantic-web.at/ 3
© Semantic Web Company – http://www.semantic-web.at/ 4
4
A-Klasse
W 176
http://voc.org.com/core/54
compact car
Each concept has a unique URI and can have various multi-lingual labels. Additionaly, it can have various types of semantic relations with other concepts. W3C´s SKOS standard describes a pre-defined set of semantic relations especially for controlled vocabularies.
prefLabel (de)
prefLabel
bro
ader
Daimler AGprefLabel
http://voc.org.com/core/176
prefLabelA 250 Sport
nar
row
er
Vehicle manufacturing
company
prefLabel
bro
ader
A-Class
altLabel
prefLabel (en)
related
http://voc.org.com/core/77
AMGprefLabel
http://voc.org.com/core/44
http://voc.org.com/core/355
Daimler-Benz hiddenLabel
http://voc.org.com/core/97
nar
row
er
related
Mercedes-AMG
altLabel
Concept-based enterprise vocabulary
© Semantic Web Company – http://www.semantic-web.at/ 4
© Semantic Web Company – http://www.semantic-web.at/ 5
Concept-tagging vs. Term-tagging
Enterprise vocabulary
--- ------ --- --- ---- ----- ----
--- ---- --- - --- --- ---- -----
------
Concept Tagging
Content from CMS
Term Tagging
‚Term-tags‘ become a ‚concept‘as part of the enterprise vocabulary
Concept-tagging is done on top of concepts which are already part of the enterprise vocabulary, thus contextualised and linked to other concepts.
Term-tagging means that tags are extracted from text (automatically via text mining) which are not part of the controlled vocabulary yet.
Term-tags can be inserted into the enterprise vocabulary. This extends and refines the vocabulary more and more.
© Semantic Web Company – http://www.semantic-web.at/ 5
© Semantic Web Company – http://www.semantic-web.at/ 6
Concept-tagging: pre-condition for semantic search
W 176
--- -- ----- -- ------ ---- ---
------ --- ------ --- --
A 250 Sport ---- ----- ---- ----
---- ---
search
A-Class
W 176
prefLabel
prefLabelA 250 Sport
nar
row
er
altLabel
© Semantic Web Company – http://www.semantic-web.at/ 6
© Semantic Web Company – http://www.semantic-web.at/ 7
Traditional search methods vs. semantic search
W 176
--- -- -- --- ------ ---
------ --- ---- --- ----------A
250 Sport ---- ----- ---- ----
---- ---
search
A-Class
W 176
prefLabel
prefLabelA 250 Sport
nar
row
er
altLabel
--- -- ----- -- ------ ---- ---
------ --- ------ --- --- ----A 250 Sport
---- ----- ---- ---- ---- ---
Traditional:Can the search phrasebe found literallyin the document?
Semantic:Can the search phrasebe found analogously?
© Semantic Web Company – http://www.semantic-web.at/ 7
© Semantic Web Company – http://www.semantic-web.at/ 8
Semantics as a means for interpretation
Semantics helps to make different language levels or various perspectives comparable.
Example: Vendors and their customers quite often talk different languages. Wrong or sometimes time-consuming ‚translations‘ and interpretations have to be done by the customers themselves.
Example: The state of knowledge of employees can be quite divergent. Semantics as a search assistant can serve especially less experienced colleagues.
W 176
----- --- ------ --- ---- ---- --- --- -A 250 Sport ---- ----- ---- ----
---- ---
search
A-Class
W 176
prefLabel
prefLabelA 250 Sport
nar
row
er
altLabel
© Semantic Web Company – http://www.semantic-web.at/ 8
© Semantic Web Company – http://www.semantic-web.at/ 9
9
Concept-/thesaurus-based facet classification of documents is as precise as the classification scheme used by the enterprise thesaurus itself. In consideration of all different labels of concepts and their transitive hierarchical relations, a more precise facet classification can be realised than with traditional term-based methods.
---- --- -- -- Daimler-Benz ------ --- ------ ----
----- ---- ---- ---- ------ --
----- ------ --- ------ -- ------- ----
----- ---- ---- ---- AMG --
---- --- --------
Vehicle manufacturer (2)
Daimler AG (2)
COMPANY
#2
#1Synonyms and hidden labels:#1 is also classified as ‚Daimler AG‘ because ‚Daimler-Benz‘ is also (an old) name for ‚Daimler AG‘.
Transitivity:#2 is categorized as ‚vehicle manufacturer‘ too, because in our thesaurus ‚AMG‘ is narrower (is part of) of ‚Daimler‘ which is a ‚vehicle manufacturer‘.
AMG (1)
© Semantic Web Company – http://www.semantic-web.at/ 9
Concept-based high-precision facet classification
© Semantic Web Company – http://www.semantic-web.at/ 10
10
Content-authors as well as end-users can benefit from similarity search (content recommendation), e.g. by ‚skim reading‘ or by the avoidance of duplicated work. Even if two documents have no words in common they can be classified as similar when using a concept-based text analysis.
--- ---- ----- -- ------ -- --- ------
-- Mercedes-AMG
-------- --- -------- -- W 176 ---- ----- ---- ---- ---- ---
--- -- AMG
--- ------ --- ------ -- ---- -- ---- ----- --- --A 250
Sport ---- ----- ---- ---- ----
--- ------ -------- ----
http://voc.org.com/core/77
AMGprefLabel
Mercedes-AMG
altLabel
A-Class
W 176
http://voc.org.com/core/176
A 250 Sport
nar
row
er
http://voc.org.com/core/44
© Semantic Web Company – http://www.semantic-web.at/ 10
Similarity search: efficient re-use of existing information
© Semantic Web Company – http://www.semantic-web.at/ 11
Topic Pages: Mashups for a fast 360O view
11
Short description
Relatedconcepts
Geo search
Artic
les (tw
itter, v
ideos e
tc.) c
an
be re
trieved
fro
m v
ario
us c
on
ten
t sou
rces
API
http://
CMS
© Semantic Web Company – http://www.semantic-web.at/ 11
© Semantic Web Company – http://www.semantic-web.at/ 12
Linked Data: complex queries on top of standard technologies
12
IndustryNews
Example: Find industry news which mention countries or regions, in which our export volume increased by more than 10% over the last 5 years an which mention either one of our products and/or a competitor.
(Federated) SPARQL Queries
Export statistics
© Semantic Web Company – http://www.semantic-web.at/ 12
© Semantic Web Company – http://www.semantic-web.at/ 13
Conclusio 1: The three levels of semantic search
Concept-based search
Term-based search
Linked Data based search
Semantics is calculated by text analysis. Example: Because „Dieter Zetsche“ frequently occurs together with „Daimler AG“ in a text the algorithm assumes that those two phrases relate somehow to each other. Term-based methods are less precise than the two from further above.
Semantics is explicitly available by using controlled vocabularies and thesauri. Thesauri are the basis for precise text analysis and to build a semantic index. Building knowledge models is especially cost-efficient for larger organisations since a more precise search can be provided.
Semantics is explicitly available via linked knowledge models. Content from various sources and deparments can be linked and mashed on top of an explicit meta data layer. Complex queries which use data from many sources can be made by using the standard query language SPARQL.
2011
2005
2014Year in which theunderlying technology will be/has been rolled out.
No Standards
© Semantic Web Company – http://www.semantic-web.at/ 13
© Semantic Web Company – http://www.semantic-web.at/ 14
Conclusio 2: Explicit metadata layer
Metadata: • Stored and processed separately from data• Metadata management is part of the enterprise information management
strategy
HRMarketing/Sales
Research ProductionData Data
Data Data
© Semantic Web Company – http://www.semantic-web.at/ 14
© Semantic Web Company – http://www.semantic-web.at/ 15
Semantic Web Company GmbHMariahilfer Strasse 70/81070 ViennaAustria
http://www.semantic-web.at/ http://poolparty.biz
http://twitter.com/semwebcompany
Andreas BlumauerManaging [email protected]
15© Semantic Web Company – http://www.semantic-web.at/
“Thank you for your time and please forward any comments or questions to me to get more information on our product or linked data & vocabularies!”