wp2 2nd review
TRANSCRIPT
![Page 1: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/1.jpg)
Combining Human and Computational Intelligence
Ilya Zaihrayeu, Pierre Andrews, Juan Pane
![Page 2: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/2.jpg)
2
Semantic annotation lifecycle
User
free text annotations
What if the users could use semantic annotations
instead to leverage semantic technology services?
Semantic annotation=structure
and/or meaningReasoning Semantic search …
Problem 1: help the user find and
understand the meaning of semantic
annotations
Problem 2: extract
(semantic) annotations
from contexts of user
resource at publishing
Context Problem 3: QoS of semantics-enabled services
Problem 4: semi-automatic semantification of existing
annotations
4/14/2011
![Page 3: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/3.jpg)
3
Index: meaning summarization
User
Reasoning Semantic search …
Problem 1: help the user find and
understand the meaning of semantic
annotations
4/14/2011
![Page 4: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/4.jpg)
4
Meaning summarization: why?• The right meaning of the words being used for the
annotation are in the mind of the people using them• E.g.: Java:– an island in Indonesia south of Borneo; one of the world's
most densely populated regions– a beverage consisting of an infusion of ground coffee beans;
"he ordered a cup of coffee“– a simple platform-independent object-oriented
programming language used for writing applets that are downloaded from the World Wide Web by a client and run on the client's machine
• Descriptions are too long for the user to grasp the meaning immediately – too high barrier to start generating semantic annotations
island
beverage
programming language
4/14/2011
![Page 5: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/5.jpg)
5
Meaning summarization: an example
One word summaries are generated from the relations in
the knowledge base, sense definitions, synonyms and
hypernym terms
4/14/2011
![Page 6: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/6.jpg)
6
Meaning summarization: evaluation results
Best precision: 63%
Discriminating power: 76,4%
4/14/2011
If we talk about java, does the word coffee mean the same as island?
![Page 7: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/7.jpg)
7
Index: gold standard dataset
User
Reasoning Semantic search …
Problem 3: QoS of semantics-enabled services?
Problem 4: semi-automatic semantification of existing
annotations
In order to evaluate the performance of the
algorithms, a gold standard dataset is
needed
4/14/2011
![Page 8: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/8.jpg)
8
Proposed Approach
Tag Tokens Senses
javaisland Java islandJava is land…
Java – an island in Indonesia to the south of BorneoIsland – a land mass that is surrounded by water
DisambiguationPreprocessing
Create a gold standard of folksonomy with sense
80% Accuracy 59% Accuracy
# of annotations 4 296
Unique tags 857
Unique URLs 644
Unique users 1 194
Annotator Agreement 81 %
4/14/2011
![Page 9: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/9.jpg)
9
A Platform for Gold Standards of Semantic Annotation Systems
• Manual validation• RDF export• Evaluation of– Preprocessing– WSD – BoW Search– Convergence
• Open source:http://sourceforge.net/projects/tags2con/
7 modules25K lines of code26% of comments
4/14/2011
![Page 10: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/10.jpg)
10
Delicious RDF Dataset @ LOD cloud
http://disi.unitn.it/~knowdive/dataset/delicious/
# triples 85 908
Outlinks to LOD cloud (WN synsets)
651
4/14/2011
Dereferenceable at:
![Page 11: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/11.jpg)
11
Index: QoS for semantic search
User
Reasoning Semantic search …
Problem 3: QoS of semantics-enabled services?
4/14/2011
![Page 12: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/12.jpg)
12
Semantic search: why?
• With the free text search, the following problems may reduce precision and recall:– synonymy problem: searching for “images” should return
resources annotated with “picture”– polysemy problem: searching for “java” (island) should
not return resources annotated with “java” (coffee beverage)
– specificity gap problem: searching for “animals” should also return resources annotated with “dogs”
• Semantic, meaning-based search can address the above listed problems
4/14/2011
![Page 13: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/13.jpg)
13
Semantics vs Folksonomy
Specificity Gap
Semantic search: complete and correct results (the baseline)
Recall goes down as the specificity gap increasescar
taxi
vehiclelink
User
query
submit
resource
annotation
result
SG=1
SG=2
4/14/2011
javaisland
java island
Java(island) island(land)
Used to build “raw” queries
Used to build BoW queries
Used to build semantic queries
correct and completeSpecificity Gap (SG)
![Page 14: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/14.jpg)
14
Index: semantic convergence
User
Reasoning Semantic search …
Problem 4: semi-automatic semantification of existing
annotations
4/14/2011
![Page 15: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/15.jpg)
15
Semantic convergence: Why?Other
3% Cannot decide5%
Ab-brevia-tion2%
Missing sense15%
I don'
t know
4%
With a WN sense71%
“General” domains: cooking, travel, education
Other1% Cannot decide
6% Ab-brevia-tion5%
Missing
sense35%
I don't know3%
With a WN sense49%
Random: programming and web domain
4/14/2011
AjaxMacAppleCSS…
![Page 16: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/16.jpg)
16
Semantic convergence: proposed solution
• Find new senses of terms– Find different senses of the same term (word sense)– Find synonymous of a term (synonymous sets - synset)
• Place the new synset in the vocabulary is-a hierarchy• What we improve
– Better use of Machine Learning techniques– The polysemy issue is not considered in the state of the
art– Missing or “subjective” evaluations in the state of the art
• Evaluation using the Delicious dataset
4/14/2011
![Page 17: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/17.jpg)
17
Convergence Evaluation: Finding Senses
Tag Collocation User Collocation
4/14/2011
B1
B4
B2
B3
t2
t3t4 t5
t1
Precision: 56%Recall: 73%
Random Baseline
Precision: 42%Recall: 29%
Precision: 57%Recall: 68%
B1
B4B3
t2
t3
t4
t5
t1
U1
U2
![Page 18: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/18.jpg)
18
Semantic annotation lifecycle
User
free text annotations
What if the users could use semantic annotations
instead to leverage semantic technology services?
Semantic annotation=structure
and/or meaningReasoning Semantic search …
Problem 1: help the user understand the meaning of semantic
annotations?
Problem 2: extract
(semantic) annotations
from contexts of user
resource at publishing?
Context Problem 3: QoS of semantics-enabled services?
Problem 4: semi-automatic semantification of existing
annotations
combining human and computational intelligence
Conclusions
4/14/2011
![Page 19: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/19.jpg)
19
Conclusions• We developed and evaluated a meaning summarization algorithm• We developed a “semantic folksonomy” evaluation platform• We studied the effect of semantics on social tagging systems:
– how much semantics can help? – how much the user needs to be involved? – How human and computer intelligence can be combined in the generation
and consumption of semantic annotations• We developed and evaluated a knowledge base enrichment algorithm• We built and used a gold standard dataset for evaluating:
– Word Sense Disambiguation– Tag Preprocessing– Semantic Search– Semantic Convergence
4/14/2011
![Page 20: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/20.jpg)
20
Integration with the use cases4/14/2011
![Page 21: WP2 2nd Review](https://reader036.vdocuments.site/reader036/viewer/2022062319/55503e28b4c905b2788b473c/html5/thumbnails/21.jpg)
21
Publications
• Semantic Disambiguation in Folksonomy: a Case StudyPierre Andrews, Juan Pane, and Ilya Zaihrayeu;
Advanced Language Technologies for Digital Libraries, Springer’s LNCS.• Semantic Annotation of Images on Flickr
Pierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;ESWC 2011
• A Classification of Semantic Annotation SystemsPierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;Semantic Web Journal – second review phase
• Sense Induction in FolksonomiesPierre Andrews, Juan Pane, and Ilya Zaihrayeu;IJCAI-LHD 2011 – under review
• Evaluating the Quality of Service in Semantic Annotation SystemsIlya Zaihrayeu, Pierre Andrews, and Juan Pane;in preparation
4/14/2011