extracting semantics from crowds

58
Knowledge Management Institute Extracting Semantics from Crowds Extracting Semantics from Crowds International Summer School on Semantic Computing (SSSC2011) International Summer School on Semantic Computing (SSSC 2011) August 8 - 12, 2011, Berkeley, California Markus Strohmaier Markus Strohmaier Assistant Professor Graz University of Technology, Austria Graz University of Technology, Austria Visiting Scientist (XEROX) PARC, USA with collaborators D. Helic, C. Körner, J. Pöschko, R. Kern, C. Trattner, C. Wagner (Graz University of 1 Markus Strohmaier 2011 Technology, Austria), D. Benz, G. Stumme (U. of Kassel, Germany), A. Hotho (U. of Würzburg, Germany), S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen (U. of Leipzig, Germany)

Upload: markus-strohmaier

Post on 27-Jan-2015

125 views

Category:

Education


2 download

DESCRIPTION

Talk at the IEEE International Summer School on Semantic Computing at UC Berkeley 2011

TRANSCRIPT

Page 1: Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from CrowdsExtracting Semantics from Crowds

International Summer School on Semantic Computing (SSSC’2011)International Summer School on Semantic Computing (SSSC 2011)August 8 - 12, 2011, Berkeley, California

Markus StrohmaierMarkus Strohmaier

Assistant ProfessorGraz University of Technology, AustriaGraz University of Technology, Austria

Visiting Scientist(XEROX) PARC, USA

with collaborators D. Helic, C. Körner, J. Pöschko, R. Kern, C. Trattner, C. Wagner (Graz University of

1

Markus Strohmaier 2011

Technology, Austria), D. Benz, G. Stumme (U. of Kassel, Germany), A. Hotho (U. of Würzburg, Germany), S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen (U. of Leipzig, Germany)

Page 2: Extracting semantics from crowds

Knowledge Management Institute

Egypt 2011Egypt 2011Twitter Trends place

event

person

2

Markus Strohmaier 2011

Page 3: Extracting semantics from crowds

Knowledge Management Institute

Semantic Structures: The DMOZ Project

3

Markus Strohmaier 2011

Page 4: Extracting semantics from crowds

Knowledge Management Institute

Classification SystemsClassification Systems in Information and Library Sciences

Usually produced and maintained by few(e g dozens of) domain experts(e.g. dozens of) domain experts.

but: used by many (potentially millions).

Can a very large group (a crowd) of userscontribute to ontology engineering efforts?contribute to ontology engineering efforts?

4

Markus Strohmaier 2011

Page 5: Extracting semantics from crowds

Knowledge Management Institute

ObjectivesP id ( ) t th f ll i tiProvide (some) answers to the following questions:

• What is the difference between extracting semantics from textWhat is the difference between extracting semantics from text vs. extracting semantics from crowds?

• Why should we study crowds and crowd behavior from a ti ti ti ?semantic computing perspective?

• How can semantics be extracted from online crowd behavior, such as

• …from Social Labeling• …from Social Tagging• …from Social Navigationo Soc a a ga o

• What are the implications for semantic computing research?

5

Markus Strohmaier 2011

Page 6: Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from …

Motivation

Social Labeling• Hashtag Semantics

Social Tagging• Tag Relatedness

Tag Generality

Extraction

• Tag Generality• Tag Hierarchies

Social NavigationSocial Navigation• Navigational Knowledge Engineering

Interaction

6

Markus Strohmaier 2011

g g

Page 7: Extracting semantics from crowds

Knowledge Management Institute

Ontology engineering vs. learning

O t l i i• Ontology engineering• Expert-driven, small-scale• Knowledge and concept identificationKnowledge and concept identification• Preliminary informal representation• Formalization (RDF, OWL, etc)

E l ti d M i t• Evaluation and Maintenance

• Ontology learning• Data driven large scale• Data-driven, large-scale• Source selection and data sampling• Data exploration and probing• Concept and link learning• Evaluation and Updating

7

Markus Strohmaier 2011

Page 8: Extracting semantics from crowds

Knowledge Management Institute

Distrib tional SemanticsDistributional Semantics[Hovy 2011]

I t l i NLP d IR h t t d• In recent years, people in NLP and IR have started using a different representation for all kinds of problems: word distributionsproblems: word distributions

[Harris 1954]: “wordsthat occur in thesame contexts tendto have similarto have similarmeanings“

• Statistical NLP operates at word level, frequency distributions are used as the (de facto) semantics of a word( )

• Not actual semantics, but captures something of contents• Not compositional: how to ‘add’ two distributions?

8

Markus Strohmaier 2011

• No explicit theory of semantics

[Hovy 2011] Based on slides by Ed Hovy, USC, Reasoning with Text Workshop 2011 (link)

Page 9: Extracting semantics from crowds

Knowledge Management Institute

Wh apple is similiar to pearWhy apple is similiar to pear[based on slides by Hovy 2011 / Pantel 02]

9

Markus Strohmaier 2011Patrick Pantel and Dekang Lin. 2002. Discovering word senses from text. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining(KDD '02). ACM, New York, NY, USA, 613-619.

Page 10: Extracting semantics from crowds

Knowledge Management Institute

Wh apple is not similiar to toothbr shWhy apple is not similiar to toothbrush[based on slides by Hovy 2011 / Pantel 02]

10

Markus Strohmaier 2011Patrick Pantel and Dekang Lin. 2002. Discovering word senses from text. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining(KDD '02). ACM, New York, NY, USA, 613-619.

Page 11: Extracting semantics from crowds

Knowledge Management Institute

Distrib tional SemanticsDistributional Semantics[based on slides by Hovy 2011 / Pantel 02]

11

Markus Strohmaier 2011Patrick Pantel and Dekang Lin. 2002. Discovering word senses from text. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining(KDD '02). ACM, New York, NY, USA, 613-619.

Page 12: Extracting semantics from crowds

Knowledge Management Institute

H t P ttHearst PatternsM. Hearst, Automatic Acquisition of Hyponyms from Large Text Corpora 1992

(S1) The bow lute, such as the Bambara ndang, is plucked and has an individualcurved neck for each string.

12

Markus Strohmaier 2011

Page 13: Extracting semantics from crowds

Knowledge Management Institute

Ontology Learning From Text

13

Markus Strohmaier 2011

Page 14: Extracting semantics from crowds

Knowledge Management Institute

Evaluation

14

Markus Strohmaier 2011

A SURVEY OF ONTOLOGY EVALUATION TECHNIQUES, Janez Brank, Marko Grobelnik, Dunja Mladenić, Proceedings of the Conference on Data Mining and Data Warehouses (SiKDD 2005), 2005

Page 15: Extracting semantics from crowds

Knowledge Management Institute

Limitations of Knowledge Extraction from TextLimitations of Knowledge Extraction from Text(or why we want to study semantics in crowds)

• DelayTi it t k t it (hi h lit ) t t t i•Time it takes to write (high quality) text on a topic

• Population bias and dependence• Authors´ demographicsAuthors demographics• study language of target groups (e.g. software developers)

• Styley•Internal monologue vs. conversation with others

• Topical bias• General Fiction, Mystery, Science Fiction, Romance, Humor, ..• Books, newspapers, magazines [Brown Corpus]

15

Markus Strohmaier 2011Brown Corpus: http://icame.uib.no/brown/bcm.html

Page 16: Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from CrowdsExtracting Semantics from Crowds (i.e. Social Media)

How can we tap into users interaction with data and with each other for the extraction of semantic structures?

16

Markus Strohmaier 2011

Page 17: Extracting semantics from crowds

Knowledge Management Institute

Crowdsourcing

“Crowdsourcing represents the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (andemployees and outsourcing it to an undefined (and generally large) network of people in the form of an open call.” p

[J. Howe 2006][J. Howe 2006]

17

Markus Strohmaier 2011

Page 18: Extracting semantics from crowds

Knowledge Management Institute

Crowdsourcing Semantics:Crowdsourcing Semantics:An Experiment (2008)

L C t d 3000 h tLoC posted 3000 photos to flickr:

24 hours after launch:24 hours after launch:• over 4,000 unique tags • about 19,000 tags added

Output? noisy, unstructured, unqualified, weak semantics

How can this data contribute to the induction of semantic structures?

18

Markus Strohmaier 2011

weak semantics

Page 19: Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from Crowds

Vi iVision: Utilizing online behavior of crowds for the construction maintenance and enrichmentthe construction, maintenance and enrichment of large-scale semantic structures.

Mission:• to model behavior of large numbers (millions) of users online• to develop techniques and algorithms that acquire semantic structures

from users‘ interaction with data and with each other• to influence user behavior and emerging semantics• to influence user behavior and emerging semantics• to evaluate results

19

Markus Strohmaier 2011

Page 20: Extracting semantics from crowds

Knowledge Management Institute

Activities of Crowds OnlineActivities of Crowds Online

Users engage in…

Labeling Tagging Navigation

Can we tap into the outcome of these activities to extract and evaluate semantic structures?

20

Markus Strohmaier 2011

to extract and evaluate semantic structures?

Page 21: Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from Crowds

Motivation

Social Labeling• Hashtag Semantics

Social Tagging• Tag Relatedness

Tag Generality• Tag Generality• Tag Hierarchies

Social NavigationSocial Navigation• Navigational Knowledge Engineering

21

Markus Strohmaier 2011

g g

Page 22: Extracting semantics from crowds

Knowledge Management Institute

Social LabelingSocial LabelingExample: Twitter

users label short messages with concepts (hashtags)

Whether hashtags behave as strong identifiers, and if they could be mapped to concept identifiers in the Semantic Web (URIs)?

[Laniado and Mika 2010][Laniado and Mika 2010]

Craig‘s talk this morning: Background knowledge for URLs to explore other URLs

22

Markus Strohmaier 2011

Craig s talk this morning: Background knowledge for URLs, to explore other URLs

Page 23: Extracting semantics from crowds

Knowledge Management Institutedeveloped by

eam

s,

010)

,

ConceptNet

ocia

l Aw

aren

ess

Stre

Con

fere

nce

(WW

W20

al S

truc

ture

s fr

om S

oon

al W

orld

Wid

e W

eb C

ing

Late

nt C

once

ptua

with

the

19th

Inte

rnat

iow

eeto

nom

ies:

Acq

uiri

2010

), in

con

junc

tion

wer

, The

Wis

dom

in T

wW

orks

hop

(Sem

Sea

rch2

26-3

0, A

CM

, 201

0.

Wag

ner,

M. S

trohm

aie

man

tic S

earc

h 20

10 W

eigh

, NC

, US

A, A

pril

2

23

Markus Strohmaier 2011

C. W

Sem

Ral

e

with J. Pöschko and C.Wagner

Page 24: Extracting semantics from crowds

Knowledge Management Institutedeveloped by

eam

s,

010)

,

ConceptNet

ocia

l Aw

aren

ess

Stre

Con

fere

nce

(WW

W20

al S

truc

ture

s fr

om S

oon

al W

orld

Wid

e W

eb C

Ev-ent

Ev-ent

ing

Late

nt C

once

ptua

with

the

19th

Inte

rnat

iow

eeto

nom

ies:

Acq

uiri

2010

), in

con

junc

tion

w

Can we qualify semantic associations in such a network?

er, T

he W

isdo

m in

Tw

Wor

ksho

p (S

emS

earc

h226

-30,

AC

M, 2

010.

e.g. X is more general than Y, X is a subconcept of Y W

agne

r, M

. Stro

hmai

em

antic

Sea

rch

2010

Wei

gh, N

C, U

SA

, Apr

il 2

24

Markus Strohmaier 2011

p

C. W

Sem

Ral

e

with J. Pöschko and C.Wagner

Page 25: Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from Crowds

Motivation

Social Labeling• Hashtag Semantics

Social Tagging• Tag Relatedness

Tag Generality• Tag Generality• Tag Hierarchies

Social NavigationSocial Navigation• Navigational Knowledge Engineering

25

Markus Strohmaier 2011

g g

Page 26: Extracting semantics from crowds

Knowledge Management Institute

Social TaggingSocial TaggingExample: Delicious

ResourcesUsers label and categorize

resources with concepts (tags)

U

Tags

is a tuple F:= (U, T, R, Y) whereth th di j i t fi it t U T R d t user 1

Users

• the three disjoint, finite sets U, T, R correspond to– a set of persons or users u ∈ U – a set of tags t ∈ T and

user 1

– a set of resources or objects r ∈ R

• Y ⊆ U ×T ×R, called set of tag assignmentstag 1 res. 1

26

Markus Strohmaier 2011

Page 27: Extracting semantics from crowds

Knowledge Management Institute

Tag Relatedness

t1 t2 t3

ResCont

r1 r2

Different tag similiarity

measures applied to del.icio.us

27

Markus Strohmaier 2011C. Cattuto, D. Benz, A. Hotho, G. Stumme, Semantic Grounding of Tag Relatedness in Social Bookmarking Systems, 7th International Semantic Web Conference ISWC2008, LNCS 5318, 615-631 (2008).

Page 28: Extracting semantics from crowds

Knowledge Management Institute

Intuition:Intuition:Latent Hierachical Structures

high centrality:more abstract

[Strohmaier et al 2011][ ]

[Heyman and Garcia-Molina 2006]

low centrality:more specific

Note: Betweeness centrality is usually difficult to calculate. Calculating all shortest path is usually O(n2). For all nodes O(n3). We use an approximation.

28

Markus Strohmaier 2011

p y ( ) ( ) pp

M. Strohmaier, D. Helic, D. Benz. Evaluation of Folksonomy Induction Algorithms. Submitted to the ACM Transactions on Intelligent Systems and Technology, (2011).

Page 29: Extracting semantics from crowds

Knowledge Management Institute

Tag Abstractness

frequency

Del.icio.us

with D. Benz et al.

29

Markus Strohmaier 2011D. Benz, C. Körner, A. Hotho, G. Stumme, M. Strohmaier, One Tag to bind them all: Measuring Term Abstractness in Social Metadata, Submitted to ESWC 2011.

Page 30: Extracting semantics from crowds

Knowledge Management Institute

Emergent semanticsEmergent semantics through hierarchical clustering

Approaches:• k-means• Affinity propagation• Tag generality

Applications:i ti• user navigation

• ontology learning• disambiguation

Evaluation:• Semantic grounding to Golden Standards (e.g. W dN t)

on a delicious dataset

WordNet)

Semantic and Pragmatic Quality i tl

Study of different tagging systems:BibSonomy, CiteULike ,Delicious, Flickr, LastFM

30

Markus Strohmaier 2011

varying greatly.

Anon Plangprasopchok, Kristina Lerman, and Lise Getoor (2010), Integrating Structured Metadata via Relational Affinity Propagation, in Proceedings of AAAI Workshop on Statistical Relational AI

Page 31: Extracting semantics from crowds

Knowledge Management Institute

Semantic Validation of FolksonomiesSemantic Validation of Folksonomies

Semantic Networks Semantic Grounding(Emergent)via e.g. hierarchical clustering

g(Golden Standard)WordNet: a lexical DB for English

computers

programmingProgramming

Map-ping

Synset Hierarchy

distance d = 1 distance

Design Python

languages

distance d1 = 1 distance d2 = 2

gpatterns

g g

java python

Semanticgrounding

abs. difference |d1 - d2| a simple proxy for the quality

31

Markus Strohmaier 2011

j pythonp p y q yof emergent semantics

Based on slides by A. Hotho

Page 32: Extracting semantics from crowds

Knowledge Management Institute

Semantically EvaluatingSemantically Evaluating Tag Hierarchies

TP..Taxonomic TR..Taxonomic TF..Taxonomic TO…Taxonomic

[Dellschaft, Staab 2006]precision recall F measure overlap

Different folksonomy algorithmsHolds with other knowledge bases (Yago, Wikitaxonomy) and datasets (bibsonomy, citeulike, lastfm less so)

32

Markus Strohmaier 2011M. Strohmaier, D. Helic, D. Benz. Evaluation of Folksonomy Induction Algorithms. Submitted to the ACM Transactions on Intelligent Systems and Technology, (2011).

[Dellschaft, Staab 2006] On How to Perform a Gold Standard Based Evaluation of Ontology Learning (2006), by Klaas Dellschaft , Steffen Staab, In Proceedings of the 5th International Semantic Web Conference (ISWC’06)

Page 33: Extracting semantics from crowds

Knowledge Management Institute

P ti i fl ti

Limitations and Opportunities

Pragmatics influence semantics:

1. Tagging behavior effects preciseness of emergingsemantics [Körner et al. 2010]

2 S i l t k i fl lti ti t k2. Social networks influence resulting semantic networks[Wang and Groth 2010]

3 Stream types effect stream semantics3. Stream types effect stream semantics[Wagner and Strohmaier 2010]

33

Markus Strohmaier 2011

Page 34: Extracting semantics from crowds

Knowledge Management Institute

Different motivations for tagging

User BUser A

This TheseThisseems to

be a category!

These seem to

bekeywords!category! keywords!

34

Markus Strohmaier 2011

Page 35: Extracting semantics from crowds

Knowledge Management Institute

How do Semantics Emerge?How do Semantics Emerge?Are they Influenced by the Pragmatics of Tagging?

Diff t t l f t iDifferent styles of tagging:To categorize or to describe resources [Strohmaier et al. 2010]

Categorizer (C) Describer (D)Categorizer (C) Describer (D)

Goal later browsing later searchChange of vocabulary costly cheapSi f b l li it d OSize of vocabulary limited OpenTags subjective objective

Example tag clouds

Semantic Assumption: Categorizers produce more precise emergent semantics than Describers.

35

Markus Strohmaier 2011[Strohmaier et al. 2010] M. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users' Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media (ICWSM2010), Washington, DC, USA, May 23-26, 2010.

Page 36: Extracting semantics from crowds

Knowledge Management Institute

Example 1pDescribers outperform categorizers on precision of

emergent tag semantics

Categorizers perform worse than random

Describers perform better than random

Random users

Random users

(D)(C)

36

Markus Strohmaier 2011C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.

Page 37: Extracting semantics from crowds

Knowledge Management Institute

Example 2pCategorizers outperform describers on social classification

accuracy

Describers perform worse than Categorizersthan Categorizers

Supervised multi-class SVM

37

Markus Strohmaier 2011A. Zubiaga, C. Koerner, and M. Strohmaier. Tags vs. shelves: From social tagging to social classification. In 22nd ACM SIGWEB Conference on Hypertext and Hypermedia (HT 2011), Eindhoven, Netherlands, ACM, 2011.

Page 38: Extracting semantics from crowds

Knowledge Management Institute

Example 3Example 3Type of stream aggregation effects semantics

… the type of aggregation:Hashtag stream aggregations are more robust against external disturbances than user list streams

Hashtag Stream User List Stream

38

Markus Strohmaier 2011

OR(RUa)S(Rh) OR(RUa)S(RUL)

Page 39: Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from Crowds

Motivation

Social Labeling• Hashtag Semantics

Social Tagging• Tag Relatedness

Tag Generality• Tag Generality• Tag Hierarchies

Social NavigationSocial Navigation• Navigational Knowledge Engineering

39

Markus Strohmaier 2011

g g

Page 40: Extracting semantics from crowds

Knowledge Management Institute

Social NavigationUsers search for and navigateUsers search for and navigate

to certain sets of resources

Can we produce ontological constructs as a b d t f i ti l ti iti f ?byproduct of navigational activities of users ?

Navigational Knowledge Engineering:A light-weight methodology for low-cost

knowledge engineering by a massive user base.

40

Markus Strohmaier 2011

Page 41: Extracting semantics from crowds

Knowledge Management Institute

Prototype: HANNE

HANNE H li ti A li ti f N i ti lHANNE: Holistic Application for Navigational Knowledge Engineering

http://aksw org/Projects/NKEhttp://aksw.org/Projects/NKE

41

Markus Strohmaier 2011

with S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, University of Leipzig

Page 42: Extracting semantics from crowds

Knowledge Management Institute

Navigational Knowledge Engineeringhttp://aksw.org/Projects/NKE

Example: Extending DBPedia with NKE

42

Markus Strohmaier 2011S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, (in preparation)

Page 43: Extracting semantics from crowds

Knowledge Management Institute

Navigational Knowledge Engineering

Choose initial positive andChoose initial positive andnegative examples from thesearch result.Here we are looking forFootball Clubs in Saxony, aregion in Germanregion in Germany.

43

Markus Strohmaier 2011 http://aksw.org/Projects/NKE

Page 44: Extracting semantics from crowds

Knowledge Management Institute

The Problem of Inductive Concept Learning

U i i l t f bj tU is a universal set of objectsC is a concept: a subset of objects in U

To learn a concept C means to learn to recognize bj t i C t b bl t t ll h thobjects in C, to be able to tell whether

Example: fU may be the set of all patients in a register, and

The set of all patients having a particular disease

44

Markus Strohmaier 2011

Page 45: Extracting semantics from crowds

Knowledge Management Institute

Inductive Concept Learning

To test the coverage, the function

Returns the value true if e is covered by H, and false otherwise.

45

Markus Strohmaier 2011Nada Lavrac and Saso Dzeroski, Inductive Logic Programming: Techniques and Applications, 1994

Page 46: Extracting semantics from crowds

Knowledge Management Institute

HANNE &HANNE &The Problem of Inductive Concept Learning

given:• Background knowledge (OWL/DL knowledge base)• positive and negative examples of a concept

(i t )(instances)

fi dfind:• A hypothesis (expressed as OWL class descriptions)

that covers all positive and no negative examplesthat covers all positive and no negative examples

46

Markus Strohmaier 2011

Page 47: Extracting semantics from crowds

Knowledge Management Institute

Based on the extension, ICL searches forsearches for suitable hypotheses.hypotheses.

47

Markus Strohmaier 2011S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, (in preparation)

Page 48: Extracting semantics from crowds

Knowledge Management Institute

Concepts Learned

C O S• The Learned Concept is shown in Manchester OWL Syntax• The user can retain the concept for later retrieval. • Saved concepts are displayed as social navigation

suggestions. Can be used to enrich existing knowledge base

48

Markus Strohmaier 2011

base.

S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, (in preparation)

Page 49: Extracting semantics from crowds

Knowledge Management Institute

ResultUseful properties:Useful properties:• Biased towards high recall

• Scales well: Number of training• Scales well: Number of training examples is more important than the size of the background knowledge

With only 2 positives and 4 negatives, it is possible to find 13 more i t hi h f tb ll l binstances, which are football clubs situated close to Saxony, Germany

49

Markus Strohmaier 2011S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen, M. Strohmaier, Navigational Knowledge Engineering, (in preparation)

Page 50: Extracting semantics from crowds

Knowledge Management Institute http://aksw.org/Projects/NKE

Mock up (1)

50

Markus Strohmaier 2011

Page 51: Extracting semantics from crowds

Knowledge Management Institute http://aksw.org/Projects/NKE

Mock up (2)

51

Markus Strohmaier 2011

Page 52: Extracting semantics from crowds

Knowledge Management Institute

Extracting Semantics from …

Motivation

Social Labeling• Hashtag Semantics

Social Tagging• Tag Relatedness

Tag Generality• Tag Generality• Tag Hierarchies

Social NavigationSocial Navigation• Navigational Knowledge Engineering Conclusions

52

Markus Strohmaier 2011

g g Conclusions

Page 53: Extracting semantics from crowds

Knowledge Management Institute

ObjectivesP id ( ) t th f ll i tiProvide (some) answers to the following questions:

• What is the difference between extracting semantics from textWhat is the difference between extracting semantics from text vs. extracting semantics from crowds?

• Why should we study crowds and crowd behavior from a ti ti ti ?semantic computing perspective?

• How can semantics be extracted from online crowd behavior, such as

• …from Social Labeling• …from Social Tagging• …from Social Navigationo Soc a a ga o

• What are the implications for semantic computing research?

53

Markus Strohmaier 2011

Page 54: Extracting semantics from crowds

Knowledge Management Institute

Social Computation for the Web of Data

Crowds provide some advantages over semantic extraction from text.

It is through the process of social computation, i.e. the combination of social behavior and algorithmic computation, th t ti t tthat semantic structures emerge.

In order to extract semantics from crowds understanding users’In order to extract semantics from crowds, understanding users behavior and its impact on emerging semantic structures is important.

Shaping or classifying certain users’ behavior are ways of increasing preciseness of emerging semantic structures.

54

Markus Strohmaier 2011

g p g g

Page 55: Extracting semantics from crowds

Knowledge Management Institute

C d S tiCrowd Semanticsbased on [Hovy 2011]

55

Markus Strohmaier 2011[Hovy 2011] Based on slides by Ed Hovy, USC, Reasoning with Text Workshop 2011 (link)

Page 56: Extracting semantics from crowds

Knowledge Management Institute

Summary

W b th tWe can observe that:• semantic structures can be obtained as a byproduct

of online crowd behavior (l b li t i i ti )of online crowd behavior (labeling, tagging, navigation)

• these structures can approximate structures in reference knowledge bases (DBPedia WordNet etc)reference knowledge bases (DBPedia, WordNet, etc)

but:• pragmatics influences resulting semantics• pragmatics influences resulting semantics• semantic preciseness remains a challenge

56

Markus Strohmaier 2011

Page 57: Extracting semantics from crowds

Knowledge Management Institute

What‘s the knowledge of SAS?An Agenda for Semantic Computing R hWhat s the knowledge of SAS? Research

0 A

ugus

t, 20

1110

Web ResourcesNatural Language Constructs

57

Real-World HappeningsUtilizing online behavior of crowds for the construction, maintenance and enrichment of large-scale semantic structures

57

Markus Strohmaier 2011http://www.flickr.com/photos/matthewfield/2306001896/

http://www.flickr.com/photos/waldoj/722508166/

Page 58: Extracting semantics from crowds

Knowledge Management Institute

Thank You.

Acknowledgements

co-authors and collaborators

H.P. Grahsl, D. Helic, C. Körner, R. Kern, C. Trattner (TU Graz)D. Benz, G. Stumme (U. Kassel)

A. Hotho (U. Würzburg)S H ll J L h C St dl J U b h (U f L i i G )S. Hellmann, J. Lehmann, C. Stadler, J. Unbehauen (U. of Leipzig, Germany)

58

Markus Strohmaier 2011