learning semantic relationships between entities in twitter

37
Delft University of Technology Learning Semantic Relationships between Entities in Twitter ICWE, Cyprus, June 22, 2011 Ilknur Celik, Fabian Abel, Geert-Jan Houben Web Information Systems, TU Delft

Upload: web-information-systems-tu-delft

Post on 11-May-2015

1.825 views

Category:

Technology


0 download

DESCRIPTION

Slides presented at ICWE 2011: Learning Semantic Relationships between Entities in TwitterSupporting web site: http://wis.ewi.tudelft.nl/icwe2011/relation-learning/

TRANSCRIPT

Page 1: Learning Semantic Relationships between Entities in Twitter

DelftUniversity ofTechnology

Learning Semantic Relationships between Entities in Twitter

ICWE, Cyprus, June 22, 2011

Ilknur Celik, Fabian Abel, Geert-Jan HoubenWeb Information Systems, TU Delft

Page 2: Learning Semantic Relationships between Entities in Twitter

2Learning Semantic Relationships between Entities in Twitter

PersonalizedRecommendations

Personalized Search Adaptive Systems

What we do: Science and Engineering for the Personal Web

Social Web

Analysis and User Modeling

user/usage data

Semantic Enrichment, Linkage and Alignment

domains: news social media cultural heritage public data e-learning

Page 3: Learning Semantic Relationships between Entities in Twitter

3Learning Semantic Relationships between Entities in Twitter

60,000,000number of tweets published per day

Page 4: Learning Semantic Relationships between Entities in Twitter

4Learning Semantic Relationships between Entities in Twitter

1number of tweets per day that are interesting for me

Page 5: Learning Semantic Relationships between Entities in Twitter

5Learning Semantic Relationships between Entities in Twitter

Searching on Twitter

Page 6: Learning Semantic Relationships between Entities in Twitter

6Learning Semantic Relationships between Entities in Twitter

Issues with Multiple Keywords Search

Page 7: Learning Semantic Relationships between Entities in Twitter

7Learning Semantic Relationships between Entities in Twitter

Let’s try to search with One Keyword

Page 8: Learning Semantic Relationships between Entities in Twitter

8Learning Semantic Relationships between Entities in Twitter

Page 1

Page 9: Learning Semantic Relationships between Entities in Twitter

9Learning Semantic Relationships between Entities in Twitter

Page 2

Page 10: Learning Semantic Relationships between Entities in Twitter

10Learning Semantic Relationships between Entities in Twitter

Page 3

Page 11: Learning Semantic Relationships between Entities in Twitter

11Learning Semantic Relationships between Entities in Twitter

Page 60!!

tweet I was looking for

Next Saturday @thatsimpsonguy aka Guilty Simpson will be performing atArea51 in my hometwon Eindhoven. #realliveshit #iwillspinrecordsabout 9 hours ago via Blackberry

Music Artist

Locations

Page 12: Learning Semantic Relationships between Entities in Twitter

12Learning Semantic Relationships between Entities in Twitter

Is there an easier way? Faceted Search can help

Locations more...

Events more...

Music Artists:+ Guilty Simpson+ Bryan Adams+ Elton John+ Golden Earring+ Rihanna+ The eagles+ 3 Doors Downmore...

Current Query:

Results:1. Yskiddd: Next saturday

@thatsimpsonguy aka Guilty Simpson will be performing at Area51 in my homeytown Eindhoven. #realliveshit #iwillspinrecords2

2. Usee123: Cool #EV3door7980 !!! http://bit.ly/igyyRhL

3. sanmiquelmusic: This Saturday I'm joining @KrusadersMusic to Intents

Eindhoven Music

Expand Query:

Page 13: Learning Semantic Relationships between Entities in Twitter

13Learning Semantic Relationships between Entities in Twitter

Semantic relationships between entities are essential to realize such applications.

Music Artist:Guilty Simpson

Location:Eindhoven

Location:Area51

Page 14: Learning Semantic Relationships between Entities in Twitter

14Learning Semantic Relationships between Entities in Twitter

Relation Discovery Framework

news articles

microblog posts Entity

extraction &semantic

enrichment

Person A Location A

Location B

Event A

Group A

temporal constraints

relation type

weighting scheme

sourceselection

Relation discovery

Person A Location A

Group A

isLocatedIn

Person AinvolvedIn

typed relations

Applications- Browsing support- Query suggestions- Schema enrichment

Relation Discovery Framework

Page 15: Learning Semantic Relationships between Entities in Twitter

15Learning Semantic Relationships between Entities in Twitter

Entity Extraction and Semantic Enrichment

@bob: Julian Assange got arrested http://bit.ly/5d4r2t

Julian Assange

Julian Assange Tweet-basedenrichment

Julian Assange arrestedJulian Assange, the founder ofWikiLeaks, is under arrest inLondon…

News-basedenrichment

Julian Assange

London

WikiLeaks

Julian Assange Julian Assange

LondonWikiLeaks

powered by

Page 16: Learning Semantic Relationships between Entities in Twitter

16Learning Semantic Relationships between Entities in Twitter

Relation Learning Strategies

• Relation: relation(e1, e2, type, tstart, tend, weight)

• Relation Learning strategy: • Input: entity e1 and e2, time period (tstart, tend)• Challenge: infer weight and type of the relation for the given

• Weighting according to co-occurrence frequency:a) Tweet-based: count co-occurrence in tweetsb) News-based: count co-occurrence in newsc) Tweet-News-based: count co-occurrence in both tweets and

news

entities

type/label of relation

time period

relatedness

Page 17: Learning Semantic Relationships between Entities in Twitter

17Learning Semantic Relationships between Entities in Twitter

Research Questions1. Which strategy performs best in detecting

relationships between entities?

2. Does the accuracy depend on the type of entities which are involved in a relation?

3. How do the strategies perform for discovering relationships which have temporal constraints (trending relationships)?

Page 18: Learning Semantic Relationships between Entities in Twitter

18Learning Semantic Relationships between Entities in Twitter

Dataset

timeNov 15 Dec 15 Jan 15

20,000 Twitter users

10,000,000 tweets

2 months

more than:

75,000 news

WikiLeaks founder, Julian Assange, under arrest in

London

Page 19: Learning Semantic Relationships between Entities in Twitter

19Learning Semantic Relationships between Entities in Twitter

Dataset Characteristics

Page 20: Learning Semantic Relationships between Entities in Twitter

20Learning Semantic Relationships between Entities in Twitter

Tweets and news articles per day50,000-400,000tweets per day

100-1000 news articles

per day

Page 21: Learning Semantic Relationships between Entities in Twitter

21Learning Semantic Relationships between Entities in Twitter

Entities referenced per day 10,000-100,000entity ref. in tweetsper day

5,000-20,000 entity ref.in newsper day

~40% tweets do not mention any

(recognizable) entity

99.3% of the news articles mention at

least one (recognizable) entity

72.6% of the top 1000 mentioned entities in Twitter are also mentioned in the mainstream

news media

Page 22: Learning Semantic Relationships between Entities in Twitter

22Learning Semantic Relationships between Entities in Twitter

Number of Distinct Entities per Entity Types

39 types of entities

Page 23: Learning Semantic Relationships between Entities in Twitter

23Learning Semantic Relationships between Entities in Twitter

Performance of Relation Learning Strategies

Page 24: Learning Semantic Relationships between Entities in Twitter

24Learning Semantic Relationships between Entities in Twitter

Our Ground Truth of true relations

1. Based on DBpedia:• We mapped entities to their corresponding DBpedia

resources• No appropriate DBpedia URIs for more than 35% of the entities

• We analyzed whether there is a direct relation between two entities

2. Based on user study:• Participants judged whether two entities are really:

a) related (62.6% were rated as related)

b) related in the given time period (57.3% were rated as related)• Overall: 2588 judgments

Thank you!

Page 25: Learning Semantic Relationships between Entities in Twitter

25Learning Semantic Relationships between Entities in Twitter

1. Which strategy performs best in detecting relationships between entities?

Page 26: Learning Semantic Relationships between Entities in Twitter

26Learning Semantic Relationships between Entities in Twitter

Accuracy of relation discovery

Based on user study Based on DBpediaCombining both tweet-based and news-based

strategies allows for highest accuracy

Page 27: Learning Semantic Relationships between Entities in Twitter

27Learning Semantic Relationships between Entities in Twitter

F-Measure@k

Tweet-based strategy

saturates quickly

Combined strategy (and news-based) increase in

performance.

Page 28: Learning Semantic Relationships between Entities in Twitter

28Learning Semantic Relationships between Entities in Twitter

2. Does the accuracy depend on the type of entities which are involved in a relation?

Page 29: Learning Semantic Relationships between Entities in Twitter

29Learning Semantic Relationships between Entities in Twitter

Does the accuracy depend on the type of entities?

Relationships which involve events can be discovered with

high precision

92%

87% precision

23%

26% precision

Page 30: Learning Semantic Relationships between Entities in Twitter

30Learning Semantic Relationships between Entities in Twitter

Does the accuracy depend on the type of entities? (cont.)

Relationships between persons/groups are difficult to

detect.

Relationships between events can

be detected with highest precision.

Page 31: Learning Semantic Relationships between Entities in Twitter

31Learning Semantic Relationships between Entities in Twitter

3. How do the strategies perform for discovering relationships which have temporal constraints?

Page 32: Learning Semantic Relationships between Entities in Twitter

32Learning Semantic Relationships between Entities in Twitter

Relationships with temporal constraints

Tweet-based strategy performs better in discovering relationships that are valid only for a specific period

in time

Page 33: Learning Semantic Relationships between Entities in Twitter

33Learning Semantic Relationships between Entities in Twitter

Where do relationships emerge faster?

time difference (in days) of first occurrence of relationship

News is faster Twitter is faster

Speed of strategies is domain-dependent

Page 34: Learning Semantic Relationships between Entities in Twitter

34Learning Semantic Relationships between Entities in Twitter

Conclusions and Future Work

What we did: relation discovery framework based on Twitter

Findings:1. Strategy that considers both tweets and (linked) news

articles allows for highest accuracy2. Performance varies for different domains (e.g. event-

relationships can be detected with highest precision)3. Tweet-based strategy allows for detecting

relationships, which have a restricted temporal validity, with high precision (and fast)

Ongoing work: Adaptive Faceted Search on Twitterhttp://wis.ewi.tudelft.nl/tweetum/

Page 35: Learning Semantic Relationships between Entities in Twitter

35Learning Semantic Relationships between Entities in Twitter

Relation Discovery for Adaptive Faceted Search

Locations more...

Events more...

Music Artists:+ Guilty Simpson+ Bryan Adams+ Elton John+ Golden Earring+ Rihanna+ The eagles+ 3 Doors Downmore...

Current Query:

Results:1. Yskiddd: Next saturday

@thatsimpsonguy aka Guilty Simpson will be performing at Area51 in my homeytown Eindhoven. #realliveshit #iwillspinrecords2

2. Usee123: Cool #EV3door7980 !!! http://bit.ly/igyyRhL

3. sanmiquelmusic: This Saturday I'm joining @KrusadersMusic to Intents

Eindhoven Music

Expand Query:

1. Analyze (temporal) relationships of entities of the

“current query” to adapt facet ranking.

2. Analyze (temporal)relationships of entities that appear in the user profile to adapt facet

ranking.

user

Page 36: Learning Semantic Relationships between Entities in Twitter

36Learning Semantic Relationships between Entities in Twitter

Thank you!

Ilknur Celik, Fabian Abel, Geert-Jan Houben

Twitter: @perswebhttp://wis.ewi.tudelft.nl/tweetum/

Page 37: Learning Semantic Relationships between Entities in Twitter

37Learning Semantic Relationships between Entities in Twitter

The Social Web

Help me to tackle the

information overload!

Recommend me news articles

that now interest me!

Help me to find interesting (social) media!

Do not bother me with

advertisements that are not

interesting for me!

Give me personalized

support when I do my online training!

Who is this? What are his personal demands? How can we make him happy?

Personalize my Web

experience!