+ user-induced links in collaborative tagging systems ching-man au yeung, nicholas gibbins, nigel...

30
+ User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January 2009

Upload: melinda-wilcox

Post on 29-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+

User-induced Links in Collaborative Tagging Systems

Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09

Speaker: Nonhlanhla Shongwe

18 January 2009

Page 2: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 2

Preview

Introduction

Collaborative tagging

User-Induced hyperlinks Similarity of Assigned Tags Association Rule Mining

Analysis of User-induced links

Tag Prediction

Discussion

Conclusion

2

Page 3: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 3

Introduction

Hyper links Makes navigation through the web possible The author decides the document to link to

Due to the limited links that authors give, has lead to user-contributed content on the web.

In social bookmarking sites, e.g. Delicious Users can maintain a collection of documents URLs are identified by their chosen tags

3

Page 4: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 4

Collaborative tagging (1/2)

Popular Tagging systems e.g. Delicious and LibraryThing Allows users describe their favorite online resources using

their own words Eg http:///www.cnn.com tags new, tv, sports weather,

travel

Advantages over traditional methods Flexibility and freedom offered by these systems Systems are quick to adapt to changes in the vocabulary

among the users.

4

Page 5: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 5

Collaborative tagging (2/2)

Collaborative tagging activities of participating user results in scheme called folksonomy

Folksonomy is divided into three types of elements Users

Assign tags to the Web Tags

Keywords chosen by users to describe and categorize a web document

Documents Object tagged by the user

5

Page 6: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 6

User-Induced hyperlinks

Two types of hyperlinks For Navigation For recommendation

Directs users to other documents that contain related information

Two different approached to discover implicit relations in folksonomy Calculating the similarity between the sets of tags assigned to

the document Analyzing the collective behavior of the user who have tagged

the document

User-induced Links are implicit links in a folksonomy as resulted from collaborative tagging activities by users

6

Page 7: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 7

Similarity of Assigned Tags (1/4)

First approach of discovering user-induced links Calculate the pair-wise similarity between documents based

on their tags Jaccard Coefficient

In IR, Cosine Similarity

7

Page 8: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 8

Similarity of Assigned Tags (2/4)8

Cosine Similarity

Page 9: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 9

Similarity of Assigned Tags (3/4)9

Second similarity function The normalized discounted cumulative gain (NDCG)

used to evaluate ranking of documents according to their relevance score Firstly list the tags of the two documents

Secondly, calculate the DCG at position p

Page 10: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 10

Similarity of Assigned Tags (4/4)10

Thirdly, iDCG

Finally, calculate the NDCG

Use a function

Page 11: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 11

Association Rule Mining11

Second approach of discovering user-induced links Finding out pairs of Web documents that have both been

tagged by the same group of users Aims at identifying implicit patterns within a large database

of transactions Two major concepts

Support

confidence

Page 12: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 12

Analysis of User-Induced Links (1/3)12

Two methods described Identify user-induced links in data collected Delicious Compared them with existing hyperlinks in terms of several

different aspects.

Several aspects to compare Do they connect 2 documents from the same

domain/website Similarity between documents on the two ends of a link Whether users are equally interested in the linked

documents

Page 13: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 13

Analysis of User-Induced Links (2/3)13

Data collection Data collected from Delicious Documents cover a wide range of topics Documents collected on per-tag basis

First collected at random 130 tags, popular tags For each tag, crawl Delicious to obtain a set of documents

and users that have tag the document.

Page 14: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 14

Analysis of User-Induced Links (3/3)14

Results Identify user-induced links between the documents using

the two methods For similarity, vary the similarity threshold to 0.5 For association Rule, set minimum support to 100 and vary

the minimum confidence level

Findings Very few user-induced links that supported confidence of

0.5 and above

Page 15: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 15

Results (1/8)15

Page 16: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 16

Results (2/8)on Same Domain

16

One important function of hyperlinks allow users to navigate from one hypertext document to

another

More beneficial if the links point to some document outside external to the current website

Check whether the documents at the ends are from the same domain

Page 17: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 17

Results (3/8) on Same Domain

17

Page 18: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 18

Results (4/8) on Coincidence between existing hyperlinks and user-induced links

18

See whether such links already exist between the documents

If user-induced links coincide with existing hyperlinks means that users are satisfied with the existing hyperlinks

If user-induces are mostly new, means that there are user interests and perspectives that

existing hyperlinks have note captures

Page 19: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 19

Results (5/8) on Coincidence between existing hyperlinks and user-induced links

19

Page 20: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 20

Results (6/8)on similarity and user preferences

20

Look at documents that are connected by user-induce links Between blog posts of highly related topics News articles on the same topics Websites offering applications of similar functionalities Q&A pages of some portal site

Two different approaches for generating user-induced links Association rule, a link is generated if enough users are

interested in two documents regardless of the similarity between them

Similarity based, generates links based on the tags assigned regardless of whether there are many users interested in the documents

Page 21: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 21

Results (7/8) on similarity and user preferences

21

Page 22: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 22

Results (8/8) on similarity and user preferences

22

Page 23: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 23

Tags Prediction (1/3)23

The analysis of user-induced links shows that links generated by association rule mining of user collections usually connect documents that are highly related to each other as judged by the similarity between their tags

To predict the tags Identify the other documents that have a link to this

document

The set of documents that have a link (dx)

Page 24: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 24

Tags Prediction (2/3)24

Firstly, consider a simple averaging method

Page 25: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 25

Tags Prediction (3/3)25

Secondly method of aggregation method

Page 26: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 26

Experiments (1/2)26

Measure the performance of the predictions By using NDCG Precision at the nth Term

NDCG was used To investigate whether the predictions are accurate in

terms of the ordering of the tags.

Page 27: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 27

Experiments (2/2) 27

Page 28: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 28

Discussion28

Implicit relation between web documents can be discovered by examining user preferences and document similarity embedded in a folksonomy

User-induced are different from hyperlinks Collaborative tagging environment

shows the differences between the perspective of Web authors and Web readers

Worthwhile considering an open hypermedia structure backed by a collaborative tagging system.

Page 29: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+ 29

Conclusion29

User-induced links, a form of implicit relations between documents

We used Tag similarity

to generate many user-induced links Association rule miming

to generate very high user-induced-links

Page 30: + User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January

+

30

Thank you for your attention

30