context-based ranking in folksonomies - diunitoargo/papers/2009_hta.pdfsocial media, search,...

Context-based Ranking in Folksonomies

Fabian AbelIVS – Semantic Web GroupLeibniz University Hannover

Appelstr. 4 D-30167 [email protected]

Matteo BaldoniDipartimento di Informatica

Università degli Studi di TorinoCorso Svizzera 185

I-10149 [email protected]

Cristina BaroglioDipartimento di Informatica



Nicola HenzeIVS – Semantic Web GroupLeibniz University Hannover


hannover.de

Daniel KrauseIVS – Semantic Web GroupLeibniz University Hannover


hannover.de

Viviana PattiDipartimento di Informatica



ABSTRACTWith the advent of Web 2.0 tagging became a popular fea-ture. People tag diverse kinds of content, e.g. products atAmazon, music at Last.fm, images at Flickr, etc. Clickingon a tag enables the users to explore related content. In thispaper we investigate how such tag-based queries, initializedby the clicking activity, can be enhanced with automaticallyproduced contextual information so that the search resultbetter fits to the actual aims of the user. We introduce theSocialHITS algorithm and present an experiment where wecompare different algorithms for ranking users, tags, andresources in a contextualized way.

Categories and Subject DescriptorsH.3.3 [Information Systems]: Information Search and Re-trieval; H.4.m [Information Systems]: Miscellaneous

General TermsAlgorithms

KeywordsSocial Media, Search, Ranking, Folksonomies, Context, Adap-tation

1. INTRODUCTIONDuring the last decade, the tagging paradigm attracted

much attention in the Web community. More and more Websystems allow their users to annotate content with freelychosen keywords (tags). The tagging feature helps users toorganize content for future retrieval [19]. Resource sharing

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.HT’09, June 29–July 1, 2009, Torino, Italy.Copyright 2009 ACM 978-1-60558-486-7/09/06 ...$5.00.

systems like Delicious, Flickr, or Last.fm would not workwithout the users, who assign tags to the shared bookmarks,images, and music respectively, because tag assignments areused as information source to provide diverse features suchas recommendation, search, or exploration features. For ex-ample, tag clouds, which depict the popularity of tags withinthe system, intuitively allow users to explore a repository oftag-annotated resources, just by clicking on tags.

Beside search algorithms that simply detect resources di-rectly annotated with the search tag, there exist more ad-vanced algorithms that exploit the full structure of the folk-sonomy [20]. A folksonomy is basically a collection of all tagassignments (user-tag-resource bindings) in the system. Itcan be modeled as graph which makes it possible to applygraph-based search and ranking algorithms according to theparadigm of PageRank [24]. Such ranking algorithms likeFolkRank [13], which is based on PageRank and applicableto folksonomies, not only allow to rank resources but alsotags and users. This feature expands the scope of applica-tions to tag recommendations, user/expert search, etc.

Hence, ranking algorithms play a central role in a multi-tude of applications, however all ranking algorithms have toface the problem of ambiguity. For example, the tag “java”might be assigned to resources related to programming orthe island of Indonesia. Another problem is caused by tagsthat are re-used on various occasions with different (thoughimplicit) meaning. For instance, the tag “to-read” might beadded by a same user at different times to scientific papersthat are relevant for a research work or to websites that ex-plain what to see in some location the user would like to visiton holidays. If the tag “to-read” would be used in a query,likely the ranking algorithm outcome would not satisfy theuser because such algorithms lack the means to contextual-ize the ranking. Correspondingly, for broad tags like“music”or “web”, which are assigned to a huge amount of resources,it is difficult to compute a ranking that fits to the actualdesires of the user.

One could think that ambiguity could be reduced by adopt-ing personalization strategies, so to produce personalizedrankings. The problem is that personalization techniquesare currently limited by their need of time to build adequateuser models: The user has, in fact, to register to the system

209

and work long enough to allow the system itself to collect asufficient amount of data to provide personalization.

We present a lightweight approach that tackles the prob-lems, mentioned above, without requiring the systems to doextensive user modeling and without any prerequisites forthe user. We do so by proposing general strategies that relyon the contextual information in a way that is orthogonal tothe ranking algorithm that is used. We model context bythe notion of tag clouds (list of weighted tags), e.g., it canbe formed by the tag cloud of a resource, from which theuser initiates a search activity. So, for example, if a userhas navigated to an image in Flickr showing the Indonesianisland Java and thereafter clicks on the tag “java” to explorefurther photos of the island, then it is beneficial to consideralso the other tags of the image (e.g. “indonesia”, etc.) toadapt the outcome of the search to the user’s actual needs.

The interesting novelty of our proposal is that we do notonly restrict the context information to the profile of theuser, who initiates the query, but also present strategies thatconsider the actual context of the user, i.e. the content theuser is currently browsing. Our main contributions can besummarized as follows.

• We introduce a lightweight approach that allows toadapt rankings, produced by arbitrary ranking algo-rithms, to a given context.

• We propose to model context as tag clouds and presentdifferent context modeling strategies to construct suchtag clouds without necessarily requiring any previousknowledge about the user.

• We present SocialHITS, a novel ranking algorithm,which adapts the HITS algorithm [16] to folksonomies.

• We evaluate SocialHITS and other ranking algorithmswith respect to the different context modeling strate-gies and identify the best approach for contextualizingrankings as well as the best ranking algorithm. As ourexperiments prove, there are situations in which ouralgorithms significantly perform better than existingranking algorithms.

• We not only evaluate the quality of ranking resourcesand tags, but also—and this constitutes the originalityof our experiment—measure the performance of rank-ing user entities.

The paper is organized as follows. In the next section wewill discuss previous work and provide further motivation forthe work presented in this paper. In Section 3 we outline ourapproach to contextualize rankings and define the differentcontext models as well, which are together with the rankingalgorithms (see Section 4) evaluated in Section 5. We endour paper with conclusions in Section 6.

2. PREVIOUS WORKFor research carried out in the field of tagging systems

the understanding of folksonomies [20], which evolve overtime when users assign tags to resources, is a matter ofparticular interest. Formal folksonomy models have beenproposed in [11, 22] and usually interpret a folksonomy ascollection of tag assignments possibly enriched with contextinformation like time [10] or characteristics of the setting,

in which a tag assignment was made [1]. Further, thereexist models that try to incorporate the tagging behavior ofusers [8, 10].

Based on such formal models there is various research onexploiting information encapsulated in folksonomies in orderto design search algorithms [5, 13], compute recommenda-tions [6, 14, 26], deduce semantics from tags [12, 25], ormodel users [9, 17, 21]. Here, ranking algorithms are indis-pensable as they allow to order search results, recommen-dations, etc. A fundamental assumption of research in thefield of folksonomy systems is that tags assigned by the usersto resources describe the content of the resources very well.This assumption is proved in [17], where the authors com-pared the actual content of Web pages with tags assigned tothese pages in the Delicious system.

In [4] we compared different ranking algorithms, Folk-Rank [13], GFolkRank [4], GRank [3], SocialPageRank, andSocialSimRank [5], regarding their performance in rankingresources and tags. We discovered that those algorithmswhich utilize the full folksonomy information (including con-text information attached to the tag assignments) performedbest. Our previous findings motivate the work presented inthis paper. We propose strategies enabling the integrationof context information independently from both, the usedranking algorithm and the underlying folksonomy model.From a more technical perspective, our strategies are basedon query expansion. Instead of applying co-occurrence basedtechniques [15] or using thesauri, such as WordNet, we fol-low the approach of [7] and utilize context information toexpand queries and contextualize rankings.

Previous work related to ranking in folksonomies mainlyfocusses on ranking of resources [4, 5, 13] or tags [2, 26].In this paper we go beyond state of the art and evaluatefolksonomy-based algorithms also with respect to their per-formance in ranking user entities, so to allow the identi-fication of users with certain interests or expertise. Thiscapability, though not yet exploited sufficiently in existingsocial networking services, like Facebook or LinkedIn, is oftremendous interest for the research area working on socialnetworks and has many practical applications. While socialnetworking systems require the users to input their interests,competencies, or relations to other users explicitly, taggingsystems, on the contrary, capture such information automat-ically and allow to construct social networks implicitly [23].However, the retrieval of user entities has not been studiedsufficiently in the field of folksonomy systems. Hence, weevaluate only ranking algorithms, which also allow to rankusers. Based on Kleinberg’s HITS algorithm [16] and ideaspresented in [27] we propose SocialHITS, as the notion ofauthorities and hubs appears to be appropriate for user en-tities, in particular.

3. CONTEXT IN FOLKSONOMIESTo give some intuition for the notion of context in folk-

sonomies, we first describe a characteristic scenario in theGroupMe! tagging system1, which we also used as test envi-ronment to conduct our experiments (see Section 5). Group-Me! [1] enables users to manage their bookmarks and sharethem with other users. GroupMe! allows users to organizebookmarks in groups and bookmarks as well as the groupscan be annotated with tags.

1http://groupme.org

210

3.1 ScenarioLet us consider that Bob is planning to travel to the Hy-

pertext conference 2009. Therefor, he creates a GroupMe!group entitled “Trip to Hypertext ’09, Turin”, in which headds bookmarks referring to the conference website or tosome video showing sights of Turin. He also annotates hisbookmarks with tags like “hypertext”, “2009”, or “confer-ence” to facilitate future retrieval. Bob would appreciatesome tag suggestions that expedite the tagging process. Al-ice is browsing through the GroupMe! system and stumblesupon Bob’s group, because she is interested in submittinga paper to that conference. However, via the bookmarkedconference website, which is part of the group, she findsout that the deadline has already passed. She now clickson the tag “conference” and when she does so, likely sheis not interested in any conference but in conferences thatare related either to the same topics or to the year 2009 orthat are related to combinations of all such features. Fur-thermore, she would be delighted to find expert users withwhom she could discuss about appropriate conferences andcorresponding topics.

In the scenario, the consideration of context can help toimprove the usability of the tagging system. When comput-ing tag suggestions, Bob’s user profile as well as the tags thathave already been assigned to other bookmarks in the “Tripto Hypertext ’09, Turin” group can be considered. Further,when Alice clicks on the tag “conference” she neither wantsto retrieve bookmarks related to conferences in the field ofbiology nor seeks for information about past conferences,but she would like to obtain content relevant to computerscience conferences in 2009. To adapt the search result toAlice’s needs it would be appropriate to include the tags thatoccur within the Web page Alice visited when she clicked on“conference”. Therewith, adaptation would even be possibleif Alice is not known to the tagging system and even uses itfor a few seconds only.

3.2 Constructing ContextOur approach is to model context in folksonomy systems

as tag clouds, which are lists of weighted tags. A folksonomyitself is, according to [11], defined as follows.

Definition 1. A folksonomy is a quadruple F := (U, T,R,Y ), where U , T , R are finite sets of instances of users, tags,and resources. Y defines a relation, the tag assignment,between these sets, that is, Y ⊆ U × T ×R.

A tag cloud can be computed for users, tags, and re-sources, e.g. the tag cloud for a user u can be defined asTCU (u) = {{t, w(u, t)}|(u, t, r) ∈ Y, w(u, t) = |{r ∈ R :(u, t, r) ∈ Y }|}. Hence, the weight assigned to a tag is sim-ply corresponds to the usage frequency of the tag. We nor-malize the weights so that the sum of the weights assigned tothe tags in the tag cloud is equal to 1. Furthermore, we useTCU @k(u), TCT @k(t), and TCR@k(r) respectively to referto the tag cloud that contains only the top k tags, whichhave the highest weight.

In our scenario, Alice and Bob are acting in the GroupMe!system. GroupMe! implies a folksonomy model that incor-porates additional information indicating in which contexta tag was assigned to a resource. In particular, such con-text is given by groups, which are finite sets of resources,and the corresponding group context folksonomy is definedas follows.

Definition 2. A group context folksonomy is a 5-tuple F :=(U, T, R,G, Y ), where U , T , R, G are finite sets that containinstances of users, tags, resources, and groups, respectively.R = R ∪ G is the union of the set of resources and the setof groups and Y defines a tag assignment having a groupcontext : Y ⊆ U × T × R × (G ∪ {ε}), where ε is a reservedsymbol for the empty group context, i.e. if there is no groupcontext available.

Group context folksonomies might evolve in systems likeGroupMe!, which allows for tagging of bookmarks in thecontext of a group of related bookmarks, or Flickr, whichenables users to create sets of images they can tag. Tagclouds for users, tags, and resources are computed corre-spondingly to traditional folksonomies, whereas a group tagcloud TCG(g) is computed by unifying TCR(g) (groups areresources as well and can therewith be tagged) and the tagclouds of resources contained in g.

In this paper, we compare three lightweight approachesfor constructing context.

user The user context is the top k tag cloud (TCU @k(u)) ofthe user, who is acting and whose action should be con-textualized, i.e. the tags he/she used most frequently.

resource If a user has navigated to a certain resource rthen the tag cloud of the resource TCR@k(r) can beused as context to adapt to his/her next activities.

group Correspondingly, if the user currently browses a groupg of resources, e.g. a GroupMe! group or a set ofimages in Flickr, then TCG@k(g) can model his/hercontext.

The user context corresponds to the naive user model-ing strategy described in [21] and only works if the user isalready known to the system by means of previously per-formed tagging activities. In our evaluation we utilize theuser context strategy as benchmark and investigate whetherthe resource and group context strategies, which do not re-quire any previous knowledge about the user, can competewith the user context strategy.

The context models are deliberately simple. More com-plex models can be constructed by combining the contextmodels above or by memorizing resource and group contextfor a user over a specific period in time. In our evaluationwe set k = 20 and thus considered the top 20 tags of the tagclouds.

3.3 Contextualizing RankingsOur approach targets topic-sensitive ranking algorithms,

i.e. algorithms that rank entities (users, tags, and resources)with respect to some topic specified via a query. With con-textualization of rankings we mean that the ranking respectsthe query as well as the context given by means of a tag cloud(see previous section). In the scenario above the query wasgiven as single tag, e.g. Alice clicked on a tag to retrieveboth, a ranked list of resources and a ranked list of users,who are experts in Alice’s current area of interest. A querymight however also consist of multiple tags and can there-with be interpreted as tag cloud as well, where the tags areusually weighted equally.

Definition 3. The generic algorithm for computing con-textualized rankings simply combines the ranking computed

211

with respect to the query tag cloud with the one computedfor the context tag cloud.

1. Input: query TCq, context TCc, folksonomy F, rank-ing algorithm a, context influence d

2. Compute a ranking Rq based on the query tag cloud,Rq ← a.rank(TCq,F), and a ranking Rc based on thecontext tag cloud, Rc ← a.rank(TCc,F). Rq and Rc

are sets of weighted entities (ei, wq) and (ei, wc) re-spectively.

3. Compute the result ranking Rr by averaging Rq andRc. Rr contains weighted entities (ei, wi,r), wherewi,r = (1− d) · wi,q + d · wi,c.

4. Output: Rr, the set of weighted entities (ei, wi,r),where wi,r denotes the weight (ranking score) assignedto the ith entity (user, tag, or resource).

A contextualized ranking is thus the average of the queryand context ranking. In the next section we present differentranking algorithms applicable to folksonomies that can beused as input for the contextualization algorithm specifiedin Def. 3.

4. RANKING ALGORITHMSIn this section we first recapitulate ranking algorithms

developed in previous work before we introduce a new al-gorithm: SocialHITS.

4.1 FolkRankThe FolkRank algorithm [13] operates on the folksonomy

model specified in Definition 1. FolkRank transforms thehypergraph that is spanned by the tag assignments into aweighted tripartite graph GF, where an edge connects twoentities (user, tag, or resource) if both entities occur togetherat a tag assignment within the folksonomy, and the weight ofan edge corresponds to the amount of their co-occurrences.For example, the weight of an edge connecting a tag t anda resource r is defined as w(t, r) = |{u ∈ U : (u, t, r) ∈Y }| (cf. Definition 1) and thus corresponds to the numberof users, who have annotated r with t. The constructedgraph GF serves as input for an adaption of the PersonalizedPageRank [24]: ~w ← dAGF ~w+(1−d)~p, where the adjacencymatrix AGF models the folksonomy graph GF, ~p allows tospecify preferences (e.g. for a tag) and d enables to adjustthe influence of the preference vector. FolkRank applies theadapted PageRank twice, first with d = 1 and second withd < 1 (in our evaluation we set d = 0.7 as done in [13]).The final vector, ~w = ~wd<1 − ~wd=1, contains the FolkRankof each folksonomy entity.

4.2 GFolkRankGFolkRank [4] is a context-sensitive ranking algorithm

that is based on FolkRank. It expects a group context folk-sonomy (see Def. 2) as input and adapts the process of trans-forming the hypergraph – spanned by the folksonomy – intothe weighted folksonomy graph GF (cf. Section 4.1). It inter-prets groups as artificial tags and creates new tags tg ∈ TG,TG ∩ T = ∅, for each group g. These artificial tags are as-signed to all resources contained in g, whereby the user whoadded a resource to the group, is declared as the tagger.The set of nodes is thus extended by TG: VB = VA ∪ TG.The edges added to VF by the GFolkRank algorithm are:EB = EA ∪ {{u, tg}, {tg, r}, {u, r}|u ∈ U, tg ∈ TG, r ∈ R, uhas added r to group g}. We use a constant value wc to

weight these edges because a resource is usually added onlyonce to a certain group.

4.3 GRankGRank [3] is a group-sensitive ranking algorithm as well

as GFolkRank. However, GRank is not based on FolkRank,but exploits group context folksonomy in a straightforwardway. Given a query tag tq, the GRank algorithm detects a

set of tag assignments (u, t, r, g) ∈ Yq, where the resource

r ∈ R is (a) directly annotated with tq, (b) contained in agroup that is tagged with tq, (c) grouped together with aresource directly annotated with tq, or (d) a group whichcontains a resource directly annotated with tq. The entities(users, tags, and resources) are then weighted according to

their occurrence frequency within the tag assignments of Yq.For more details on GRank we refer the reader to [3, 4].

4.4 SocialHITSIn [16] Kleinberg introduces the HITS algorithm that en-

ables to detect hub and authority entities in hyperlinkednetwork structures. A hub describes an entity that links tomany high quality authority entities and an authority de-notes an entity, which is linked by many high quality hubentities. Hence, the HITS algorithm is based on a mutu-ally reinforcing relationship between hubs and authorities.Therefore, the operations that update the authority weightx〈p〉 and hub weight y〈p〉 of an entity p are defined by theoperations A and H [16].

A : x〈p〉 ←X

q:(q,p)∈E

y〈q〉 (1)

H : y〈p〉 ←X

q:(p,q)∈E

x〈q〉 (2)

Here, E denotes the set of directed edges within the givengraph G. The core algorithm of HITS, which detects the au-thorities and hubs in a given graph G, performs k iterationsin order to update x〈p〉 and y〈p〉 for each entities (nodes)within G. The core iteration is defined as follows [16].

Definition 4. Core HITS iteration.function iterate(G, k)

G: a graph containing n linked entities

Let x and y be vectors containing the authority

and hub weights.

Set x0 and y0 to ( 1n

, 1n

, 1n

, ...) ∈ Rn

for i = 1, 2, ..., k do:

x′i ← apply A to (xi−1, yi−1)

y′i ← apply H to (x′i, yi−1)

xi ← ||x′i||1yi ← ||y′i||1

end

return (xk, yk)

The graph G that is passed to the core iteration of HITShas to be a directed graph. In general, G is a partial Webgraph consisting of linked resources that are possibly rele-vant to a certain topic (cf. [16]). The challenge of applyingHITS to folksonomies is to transform a folksonomy into adirected graph in contrast to an undirected graph (GF) asdone by the ranking algorithms in the previous sections. Thetag assignments do not explicitly prescribe a direction. In[27] the authors propose the following strategy: If there isa tag assignment (u, t, r) ∈ Y then the edges “u → t” and

212

“t → r” will be constructed. Hence, hubs are restricted tobe users while the authority role is bound to resources. Inour evaluations we will denote that strategy as naive HITS.Our approach does not limit the role of hubs and authoritiesto certain folksonomy entity types, but makes it possible todetect authority users as well.

The construction of the directed folksonomy graph has toconsider the design of the folksonomy system and its user in-terface in particular. In the GroupMe! system, for example,a resource rh can be interpreted as a hub of a tag ta assignedto rh because each resource displays its tag cloud, whereasin tagging systems that do not show the tags of resources itis not possible to draw that conclusion (cf. tagging support:“viewable” vs. “blind” in [18]).

Authorities

usera high quality user annotates high quality resourcesbefore other users annotate them

tag is assigned by high quality users

resource(1) is tagged by high quality users with high quality tags(2) is contained in high quality groups

Hubs

userhas annotated high quality resources and utilized highquality tags

tag is assigned to high quality resources

resource(1) is tagged with tags of high quality resources(2) is contained in groups with high quality resources

Table 1: Overview of some characteristics of author-ity/hub users, tags, and resources.

Table 1 lists some of the characteristics of users, tags, andresources that indicate when they should be considered asauthorities and hubs respectively. Some of these character-istics can be deduced from the traditional folksonomy model(see Definition 1) while others require additional context in-formation, e.g. regarding user entities, edges representingsome user characteristics can be constructed as follows.

hub users For all resources r a user u has annotated witha tag t we can construct edges “u → t” and “u → r”.The required information is thus contained in the tagassignments.

authority users According to Table 1 an authority user ua

can also be characterized by the fact that other usershave annotated resources ua has annotated before theother users annotated them. Therefore, the timestampof tag assignments has to be evaluated so that we canconstruct an edge “uh → ua” whenever another useruh has annotated a resource that was already taggedby ua.

Having an appropriate strategy for constructing the di-rected folksonomy graph, which serves as input to the coreHITS iteration (see Definition 4), SocialHITS can be definedas follows.

Definition 5. The SocialHITS algorithm computes huband authority values for arbitrary folksonomy entities.

1. Input: folksonomy F, topic t, search strategy st, graphconstruction strategy sg, and the number of HITS it-erations k to perform

2. Ft ← apply st to F in order to search for entities andtag assignments relevant to t

3. GD ← apply sg to Ft

4. (xk, yk) ← iterate(GD, k)

5. Output: the vectors xk and yk containing the author-ity and hub values of the entities in F≈

1

10

100

1 10 100 1000 10000

number of tags

usag

e f

req

uen

cy

Figure 1: Tag usage in the GroupMe! data set on alogarithmic scale. Only a few distinct tags have beenused frequently while most of the tags are only usedonce.

In our evaluations we applied a search strategy, which sim-ply accumulated the set of entities delivered by FolkRank,GFolkRank, and GRank (without ranking the items), andutilized the sum of authority and hub score to rank.

5. EXPERIMENTIn Section 3.2 we proposed different ways to construct

context by means of tag clouds that describe the actual set-ting of the user. Section 3.3 explained how rankings can beadapted to such context independently from the underlyingranking algorithm. Several applicable ranking algorithmswere discussed in the previous section. In general, we nowhave a tool box that helps tagging systems to adapt rankingsto the actual desires of the users. In this section we evaluatethe tool box with respect to the following task.

Ranking Task. Given a keyword query (tag) and a context(set of weighted tags), the task of the ranking strategy isto compute a ranking of folksonomy entities so that en-tities that are most relevant to both, the keyword queryand the context, appear at the top of the ranking.

In particular, we will answer the following questions.

1. How does the consideration the different context typesinfluence the performance of the algorithms in fulfillingthe task above?

2. Which type of context (cf. Section 3.2) is most appro-priate?

3. Which algorithm (cf. Section 4) performs best?

We are furthermore interested in the strength of the algo-rithms regarding the type of entity (user, tag, or resource)that should be ranked. Moreover, the ranking algorithmspossibly prefer different types of context. Our goal is toclarify how each individual ranking algorithm can benefitfrom the knowledge about the context.

5.1 Data Set and Test SetWe run our experiments on a data set of the GroupMe!

tagging system (cf. Section 3.1). In the data set we had450 users, who mainly come from the research communityin Computer Science. Together they bookmarked 2189 Web

213

resources, created 550 groups to organize these bookmarksand made 3190 tag assignments using 1699 different tags.Figure 1 illustrates that the tag usage reminds of a powerlaw distribution as there are a lot of tags (72.04%), whichwere only used once, and only a few tags, which were ap-plied frequently. For example, the tag “semantic web” wasassigned 60 times and was therewith the most frequentlyused tag. Hence, regarding the tag usage distribution weobserved similar characteristics as they occur also in largerdata sets (cf. [8, 10]).

For our experiments, we defined a test set of 19 searchsettings, where each setting was formed by a keyword query(tag) and a context consisting of (i) the user u, who performsa search activity, (ii) the resource r the user u accessed be-fore initiating the search activity, and (iii) the group thatcontains r. We thus simulated the scenario described inSection 3.1, where the user Alice first accessed a group ofresources, which were related to the “Hypertext ’09 confer-ence”, then focused a certain resource (the conference web-site), before she finally clicked on the tag “conference” tosearch for related content. For the search settings, we se-lected tags as queries that cover the different spectra ofthe tag usage distribution. In particular, we chose 6 tagsthat were used 1-10 times (e.g. “soa” and “james bond”),9 tags that were used 11-20 times (e.g. “conference” and“beer”), and 4 tags that were used more than 20 times (e.g.“hannover” and “semantic web”). The topics of the differentsearch settings represented the diversity of topics available inthe GroupMe! data set. For each of the 19 search settingswe also selected a resource and a corresponding group ascontext, where the resource context tag cloud (cf. TCR(r),Section 3.2) contained 3.21 tags on average and the groupcontext tag cloud TCG(g) contained 13.58 tags. Further, foreach search setting we defined a user as actor. Here, thecondition was that the actor is also related to the topic ofthe setting, i.e. we only selected those users who alreadyused the tags that occurred in the tag clouds of the cor-responding resource (TCR(r)) and group (TCG(g)) of thesetting. Thereby, we tried to give the user modeling strat-egy (TCU (r)) the same opportunities as the resource andgroup context strategies to fulfill the task defined above.

5.2 User StudyGiven the different search settings, we conducted a user

study with users of the GroupMe! system (10 PhD stu-dents and student assistants). We presented the participantsof the study a search setting together with a list of users,tags, and resources that were determined by accumulatingthe rankings of the different strategies for the given searchsetting. For each entity (user, tag, or resource) the partici-pants judged the relevance of the entity with respect to the(i) query, (ii) group context, (iii) resource context, and (iv)user (actor) context. Therefore, they were enabled to eas-ily gather information on which they could constitute theirjudgements, e.g. all involved entities were clickable and theparticipants were able to see an entity while judging it. Inparticular, the participants had to answer whether an en-tity is relevant or not on a five-point scale: yes, rather yes,rather no, no, and don’t know. Thereby, we obtained a setof 8593 user-generated judgements, in particular 1550 yes,1549 rather yes, 1097 rather no, 4242 no, and 155 don’t knowjudgements. Figure 2 overviews the 8593 user judgementsand the overall judging behavior of the participants with

Figure 2: Characteristics of the judgment behaviorin the user study with respect to the types of ratedentities (user, tag, or resource) and the type of judg-ment basis (query, group context, resource context,or user context)

respect to the type of entity (user, tag, and resource) thatwas judged on the basis of its relevance to the query andthe different parts of the context (group, resource, and usercontext). The average judgement is given as number, where0 means don’t know, 1 means no, 2 means rather no, etc.The standard deviation σ is averaged across the deviationsof judgments, where the different participants evaluated thesame entity with respect to the same query/context.

Overall, the standard deviation indicates that the judg-ments of the participants were very homogeneous. Ratingthe relevance of entities with respect to the user contextwas probably the most difficult task for the participants,because therefor they had to browse the profile of the corre-sponding user, i.e. the groups he/she created, the resourceshe/she bookmarked, and the tags he/she used in the past.Hence, the standard deviation for that judgement task ishigher than for the others. Judging tags was the most intu-itive task and also gained the most homogenous judgements.On average, the resources were rated better than tags, andusers. This can be explained by the number of possibly rel-evant entities listed in the user study. For example, therewere probably less than 5 of 22 users but more than 20 of43 resources relevant to the query “james bond”. However,even if there would be a slightly different judging behav-ior regarding the different types of entities (users, tags, andresources) then this would not influence our results as allalgorithms were confronted with the same settings.

In general, the characteristics of the data set of judge-ments carried out during the user study enable us to gainstatistically well-grounded results.

5.3 Method and MetricsAccording to the ranking task, which we defined at the

beginning of the section, the different strategies had to rankusers, tags, and resources with respect to a given searchsetting consisting of a query and context as described inSection 5.1. We combined the ranking algorithms presentedin Section 4 with the different context models presented inSection 3.2 and then passed them to the algorithm for con-textualizing rankings (Definition 3 in Section 3.3). Therebywe obtained 12 strategies, e.g. FolkRank(user), which de-

214

notes the strategy that applies the FolkRank algorithm to-gether with the user context, or GRank(resource), which isthe strategy that contextualizes the ranking produced byGRank with the resource context. Each ranking strategythen had to compute a user, tag, and resource ranking foreach of the 19 search settings, which consist of a query andthe (user, group, and resource) context. Thus, each strategyhad to compute 57 rankings.

To measure the quality of the rankings we used the fol-lowing metrics (cf. [26]):

MRR The MRR (Mean Reciprocal Rank) indicates at whichrank the first relevant entity occurs on average.

S@k The Success at rank k (S@k) stands for the mean prob-ability that a relevant entity occurs within the top kof the ranking.

P@k Precision at rank k (P@K ) represents the average pro-portion of relevant entities within the top k.

For our experiment we considered an entity as relevant iffthe average user judgement is at least “rather yes” (ratingscore ≥ 3.0), e.g. given three “rather yes” (rating score =3) judgments and two “rather no” judgments (rating score= 2) for the same entity with respect to some setting thenthis was treated as not relevant, because the average ratingscore is 2.6 and therewith smaller than 3.0 (“rather yes”).Judgements where the participant stated “don’t know” weretreated as “no”.

5.4 ResultsWe present the results according to the following struc-

ture. We first try to evaluate the performance of the newlyintroduced SocialHITS algorithm, independently from theused context strategy. Afterwards we overview our core re-sults that allow us to answer the questions raised at thebeginning of this section. In Subsection 5.4.3 we analyzethe performance of the strategies when they have to rank(a) user and (b) resource entities. We will particularly in-vestigate the ability of the algorithms to rank users, becausethis has not been studied extensively in previous work yet.Our result analysis finishes with a summary regarding theperformance of the different context models, which are usedto adapt the rankings to the actual context of a user.

We tested the statistical significance of all following resultswith a two-tailed t-Test and a significance level of α = 0.05.The null hypothesis H0 is that some strategy s1 is as good asanother strategy s2, while H1 states that s1 is better than s2.

5.4.1 SocialHITS vs. naive HITSThe SocialHITS algorithm, which we introduced in Def-

inition 5, expects a graph construction strategy as input,which creates a directed graph from the given folksonomy.A naive approach to construct such a graph is presentedin [27]. Figure 3 compares this straightforward applicationof HITS with SocialHITS, a more complex approach, whichcauses a graph with higher compactness. The results arebased on 171 test runs, where the algorithms had to rankuser, tags, or resources regarding the different search set-tings described above. Entities were considered as relevantiff they were, according to the user judgments, relevant toboth, the query and the context. SocialHITS outperformsthe naive HITS algorithm significantly with respect to allmetrics. For example, the mean reciprocal rank (MRR),

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

SocialHITS naive HITS

MRR

S@1

P@10

Figure 3: SocialHITS vs. naive HITS strategy (or-dered by MRR(both)).

which indicates the average rank of the first relevant entity,is more than 50% better when using SocialHITS instead ofthe naive approach. The same holds for S@1. In particular,the probability that a relevant entity appears at the firstrank is 47.4% when using SocialHITS in contrast to 28.7%when the naive approach is applied. Further, the precisionwithin the top 10 is significantly higher for the SocialHITSalgorithm.

The performance differences were obvious for every singleranking result. The naive HITS algorithm performed worstwhen it had to rank user entities. This can be explained fromthe underlying graph construction strategy, which implies anauthority score of zero for user entities.

As SocialHITS outperforms the naive HITS approach wejust consider SocialHITS for our comparisons with the otherranking algorithms presented in Section 4.

5.4.2 Result OverviewFigure 4 overviews the core results of our experiment. It

shows the quality of the ranking algorithms (Section 4) incombination with the different context models (Section 3.2)when using the contextualization strategy defined in Sec-tion 3.3. The metrics MRR(context), S@1(context), andP@10(context) determine the relevance of a particular en-tity with respect to the context, which is formed by theactor of a search setting as well as the resource and groupcontext. For MRR(both), S@1(both), and P@10(both) rel-evance is given iff the entity is relevant to both, the queryand the context of a search setting.

The GRank algorithm in combination with the resourcecontext (GRank(resource)) is the most successful strategyfor computing folksonomy entity rankings that should beadapted to a given search setting. GRank(resource) signif-icantly performs better than all other strategies except forGRank (group) and GFolkRank(resource). Overall, Figure 4reveals two main results: (1) the GRank algorithm is thebest performing algorithm and (2) independently from theused algorithm, the resource and group context models pro-duce better results than the user context strategy.

It is interesting to see that the precisions P@10(context)and P@10(both) do not differ significantly, which means thatthe items, which are included into the top 10 rankings be-cause of their relevance to the context, are also relevantto the query. This gives supplemental motivation for thework, presented in this paper, as it indicates that the con-sideration of context does not reduce the precision of theresult rankings within the top 10. Similarly, this motiva-tion can be deduced from the S@1 metrics, as there is no

215

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

GRank(resource)

GRank(group)

GRank(user)

GFolkRank(resource)

GFolkRank(group)

GFolkRank(user)

FolkRank(resource)

FolkRank(group)

FolkRank(user)

SocialHITS(group)

SocialHITS(resource)

SocialHITS(user)

MRR(context)

MRR(both)

S@1(context)

S@1(both)

P@10(context)

P@10(both)

Figure 4: Performance of the different strategies with respect to the task of ranking folksonomy entities(ordered by MRR(both)).

significant difference between S@1(context) and S@1(both)for the strategies that make use of the resource or groupcontext. However, the consideration of user context causesimpreciseness regarding query relevance at the very top ofthe ranking. For example, the probability to retrieve an itemthat is relevant to the context of a search setting is 75.4%when GRank(user) is applied, whereas the probability thatthis item is relevant to the query as well is just 59.6%.

Between FolkRank and GFolkRank, the group-sensitiveextension of FolkRank, there is not a significant differencein general, but GFolkRank performs better for all the dif-ferent context models than FolkRank. The SocialHITS al-gorithm tends to be outperformed by the other algorithms.The performance of SocialHITS depends on the type of en-tity that should be ranked, while the performance of theother algorithms is rather constant, in this regard. Social-HITS significantly performs worse when it has to rank tagsinstead of users or resources. Hence, the role of tags in themodel of SocialHITS (cf. Table 1) should possibly be re-vised in future work to make SocialHITS also applicable tothe ranking of tags.

5.4.3 Ranking Users and ResourcesThe task of ranking resources is possibly the most promi-

nent ranking application, because it is, for example, appliedto put search results into an appropriate order. Figure 5overviews the performance of the different algorithms forthat task averaged across the test runs targeting the differ-ent search settings while considering either the user, group,or resource context. The metrics MRR, S@1, and P@10are measured based on the relevance of a resource to both,

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

GRank GFolkRank SocialHITS FolkRank

MRR

S@1

P@10

Figure 5: Performance of the different algorithmswith respect to the task of ranking resources (or-dered by MRR).

the query and the context of the corresponding search set-ting. GRank is significantly the best algorithm to rank re-sources followed by GFolkRank. Both algorithms exploitgroup structures in group context folksonomies (see Defini-tion 2). Such folksonomies arise in tagging systems such asFlickr or GroupMe! which allow their users to group andtag the resources. In folksonomy systems that do not offerthe notion of groups these algorithms would not work prop-erly. In these systems SocialHITS would be the preferredchoice because it shows better results than the FolkRankalgorithm.

The results of the experiment focussing on ranking users isof particular interest because so far there exist – to the bestof our knowledge – no studies which analyze the quality offolksonomy-based ranking algorithms in this regard. A set

216

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

GRank SocialHITS GFolkRank FolkRank

MRR

S@1

P@10

Figure 6: Performance of the different algorithmswith respect to the task of ranking user (ordered byMRR).

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

resource group user

MRR(query)

MRR(context)

MRR(both)

S@1(query)

S@1(context)

S@1(both)

P@10(query)

P@10(context)

P@10(both)

Figure 7: Performance depending on the used con-text type (ordered by MRR(both).

of exiting application can be realized with the aid of an userranking functionality. For example, it can be applied to findexperts on a certain topic or to recommend a users to eachother, who have – based on their tagging behavior – similarinterests.

The qualification of the algorithms to rank user entitiescan be derived from the results shown in Figure 6. Overall,the outcomes are, regarding P@10, worse than the outcomesof the resource ranking experiment depicted in Figure 5.This can be explained by the absolute number of users pos-sibly relevant to a search setting which is lower in compari-son to the number of possibly relevant resources. GRank isagain the best performing algorithm. For example, the prob-ability that a user, who is relevant to the query and context,appears in the first position of the ranking is 94.7%. Social-HITS is the second best strategy having S@1 score that is20% higher than the one of GFolkRank and FolkRank. Fur-ther, the mean reciprocal rank (MRR) of SocialHITS is morethan 10% better than the one of GFolkRank and FolkRank,which do not differ significantly in their performance. Hence,SocialHITS is again the first choice for applications, whichcannot make use of GRank and which especially require ahigh precision at the very top of the rankings, e.g. for ap-plications that try to predict links between users.

5.4.4 SynopsisFrom the results presented in the previous subsections we

can identify GRank, which we introduced in [3], as best

performing algorithm for ranking entities in group contextfolksonomies. When it comes to ranking of users or resourcesthen SocialHITS, which significantly performs better thanthe naive HITS approach, is the best algorithm operatingon the traditional folksonomy model (cf. Definition 1).

Figure 7 abstracts from the underlying ranking algorithmsand summarizes the results listed in Figure 4 from the per-spective of the context type that was considered by the al-gorithms to adapt the rankings to a particular search set-ting. According to the results shown in Figure 7, we canclearly put the strategies into an order: (1) the resourcecontext gains significantly better results than the group anduser context, (2) the group context strategy produces signif-icantly better results than the user context strategy, while(3) the user context strategy performs worst. As describedin Section 3.2, context is formed by the tag cloud of a re-source, group, or user respectively. The size of the differ-ent tag cloud types differed: Resource tag clouds containedon average 3.21 tags, group tag clouds 13.58, and user tagclouds were limited to 20 tags. However, the pure size of thecontext tag clouds do not only explain the outcomes of theexperiment. For example, for some settings group contexttag clouds containing more than 15 tags delivered better re-sults than smaller tag clouds while for other settings it wasthe other way round. Hence, rather the homogeneity of atag cloud used as context seems to influence the quality ofa contextualizing a ranking. The user context, i.e. the toptags of the user who performs a search activity, is themati-cally multi-faceted, which explains that the mean reciprocalrank measured with respect to the context (MRR(context))is higher than the MRR measured regarding the relevanceto the query (MRR(query)).

Overall, the excellent results of the resource and groupcontext strategies are impressive, because they do not re-quire any previous knowledge about the user, but just cap-ture the current context of a user. The user modeling strat-egy on the contrary requires such knowledge. Our resultshave therewith a direct impact on the end users of a taggingsystem as they can benefit from the adaptation of resultrankings to their current needs even if they are not knownto the system.

6. CONCLUSIONSRanking in folksonomies is currently an important re-

search topic. In this paper we proposed an approach thatallows to adapt rankings to the actual context of a user inde-pendently from the underlying ranking algorithm. We pre-sented different strategies that are able to form such contextby the notion of tag clouds even if no previous knowledgeabout the user is available. Furthermore, we introduced So-cialHITS, a new ranking algorithm for folksonomy systems,and showed that it significantly improves the HITS-basedapproach proposed in [27]. We analyzed the performance ofSocialHITS and other folksonomy-based ranking algorithmsfor the task of contextualizing rankings while consideringdifferent types of context and revealed that those strategies,which do not require any previous knowledge about the user,perform significantly better than tag-based user modeling.For example, by considering the the tag cloud of a resourcethe user has just visited we are able to adapt the ranking ofa subsequent search activity to the user’s current context. Aremarkable feature of our evaluation was that we also mea-sured the ranking performance with respect to the task of

217

ranking users, which is new in the field of research on folk-sonomies and further promises high impact on the future ofsocial networking. Here, we identified SocialHITS as one ofthe most promising ranking algorithms.

Acknowledgments. We thank all the participants ofthe user study. This research has partially been funded byDAAD and by Ateneo Italo-Tedesco through the Vigoni Ger-man and Italian researchers exchange Program 2007-2008.

7. REFERENCES[1] F. Abel, M. Frank, N. Henze, D. Krause, D. Plappert,

and P. Siehndel. GroupMe! – Where Semantic Webmeets Web 2.0. In Int. Semantic Web Conference(ISWC 2007), November 2007.

[2] F. Abel, N. Henze, and D. Krause. Exploitingadditional Context for Graph-based TagRecommendations in Folksonomy Systems. In Int.Conf. on Web Intelligence and Intelligent AgentTechnology. ACM Press, December 2008.

[3] F. Abel, N. Henze, and D. Krause. A Novel Approachto Social Tagging: GroupMe! In 4th Int. Conf. onWeb Information Systems and Technologies(WEBIST), May 2008.

[4] F. Abel, N. Henze, D. Krause, and M. Kriesell. On theeffect of group structures on ranking strategies infolksonomies. In R. Baeza-Yates and I. King, editors,Weaving Services, Location, and People on the WWW.Springer, to appear.

[5] S. Bao, G. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su.Optimizing Web Search using Social Annotations. InProc. of 16th Int. World Wide Web Conference(WWW ’07), pages 501–510. ACM Press, 2007.

[6] A. Byde, H. Wan, and S. Cayzer. Personalized tagrecommendations via tagging and content-basedsimilarity metrics. In Proc. of the Int. Conf. onWeblogs and Social Media (ICWSM), March 2007.

[7] P.-A. Chirita, C. Firan, and W. Nejdl. PersonalizedQuery Expansion for the Web. In Proc. of the 30thInt. ACM SIGIR conference on Research andDevelopment in Information Retrieval (SIGIR ’07),pages 7–14, New York, NY, USA, 2007. ACM.

[8] K. Dellschaft and S. Staab. An epistemic dynamicmodel for tagging systems. In P. Brusilovsky and H. C.Davis, editors, Hypertext, pages 71–80. ACM, 2008.

[9] C. S. Firan, W. Nejdl, and R. Paiu. The Benefit ofUsing Tag-based Profiles. In Proc. of the 2007 LatinAmerican Web Conference (LA-WEB 2007), pages32–41, Washington, DC, USA, 2007. IEEE ComputerSociety.

[10] H. Halpin, V. Robu, and H. Shepherd. The ComplexDynamics of Collaborative Tagging. In Proc. of 16thInt. World Wide Web Conference (WWW ’07), pages211–220, New York, NY, USA, 2007. ACM Press.

[11] A. Hotho, R. Jaschke, C. Schmitz, and G. Stumme.BibSonomy: A Social Bookmark and PublicationSharing System. In Proc. First Conceptual StructuresTool Interoperability Workshop, pages 87–102,Aalborg, 2006.

[12] A. Hotho, R. Jaschke, C. Schmitz, and G. Stumme.Emergent Semantics in BibSonomy. In C. Hochbergerand R. Liskowsky, editors, Informatik 2006:Informatik fur Menschen, volume 94(2) of LNI, Bonn,October 2006. GI.

[13] A. Hotho, R. Jaschke, C. Schmitz, and G. Stumme.FolkRank: A Ranking Algorithm for Folksonomies. InProc. of Workshop on Information Retrieval (FGIR),Germany, 2006.

[14] R. Jaschke, L. B. Marinho, A. Hotho,L. Schmidt-Thieme, and G. Stumme. Tagrecommendations in folksonomies. In Proc. 11thEurop. Conf. on Principles and Practice of KnowledgeDiscovery in Databases (PKDD), pages 506–514, 2007.

[15] M.-C. Kim and K.-S. Choi. A Comparison ofCollocation-based Similarity Measures in QueryExpansion. Information Processing and Management,35(1):19–30, 1999.

[16] J. M. Kleinberg. Authoritative sources in ahyperlinked environment. Journal of the ACM,46(5):604–632, 1999.

[17] X. Li, L. Guo, and Y. E. Zhao. Tag-based socialinterest discovery. In Proc. of the 17th Int. WorldWide Web Conference (WWW’08), pages 675–684.ACM Press, 2008.

[18] C. Marlow, M. Naaman, D. Boyd, and M. Davis.HT06, tagging paper, taxonomy, flickr, academicarticle, to read. In Proc. of the 17th Conf. onHypertext and Hypermedia, pages 31–40. ACM Press,2006.

[19] C. Marlow, M. Naaman, D. Boyd, and M. Davis.Position Paper, Tagging, Taxonomy, Flickr, Article,ToRead. In Collaborative Web Tagging Workshop atWWW ’06, May 2006.

[20] T. Vander Wal. Folksonomy.http://vanderwal.net/folksonomy.html, July 2007.

[21] E. Michlmayr and S. Cayzer. Learning User Profilesfrom Tagging Data and Leveraging them forPersonal(ized) Information Access. In Proc. of theWorkshop on Tagging and Metadata for SocialInformation Organization, 16th Int. World Wide WebConference (WWW ’07), May 2007.

[22] P. Mika. Ontologies Are Us: A unified model of socialnetworks and semantics. In Proc. Int. Semantic WebConf. (ISWC 2005), pages 522–536, November 2005.

[23] A. Nauerz and G. Groh. Implicit social networkconstructuon and expert user determination in webportals. In AAAI Spring Symposium on SocialInformation Processing, pages 60–65. StanfordUniversity, AAAI Press, 2008.

[24] L. Page, S. Brin, R. Motwani, and T. Winograd. ThePageRank Citation Ranking: Bringing Order to theWeb. Technical report, Stanford Digital LibraryTechnologies Project, 1998.

[25] T. Rattenbury, N. Good, and M. Naaman. Towardsautomatic extraction of event and place semanticsfrom flickr tags. In Proc. of the 30th Int. ACM SIGIRConf. on Information Retrieval (SIRIR ’07), pages103–110, New York, NY, USA, 2007. ACM Press.

[26] B. Sigurbjornsson and R. van Zwol. Flickr tagrecommendation based on collective knowledge. InProc. of 17th Int. World Wide Web Conference(WWW ’08), pages 327–336. ACM Press, 2008.

[27] H. Wu, M. Zubair, and K. Maly. Harvesting socialknowledge from folksonomies. In Proc. of the 17thConf. on Hypertext and Hypermedia (HT ’06), pages111–114, New York, NY, USA, 2006. ACM Press.

218

context-based ranking in folksonomies - diunitoargo/papers/2009_hta.pdfsocial media, search,...

Documents