why we twitter: understanding microblogging usage and ...· why we twitter: understanding...

Download Why We Twitter: Understanding Microblogging Usage and ...· Why We Twitter: Understanding Microblogging

Post on 26-Feb-2019

214 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

Why We Twitter: Understanding MicrobloggingUsage and Communities

Akshay JavaUniversity of Maryland Baltimore County

1000 Hilltop CircleBaltimore, MD 21250, USA

aks1@cs.umbc.edu

Xiaodan SongNEC Laboratories America

10080 N. Wolfe Road, SW3-350Cupertino, CA 95014, USA

xiaodan@sv.nec-labs.com

Tim FininUniversity of Maryland Baltimore County

1000 Hilltop CircleBaltimore, MD 21250, USA

finin@cs.umbc.edu

Belle TsengNEC Laboratories America

10080 N. Wolfe Road, SW3-350Cupertino, CA 95014, USAbelle@sv.nec-labs.com

ABSTRACTMicroblogging is a new form of communication in whichusers can describe their current status in short posts dis-tributed by instant messages, mobile phones, email or theWeb. Twitter, a popular microblogging tool has seen a lotof growth since it launched in October, 2006. In this paper,we present our observations of the microblogging phenom-ena by studying the topological and geographical propertiesof Twitters social network. We find that people use mi-croblogging to talk about their daily activities and to seekor share information. Finally, we analyze the user intentionsassociated at a community level and show how users withsimilar intentions connect with each other.

Categories and Subject DescriptorsH.3.3 [Information Search and Retrieval]: InformationSearch and Retrieval - Information Filtering; J.4 [ComputerApplications]: Social and Behavioral Sciences - Economics

General TermsSocial Network Analysis, User Intent, Microblogging, SocialMedia

1. INTRODUCTIONMicroblogging is a relatively new phenomenon defined as aform of blogging that lets you write brief text updates (usu-ally less than 200 characters) about your life on the go andsend them to friends and interested observers via text mes-saging, instant messaging (IM), email or the web. 1. It is

1http://en.wikipedia.org/wiki/Micro-blogging

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.Joint 9th WEBKDD and 1st SNA-KDD Workshop 07 , August 12, 2007 ,San Jose, California , USA . Copyright 2007 ACM 1-59593-444-8...$5.00.

provided by several services including Twitter2, Jaiku3 andmore recently Pownce4. These tools provide a light-weight,easy form of communication that enables users to broadcastand share information about their activities, opinions andstatus. One of the popular microblogging platforms is Twit-ter [29]. According to ComScore, within eight months of itslaunch, Twitter had about 94,000 users as of April, 2007 [9].Figure 1 shows a snapshot of the first authors Twitter home-page. Updates or posts are made by succinctly describingones current status within a limit of 140 characters. Top-ics range from daily life to current events, news stories, andother interests. IM tools including Gtalk, Yahoo and MSNhave features that allow users to share their current statuswith friends on their buddy lists. Microblogging tools facili-tate easily sharing status messages either publicly or withina social network.

Figure 1: An example Twitter homepage with up-dates talking about daily experiences and personalinterests.

2http://www.twitter.com3http://www.jaiku.com4http://www.pownce.com

Compared to regular blogging, microblogging fulfills a needfor an even faster mode of communication. By encourag-ing shorter posts, it lowers users requirement of time andthought investment for content generation. This is also oneof its main differentiating factors from blogging in general.The second important difference is the frequency of update.On average, a prolific bloger may update her blog once ev-ery few days; on the other hand a microblogger may postseveral updates in a single day.

With the recent popularity of Twitter and similar microblog-ging systems, it is important to understand why and howpeople use these tools. Understanding this will help usevolve the microblogging idea and improve both microblog-ging client and infrastructure software. We tackle this prob-lem by studying the microblogging phenomena and analyz-ing different types of user intentions in such systems.

Much of research in user intention detection has focused onunderstanding the intent of a search queries. According toBroder [5], the three main categories of search queries arenavigational, informational and transactional. Understand-ing the intention for a search query is very different fromuser intention for content creation. In a survey of bloggers,Nardi et al. [26] describe different motivations for whywe blog. Their findings indicate that blogs are used as atool to share daily experiences, opinions and commentary.Based on their interviews, they also describe how bloggersform communities online that may support different socialgroups in real world. Lento et al. [21] examined the im-portance of social relationship in determining if users wouldremain active in a blogging tool called Wallop. A users re-tention and interest in blogging could be predicted by thecomments received and continued relationship with otheractive members of the community. Users who are invited bypeople with whom they share pre-exiting social relationshipstend to stay longer and active in the network. Moreover, cer-tain communities were found to have a greater retention ratedue to existence of such relationships. Mutual awareness ina social network has been found effective in discovering com-munities [23].

In computational linguists, researchers have studied the prob-lem of recognizing the communicative intentions that un-derlie utterances in dialog systems and spoken language in-terfaces. The foundations of this work go back to Austin[2], Stawson [32] and Grice [14]. Grosz [15] and Allen [1]carried out classic studies in analyzing the dialogues be-tween people and between people and computers in coopera-tive task oriented environments. More recently, Matsubara[24] has applied intention recognition to improve the per-formance of automobile-based spoken dialog system. Whiletheir work focusses on the analysis of ongoing dialogs be-tween two agents in a fairly well defined domain, studyinguser intention in Web-based systems requires looking at boththe content and link structure.

In this paper, we describe how users have adopted a spe-cific microblogging platform, Twitter. Microblogging is rel-atively nascent, and to the best of our knowledge, no largescale studies have been done on this form of communicationand information sharing. We study the topological and geo-graphical structure of Twitters social network and attempt

to understand the user intentions and community structurein microblogging. From our analysis, we find that the maintypes of user intentions are: daily chatter, conversations,sharing information and reporting news. Furthermore, usersplay different roles of information source, friends or informa-tion seeker in different communities.

The paper is organized as follows: in Section 2, we describethe dataset and some of the properties of the underlyingsocial network of Twitter users. Section 3 provides an anal-ysis of Twitters social network and its spread across geogra-phies. Next, in Section 4 we describe aggregate user behav-ior and community level user intentions. Section 5 providesa taxonomy of user intentions. Finally, we summarize ourfindings and conclude with Section 6.

2. DATASET DESCRIPTIONTwitter is currently one of the most popular microbloggingplatforms. Users interact with this system by either using aWeb interface, IM agent or sending SMS updates. Membersmay choose to make their updates public or available only tofriends. If users profile is made public, her updates appearin a public timeline of recent updates. The dataset usedin this study was created by monitoring this public timelinefor a period of two months starting from April 01, 2007 toMay 30, 2007. A set of recent updates were fetched onceevery 30 seconds. There are a total of 1,348,543 posts from76,177 distinct users in this collection.

Twitter allows a user, A, to follow updates from othermembers who are added as friends. An individual who isnot a friend of user A but follows her updates is known asa follower. Thus friendships can either be reciprocated orone-way. By using the Twitter developer API5, we fetchedthe social network of all users. We construct a directedgraph G(V, E), where V represents a set of users and Erepresents the set of friend relations. A directed edge eexists between two users u and v if user u declares v asa friend. There are a total of 87,897 distinct nodes with829,053 friend relation between them. There are more nodesin this graph due to the fact that some users discoveredthough the link structure do not have any posts during theduration in which the data was collected. For each user, wealso obtained their profile information and mapped theirlocation to a geographic coordinate, details of which areprovided in the following section.

3. MICROBLOGGING IN TWITTERThis section describes some of the characteristic propertiesof Twitters Social Network including its network topologyand geographical distribution.

3.1 Growth of TwitterSince Twitter provides a sequential user and post identifier,we can estimate the growth rate of Twitter. Figure 2 showsthe growth rate for users and Figure 3 shows the growth ratefor posts in this collection. Since, we do not have access tohistorical data, we can only observe its growth for a twomonth time period. For each day we identify the maximumvalue for the user identifier and post identifier a