tools and tips for analyzing social media data
TRANSCRIPT
![Page 1: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/1.jpg)
Analyzing Social Media Systems
Shelly Farnham, Emre Kiciman
FUSE Labs & Internet Services Research Center, Microsoft Research
CHI Course 2013
![Page 2: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/2.jpg)
Agenda
Introductions Overview Lesson scenarios with real data
Usage analysis: Predictors of coming back Social network analysis: Finding who you like Content analysis: Relationships, cliques, and their conversation
Focus on Tools/Tips, special consideration when examining social data
![Page 3: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/3.jpg)
MAKING MEANING OUT OF THE MESS
![Page 4: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/4.jpg)
SHELLY FARNHAM: INDUSTRY RESEARCH
Specialize in social technologies Social networks, community, identity, mobile
social
Early stage innovation Extremely rapid R&D cycle study, brainstorm, design, prototype, deploy,
evaluate (repeat) Convergent evaluation methodologies: usage
analysis, interviews, questionnaires
Career PhD in Social Psychology from UW 7 years Microsoft Research
Virtual Worlds, Social Computing, Community Technologies 4 years startup world Waggle Labs (consulting), Pathable 2 Years Yahoo! FUSE Labs, Microsoft Research
Personal Map
![Page 5: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/5.jpg)
EMRE KICIMAN Specialize in social data analytics
Social media, social networks, search
Methods Machine learning Information extraction, entity recognition from social
data Prototyping
Career Ph.D. and M.S. in computer science from Stanford
University B.S. in Electrical Engineering and Computer Currently at Internet Services Research Center,
Microsoft Research
![Page 6: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/6.jpg)
ANALYSIS THROUGHOUT R&D CYCLE
Importance of Information in selecting chat partner
0
1
2
3
4
5
6
7
Rank
Rating
Similarity
Interacts with friends
Ratings by friends
![Page 7: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/7.jpg)
USER STUDIES
![Page 8: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/8.jpg)
PROTOTYPING
![Page 9: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/9.jpg)
USAGE ANALYSISDo social responses matter in driving engagement?
![Page 10: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/10.jpg)
SOCIAL MEDIA ANALYSIS Common types
Usage analysis: behaviors, interactions Network analysis: patterns in networks (sets of
pair-wise connections) Content analysis: semantics, sentiment of
conversational content
Common steps Step 1. Getting started: defining questions Step 2. Processing data: extraction, cleaning,
summarization Step 3. High level analysis: inference
![Page 11: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/11.jpg)
CASE STUDY: USAGE ANALYSIS
So.cl is an experimental web site that allows people to connect around their interests by integrating search tools with social networking.
How important are social interactions in encouraging users to become engaged with an interest network?
So.cl usage analysis as case study scenario, lessons learned apply to other forms of social media and other forms of analysis
![Page 12: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/12.jpg)
search +sharing +networking= informal discovery and learning
SO.CL reimagining search as social from the ground up
History:Oct 2011: Pre-release deployment studyDec 2011: Private, invitation-only betaMay 2012: removed invitation restrictionsNov 2012: over 300K registered users, 13K active per month
Try it now! http://www.so.cl
![Page 13: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/13.jpg)
INTEREST NETWORKGOALS Find others around
common interests Be inspired by new
interests Learn from each other
through these shared interests
![Page 14: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/14.jpg)
HOW IT WORKS
FeedFeed
Search & PostSearch & Post
Feed FiltersFeed Filters
PeoplePeople
Try it now! http://www.so.cl – use facsumm tag
![Page 15: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/15.jpg)
POST BUILDING
ResultsResults
Search (Bing)Search (Bing)
Filter ResultsFilter ResultsPost BuilderPost Builder
Experience:Step 1: Perform searchStep 2: Click on items in results to add to postStep 3: Add a messageStep 4: Tag
Try it now! http://www.so.cl – use facsumm tag
![Page 16: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/16.jpg)
USAGE ANALYSIS
![Page 17: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/17.jpg)
STEP 1: GETTING STARTED
![Page 18: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/18.jpg)
Amount of data overwhelming – the more defined your question, the easier the analysis
What real world problem are you trying to explore?
Avoid pitfall of technology for technology’s sake
What argument do you want to be able to make?
State your problem as a hypothesis
DEFINING RESEARCH QUESTION
![Page 19: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/19.jpg)
CASE SCENARIO:
Real world problem: Help people learn online
Argument want to make: People are more motivated to explore new interests via social media than via search alone because of the opportunity to connect with others.
Hypothesis: If people receive a social response when they first join So.cl they are more likely to become engaged.
![Page 20: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/20.jpg)
OPERATIONALIZING CONSTRUCTS
Operationalize = to make measurable Always review related literature for best practices How do you measure…
Friendship? Similarity? Interest? Trend?
Conversation? Community? Engagement?
Can you operationalize with existing data, or do you need to generate more?
![Page 21: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/21.jpg)
CASE SCENARIO: Hypothesis:
If people receive a social response when they first join So.cl they are more likely to become engaged.
Measuring social/behavioral constructs: When first join
First session = time of first action to time of last action prior to an hour of inactivity
Social responsesFollows user, likes user’s post(s), comments on user’s post(s)
Engagement = coming backA second session = any action occurs 60 minutes or more after first session
Restating hypothesis: If a people receive follows, likes, and comments in their first session they are
more likely to come back for a second session
![Page 22: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/22.jpg)
STEP 2. PROCESSING DATA
![Page 23: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/23.jpg)
COLLECTING DATA Existing tools
APIs (Twitter, Foursquare, Yelp) Web analytics (Google Analytics)
Write crawlers Writing your own instrumentation system
e.g. log each call to server, query parameters
![Page 24: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/24.jpg)
RAW INSTRUMENTATION
Tendency to collect everything
incomprehensible, incoherent mess
Prone towards bugs
![Page 25: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/25.jpg)
![Page 26: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/26.jpg)
INSTRUMENTATION Convert to human readable
![Page 27: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/27.jpg)
Always look at your raw data: play with it,ask yourself if it makes sense, test!
![Page 28: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/28.jpg)
COMMON INSTRUMENTATION SCHEMA
Users table One row per user
![Page 29: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/29.jpg)
COMMON INSTRUMENTATION SCHEMA
Actions table One row per meaningful action Filter out non-meaningful, non-user generated actions
![Page 30: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/30.jpg)
COMMON INSTRUMENTATION SCHEMA
Content table(s): One row per content item, with text, URL, etc. of that item
e.g. messages, pictures shared, likes, tags
![Page 31: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/31.jpg)
COMMON INSTRUMENTATION SCHEMA
Across tables, with social systems instrument social target (PersonA responds to
PersonB) Instrument parent item (e.g., Comment A, Comment
B, Comment C, responses to parent item PostB)
In other words, instrument who interacting with whom, and in what context
![Page 32: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/32.jpg)
REDUCING LARGE DATA Filters
Time span, type of person, type of actions
Sampling Random selection Snow balling, so get complete picture of person’s
social experience
Consider your research questions, how you want to generalize
![Page 33: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/33.jpg)
FILTERING & SAMPLING
Filtered out administrators/community managers
New users only Date range: Sept 28 to Oct 13 100% sample for that time span: 2462
people
![Page 34: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/34.jpg)
SYSTEMATIC BIASES IN SOCIAL SYSTEMS
If you want to understand your “typical” users, keep in mind generally find: Large percent never become active or
return --“lookiloos” can unduly bias averages
Common reporting format:
X% performed Y behavior, of those averaged Z times each
5% commented on a post their first session, averaging 5 times each
![Page 35: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/35.jpg)
OUTLIERS Filtered out 13 people outliers z > 4 in number of
actions (if do more than sign in)
![Page 36: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/36.jpg)
SYSTEMATIC BIASES IN SOCIAL SYSTEMS
A small percent “hyper-active” users: avid, spammers, trolls, administrators, and can unduly bias averages Remove outliers
A substantial percent are consumers but not producers (“lurkers”), often no signal for lurkers
Consult literature, related work for estimates – so.cl, about 75% lurkers
Custom instrumentation, logging sign ins Web analytics for clicks
![Page 37: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/37.jpg)
PLAYING WITH YOUR DATA Very important to spend time examining data
Descriptives, Frequencies, Correlations, Graphs Use tool that easily generates graphs, correlations Does it make sense? If not, really chase it down. Often
a bug or misinterpretation of data.
![Page 38: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/38.jpg)
AGGREGATIONS
Aggregation: merging down for summarization What is your level of analysis?
Person, group, network Content types
If person is unit of analysis, aggregate measures to the person level
E.g. in SPSS: One line per person very important to have appropriate unit analysis, to avoid bias in
statistics
![Page 39: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/39.jpg)
AGGREGATIONS SPSS Syntax:
![Page 40: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/40.jpg)
DESCRIPTIVES OF ACTIVE SESSIONS Active session = a time of
activity (public), with 60 minute gap of no activity before or after
91% of users
only one active session On average,
34.6 hours apart First session,
1.6 minutes
![Page 41: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/41.jpg)
AA
DESCRIPTIVES OF ACTIONS
8% created a post there first session, of those averaged 1.5 times each
Actions in First Session
![Page 42: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/42.jpg)
DESCRIPTIVES OF COMING BACK 9.1% came back
for another active session(~25% including inactive)
On average, 35 hours later
![Page 43: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/43.jpg)
IN THE FIRST SESSION How often is user the target of social behavior? 23% received some response up to 2nd session
->3% if did not create a post, 37% if did create a post
Response *During* First Session Response *in Between* 1st and 2nd Sessions
![Page 44: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/44.jpg)
STEP 3. HIGH LEVEL ANALYSIS
![Page 45: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/45.jpg)
PRELIMINARY CORRELATIONS Always
ask, does this pattern make sense?
![Page 46: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/46.jpg)
PREDICTORS OF COMING BACK Social responses inspire people to return to
site, especially if occurring during first session
Social responses to user: following, commenting on post, liking post, liking comment, riffing
N = 2273 N = 179 N = 1942 N = 510
![Page 47: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/47.jpg)
WHICH RESPONSE MATTERS
Logistic Regression, Which Predicts Coming Back
B Sig.
Created post first session .95 .000
Followed .92 .003
Commented On .38 ns
Post Liked .87 .02
Comment Liked -.09 ns
Messaged -.09 ns
Riffed .00 ns
Logistic Regression, Any Response Predicts Coming BackB S.E. Sig.
Created post first session .71 .20 .000Response1: during first session 1.12 .21 .000Response2: after first session .60 .17 .000
![Page 48: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/48.jpg)
IDENTIFYING SUBGROUPS
Factors about equally predict if user comes back
Factor Analysis for Associated Behaviors:Three types of usage – creating, socializing, browsing
Principle components, varimax rotation [meaning forced to be orthoganol]
Component Matrixa
ComponentType: Creators Socialites Browsers
% Variance: 32% 12% 9%
Created post .86 .17 .10
Invited .01 -.16 .63
Followed -.03 .10 .37
Added item to post .83 .08 -.06
Searched .81 .03 .17
Commented .36 .64 .09
Liked post .15 .58 .32
Liked comment .13 .80 .06
Messaged -.09 .50 -.08
Viewed person .22 .47 .48
Navigated to All .51 .37 .53
Joined party .17 .09 .68
Browsing stronger predictor of overall activity levelRegression Coefficients
Beta t Sig
Creating 0.20 7.89 0.00
Socializing 0.17 6.58 0.00
Browsing 0.29 9.07 0.00
Regression Coefficients
Beta t Sig
Creating .14 5.28 .000
Socializing .07 2.61 .000
Browsing .19 7.20 .000
![Page 49: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/49.jpg)
NETWORK ANALYSIS
![Page 50: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/50.jpg)
Case Scenario 2:
illustrating network analysisReal world problem:
help people find and learn from others who share their interests online
Argument want to make: people do not just care about content around their interests, they want to develop friendships with others who share their interests
Hypothesis: People will interact with others more the more common tags they have
Design implication: Recommendations based on common overlapping tags
![Page 51: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/51.jpg)
PROCESSING NETWORK DATA Common format:
EntityA EntityB measure
EntityB EntityC measure
EntityB EntityD measure
EntityF EntityG measure
Units of analysis:EdgesNodes/verticesClusters, networks
![Page 52: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/52.jpg)
OPERATIONALIZING CONNECTION
How would you measure… Similar interests?
Friendship? Information flow?
Asymmetrical?
Often some form of co-occurrence
http://www.touchgraph.com/assets/navigator/help2/module_3_3.html
![Page 53: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/53.jpg)
NORMALIZATION adjusting values
measured on different scales to a notionally common scale
Allow the comparison of corresponding normalized values for different datasets in a way that eliminates the effects of certain gross influence
• Mary has 400 friends• Jim has 200 friends• Bob and 50 friends• Mary and Jim have 100
overlapping friends• Mary and Bob and 50
overlapping friends• How similar are they?• Who’s more similar?
Mary
Jim Bob
![Page 54: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/54.jpg)
CASE STUDY:
Real world problem: Help people find people like them online
Argument want to make: Interests you share and tag online are good indicator of what you are like
Hypothesis: If people more interested in receiving recommendations of whom to befriend based on overlapping tags than random others in the system
![Page 55: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/55.jpg)
CONNECTION VIA OVERLAPPING TAGS
![Page 56: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/56.jpg)
![Page 57: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/57.jpg)
NETWORK ANALYSIS (NODEXL) Playing with data, learned:
All tagging not a good indicator of what you are like – the tags on your posts are, whether or not you add them
Most common tags not very meaningful, unique overlapping tags are importance of normalization
![Page 58: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/58.jpg)
CONTENT ANALYSIS
![Page 60: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/60.jpg)
Outline
What’s in social media? (donuts)
Extracting relationships and their context
Using context with higher-level analyses
![Page 61: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/61.jpg)
Do people really talk about donuts? 1 week of tweets mentioning “donut” or
“doughnuts” Week of Feb 6-12, 2012. Matched ~180k messages
Train entity tagger for food and for restaurants (no disambiguation or canonicalization)
Let’s see what we find…
![Page 62: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/62.jpg)
Where do people get donuts?
![Page 63: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/63.jpg)
What do people drink with donuts?
![Page 64: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/64.jpg)
What kind of donuts do people eat?
![Page 65: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/65.jpg)
Beyond donuts… Drugs, diseases, and contagions
Paul and Dredze 2011; Sadilek, Kautz and Silenzio 2012.
Crises, disasters, and wars Starbird et al. 2010; Al-Ani, Mark & Semaan
2010; Monroy-Hernandez et al. 2012
Public Sentiment Political and election indices, market insights
Everyday life
![Page 66: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/66.jpg)
Relationships in Context
![Page 67: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/67.jpg)
Stage 1: Feature extraction
“I had fun hiking Tiger Mountain last weekend” – Alice said on Monday, at 10am
Location Tiger Mountain
Mood Happy
Activity Hiking
Name Alice
Gender Female
Post Time Mon 10am
Activity Time {Sat-Sun}
![Page 68: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/68.jpg)
Stage 2(A) Build a hyper-graph representation
“I had fun hiking Tiger Mountain last weekend” – Alice said on Monday, at 10am
Name: Alice
Location: Tiger
Mountain
Gender: Female
Mood: Happy
Post Time: Mon 10am
Activity Time:
{Sat-Sun}
Activity: Hiking
![Page 69: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/69.jpg)
Name: Alice
Location: Tiger
Mountain
Gender: Female
Mood: Happy
Post Time: Mon 10am
Activity Time:
{Sat-Sun}
Activity: Hiking
Name: Bob
Gender: Male
Post Time: Fri 3pm
![Page 70: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/70.jpg)
Location: Tiger
Mountain
Activity: Hiking
• Reduce graph to key domains• Statistical distributions of other domains provide key
context
Stage 2(B) Projection
![Page 71: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/71.jpg)
Location: Tiger
Mountain
Activity: Hiking
Gender: Female
Gender: Male
![Page 72: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/72.jpg)
Demo to show example relationships & contexts from several domains
![Page 73: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/73.jpg)
Using context with high-level analyses
Current Clustering Neighborhood discovery Network centrality
Context of discussion provides
![Page 74: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/74.jpg)
Demo to show example contexts for pseudo-cliques and network centrality
![Page 75: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/75.jpg)
CONCLUSIONS Define research questions early to help focus analysis Many special considerations with social media data
Operationalizing social constructs Attention to lookiloos, hyperactives, lurkers who bias outcomes Different types of users = different behaviors Different context meaningfully impact conversation
Processing data = simplification, getting meaningful measures summarized at appropriate level of analysis
Format your data and plug it into appropriate tool to enable you play with your data a *lot* Important for debugging, finding patterns
Great tools available for leveraging social media to describe, predict behaviors
![Page 76: Tools and Tips for Analyzing Social Media Data](https://reader030.vdocuments.site/reader030/viewer/2022032420/55a5a4661a28abf50f8b459c/html5/thumbnails/76.jpg)
CONTACT INFOShelly Farnham, Researcher
Emre Kiciman, Researcher
(@shellyshelly; [email protected])
QUESTIONS