modeling information seeking behavior in social media eugene agichtein emory university
TRANSCRIPT
Modeling Information Seeking Behavior in Social Media
Eugene AgichteinEmory University
2
Intelligent Information Access Lab (IRLab)
Qi Guo (2nd year Phd)
Yandong Liu (2nd year Phd)
Ablimit Aji (1st year PhD)
• Text and data mining• Modeling information seeking behavior• Web search and social media search• Tools for medical informatics and public health
Supported by:
External collaborators:- Beth Buffalo (Neurology)- Charlie Clarke (Waterloo)- Ernie Garcia (Radiology)- Phil Wolff (Psychology)- Hongyuan Zha (GaTech)
3
Information sharing: blogs, forums, discussions
Search logs: queries, clicks
Client-side behavior: Gaze tracking, mouse movement, scrolling
Online Behavior and Interactions
Research Overview
4
Social media
Health Informati
cs
Cognitive Diagnosti
cs
Intelligent search
Discover Models of Behavior
(machine learning/data mining)
Applications that Affect Millions
• Search: ranking, evaluation, advertising, search interfaces, medical search (clinicians, patients)
Collaboratively generated content: searcher intent, success, expertise, content quality
• Health informatics: self reporting of drug side effects, co-morbidity, outreach/education
• Automatic cognitive diagnostics: stress, frustration, Alzheimer’s, Parkinson's, ADHD, ….
5
6
(Text) Social Media TodayPublished:
4Gb/daySocial Media:
10Gb/Day
Technorati+Blogpulse120M blogs2M posts/day
Twitter: since 11/07:2M users3M msgs/day
Facebook/Myspace: 200-300M usersAvg 19 m/day
Yahoo Answers: 90M users, 20M questions, 400M answers[Data from Andrew Tomkins, SSM2008 Keynote]
Yes, we could read your blog. Or, you could tell us about your day
8
9
Total time: 7-10 minutes, active “work”
Someone must know this…
11+1 minute
+7 hours: perfect answer
Update (2/15/2009)
13
14
http://answers.yahoo.com/question/index;_ylt=3?qid=20071008115118AAh1HdO
15
Finding Information Online (Revisited)
16
Next generation of search: Algorithmically-mediated information exchange
CQA (collaborative question answering):• Realistic information exchange
• Searching archives
• Train NLP, IR, QA systems
• Study of social behavior, norms
Content quality,
asker satisfaction
Current andfuture work
(Some) Related Work• Adamic et al., WWW 2007, WWW 2008:
– Expertise sharing, network structure• Elsas et al., SIGIR 2008:
– Blog search• Glance et al.:
– Blog Pulse, popularity, information sharing• Harper et al., CHI 2008, 2009:
– Answer quality across multiple CQA sites• Kraut et al.:
– community participation• Kumar et al., WWW 2004, KDD 2008, …:
– Information diffusion in blogspace, network evolution
SIGIR 2009 Workshop on Searching Social Mediahttp://ir.mathcs.emory.edu/SSM2009/
17
Finding High Quality Content in SM
• Well-written• Interesting• Relevant (answer)• Factually correct• Popular?• Provocative?• Useful?
18
As judged by professional editors
E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne, Finding High Quality Content in Social Media, in WSDM 2008
19
Social Media Content Quality E. Agichtein, C. Castillo, D. Donato, A. Gionis, G. Mishne, Finding High Quality Content in Social Media, WSDM 2008
quality
2020
21
How do Question and Answer Quality relate?
2222
2323
2424
2525
26
Community
27
Link Analysis for Authority Estimation
Question 1
Question 2
Answer 5
Answer 1
Answer 2
Answer 4
Answer 3
User 1
User 2
User 3
User 6
User 4
User 5
Answer 6
Question 3
User 1
User 2
User 3
User 6
User 4
User 5
Kj
jAiH..0
)()(
Mi
iHjA..0
)()(
Hub (asker) Authority (answerer)
28
Qualitative Observations
HITS effective
HITS
ineffective
2929
Random forest classifier
Result 1: Identifying High Quality Questions
30
Top Features for Question Classification
• Asker popularity (“stars”)
• Punctuation density
• Question category
• Page views
• KL Divergence from reference LM31
Identifying High Quality Answers
32
Top Features for Answer Classification• Answer length
• Community ratings
• Answerer reputation
• Word overlap
• Kincaid readability score33
Finding Information Online (Revisited)
34
• Next generation of search: • human-machine-human
• CQA: a case study in complex IRContent quality• Asker satisfaction• Understanding the interactions
Dimensions of “Quality”
• Well-written• Interesting• Relevant (answer)• Factually correct• Popular?• Timely?• Provocative?• Useful?
35
As judged by the asker (or community)
Are Editor Labels “Meaningful” for CGC?
• Information seeking process: want to find useful information about topic with incomplete knowledge– N. Belkin: “Anomalous states of knowledge”
• Want to model directly if user found satisfactory information
• Specific (amenable) case: CQA
37
Yahoo! Answers: The Good News
• Active community of millions of users in many countries and languages
• Effective for subjective information needs– Great forum for socialization/chat
• Can be invaluable for hard-to-find information not available on the web
38
39
Yahoo! Answers: The Bad News
May have to wait a long time to get a satisfactory answer
May never obtain a satisfying answer
0
5
10
15
20
25
30
35
40
1 2 3 4 5 6 7 8 9 10
1. FIFA World Cup2. Optical3. Poetry4. Football (American)5. Soccer6. Medicine7. Winter Sports8. Special Education9. General Health Care10. Outdoor Recreation
Time to close a question (hours)
40
Predicting Asker Satisfaction
Given a question submitted by an asker in CQA, predict whether the user will be satisfied with the answers contributed by the community.
–“Satisfied” :• The asker has closed the question AND• Selected the best answer AND• Rated best answer >= 3 “stars” (# not important)
–Else, “Unsatisfied
Yandong Liu Jiang Bian
Y. Liu, J. Bian, and E. Agichtein, in SIGIR 2008
41
ASP: Asker Satisfaction Prediction
asker is satisfied
asker is not satisfied
TextCategory
Answerer History
Asker History
Answer
Question
Wikipedia
News
Classifier
42
Experimental Setup: Data
Crawled from Yahoo! Answers in early 2008
Questions
Answers
Askers
Categories
% Satisfied
216,170 1,963,615
158,515
100 50.7%
“Anonymized” dataset available at: http://ir.mathcs.emory.edu/shared/
1/2009: Yahoo! Webscope : “Comprehensive” Answers dataset: ~5M questions & answers.
43
Satisfaction by Topic
Topic Questions
Answers
A per Q
Satisfied
Asker rating
Time to close by asker
2006 FIFA World Cup
1194 35,659
329.86
55.4%
2.63 47 minutes
Mental Health
151 1159 7.68 70.9%
4.30 1.5 days
Mathematics
651 2329 3.58 44.5%
4.48 33 minutes
Diet & Fitness
450 2436 5.41 68.4%
4.30 1.5 days
44
Satisfaction Prediction: Human Judges
• Truth: asker’s rating• A random sample of 130 questions• Researchers
– Agreement: 0.82 F1: 0.45 2P*R/(P+R)
• Amazon Mechanical Turk– Five workers per question. – Agreement: 0.9 F1: 0.61 – Best when at least 4 out of 5 raters agree
45
Performance: ASP vs. Humans (F1, Satisfied)
Classifier With Text Without Text Selected Features
ASP_SVM 0.69 0.72 0.62
ASP_C4.5 0.75 0.76 0.77
ASP_RandomForest
0.70 0.74 0.68
ASP_Boosting 0.67 0.67 0.67
ASP_NB 0.61 0.65 0.58
Best Human Perf
0.61
Baseline (random)
0.66
ASP is significantly more effective than humans
Human F1 is lower than the random baseline!
46
Top Features by Information Gain
• 0.14 Q: Askers’ previous rating• 0.14 Q: Average past rating by
asker• 0.10 UH: Member since (interval)• 0.05 UH: Average # answers for by
past Q• 0.05 UH: Previous Q resolved for the
asker• 0.04 CA: Average asker rating for
category• 0.04 UH: Total number of answers
received…
47
“Offline” vs. “Online” Prediction
• Offline prediction (AFTER answers arrive)– All features( question, answer, asker & category)– F1: 0.77
• Online prediction (BEFORE question posted)– NO answer features– Only asker history and question features (stars,
#comments, sum of votes…)– F1: 0.74
48
Personalized Prediction of Satisfaction
Same information != same usefulness for different searchers!
Personalization vs. “Groupization”?
Y. Liu and E. Agichtein, You've Got Answers: Personalized Models for Predicting Success in Community Question Answering, ACL 2008
49
Example Personalized Models
Outline
50
• Next generation of search: • Algorithmically mediated information exchange
• CQA: a case study in complex IRContent qualityAsker satisfaction
Current Work (in Progress)• Partially supervised models of expertise
(Bian et al., WWW 2009)
• Real-time CQA
• Sentiment, temporal sensitivity analysis
• Understanding Social Media dynamics
Answer Arrival
52
5 10 15 20 25 30 35 40 45 50 55 600
100000
200000
300000
400000
500000
600000
700000
573086
378227
146845
7226046364 34573 27322 23194 19952 17260 15481 13985
First Hour (69%)
Time in minutes
Answer number arrived in < T
Exponential Decay Model [Lerman 2007]
Factors Influencing Dynamics
Example: Answer Arrival | Category
Subjectivity
Answer, Rating Arrival
Preliminary Results: Modeling SM Dynamics for Real-Time Classification
• Adapt SM dynamics models to classification
e.g.: predict ratings feature value:
Outline
59
• Next generation of search: • Algorithmically mediated information exchange
• CQA: a case study in complex IRContent qualityAsker satisfactionUnderstanding social media dynamics
60
Goal: Query Processing over Web and Social Systems
60
61
Takeaways
Robust machine learning over behavior data system improvements, insights into behavior
Contextualized models for NLP and text mining system improvements, insights into interactions
Mining social media: potential for transformative impact for IR, sociology, psychology, medical informatics, public health, …
References • Modeling web search behavior [SIGIR 2006, 2007]• Estimating content quality [WSDM 2008]• Estimating contributor authority [CIKM 2007]• Searching CQA archives [WWW 2008, WWW 2009]• Inferring asker intent [EMNLP 2008]• Predicting satisfaction [SIGIR 2008, ACL 2008, TKDE]• Coping with spam [AIRWeb 2008]
More information, datasets, papers, slides:http://www.mathcs.emory.edu/~eugene/