real-world behavior analysis through a social media lens

22
Data Mining and Machine Learning Lab Real-World Behavior Analysis through a Social Media Lens Mohammad-Ali Abbasi, Huan Liu Computer Science and Engineering, Arizona State University Sun-Ki Chai, Kiran Sagoo Department of Sociology, University of Hawai`i [email protected]

Upload: mohammad-ali-abbasi

Post on 27-Jan-2015

107 views

Category:

Technology


1 download

DESCRIPTION

In this paper, using a large amount of data collected from Twitter, the blogosphere, social networks, and news sources, we perform preliminary research to investigate if human behavior in the real world can be understood by analyzing social media data. The goals of this research is twofold: (1) determining the relative eff ectiveness of a social media lens in analyzing and predicting real-world collective behavior, and (2) exploring the domains and situations under which social media can be a predictor for real-world's behavior. We develop a four-step model: community selection, data collection, online behavior analysis, and behavior prediction. The results of this study show that in most cases social media is a good tool for estimating attitudes and further research is needed for predicting social behavior.

TRANSCRIPT

Page 1: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

Real-World Behavior Analysisthrough a Social Media Lens

Mohammad-Ali Abbasi, Huan LiuComputer Science and Engineering, Arizona State University

Sun-Ki Chai, Kiran SagooDepartment of Sociology, University of Hawai`i

[email protected]

Page 2: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

Real-World Behavior Analysis through a Social Media Lens

2

Real world Events/Behavior

Page 3: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

3

Real-World Behavior Analysis through a Social Media Lens

Page 4: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

4

Real-World Behavior Analysis through a Social Media Lens

Page 5: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

5

Real-World Behavior Analysis through a Social Media Lens

Page 6: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

6

Any correlation between social media numbers and election results?

1,520,000

370,000

295,000

1,447,000

173,000

160,000

900,000

260,000

Ron Paul Newt GingrichMitt Romney Rick Santorum

25,500,000

12,920,000

Barack Obama

Number of States carried?

http://en.wikipedia.org/wiki/Republican_Party_presidential_primaries,_2012

Do we observe the same difference in the votes?

Page 7: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

7

Objectives of the research

• Studying the correlation between real-world collective

behavior and social media data

• Determining the relative effectiveness of a social media

lens in analyzing and predicting real-world collective

behavior

• Exploring the domains and situations under which

social media can be a predictor for real-world's behavior

Page 8: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

8

Data collection

Active methods

• Experiments

• Surveys

• Field Study

Passive methods

(By observing and analyzing)

• Behavior

• Belongings

• Documents, …

• Expensive

• Time consuming

• Maybe dangerousSocial Media

• People leave many clues about themselves

• Their interactions reveal much about people

• We can passively observe people’s activities

Page 9: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

9

Snooping

Experimental psychology suggests that a person

may be understood by what happens around him

• Does what's on your desk reveal what's on

your mind?

• Do those pictures on your walls tell true tales

about your character?

Page 10: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

10

Using online data for opinion polling

• From Tweets to Polls: Linking Text Sentiment

to Public Opinion Time Series

• O'Connor et al. analyzed sentiment polarity

of tweets and found a correlation of 80% with

results from public opinion polls

Page 11: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

11

Some Existing Work

• Stock Market Prediction using data collected data

form twitter

• Box-office revenues prediction for movies

• Analyzing Arab-Spring using social media

Most of the work in the field can be classified into two categories:

• Behavior Analysis and finding a correlation

• Behavior prediction

Page 12: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

12

Our approach: A four-step model

Find equivalent groups in Real-World & Social Media

Collect Related Online Data from Social Media

Analyze Online Data (Behavior)

Analyze the Real-World Behavior & find correlation

Page 13: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

13

Experimental settings

Find a Group in real world and Social Media

Collect Related Online Data from Social Media

Analyze Online Data (Behavior)

Analyze the Real-World Behavior

• Twitter to collect 35 million tweets related to Arab Spring

• Collect more than 1 million blogposts

• 135,000 popular Facebook pages to collect data on posts, comments and like behavior on Facebook.

• The data on real-world events has been collected from Reuters.com

• Select based on more stable characteristics

Race, religion, primary language, and country/region of origin

• Arab-Spring movement

• Correlational analysis

• Multivariate regression analysis

• Information Retrieval techniques

• Sentiment polarity analysis

• Statistical methods

Page 14: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

14

Correlation between online and real events

Time that event in real-world happened

Page 15: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

15

Observations

Time that event in real-world happened

Page 16: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

16

Observations

• There could be correlations between real-world events and

online discussions. However,– Correlation is not amount to prediction

– Poor results for small events• Many real-world events left uncovered

– Influence and cascade effects, causes too much non-relevant

discussion in social media

• What we have experimented– Finding Influential people– Analyzing Mood over the network

Page 17: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

17

What are people concerned about

Page 18: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

18

Challenges

• Finding Relevant Communities– Analyzing Arab Spring tweets, show that 75 percent

of the 1 million clicks on Libya-related tweets and 89 percent of the 3 million clicks for Egypt-related Tweets came from outside of the Arab world1

– The fallacy of millions of followers

1- http://www.stripes.com/blogs/stripes-central/stripes-central-1.8040/researchers-skeptical-dod-can-use-social-media-to-predict-future-conflict-1.15529

Page 19: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

19

Challenges

• Data Collection– Sufficient coverage of the data– Source of data is unknown– Spam– Paid social media content

• Online behavior Analysis– Unstructured, noisy text data– Language ambiguity

Page 20: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

20

Observations

Real-World Behavior Prediction– Stark difference between click and taking

real risk in the street

Page 21: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

21

Conclusions

• Social media is helping us to understand the real-

world’s events but is not a sole source

• More research and development to make social

media a reliable source for behavior analysis

• Social event prediction using social media remains

an open problem. More interdisciplinary research

should be promoted.

Page 22: Real-World Behavior Analysis through a Social Media Lens

Data Mining and Machine Learning Lab

22

Thanks!

Mohammad-Ali Abbasi

[email protected]

Acknowledgments: This work is, in part, sponsored by ONR and AFOSR grants.

We are grateful for the comments from anonymous reviewers and members of DMML lab at ASU