social multimedia as sensors

46
1 Social Multimedia as Sensors Jiebo Luo Department of Computer Science University of Rochester

Upload: goergen-institute-for-data-science

Post on 20-Feb-2017

38 views

Category:

Social Media


0 download

TRANSCRIPT

1

Social Multimedia as Sensors

Jiebo Luo

Department of Computer Science

University of Rochester

2

Big Data

Machine Learning

Computer Vision Data Mining

Surveillance video analytics Surgery video analysis

3D scene modelAugmented photographyImage geolocation

Visual recognition using weakly labeled big image data(people, object, action, event, activity)

Social media summarizationMedia-driven recommendation

Cultural influence on social mediaCrowd-sourced learning

Data analytics for healthcare

Nowcasting and forecasting

Multimodal sentiment/affect analysis

Deep user profiling & demographicsIndividual or group behaviors

Wisdom of social

multimedia

Non-contact sensing

Make Computers See Let Data Speak

Medical image analysis

Changing Times

5

3

It’s not just young people

6

7

• A sensor network consists of spatially distributed autonomous sensors to monitor physical or environmental conditions, such as temperature, sound, light, pressure, etc. and to cooperatively pass their data through the network to a main location.

Sensor Networks: Then and Now

autonomous. intelligent. active. mobile. no maintenance. multimodal. social. sentiment .

4

Why Images? Why Multimedia?

• Photography is the only “language" understood in all parts of the world

• “A picture is worth 1000 words”

8

More and More Multimedia in Social Media

Time Social Network & Apps Support of multimedia

2011, June Twitter Release of integrated photo-sharing service

2012, Jan Pinterest Fastest site ever to break through 10 million unique visitor mark

2012, April Instagram Online photo-sharing service (cost Facebook 1 billion dollars)

2012, Nov Snapchat One billion photos have been shared in Snapchat

2013, Jan Twitter (Vine) Twitter released 6-second apps for short looping videos

2013, May Tumblr Support text, images, videos, quotes or links (Yahoo! spent 1.1 $billion in the acquisition)

2013, June Facebook Introduce its own short-video service (in Instagram)

More and more multimedia content will come in social media …

A brief summary of recent influential multimedia platforms

5

What Happens Online Every 60 Seconds

Highlight of multimedia content

216,000 images shared on Instagram

104,000 images on Snapchat

20 million photos viewed on Flickr

20,000 photos uploaded to Tumblr

72 hours of videos are uploaded to Youtube

SRC: http://www.wamda.com/2013/07/what-happens-online-every-60-seconds-infographic

11

What is Social Multimedia?

Andreas M. Kaplan, Michael Haenlein (2010). "Users of the world, unite! The challenges and opportunities of Social Media". Business Horizons 53 (1): 59–68.

A group of Internet‐based applications that build on the ideological and technological foundations of Web 2.0, which allows the creation and exchange of user‐generated content 

Social Multimedia = multimedia generated for social interactions

6

12

13

Heterogeneous Data Types

Link

Text

Spatial‐temporal Data

Image/Video Data

7

14

A Heterogeneous Network Model

15

LikeMiner: The Power of Social InteractionJin, Wang, Luo, Han, KDD 2011

8

17

londoneye, trafalgarsquare, britishmuseum, bigben, towerbridge, piccaillycircus, buckinghampalace, tatemodern,  …

Diversified Tourist Trajectory Patterns

(1) Cluster locations mean-shift algorithm (27974 photos in London)

(2) Form sequencesID User Date Sequence

1 Alice 04/26/11 londoneye -> bigben -> downingstreet -> trafalgarsquare

2 Alice 04/27/11 londoneye -> tatemodern -> towerbridge

3 Bob 04/26/11 londoneye -> bigben -> tatemodern

… … … …

Yin, Cao, Han, Luo, Huang, SDM 2011

28

Taking it Further

• To predict other human geography metrics, such as GDP, wealth by region, unemployment rate, stock market sentiment

• To monitor health/disease, environment, ecology, social unrests, …

9

Twitter Health

• You can explore health patterns with the web application at GermTracker

29

Adam Sadilek, Henry Kautz, and Vincent Silenzio, Predicting Disease Transmission from Geo-Tagged Micro-Blog Data, AAAI 2012

30

Two Most Important Social Signals (IMO)

• User

• Sentiment

Events

10

Fine‐Grained User Profiling from Multiple Social Multimedia PlatformsMain Contributors: Quanzeng You, Sumit Bhatia*, Tong Sun* and Jiebo Luo

User Expression & Behavior

(demographics & interests)

32

User Demographics & Interests

13

Pinterest Board Recommendation for Twitter Users

39

User-Curated Image Collections: Modeling and Recommendation

40

14

Social Multimedia Sentiment

41

Affect Sensing

42

15

Sentiment Analysis in Social Media

• Sentiment is arguably the most important signal from social media

– User connections

– User preference

• Most existing methods are based on textual information only– Comments, reviews, textual tweets, and status updates

• Twitter– Easy? Only a limited number of words per tweet

– Difficult? Lots of noise and little information

• Questions– Do users express themselves only using text?

– Can the emerging multimedia content provide additional useful signals?

Textual Sentiment Analysis

• Dictionary-based approach–Lexicon contains a large amount of words with sentiment labels

–Emoticons, widely used in online social networksSimple and effective in most cases, however fail to capture the rich

contextual information

• Semantic analysis–Using NLP related techniques to build more robust featuresDifficult to develop a method that works for all languages

We use sentiment140 for textual sentiment analysis• Using emoticons as auxiliary information

• Open API available

Source: http://www.sentiment140.com

16

Image Tweets

• Image tweets: tweets that contain images

• Different users may prefer different types of tweets

Observation: Users who prefer image tweets tend to have more positive tweets

Results: Advantage over Low-level Methods

S. Siersdorfer, J. Hare, E. Minack, and F. Deng. Analyzing and predicting sentiment ofimages on the social web. In ACM Multimedia 2010, pages 715–718. ACM, October 2010.

Representative attributes:- Aged/worn- Constructing- Smoke- Stressful

Representative attributes:- Glossy- Playing- Socializing- Positive Facial

Emotion

• Low-level visual feature based method [S. Siersdorfer et al.]

– SIFT, Global Color Histogram, Local Color Histogram as features

– Linear SVM as Classifier

• Mid-level result easy to interpret, and amenable to modular learning

– Positive Negative

17

Low-Level and Mid-Level Features

•Low level visual features– HOG: object and human recognition

– GIST: scene recognition

– SSIM: invariant scene layout

– GEO-COLOR-HIST: robust histogram features

•Mid Level visual features (selected attributes)

Mid-level Attribute Classification

•Dataset: SUN dataset (MIT)–102 mid-level attributes [Patterson & Hays]

–Each is labeled by 3 individuals, making votes ranking for 0 to 3

–Consider images with votes of more than 1 as positive samplesas Soft Decision (SD), and votes of more than 2 as positive asHard Decision (HD)

Genevieve Patterson, James Hays. SUN Attribute Database: Discovering, Annotating, and Recognizing Scene Attributes. CVPR 2012.

18

Linking Attributes to Sentiments

• Mutual Information (MI) analysis

• Attributes with the10 highest MI values for SD and HD

TOP 10 Soft Decision Hard Decision

1 congregating railing

2 flowers hiking

3 aged/ worn gaming

4 vinyl/ linoleum competing

5 still water trees

6 natural light metal

7 glossy Tiles

8 open area direct sun/sunny

9 glass aged/ worn

10 ice Constructing

Results: Sentiment Classification

• Comparison between the low-level visual feature based algorithm andmid-level attribute based algorithm

• Comparison between the mid-level visual content based algorithm andtextual content based algorithm [Wilson et al.]

T. Wilson, J. Wiebe, and P. Homann. Recognizing contextual polarity in phrase-levelsentiment analysis. In Proceedings of the conference on Human Language Technologyand Empirical Methods in Natural Language Processing, pages 347-354. Association forComputational Linguistics, 2005.

19

Deep Learning for Image Sentiment Analysis

Convolutional Neural Network for Image Sentiment Analysis• Domain‐transfer  Learning;• Boosted Learning using Noisy Labels

Users who like to post many image tweets, they aremore likely to have positive sentiments.

Main Contributors: Quanzeng You, Hailin Jin*, Jianchao Yang*, Jianbo Yuan and  Jiebo Luo, AAAI‐2015

Weakly labelled Images

CNN for Visual Sentiment Analysis

Implemented CNN architecture (using Caffe)

256

256

227

227

3

3

227

227

1111

5

5

96

55

55

256

27

27

512 512

24

2

20

Exploiting Noisy Training Data

• Expensive to manually label a large amount of training data

• Possible to gather weakly labeled images

• Progressively selecting the training set

• After each training generation, select i with probability

si is the prediction sentiment score

Progressively Trained CNN

Progressively fine‐tune the neural network

21

What PCNN Learned

Experiments

• Half million weakly labeled Flickr images from Visual Sentiment Ontology

• Two GTX Titan GPUS and 32 GB RAM

• Statistics of the data set

• Performance of CNN and PCNN on the testing images

22

Twitter Testing Data Set

• Employ Amazon Mechanical Turk to manually label selected Twitter images

• Assign 5 workers for each image

• Statistics of the labeling results from AMT workers on 1269 images

Performance on the Twitter Testing Data Set

• Trained models of CNN and PCNN on the Flickr images

• With transfer of knowledge (using 5‐fold cross validation)

23

Examples of Top Ranked Images

• Positive examples and negative examples– left to right: PCNN, CNN, Sentribute, Sentibank, GCH, LCH, GCH+BoW, 

LCH+BoW

Joint Visual-Textual Sentiment AnalysisWSDM 2016

• Cross‐modality Consistent Regression

24

Building a Large Scale Dataset for Image Emotion Recognition

• We started from 3+ million weakly labeled images of different emotions and ended up with an AMT manually labeled data set that is 30 times as large as the current largest publicly available visual emotion data set. 

• We also performed extensive benchmarking analyses on this large data set using the state of the art methods including CNNs, and established a nontrivial baseline for further research by the community

Main Contributors: Quanzeng You, Hailin Jin*, Jianchao Yang*, Jianbo Yuan and  Jiebo Luo, AAAI 2016

What the Language You Tweet Says About Your Occupation 

Tianran Hu, Haoyuan Xiao, Jiebo Luo, and Thuy‐vy Thi Nguyen

ICWSM 2016

25

Visualization method described in (Schwartz et al. 2013) 

Visualization method described in (Schwartz et al. 2013) 

26

Visualization method described in (Schwartz et al. 2013) 

Visualization method described in (Schwartz et al. 2013) 

27

Visualization method described in (Schwartz et al. 2013) 

Visualization method described in (Schwartz et al. 2013) 

28

How to find one’s occupation

How to find one’s occupation

29

Skill list of one’s LinkedIn profile

How to find one’s occupation

DM ML C++

C SQL … SEO MKTG

U1 32 89 4 3 12 … 0 0 …

U2 0 0 32 42 12 … 0 0 …

U3 2 0 0 0 10 … 17 23

User‐skill matrix

30

Apply LDA on user‐skill matrix

A word = a skill

Word count = endorsements of skills

A document = a user’s skill list

A topic = linear combination of  skills

DM ML C++

C SQL … SEO MKTG

U1 32 89 4 3 12 … 0 0 …

U2 0 0 32 42 12 … 0 0 …

U3 2 0 0 0 10 … 17 23

about.me is a platform that connect the same user’s multiple social  media accounts.

Language Patterns are extracted from tweets

Occupation is extracted form LinkedIn profile

31

Open Vocabulary Approach 

1) Collect the tweets of people2) Count the number of word, terms, 

topics of each person3) Compute the Correlation between a 

person’s count of each word/term/topics and weight on each 

Open Vocabulary ApproachInteresting Findings 

Administrators Like: “Make a difference”, “courage”, “honor”, “we need to”, “we must”, “we are”Dislike: “blessing”,  “pray”,  “god’s”, and “can’t” 

Start‐upLike: “founders”, “investors”, “growth”, “valuation”, and “companies”, and “silicon” Most dislike: “ I can’t” followed by “I  don’t know” 

Software EngineerLike: “web”, “UI”, “code”, and “plugin”Dislike: “love this!”, “so excited”, “Sunday”,  “girl” ,“her”,  and “relationship”

Office ClerkLike: “my life”, and other phrases related to daily life such as “woke up”,  “fall asleep”Dislike: “interesting”,  “creating”,  “great” 

(Schwartz et al. 2013) 

32

Personality features of different occupations

O: Openness                 N:  Neuroticism        A : Agreeableness E : Extraversion             C : Conscientiousness 

IBM Watson Personality Insights service API is applied to compute the Big Five 

Occupation Prediction

1) Using the words a person tweets, we can predict the occupation with reasonable accuracy (data is assumed to be balanced )

1) Software Engineer, Designer, Editor & Writer are relatively easy because they usually use specific words (especially engineer)

33

Sensing from a Distance

[John is holding a gun to his head]Terminator: You cannot self‐terminate.John Connor: No, you can't. I can do anything I want. I'm a human being, not some god‐damn robot.Terminator: [correcting him] Cybernetic organism.John Connor: Whatever! Either we go, and save her Dad, or so much for the Great John Conner. Because your future, my destiny, I want no part in it, I never did.Terminator: Based on your pupil dilation, skin temperature, and motor functions, I calculate an 83% probability that you will not pull the trigger.

Tacking Mental Health

• Motivation– Mental health is a significant problem on the rise with reports of anxiety,

stress, depression, suicide, and violence

– Mental illness has been and remains a major cause of disability, dysfunction, and even violence and crime

• Challenges– Traditional methods of monitoring mental health are expensive, intrusive,

and often geared toward serious mental disorders

– These methods do not scale to a large population of varying demographics, and are not particularly designed for those in the early stages of developing mental health problems

• Opportunities– Advances in computer vision and machine learning, coupled with the

widespread use of the Internet and adoption of social media, are opening doors for a new approach to tackling mental health using physically noninvasive, low-cost multimodal sensors already in people’s daily lives

34

Tackling Mental Health Via Multimodal SensingDawei Zhou, Jiebo Luo, Vincent Silenzio*, Yun Zhou, Jile Hu, Glenn Currier*, Henry Kautz, AAAI‐2015

Innovation

• Extracting fine-grained psycho-behavioral signals that reflect the mental state of the subject from imagery unobtrusively captured by the webcams built in most mobile devices (laptops, tablets, and smartphones). We develop robust computer vision algorithms to monitor real-time psycho-behavioral signals including the heart rate, eye blink rate, pupil variations, head movements, and facial expressions of the users.

• Analyzing effects from personal social media stream data, which may reveal the mood and sentiment of its users. We measure the mood and emotion of the subject from the social media posted by the subject as a prelude to assessing the effects of social contacts and context within such media.

• Establishing the connection between mental health and multimodal signals extracted unobtrusively from social media and webcams using machine learning methods.

35

Multimodal (Weak) Signals

84

0 10 20 300

2

4

6

8

10

12

Time (min)

Hea

d M

ovem

ent R

ate

PositiveNeutralNegative

0 10 20 30 40 50 60 70 80-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

Time (min)

Pup

il D

iam

eter

Positive Neutral Negative

0 5000 10000 15000 20000 25000 30000 35000 40000

Mouse Wheel

Mouse Moving Distance

Mouse Click

Key Stroke

PositiveNeutralNegative

Pattern Classification and Mining

85

36

Experiments

• Experiment I– 27 participants (16 females and 11

males), including undergraduate students, PhD students, and faculties, with different backgrounds in terms of education, income, and disciplines. The age of the participants ranges from 19 to 33, consistent with the age of the primary users of social media.

• Experiment II – 5 depression patients (2

severe/suicidal and 3 moderate) and a control group of five normal users. FaceTime with doctors.

• Approaching Terminator T-600 (83%)

86

Table 3. Leave-One-Subject-Out Test for Experiment 1.

TP FP Prec. Rec. F-1 AUC

Negative 0.89 0.08 0.82 0.89 0.84 0.95

Neutral 0.56 0.13 0.67 0.56 0.59 0.79

Positive 0.78 0.17 0.76 0.78 0.75 0.91

Table 4. Leave-One-Subject-Out Test for Experiment 2.

Patients vs.

Control in

positive

mood

Patients vs.

Control in

negative mood

Patients vs.

Control in

neutral

mood

precision 0.814 0.817 0.813

recall 0.674 0.738 0.717

Deployment

• Physically non-invasive– Detecting emotional information

using both online social media and passive sensors

– No specialized wired or wireless invasive sensors

– Can potentially enhance the effectiveness and quality of new services delivered online or via mobile devices in current depression patient care.

– Can incorporate other sensors

• Mobile app– Self-awareness

– Self-management

– Informed intervention

87

38

Understanding the Pulse of Our Society• Social interactions and social activities• Public health surveillance• Web sentiment analysis and trend prediction• Cyber terrorism, extremism, and activism• Fads and infectious ideas• Marketing intelligence analytics • Traffic and human mobility patterns• Human and environment• Social unrest, protest and riot

Social Multimedia‐based Prediction of Elections

Prediction for the swing states  in 2012 US Presidential Election

Social images can act like a prism to reveal split public opinions

Main Contributors: Quanzeng You, Liangliang Cao*, Junhuan Zhu, John R. Smith* and Jiebo Luo

Competitive Vector Auto Regression

Textual and Visual Sentiment

Negative Campaign

39

Fine‐Grained Analysis of the 2016 Election

America Tweets China: Analysis of State and Individual Characteristics Regarding Attitudes towards China

Main Contributors: Yu Wang and Jiebo Luo, IEEE Big Data Conference, 2015

40

Using Social Multimedia to Solve Social Problems Main Contributors: Ran Pang, Jiebo Luo, and Henry Kautz

Drinking Levels among YouthThe CDC 2011 Youth Risk Behavior Survey found that among high school students, during the past 30 days:

• 39% drank some amount of alcohol.• 22% binge drank.• 8% drove after drinking alcohol.• 24% rode with a driver who had been 

drinking alcohol.

Consequences of Underage Drinking• School problems, such as higher absence and poor or failing grades.• Social problems, such as fighting, physical and sexual assault.• Legal problems, such as arrest for driving or physically hurting 

someone while drunk.• Physical problems, such as hangovers or illnesses.• Unwanted, unplanned, and unprotected sexual activity.• Higher risk for suicide and homicide.• Alcohol‐related car crashes and other unintentional injuries.• Abuse of other drugs.

Using Social Multimedia to Solve Social Problems Main Contributors: Ran Pang, Jiebo Luo, and Henry Kautz

41

Human Behavior

• Distributed Sensing 

• Integrated Mining

Time Patten of Underage Alcohol UseMain Contributors: Ran Pang, Jiebo Luo, and Henry Kautz

NYC

ALL

42

Brand Influence in Underage Alcohol UseMain Contributors: Ran Pang, Jiebo Luo, and Henry Kautz

Vodka 1 Vodka 2 Champagne Beer 1 Beer 2

Young Male 6.43% 6.79% 6.10% 13.21% 10.95%

Adult Male 29.69% 42.16% 24.27% 52.41% 51.91%

Young Female

19.76% 15.12% 19.49% 11.58% 12.17%

Adult Female 44.12% 35.93% 50.14% 22.79% 24.97%

EXPERIMENTS (3): Youth Exposure to Alcohol Media

43

• Mining deeper level patterns in terms of factors such as family income, rural vs. urban, coastal vs.

heartland regions, as well as social influence by peers in the social networks

• Combining the proposed approach with surveys, which can be used to verify the findings from

social media data mining.

• Applying this methodology to other social problems that involve youth, such as tobacco, drugs, teen

pregnancy, unsafe sex, unsafe driving, obesity, stress, and depression.

ONGOING DIRECTIONS

Drug Image Classification

• Fine‐tuned CNN• Starting with the pre‐trained VGG Net • Fine‐tuned CNN features + SVM 

• Using noisy data downloaded from Google

• Fine‐tuned data statistics• Instagram photos

label pills bottle weed total Non‐drug

# 2421 1233 675 4329 12253

Main Contributors: Xitong Yang, Meredith McCarron, Lacey Kelly,  Jiebo Luo

44

Drug Use Patterns from InstagramMain Contributors: Yiheng Zhou, Numair Sani, Jiebo Luo

Big cities vs. Small cities

1. Different mobility patterns?

2. Exciting vs. Routine?

3. Stressful vs. Relaxed?

4. Fast vs. Slow?

Geo-tagged social media makes it possible to understand various life styles in different cities at scale

45

Data‐Driven Lifestyle Patterns

Data‐Driven Lifestyle Patterns

46

Human Mobility and Human‐Environment InteractionMain Contributors: Yuncheng Li, Jifei Huang, and  Jiebo Luo (ICIMCS 2015)

Geotagged tweets

Morning and evening rush hours                                    Haze                        Dehazed

Experiments: Metrics

• Spearman correlation coefficients

– rank correlation

• Haze level:

– ordinal data

– sign is irrelevant

• The metric:

– absolute spearman coefficients

47

When Do Luxury Cars Hit the Road?

From Catwalk to Main StreetMain Contributors: Kezhen Chen, Kuan‐Ting Chen*,  Peizhong Cong, Winston Hsu, Jiebo Luo (MM ‘15 Grand Challenge)

Motivations• In modern times, a growing number of people pay more attention to fashion and the mass has the penchant to emulate what large city residents and celebrities wear

• Investigating fashion trends is of great interest to the industry and academia because of the potential for boosting many emerging applications, such as clothing recommendation, advertising by clothing brand association, etc. 

Approach1. Constructing a large dataset from the New York 

Fashion Shows and New York street chic in order to understand the likely clothing fashion trends in New York

2. Utilizing a learning‐based approach to discover fashion attributes as the representative characteristics of fashion trends, and

3. Comparing the analysis results from the New York Fashion Shows and street‐chic images to verify whether the fashion shows have actual influence on the public in New York City.

48

From Catwalk to Main StreetMain Contributors: Kezhen Chen, Kuan‐Ting Chen*,  Peizhong Cong, Jiebo Luo

Where to go for dinner tonight?

1. Different occasions

2. Different ambience

3. Formal, causal, trendy, fun?

4. Friends, couple, family?

53

RECAP: Organic Sensor Networks

• 52% of adults use online social networks (teenagers?)

– Smartphone access (>1 billion)• Real time (“In the moment”)

• Location aware

• Detailed measurements at a population scale

– No active user participation• Fine granularity

• Timely

• Multimodal sensing from multimedia data

– Advantages• autonomous. intelligent. active. mobile. no maintenance.

multimodal. social. sentiment.

– Inference & Prediction141

• Heterogeneous multimedia

• Heterogeneous information networks

• Intelligence from visual data is increasingly crucial

• From information fusion to information integration

• Separating wheat from chaff

• Geospatial analysis

• Mobile platform/context

• (Data) Rich gets richer!

• Tighter collaboration between industry and academia

• Continuing search for killer apps142

Trends and Open Issues