updated indian elections forecast

22
SentElect TM : Forecasting Elections based on Sentiments in Social Media V.S. Subrahmanian SentiMetrix, Inc. & University of Maryland @vssubrah [email protected] Apr 19 2014 © Sentimetrix, Inc All rights reserved 2014 1 This work was performed for Sentimetrix, Inc.

Upload: sentimetrix

Post on 11-Aug-2014

569 views

Category:

Data & Analytics


0 download

DESCRIPTION

SentiMetrix updated its forecast for the Indian election, first made on March 6, 2014 at the Sentiment Analysis Symposium in New York. This update is based on the data collected after the SAS14

TRANSCRIPT

Page 1: Updated Indian elections forecast

1

SentElectTM: Forecasting Elections based on Sentiments in Social Media

V.S. SubrahmanianSentiMetrix, Inc. & University of Maryland

@[email protected]

Apr 19 2014

© Sentimetrix, IncAll rights reserved 2014

This work was performed for Sentimetrix, Inc.

Page 2: Updated Indian elections forecast

2

SentElectTM Election Application

© Sentimetrix, IncAll rights reserved 2014

On May 8 2013, Sentimetrix predicted the outcome of the upcoming Pakistan election in front of 100+ people in V.S. Subrahmanian’s keynote at the Sentiment Analysis Symposium in New York City

On May 9, the BBC said the election was too close to call “Pakistan Elections: Five Reasons why the vote is unpredictable”

Sentimetrix was correct!

Page 3: Updated Indian elections forecast

3

SentElectTM

• Currently tracks Twitter feeds on virtually any topic– Politicians– Political parties– Issues (in progress, expected completion April 2014)

• Identifies intensity of sentiment on each topic in each tweet.• Forecasts trends in terms of expected number of

supporters/opponents on Twitter• Identifies individuals who are most influential in shaping an

opinion/trend• Provides a single dashboard to cover all of this.

© Sentimetrix, IncAll rights reserved 2014

Page 4: Updated Indian elections forecast

4

SentElectTM

SentElectTM Functionalities Business UseIdentify sentiment and changes in sentiment on any given topic

Track sentiment on both your political campaign as well as your competitor’s

Learns a model on “big data” showing how support/opposition to a topic spreads

Understand how your campaign (and your opponent’s) are doing with voters and why

Forecast the expected number of people who will support/oppose a topic

Forecast how many people support/oppose your campaign and/or your opponent’s

Identify the most important individuals responsible for shaping/spreading opinion on a topic

Identify those shaping positive/negative opinion about you and see if you can get them to work on your behalf. Engage with influential Twitter users

© Sentimetrix, IncAll rights reserved 2014

Page 5: Updated Indian elections forecast

5

SentElectTM Case Study

© Sentimetrix, IncAll rights reserved 2014

• Upcoming Indian election• Identified 31 entities to track.• Learned diffusion models from July

15 – Jan 25 2014.• Tested models on Jan 25-Feb 20 data

(~26 days)• Forecast trends on all 31 entities

from Feb 20 2014 to May 15 2014.• Tested diffusion forecasts on January

25-Feb 20 2014 data with Pearson correlation coefficients consistently over 0.8, usually over 0.9.

SUMMARY STATISTICS

• Study reported here uses data from July 2013 to Feb 20 2014

• Forecasts made till May 15 2014.• 19.5M tweets studied in all• 16M distinct Twitter accounts • 40M edge networkTwitter collection done using Twitter ontology and semantic database developed by Rensselaer Polytechnic Institute. [@jahendler]

Page 6: Updated Indian elections forecast

6

BJP Forecast

© Sentimetrix, IncAll rights reserved 2014

July 15 2013

Feb 24 2014 Mar 24 2014

May 15 2014

OUTLOOK

• Positive support for BJP is growing at a faster rate than negatives.

• Outlook is good but more or less same as March 6 forecast.

Page 7: Updated Indian elections forecast

7

Narendra Modi Forecast Forecast

© Sentimetrix, IncAll rights reserved 2014

July 15 2013

Feb 24 2014Mar 24 2014

May 15 2014

OUTLOOK

• Positive support for Modi is growing at a much faster rate than negatives.

• Outlook is very good and has improved since our March 6 forecast.

Page 8: Updated Indian elections forecast

8

UPA Forecast

© Sentimetrix, IncAll rights reserved 2014

July 15 2013

Feb 24 2014 Mar 24 2014

May 15 2014

OUTLOOK• Opposition

to UPA exceeds support. It is also growing at a slightly faster rate.

• Outlook for the UPA is not good and has worsened slightly since the March 6 forecast

• Number of people tweeting about UPA is way smaller

Page 9: Updated Indian elections forecast

9

Congress Party Forecast

© Sentimetrix, IncAll rights reserved 2014

July 15 2013

Feb 24 2014

Mar 24 2014

May 15 2014

OUTLOOK• Congress

has more supporters than opponents.

• Growth in support Iarger than growth in opposition

• But number of supporters is small compared to BJP.

Page 10: Updated Indian elections forecast

10

Rahul Gandhi Forecast

© Sentimetrix, IncAll rights reserved 2014

July 15 2013

Feb 24 2014Mar 24 2014

May 15 2014

OUTLOOK• Sentiment

on Rahul Gandhi is strong and growth in supporters outweights growth in opponents.

• But in raw numbers, his 1/3 the supporters that Modi has.

• Outlook is good but not great.

Page 11: Updated Indian elections forecast

11

Arvind Kejriwal Forecast

© Sentimetrix, IncAll rights reserved 2014

July 15 2013

Feb 24 2014Mar 24 2014

May 15 2014

OUTLOOK• Kejriwal

will have more opponents than supporters by early May.

• Steep increase in both supporters and opponents around mid-December 2013.

Page 12: Updated Indian elections forecast

12

SentElect Summary Statistics

© Sentimetrix, IncAll rights reserved 2014

BJP Narendra Modi

UPA CongressParty

RahulGandhi

ArvindKejriwal

#Supporters Mar 24 2014

294848 96376 59880 9324 102541 54777

#Opponent Mar 24 2014

211002 43217 71514 5839 59958 42367

#SupportersMay 15 2014

385819 102669 68926 11289 147989 64371

#OpponentMay 15 2014

257902 48002 81436 7948 65820 71717

Accuracy (PCC*) Pos.

0.999 0.998 0.998 0.977 0.995 0.979

Accuracy (PCC) Neg.

0.988 0.998 0.998 0.970 0.996 0.971

* Pearson Correlation Coefficient

Page 13: Updated Indian elections forecast

13

Head to Head: BJP vs. UPA/Congress

• Mar 24 2014: – BJP shows almost 5 times as many supporters

as Congress/UPA supporters, up in ratio from a month back.

– BJP opponents are less than 3 times as many as Congress/UPA opponents.

– So BJP is doing well.• Forecast for May 15 2014:

– BJP will have almost 3x supporters as compared to opponents.

– Congress/UPA has about 10% more opponents than supporters.

• BJP’s outlook in terms of positives and negatives shows a combined growth.

• But UPA/Congress combined negatives exceed positives.

• And support for UPA/Congress is tepid raising the question of Congress/UPA supporters showing up to vote.

• In general, till May 15 2014, BJP seems to garner more support than Congress/UPA.

© Sentimetrix, IncAll rights reserved 2014

BJP -5/15

UPA/Congress - 5/15

BJP - 3/24

UPA/Congress - 3/24

0

500000

1000000

SupportOpposition

Page 14: Updated Indian elections forecast

14

Head to Head: Narendra Modi vs. Rahul Gandhi

• Mar 24 2014:– Mr. Gandhi has about 5% more

supporters than Mr. Modi.– But Mr. Gandhi has 1.4x as many

opponents in comparison to Mr. Modi.

• May 15 2014:– In terms of supporters, Mr. Gandhi

is pulling ahead of Mr. Modi with 1.5x supporters compared with Mr. Gandhi.

– But on opponents, Mr. Gandhi has 1.3x of the opponents Mr. Modi has.

• This reverses a trend seen in our Mar 6 2014 forecast.

• Head-to-head, Mr. Gandhi has improved his showing in between Feb 20 and Mar 24.

© Sentimetrix, IncAll rights reserved 2014

Modi -5/15

Gandhi - 5/15

Modi - 3/24

Gandhi - 3/24

0 100000 200000 300000

SupportOpposition

Page 15: Updated Indian elections forecast

15

Head to Head: Rahul Gandhi vs. Arvind Kejriwal

• Mar 24 2014:– Mr Gandhi has 2x supporters

w.r.t. Mr. Kejriwal– But he has 1.4x opponents w.r.t.

Mr. Kejriwal (down from 2x in our Feb 6 forecast)

• May 15 2014:– Mr. Gandhi will have over 2x

supporters that Mr. Kejriwal [an about turn from our Mar 6 forecast!]

– Mr. Kejriwal will have 1.2x opponents w.r.t. Mr. Gandhi, a significant reduction of the ratio from the last month.

• In short, Mr. Gandhi has made an about-turn in the race in terms of positives.

• Congress/UPA should outperform AAP/Mr. Kejriwal.

© Sentimetrix, IncAll rights reserved 2014

Kejriwal -5/15

Gandhi - 5/15

Kejriwal - 3/24

Gandhi - 3/24

0 100000200000300000

SupportOpposition

Page 16: Updated Indian elections forecast

16

Head to Head: Narendra Modi vs. Arvind Kejriwal

• Mar 24 2014:– Mr Modi has 1.9x supporters as Mr.

Kejriwal– But on opponents, he is more or

less even with Mr. Kejriwal (a sharp reduction from our Mar 6 talk)

• May 15 2014:– Mr. Modi and Mr. Kejriwal will have

about 1.6x the supporters of Mr. Kejriwal

– Mr. Kejriwal will have about 1.5x the number of opponents as Mr. Modi

• Overall, the situation in the Modi vs. Kejriwal race has not changed much.

• Though support for Mr. Kejriwal is growing, opposition is growing at a much faster rate.

• We expect BJP to handily outperform AAP/Mr. Kejriwal.

© Sentimetrix, IncAll rights reserved 2014

Kejriwal -5/15

Modi - 5/15

Kejriwal - 3/24

Modi - 3/24

050000

100000

150000

200000

SupportOpposition

Page 17: Updated Indian elections forecast

17

Forecast Summary

© Sentimetrix, IncAll rights reserved 2014

Forecast #1

• Narendra Modi will be India’s next Prime Minister.

Forecast #2

• BJP (by itself) will fall short of a majority in Parliament, securing less than 272 seats.

Forecast #3

• Next Indian government will be a BJP-led coalition

Page 18: Updated Indian elections forecast

18

Forecast Risks

• Our forecast can go wrong.– Risk #1 Forecasting based on unsupervised learning is difficult at

best. No training data connecting votes on the ground in India to number of supporters/opponents on Twitter. Selection bias.

– Risk #2 Forecast is based on publicly available Twitter data, not on entire Twitter fire-hose.

– Risk #3 Twitter-based and technology based risks: geo-location issues, bots/sybils/fake accounts.

– Risk #4 Changing situation on the ground with new allegations (e.g. corruption) emerging frequently.

– Risk #5 External events we can’t control for (e.g. terrorist attacks) can dramatically change the electoral landscape.

© Sentimetrix, IncAll rights reserved 2014

Page 19: Updated Indian elections forecast

19

One Sybil’s strategy: @IsabellaObregom

1. Take tweet from a reputable account:– @AapKaJawab, an Aam Aadmi Party enthusiast, retweets:

“Arvind Kejriwal breaks into Manna Dey song on brotherhood at swearing-in – http://t.co/bVCHPte60k”

2. Follow link, rewrap in new shortened URL– @AapKaJawab’s link leads to an Indian news article– @IsabellaObregom shrinks URL with Adf.ly, tweets:

“Arvind Kejriwal breaks into Manna Dey song on brotherhood at swearing-in http://t.co/81cq9eyrNh”

3. @IsabellaObregom now paid per click through Adf.ly!

(In early 2014, Adf.ly and Twitter suspended account – original owner tweeted only in Spanish)

© Sentimetrix, IncAll rights reserved 2014

Page 20: Updated Indian elections forecast

20

A larger Sybil network in our dataset

• We found many Sybil/bot accounts• @Marie____Taylor and @Amy____Jones tweet

identically, except different shortened links.– Overlapping network of followers– 100K+ tweets– Many “smaller” inactive followers, each following 30-

40 random people, with 30-40 bot followers.– Related: @Lea___Smith, @Megan__Martinez, etc…

© Sentimetrix, IncAll rights reserved 2014

Page 21: Updated Indian elections forecast

21© Sentimetrix, IncAll rights reserved 2014

Page 22: Updated Indian elections forecast

22

SentiMetrix Contact Information

• Address 6017 Southport Drive20814 Bethesda MDUSA

• E-mail [email protected]

• www.sentimetrix.com• Telephone +1 240 479

9286

• V.S. Subrahmanian• Twitter: @vssubrah• Email:

[email protected]• www.cs.umd.edu/~vs/• Telephone: +1 301 405

6724

© Sentimetrix, IncAll rights reserved 2014