election polling & forecasting 2004
DESCRIPTION
A statistics presentation I gave at work using simulation to predict election results from state-by-state polling data.TRANSCRIPT
But first…..
Something of topical interest …
Psephology
The M&M graph…
Poll of polls for October 8th, 2004 - Posted October 15th, 2004 Dead Heat
Before the second and third debates, Bush's lead has dropped to less than 0.1%... http://tis.goringe.net/pop/pollofpolls.html
Reweighting by Party ID? Shifting Party Affiliation
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
'52 '56 '60 '64 '68 '72 '76 '80 '84 '88 '92 '96 '00
Year
Strong Republican
Weak Republican
IndependentRepublican
IndependentIndependent
IndependentDemocrat
Weak Democrat
Strong Democrat
“Dewey Beats Truman” Chicago Tribune, Nov 3, 1948 Sources of bias: • “distribution of telephones favored wealthy
Dewey voters rather than poor Truman supporters”
• “the pollsters stopped polling two weeks or more before the election.”
• Convenience Sampling
Golden Age of Polling? • >95% Households have a telephone
Easy to get random sample • Until recently: good response • Now: caller id, answering machines, etc.
30% response rate
Cell Phones: No landline èYounger, Democratic
Online Polling?
Demographics
• "As people do better, they start voting like Republicans--unless they have too much education and vote Democratic, which proves there can be too much of a good thing." --Karl Rove
Electoral College Map
Electoral College Approach
KERRY ELECTORAL VOTE PROJECTIONS from State-Level Poll Data 14-Oct-2004 Bush > 55% 15 110 Bush total: Bush 50-55% 12 146 Kerry 50-55% 15 187 Kerry total: Kerry > 55% beginning ending Estimated Kerry Vote Share estimated probability of Kerry win if election held now* state EV stdev AL 9 44.3% 0.7% 1.0% AK 3 37.0% 1.2% 1.0% AZ 10 46.1% 3.2% 1.0% AR 6 53.1% 3.1% 97.8% CA 55 54.6% 2.8% 99.0% number of states Total Electoral Votes 256 9 95 282 51 538 polls used in estimate: 10/7/04 10/12/04
number of states
Total Electoral
Votes
Bush > 55% 15 110 Bush total:Bush 50-55% 12 146 256Kerry 50-55% 15 187 Kerry total:Kerry > 55% 9 95 282
51 538
beginning ending10/7/2004 10/12/2004
KERRY ELECTORAL VOTE PROJECTIONSfrom State-Level Poll Data
18-Oct-2004
polls used in estimate:
Basis for Simulation
State EV K.share SDAL 9 44.3% 0.7%AK 3 37.0% 1.2%AZ 10 46.1% 3.2%AR 6 53.1% 3.1%CA 55 54.6% 2.8%CO 9 46.8% 2.2%CT 7 55.1% 2.7%DE 3 54.5% 2.4%DC 3 89.5% 0.7%
Simulate Elections 1 Random Electoral Votes
Rand# SD Mean Normal Kerry BushAL 0.396 0.73% 44.35% -5.84% 9AK 0.281 1.20% 37.03% -13.66% 3AZ 0.314 3.15% 46.08% -5.44% 10AR 0.244 3.12% 53.15% 0.98% 6CA 0.105 2.78% 54.63% 1.14% 55CO 0.441 2.19% 46.76% -3.56% 9CT 0.113 2.70% 55.15% 1.87% 7DE 0.44 2.40% 54.50% 4.13% 3DC 0.474 0.73% 89.49% 39.44% 3
22-Jul 4-Aug 30-Aug 2-Sep 7-SepElect Wins 76.0% 73.0% 56.4% 37.0% 39.2%
Loses 23.6% 26.1% 42.2% 61.2% 59.7%Ties 0.4% 0.9% 1.4% 1.8% 1.1%Odds 3.22to1 2.80to1 1.34to1 0.60to1 0.66to1
Estimating States (1)
• Historical data: Compare state outcomes vs national
• Current data: Adjust national poll by historical difference
• Combine with recent state polls
Example: http://www.mydd.com/story/2004/5/28/115448/006
Estimating States (2) Election data:
1980-2000 (6 elections) state vote counts
For pair of states (2550 pairs): Regress each states vote on all other states
For each available poll:
Predict votes in all other states
For each state: Use the median of predicted votes
www.pollkatz.homestead.com/files/kerryEVproj.htm
Forecasting Models Predict Vote from: Economic conditions,
social climate, incumbency, “party fatigue”, etc., etc.
Limited by: Too little data Result: perfect predictions of past elections Explanatory, not predictive
Bread & Peace Equation 14
1 20
lnjt t j t
jVote R CUM KIAα β λ β−
=
= + Δ +∑
40
45
50
55
60
65
-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Weighted-average growth of real disposable personal income per capita during the presidential term
Tw
o-pa
rty
vote
sha
re
Real Income Growth and the Two-Party Vote Share of the Incumbent Party's Presidential Candidate
%
%
1980
1972
1996
1952
1968
19641956
1984
1988
1992
1976
1960
Vietnam
Korea
Votes as a function of:
• Income growth
• War Casualties
2004: Predicts ~53% Bush
Explanatory, not predictive!
(Who’s income?)
Incumbent Advantage or Disadvantage?
• Forecasting models: Incumbent advantage
• Final Polls vs. Final Votes: Incumbent disadvantage
Apparently, undecideds lean toward challenger (at about 2:1)
My Favorite Forecasting Model Electability = 4P - V - S + R + 9G
+ 95DCI + 95GEN + 95NUC • Elections since 1932 • Predicts all elections since 1932 • Developed using stochastic trials
(i.e., guessing until something worked)
• Source: Annuals of Improbable Results
http://members.bellatlantic.net/~vze3fs8i/air/pres2004.html
2004: Bush, 70; Kerry -20
Another Favorite Washington Redskins, Last home game prior to election
Redskins Win è Incumbent wins Redskins Lose è Incumbent loses
• True for entire history of Washington Redskins (15 elections)
(1932 & earlier: Boston Braves, no predictive power) • October 31, 2004: vs Green Bay
Not anymore 28 to 14 defeat, favor Kerry
University of Iowa's Electronic Market
http://128.255.244.60/graphs/graph_Pres04_WTA.cfm
Obligatory Bayesian Methods A Bayesian Truth Serum for Subjective Data. Prelec,
Drazen. Science, Vol 306, Issue 5695, 462-466 , 15 October 2004
• Reward based system • Respondents “compete” • Counterbalances tendency to agree with
perceived majority • Best: large samples, rational participants
Bayesian Truth Serum Example: Q1: Do you prefer painting A or B? Q2: Which would others prefer? • Compute Information score+prediction score
using sums of logarithms for each respondant and etc etc • Truth Telling is Bayesian Nash equilibrium (I.e.,
reduced payoffs for anything else) • Does it work? When? For who? For only Bayesians?
What about frequentists? Under what conditions? Cost? Etc...
• A work in progress