bayesian analysis using mcmc on survey data
TRANSCRIPT
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
1/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
IMPROVING FORECASTING
OF POLITICAL POLLING OUTCOMES
By
Lancelot Muwayi and Sanoj u!a"
Su#e"$i%o"& '"( Min)*Lon) La!
i
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
2/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
A Ca#%tone P"oject
Su+!itted to t,e Uni$e"%ity o- C,ica)o
in #a"tial -ul-ill!ent o- t,e "e.ui"e!ent% -o" t,e de)"ee o-
Ma%te" o- Science in Analytic%
G"a,a! Sc,ool o- Continuin) Li+e"al and P"o-e%%ional Studie%
/Au)u%t0 12345
ii
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
3/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
T,e Ca#%tone P"oject co!!ittee -o" Lancelot Muwayi and Sanoj u!a"
ce"ti-ie% t,at t,i% i% t,e a##"o$ed $e"%ion o- t,e -ollowin) ca#%tone #"oject "e#o"t&
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
APPROVE' B6 SUPERVISING COMMITTEE&
'"( Min)*Lon) La! 777777777777777777777777777777777777
'"( Se!a Ba"la% 7777777777777777777777777777777777777
iii
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
4/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Abstract
T,i% "e%ea"c, e8#lo"e% t,e e8tent to w,ic, Baye%ian e%ti!ato"% i!#"o$e t,e
-o"eca%tin) o- #olitical #ollin) outco!e%( A Baye%ian !odel i% +uilt u%in) a two*%te#
#"oce%%( T,e -i"%t %te# a##lie% deci%ion t"ee% to %elect %i)ni-icant de!o)"a#,ic
$a"ia+le%9 t,e %econd %te# u%e% t,e Ma":o$ C,ain Monte Ca"lo /MCMC5 !et,od to
e%ti!ate t,e Baye%ian !odel( T,e Baye%ian !odel ta:e% into account %a!#lin)
$a"iation and en%u"e% t,e %ta+ility o- #a"a!ete" e%ti!ate% +y wei),tin) t,e! wit,
#"io"(
Key Words
Baye%ian e%ti!ato"0 -o"eca%tin) o- #ollin) outco!e%0 MCMC0 deci%ion t"ee
i$
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
5/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Execut!e Su""ary
Inco!#lete and noi%y data -"o! di%#a"ate %ou"ce% call -o" non*con$entional
%tati%tical tool% -o" co""ect analy%i%( Baye%ian !et,od% a"e +ene-icial -o" o+tainin)
"o+u%t e%ti!ate% and co!+inin) in-o"!ation -"o! di%#a"ate %ou"ce%( T,i% "e%ea"c,
e8a!ine% t,e %uita+ility o- Baye%ian !et,od% in e%ti!atin) t,e #a"a!ete"% o- a !odel
t,at #"edict% election outco!e% on t,e +a%i% o- #ollin) data in w,ic, #a"a!ete"% o-
ea"lie" !odel% could +e u%ed a% #"io"%( A two*%ta)e #"oce%%0 CART and C;AI'
deci%ion t"ee% -o" $a"ia+le %election and Ma":o$ C,ain Monte Ca"lo /MCMC5 !et,od0
i% u%ed to #"edict a##"o$al o- t,e O+a!a #"e%idency +a%ed on "ace0 )ende"0 education0
and "e)ion o- "e%#ondent%( Re%ult% %,ow t,at +lac: "e%#ondent% -"o! t,e U(S( %out,e"n
"e)ion a##"o$e o- O+a!a
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
6/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Tab#e o$ Co%te%ts
Int"oduction((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3
P"o+le! State!ent=((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3
Re%ea"c, Pu"#o%e=(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1
Re%ea"c, >ue%tion=((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((?
Bac:)"ound((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((?
Met,odolo)y((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((@
E8#lo"ato"y 'ata analy%i%(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((@
Buildin) Cla%%i-ication Model((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((32
Baye%ian Lo)i%tic Re)"e%%ion Model=((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((32
PROC MCMC wit, Baye%ian Lo)i%tic Re)"e%%ion Model==((((((((((((((((((((((((33
Final Re%ult%(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3?
E8#lo"ato"y Analy%i% Re%ult%====================(3?
Facto"% Selection=======================((((((((3@
Buildin) Cla%%i-ication Model Re%ult%==(==============((3
Su!!a"y and Conclu%ion%((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((11
Reco!!endation%(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1?
Fu"t,e" Model 'e$elo#!ent (((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1?
A##endi8((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1
Bi+lio)"a#,y ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1
$i
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
7/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Lst o$ Tab#es
Ta+le 3& A))"e)ated 'ata -o" Model(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((33
Ta+le 1& Inte"action o- Race and Re)ion(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3
Ta+le ?& Inte"action o- Race and Se8((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((34
Ta+le & Inte"action o- Race and Education((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3D
Ta+le 4& I!#o"tant Facto"%((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3@
Ta+le D& Out#ut Pa"a!ete"% o- Model(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3
Ta+le & Rando! E--ect% Pa"a!ete"%((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((12
Ta+le @& Po%te"io" Su!!a"ie% o- Pa"a!ete" E%ti!ate%============(((12
Lst o$ &a'ra"s
'ia)"a! 3& Blac: Only a% a -acto"((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3
'ia)"a! 1& Sout, a% a -acto"(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((3
'ia)"a! ?& 'ia)no%tic% Plot -o" Beta3((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((11
'ia)"a! & 'ia)no%tic% Plot -o" Beta1((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((14
'ia)"a! 4& 'ia)no%tic% Plot -o" Beta?((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1D
'ia)"a! D& 'ia)no%tic% Plot -o" Beta((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1D
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
8/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
I%troducto%
Prob#e" State"e%t
E$e"y -ou" yea"%0 A!e"ican% elect t,ei" P"e%ident -o" a new te"!( Since t,i% i%
an i!#o"tant deci%ion -o" t,e nation0 #olitical )"ou#% ,a$e +een conductin) election
#oll% %ince t,e 3?2%( Million% o- A!e"ican% ,a$e +een %u"$eyed on t,ei" #olitical
o#inion%0 yieldin) a wealt, o- in-o"!ation t,at #olitical %cienti%t% can utilie to t,ei"
-ull ad$anta)e( /Cau),ey and a"%,aw0 122?5( State*le$el #"e*election %u"$ey data
"e#"e%ent a "ic, new %ou"ce o- in-o"!ation -o" +ot, -o"eca%tin) election outco!e% and
t"ac:in) t,e e$olution o- $ote" #"e-e"ence% du"in) t,e ca!#ai)n /Line"0 123?5(
;owe$e"0 !ajo" ,u"dle% %till e8i%t t,at a"e a--ectin) %u"$ey "e%ea"c,e"%0
includin) low "e%#on%e "ate%0 "i%in) co%t% to ca""y out t,e %u"$ey%0 and t,e de!and -o"
.uic:e" tu"n*a"ound ti!e%( T,e%e -acto"% ,a$e inc"ea%ed t,e de!and -o" new way% to
)ene"ate accu"ate and ti!ely #ollin) e%ti!ate% o- #u+lic o#inion and %ocial +e,a$io"%
+y u%in) inco!#lete o" noi%y data and co!+inin) in%i),t% -"o! di--e"ent %u"$ey% and
ot,e" %ou"ce%( Re%ea"c,e"% a"e -aced wit, t,e ta%: o- doin) !o"e wit, di%#a"ate
data%et%(
In t,e a"ena o- #olitical #ollin)0 -o" e8a!#le0 Line" /123?5 #"o#o%ed a
Baye%ian a##"oac, to i!#"o$e -o"eca%tin) election outco!e% -"o! %tate #oll% u%in)
"ecent ad$ance% in Baye%ian !et,odolo)y /%ee al%o ac:!an0 12249 Ca"lin and Loui%0
1229 and Ntou-"a% 1225( Baye%ian in-e"ence i% no"!ally u%ed to u#date a
#"e$iou%ly e%ti!ated #"o+a+ility )i$en new in-o"!ation /Gel!an et al(0 1225(
3
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
9/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
T,i% %tudy u%e% #oll% conducted -"o! anua"y 1231 to Ma"c, 12310 w,ic,
a%:ed nea"ly ?0222 $ote"% in eac, %tate a+out t,ei" a##"o$al o- t,e O+a!a #"e%idency
on %e$en di--e"ent %cale%0 "an)in) -"o! %t"on)ly di%a##"o$e to %t"on)ly a##"o$e( T,e
c,annel t,"ou), w,ic, data wa% collected i% not :nown0 and t,e collected data a"e
%#a"%e0 noi%y0 and co!e -"o! un"e#"e%entati$e %a!#le%(
T,e )oal o- t,i% %tudy i% to +uild a %tati%tically "o+u%t !odel to "elate a##"o$al
"atin)% o- t,e O+a!a #"e%idency to c,a"acte"i%tic% o- t,e #o#ulation% u%in) t,e %u"$ey
data #"o$ided +y I#%o%(
Researc( Pur)ose
T,e #"i!a"y #u"#o%e o- t,i% "e%ea"c, i% to +uild a !odel to #"edict a##"o$al
"atin)% o- t,e O+a!a #"e%idency and to identi-y %i)ni-icant #"edicto"%( T,e %econda"y
#u"#o%e i% to e8a!ine t,e %uita+ility o- Baye%ian !et,od%( T,e la%t0 +ut not t,e lea%t
i!#o"tant #u"#o%e0 i% to de$elo# cu%to! SAS code% t,at ou" Ca#%tone %#on%o" can
"e-e"ence(
I#%o% Pu+lic A--ai"%0 ou" Ca#%tone %#on%o"0 !aintain% an acti$e #"o)"a! o-
"e%ea"c, t,at inte)"ate% Baye%ian and ot,e" !et,odolo)ie% into it% #ollin) and +"oade"
"e%ea"c, #"actice%( T,i% "e%ea"c, a%%i%t% and ad$i%e% t,e co!#any
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
10/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
al%o u%e-ul in co!+inin) in-o"!ation -"o! di%#a"ate %ou"ce%( Pa"a!ete"% o- t,e !odel%
-"o! ea"lie" #oll% can +e u%ed a% #"io"% -o" t,e #a"a!ete"% o- t,e !odel -o" t,e new
#oll(
T,e #a"ticula" ca%e %tudy at t,e cente" o- t,i% #"oject i% U(S( %tate*le$el !id*te"!
election%0 u%in) ,i%to"ical data %ou"ce% a% well a% #a"tial o" inco!#lete #ollin) data and
ot,e" data /includin) %i!ulated data5( e wo":ed clo%ely wit, I#%o% !et,odolo)i%t%
and #ollin) e8#e"t% to ad$ance t,ei" -o"eca%tin) and #ollin) ca#a+ilitie%0 a% well a% to
+"oaden t,ei" #"o)"a! o- "e%ea"c, on non#"o+a+ility %a!#lin)(
Researc( *uesto%
e a"e ta%:ed to de$elo# a -"ee %tandin) and -le8i+le SAS #"o)"a! t,at will
i!#le!ent t,e Baye%ian !odel e%ti!ation0 )ene"ate co""ect %tanda"d e""o"%0 and allow
-o" t,e e$entual int"oduction o- additional ta")etJ!odel $a"ia+le%( T,e !odel will
allow -o" inte)"ation o- ot,e" %#ecialied attitudinal !ea%u"e%0 in addition to t,e
#a"a!ete"% u%ed in t,i% #"oject( T,e #"oject a%:% t,i% .ue%tion& to w,at e8tent a"e
Baye%ian !et,od% %uita+le -o" e%ti!atin) #a"a!ete"% o- a !odel t,at #"edict% election
outco!e on t,e +a%i% o- #ollin) data0 w,e"e #a"a!ete"% o- ea"lie" !odel% can +e u%ed
a% #"io"K
+ac,'rou%d
T,e -ocu% o- ou" "e%ea"c, o- t,i% #"oject i% to analye %u"$ey data "elated to t,e
#"e%idency o- O+a!a and to identi-y t,e %i)ni-icant #"edicto"% t,at a--ected ,i%
a##"o$al "atin) ac"o%% di--e"ent %tate% o" "e)ion%( T,e #oll% u%ed in ou" %tudy we"e
?
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
11/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
conducted a% "ando! %a!#le% o- "e)i%te"ed o" li:ely $ote"%( Sa!#le data i% %#a"%e and
noi%y( In %o!e %tate% J "e)ion%0 t,e"e a"e !o"e "e%#ondent% t,an in ot,e" %tate% J
"e)ion%( In ou" data0 " e%#on%e $a"ia+le i% cate)o"ical in natu"e( e t,e"e-o"e ,ad to
u%e cla%%i-ication tec,ni.ue( e %ta"ted loo:in) -o" di--e"ent %tati%tical !et,od% to -it
ou" need%(
Ou" data i% -acto"ial in natu"e& %o!e -acto"% a"e o+%e"$ed wit,in ot,e" -acto"%(
Fo" in%tance0 di--e"ent le$el% o- "ace a"e o+%e"$ed in di--e"ent "e)ion%( it,in eac,
"ace we ,a$e two )ende"%0 and %o on( Ou" )oal i% to unde"%tand t,e inte"action e--ect%
a!on) t,e%e -acto"%( T,e"e-o"e0 we %ta"ted +y de$elo#in) an SAS #"o)"a! -o" !ulti*
le$el "e)"e%%ion o- #ollin) data(
To unde"%tand !ulti*le$el "e)"e%%ion on %u"$ey data0 we -i"%t analyed ca%e
%tudie% conducted +y a nu!+e" o- "e%ea"c,e"%( Gel!an and G,ita0 in H'ee#
Inte"action% wit, MRP& Election Tu"nout and Votin) Patte"n% a!on) S!all Electo"al
Su+)"ou#%0 identi-ied ,ow !ultile$el "e)"e%%ion de"i$e% t,e $otin) #atte"n and
tu"nout e%ti!ate% +a%ed on %!all %u+)"ou#% o- t,e #o#ulation( MRP %tand% -o" !ulti*
le$el "e)"e%%ion and #o%t*%t"ati-ication3( T,ey analyed ,ow to -it a !ultile$el
"e)"e%%ion !odel t,at include% )"ou#*le$el #"edicto"% a% well a% une8#lained $a"iation
at eac, o- t,e le$el% o- t,e -acto"% and t,ei" inte"action%(
I#%o% want% to u%e it% #"io" :nowled)e on t,e #a"a!ete" e%ti!ate%( T,e"e-o"e0
an a##lication o- Baye%ian analy%i% +eca!e ine$ita+le in t,i% #"oject( To unde"%tand
Baye%ian analy%i% on #ollin) data0 we %tudied Line"
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
12/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
-o"eca%tin) a##"oac, de$elo#ed in #olitical %cience and econo!ic% wit, t,e #oll*
t"ac:in) ca#a+ilitie% !ade -ea%i+le +y t,e "ecent u#%u")e in %tate*le$el o#inion #oll%(
In t,e ca%e %tudy HElecto"al Fo"eca%tin) and Pu+lic O#inion T"ac:in) in Latin
A!e"ica& An A##lication to C,ile0 Bun:e" et al( a")ued t,at Baye%ian in-e"ence i%
well %uited to e%ti!ate t"ue #u+lic o#inion( Baye%ian in-e"ence i% no"!ally u%ed to
u#date a #"e$iou%ly e%ti!ated #"o+a+ility )i$en new in-o"!ation /Gel!an et al(0
1225(
Baye%ian !et,od% a"e de"i$ed -"o! t,e a##lication o- Baye%< t,eo"e!( Fo" e$ent%
A and B0 Baye%< t,eo"e! i% e8#"e%%ed a%
Pr ( A∨B) Pr (B| A ) Pr ( A)
Pr (B)
in w,ic, de)"ee o- +elie- in #"o#o%ition A )i$en e$idence B i% e.ual to t,e joint
#"o+a+ility o- A and B di$ided +y t,e #"o+a+ility o- B( T,i% t,eo"e! ,a% +een -"e.uently
a##lied in electo"al %tudie%0 includin) "e%ea"c, t,at ,a% +een u%ed in #u+lic o#inion #oll%
to e%ti!ate electo"al "etu"n% /Fe"nande*I Ma"in0 12339 Loc: and Gel!an0 12329
ac:!an0 1229 Line"0 123?9 St"au%%0 1225( A% %uc,0 it ,a% +een -ound to inc"ea%e t,e
o$e"all accu"acy o- e%ti!ate% /%ee B"oo:% and Gel!an0 3@9 Gel!an and Ru+in0 315
and /ac:!an0 12220 1225(
• Pr(A) is the prior degree of belief in A
• Pr(A|B) is the posterior degree of belief in A, in the sense
of what “after looking at the evidence (B)”
It can al%o +e w"itten a%
4
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
13/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
P" /AB5 Pr (B| A ) Pr ( A )
Pr (B| A ) Pr ( A )+ Pr ( B| Ā ) Pr ( Ā)
in w,ic, !ean% not A( I- A i% #a"a!ete" and B a% data y0 t,en we ,a$e
P"/y5 Pr ( y|θ ) Pr (θ)
Pr ( y )
Pr ( y|θ ) Pr (θ)
Pr ( y|θ ) Pr (θ )+ Pr ( y|θ́ ) Pr (θ́)
T,e .uantity P"/y5 i% t,e !a")inal #"o+a+ility0 and it %e"$e% a% a no"!aliin) con%tant to
en%u"e t,at t,e #"o+a+ilitie% add u# to unity( Becau%e P"/y5 i% a con%tant0 we can i)no"e it
and w"ite
P"/ θ| y )∝ Pr ( y|θ ) Pr (θ )
T,u%0 t,e #"io" P"/5 i% +ein) u#dated wit, li:eli,ood P"/y5 to -o"! t,e #o%te"io"
di%t"i+ution P"/y5(
In a nut%,ell0 Baye%ian analy%i% u#date% +elie-% a+out t,e #a"a!ete"% +y
accountin) -o" additional data( e need to wei),t t,e li:eli,ood -o" t,e data wit, t,e
#"io" di%t"i+ution to #"oduce t,e #o%te"io" di%t"i+ution( I- we want to e%ti!ate a #a"a!ete"
θ -"o! data y { y1….. yn } +y u%in) a %tati%tical !odel de%c"i+ed +y den%ity #/y
θ ¿ 0 Baye%ian analy%i% %ay% t,at we cannot dete"!ine θ e8actly +ut we can de%c"i+e
t,e unce"tainty +y u%in) #"o+a+ility %tate!ent% and di%t"i+ution( e can -o"!ulate a #"io"
di%t"i+ution / θ¿ to e8#"e%% ou" +elie-% a+out θ ( e t,en u#date t,e%e +elie-% +y
co!+inin) t,e in-o"!ation -"o! t,e #"io" di%t"i+ution and t,e data0 de%c"i+ed wit, t,e
%tati%tical !odel #/ θ∨ y¿ 0 to )ene"ate t,e #o%te"io" di%t"i+ution #/ θ| y ) .
D
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
14/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
#/ θ| y )= p(θ , y) p ( y )
p( y∨θ)π (θ)
p( y)
p ( θ ) × P( y∨θ)
∫ p (θ)× p ( y|θ ) dθ
In )ene"al0 any #"io" di%t"i+ution can +e u%ed0 de#endin) on t,e a$aila+le #"io"
in-o"!ation( T,e c,oice can include in-o"!ati$e #"io" di%t"i+ution i- %o!et,in) i%
:nown a+out t,e li:ely $alue% o- t,e un:nown #a"a!ete"%0 o" Hdi--u%e o" Hnon*
in-o"!ati$e #"io"% i- eit,e" little i% :nown a+out t,e coe--icient $alue% o" i- one
wi%,e% to %ee w,at t,e data t,e!%el$e% #"o$ide a% in-e"ence%( Non*in-o"!ati$e #"io"
di%t"i+ution% #lay a !ini!al "ole in t,e #o%te"io" di%t"i+ution(
Se$e"al -eatu"e% o- t,e Baye%ian a##"oac, !a:e it att"acti$e -o" "e%ea"c,e"%
+ecau%e it #"o$ide% a !ec,ani%! -o" co!+inin) a #"io" #"o+a+ility di%t"i+ution -o" t,e
%tate% o- natu"e wit, %a!#le in-o"!ation to #"o$ide a "e$i%ed /#o%te"io"5 #"o+a+ility
di%t"i+ution a+out t,e %tate% o- natu"e( T,e%e #o%te"io" #"o+a+ilitie% a"e t,en u%ed to
!a:e +ette" deci%ion%(
P"e$iou% #o%te"io" di%t"i+ution can al%o +e u%ed a% a #"io" w,en new
o+%e"$ation% +eco!e a$aila+le( S#a"%e and noi%y data in-e"ence #"oceed% in t,e %a!e
!anne" a% i- one ,ad a la")e %a!#le( It #"o$ide% inte"#"eta+le an%we"%0 %uc, a% Ht,e
t"ue #a"a!ete" θ ,a% a #"o+a+ility o- 2(4 o- -allin) in a 4Q c"edi+le inte"$al( It
#"o$ide% a con$enient %ettin) -o" a wide "an)e o- !odel%0 %uc, a% ,ie"a"c,ical !odel%
and !i%%in) data #"o+le!%(
In ca%e t,e #o%te"io" di%t"i+ution in Baye%ian analy%i% doe% not ,a$e a clo%ed
-o"!0 one can a##ly t,e Ma":o$ C,ain Monte Ca"lo /MCMC5 %i!ulation !et,od% -o"
any %a!#le %ie and o+tain accu"ate e%ti!ate% o- #a"a!ete"%(
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
15/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
T,e MCMC #"ocedu"e u%e% t,e Ma":o$ C,ain Monte Ca"lo tec,ni.ue to
e%ti!ate t,e !odel #a"a!ete"% and to #"oduce co""ect %tanda"d e""o"% and con-idence
li!it%( Ma":o$ c,ain Monte Ca"lo i% a )ene"al co!#utin) al)o"it,! t,at ,a% +een
widely u%ed in !any %cienti-ic di%ci#line%0 includin) %tati%tic%( T,e #o%te"io"
di%t"i+ution o-ten in$ol$e% a ,i),*di!en%ional inte)"ation( T,e -unction o- Monte
Ca"lo
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
16/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
#a"a!ete" o- de!o)"a#,ic in-o"!ation ,ad !ulti#le le$el%( Race con%i%ted o- -i$e
le$el%& +lac: only0 w,ite only0 ;i%#anic%0 ' /'on
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
17/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
in t,e%e inte"action%( ;owe$e"0 a% a "e%ult o- analyin) t,e t,"ee ta+le% !entioned in
Final Re%ult% %ection0 we -ound t,at "ace e!e")ed a% a do!inant -acto" and a!on) "ace
+lac: only and w,ite only wa% !o"e #"o!inent( T,e inte"action% o- +lac: only wit,
%out, "e)ion and w,ite only wit, no colle)e education a##ea"ed %i)ni-icant(
To $e"i-y t,e a+o$e -indin)%0 we u%ed C;AI' and CART in SAS( Facto"%
%election in t,e Final Re%ult% %ection di%cu%%e% ,ow %i)ni-icant -acto"% we"e c,o%en(
+u#d%' C#ass$cato% Mode#
+ayesa% Lo'stc Re'resso% Mode#
Ou" ta")et $a"ia+le0 O+a!a a##"o$al0 ,a% two le$el%0 na!ely A##"o$e and
'i%a##"o$e( It can +e "e#"e%ented in +ina"y -o"!at( e t,e"e-o"e u%ed a lo)i%tic%
"e)"e%%ion !odel( A% di%cu%%ed ea"lie"0 Baye%ian analy%i% need% to +e inco"#o"ated in
ou" !odel( T,e"e-o"e0 a Baye%ian lo)i%tic "e)"e%%ion !odel i% t,e +e%t o#tion to u%e in
t,i% %cena"io(
F"o! Facto"% %election o- Final Re%ult% %ection0 we c,o%e -ou" -acto"% to +uild ou"
!odel0 and t,o%e a"e& Blac:0 ,ite0 No Colle)e a!on) education0 and Sout, a!on)
"e)ion%( Fi"%t0 we #"e#a"ed ou" data -o" a lo)i%tic "e)"e%%ion "ando!*e--ect% !odel( e
du!!y coded t,e cate)o"ical $a"ia+le%( Fo" e8a!#le0 we c,o%e only two "ace% /Blac:
and ,ite5 du"in) -acto" %election( T,e"e-o"e0 in t,e Blac:
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
18/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Finally0 t,e A##"o$al $alue i% 3 w,en O+a!a i% a##"o$ed0 and 2 ot,e"wi%e0 i(e(0
'i%a##"o$al( e ta+ulated t,e 1 /1 ? 1 15 #o%%i+le $alue% o- t,e ta")et and t,e
a+o$e -acto"% u%in) #"o)"a!!in) code in a##endi8( T,en -o" !odelin) #u"#o%e%0 w,en
A##"o$al i% 30 we calculated total count% -o" all cate)o"ie% di%cu%%ed a+o$e0 -o" w,ic,
we )ene"ated t,i% data&
Tab#e -. A''re'ated &ata $or Mode#
Grou) +#ac, W(te No%/Co##e'e Re'o% Sout( A))ro!a# Tota#
- 2 2 2 2 D 13D
0 2 2 2 3 24 3322
1 2 2 3 2 1@2 ?
2 2 2 3 3 31 ?D1
3 2 3 2 2 42 3D
4 2 3 2 3 132D @??
5 2 3 3 2 3?2@ @@?
6 2 3 3 3 43 1@@@
7 3 2 2 2 2? 4
-8 3 2 2 3 12 33
-- 3 2 3 2 31 32
-0 3 2 3 3 1 2
F"o! t,e a+o$e ta+le0 we deduce t,at w,en "ace% a"e neit,e" Blac: no" ,ite0
education le$el i% ot,e" t,an no colle)e0 and "e)ion i% ot,e" t,an %out,0 total
"e%#ondent% w,o a##"o$ed o- O+a!a
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
19/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
e +uilt t,e !odel on t,e a+o$e data u%in) -acto"% Blac:0 ,ite0 education0 and
"e)ion a% -i8ed e--ect% and inte"action o- Blac: wit, "e)ion and ,ite wit, education
a% "ando!( T,en we -it a Baye%ian lo)i%tic !odel in PROC MCMC(
e !odeled eac, "e%#ondent
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
20/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
+eta1w,ite +eta?edu +eta"e)ion +eta4+lac:"e)ion +etaDw,iteedu and
$a"iance σ 2 (
Bot, o- t,e a+o$e !odel% a"e e.ui$alent( In t,e -i"%t !odel0 t,e "ando! e--ect%
ηi cente"% at 2 in t,e no"!al di%t"i+ution0 and in t,e ot,e" !odel0 δ i cente"% at t,e
"e)"e%%ion !ean( T,i% ,ie"a"c,ical cente"in) i!#"o$e% !i8in)(
Ba%ed on a+o$e lo)ic0 t,e PROC MCMC #"o)"a! %tate!ent% a"e&
proc mcmc data=last otpost=postot seed=!!"#$% nmc=!&&&& thin='& parms beta& & beta' & beta" & beta! & beta & beta* & beta% & s" ' prior s" + igamma(&&', s=&&') prior beta- + general(&)
w = beta& . beta'/black . beta"/white . beta!/ed . beta/region . beta*/black/region . beta%/white/ed random delta + normal(w, var=s") sb0ect=grops pi = logistic(delta) model cont + binomial(n = n, p = pi)rn
T,e PROC MCMC %tate!ent %#eci-ie% t,e in#utJout#ut data%et%0 %et% a %eed -o" t,e
"ando! nu!+e" )ene"ato"0 "e.ue%t% a $e"y la")e %i!ulation nu!+e" o- ?22220 and t,in%
t,e Ma":o$ c,ain +y 32( T,e PARMS %tate!ent decla"e% t,e !odel #a"a!ete"%( T,i% i%
not,in) +ut "e)"e%%ion coe--icient%( T,e PRIOR %tate!ent% %#eci-y t,e #"io"
di%t"i+ution -o" +eta and %1(
T,e %y!+ol w calculate% t,e "e)"e%%ion !ean0 and RAN'OM %tate!ent
%#eci-ie% t,e "ando!*e--ect%0 wit, a no"!al #"io" di%t"i+ution0 cente"ed at w wit,
$a"iance %1( T,e SUBECT o#tion indicate% t,e )"ou# inde8 -o" t,e "ando!*e--ect%(
T,e %y!+ol #i i% t,e lo)it t"an%-o"!ation( T,e MO'EL %#eci-ie% t,e "e%#on%e
$a"ia+le count a% a +ino!ial di%t"i+ution wit, #a"a!ete"% n and #i(
F%a# Resu#ts
3?
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
21/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Ex)#oratory a%a#yss Resu#ts.
,en we e8#lo"ed t,e inte"action +etween "ace and "e)ion and #"e%ented t,at in
ta+ula" -o"!at0 we -ound t,at HBlac: only0 a% a "ace0 ,ea$ily a##"o$ed o- O+a!a
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
22/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Cou%t
Obama Approval
Race Region % Approve % Disapprove
White
Only
Midwest 7,618 32.23 67.77
Northeast 5,270 36.07 63.93
South 11,225 23.37 76.63
West 6,692 30.65 69.35
All 37,483 33.43 66.57
F"o! t,e a+o$e ta+le0 it a##ea"% t,at HBlac: only "e%#ondent% -"o! t,e Sout,
a##"o$ed o- O+a!a
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
23/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Cou%t
Obama Approval
Race Sex % Approve % Disapprove
White Only Female 19,955 29.30 70.70
Male 10,850 29.35 70.65
All 37,483 33.43 66.57
,en we e8#lo"ed inte"action% +etween "ace and education in Ta+le 0 we -ound
an al!o%t a %i!ila" )a# +etween colle)e and no colle)e a!on) "e%#ondent% o- all "ace
-o" a##"o$al o- O+a!a
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
24/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Cou%t
Obama Approval
Race Education % Approve % Disapprove
White
Only
College 23,034 31.28 68.72
No college 7,771 23.48 76.52
All 37,483 33.43 66.57
It t,e"e-o"e a##ea"% t,at t,e"e a"e %o!e inte"action% +etween +lac: wit, %out, "e)ion
and w,ite wit, no colle)e education( e needed to $e"i-y a+o$e -indin)% and -o" t,at
we d"ew a deci%ion t"ee(
T,e %c"een%,ot% +elow0 in dia)"a! 30 %,ow t,e "e%ult% -"o! t,e SAS deci%ion
t"ee %#lit% -o" t,e de!o)"a#,ic% $a"ia+le%( A% e8#ected0 "ace i% t,e !ajo" #"edicto" in
O+a!a a##"o$al "atin)% co!#a"ed to all ot,e" -acto"%( O- t,e 3022 %u"$ey
#a"tici#ant%0 o$e" 4Q o- +lac:% a##"o$ed o- O+a!a
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
25/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
&a'ra" 0. Sout( as a $actor
F"o! t,e dia)"a! 1 a+o$e0 Sout, a% a "e)ion %tand% out a!on) all "e)ion%( T,e
%out, ,ad t,e lowe%t a##"o$al /1(4Q5 o- O+a!a( A-te" +lac: a% a "ace in 'ia)"a! 30
we can %ee t,at w,ite only %tand% out -"o! ot,e" "ace in 'ia)"a! 1( ,ite only ,a%
3@
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
26/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
t,e !a8i!u! di%a##"o$al /D(1@Q5 o- O+a!a(
Tab#e 3. I")orta%t Factors
Varab#e I")orta%ce
Race 3(22
Re)ion 2(?
Education 2(2D
Se8 2(24
Ta+le 4 a+o$e %,ow% t,e Va"ia+le I!#o"tance !et"ic0 w,ic, i% a "elati$e !et"ic
wit, a $alue o- one -o" t,e !o%t i!#o"tant $a"ia+le( Le%% i!#o"tant $a"ia+le% ,a$e
!et"ic% le%% t,an one( e %ee t,at t,e !o%t i!#o"tant $a"ia+le i% "ace0 t,en "e)ion0
education and %e80 in de%cendin) o"de"(
Factors Se#ecto%
F"o! t,e a+o$e ta+le% and dia)"a!%0 it a##ea"% t,at a!on) "ace +lac:
"e%#ondent% %tand out clea"ly -o" t,ei" !a8i!u! a##"o$al o- O+a!a
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
27/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
"ace cate)o"y0 %out, a!on) "e)ion%0 and no colle)e a!on) education( Inte"action o-
+lac: in %out, "e)ion0 and w,ite wit, no colle)e need% to +e %tudied( e decided to
:ee# a !ini!u! nu!+e" o- -acto"% and t,ei" inte"action% to :ee# ou" !odel %i!#le
and ea%y to inte"#"et(
+u#d%' C#ass$cato% Mode# Resu#ts
Tab#e 4. Out)ut Para"eters o$ Mode#
Parameters
Block Parameter
Sampling
Method
Initial
Value Prior Distribution
1 s2 Conjugate 1.0000 igamma(0.01, s=0.01)
2 beta0 N-Metropolis 0 general(0)
beta1 0 general(0)
beta2 0 general(0)
beta3 0 general(0)
beta4 0 general(0)
beta5 0 general(0)
beta6 0 general(0)
T,e HPa"a!ete"% ta+le li%t% t,e %a!#lin) in-o"!ation0 t,e na!e o- t,e !odel
#a"a!ete"%0 %a!#lin) al)o"it,!% u%ed0 initial $alue%0 and t,ei" #"io" di%t"i+ution%( T,e
conju)ate %a!#lin) al)o"it,! i% u%ed to d"aw t,e #o%te"io" %a!#le% o- %1 and "ando!
wal: Met"o#oli% -o" t,e "e)"e%%ion #a"a!ete"%(
12
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
28/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
Tab#e 5. Ra%do" E$$ects Para"eters
Ra%do" E$$ect Para"eters
Para"eter
Sa")#%'
Met(od Sub:ect
Nu"ber o$
Sub:ects
Sub:ect
Va#ues
Pror
&strbuto%
&e#ta N*Met"o#oli% )"ou#% 31 3 1 ? 4 D @ 32 33 31 no"!al/w0 $a"%15
T,e Rando! E--ect Pa"a!ete"% in t,e ta+le a+o$e li%t t,e na!e o- t,e "ando!
e--ect0 t,e %u+ject $a"ia+le0 and t,e nu!+e" o- di%tinct le$el% in t,e %u+ject $a"ia+le(
T,e total nu!+e" o- "ando!*e--ect% #a"a!ete"% in t,i% !odel i% 31(
Tab#e 6. Posteror Su""ares o$ Para"eter Est"ates
Para"ete
r Labe# N Mea%
Sta%dard
&e!ato% 73;
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
29/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
T,e !ean "e)"e%%ion coe--icient e%ti!ate o- 3(3D0 wit, a %tanda"d de$iation o-
2(3D?3 -o" +eta3 /i(e( Race Blac:5 i% inte"#"eted a% -ollow%& w,en t,e Race Blac:0
t,e odd% -o" a##"o$al -o" O+a!a i% e8#/3(3D5 ?(1 ti!e% t,e odd% w,en Race i%
ot,e"wi%e0 #"o$ided all ot,e" #"edicto"% "e!ain t,e %a!e(
e can conclude t,at t,e"e i% a ,i), a##"o$al -o" O+a!a ac"o%% all +lac:
#o#ulation( Si!ila"ly0 a %i)ni-icantly ,i), $alue -o" +eta4 /inte"action +etween "ace
+lac: and "e)ion %out,50 !ean "e)"e%%ion coe--icient e%ti!ate o- 2(DD?3 wit, a
%tanda"d de$iation o- 2(132@0 %,ow% t,at +lac:% in %out, "e)ion% a##"o$ed o- O+a!a
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
30/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
t,at t,e +eta3 #a"a!ete" con$e")ed( And Po%te"io" 'en%ity ,a% one !a8i!u!
li:eli,ood e%ti!ate( In addition0 Po%te"io" 'en%ity ,a% +ell*%,a#ed #o%te"io"
di%t"i+ution0 w,ic, %u##o"t% a##lyin) in-e"ence% -o" no"!al di%t"i+ution( Si!ila"ly0
dia)no%tic% #lot% -o" ot,e" #a"a!ete"% #"o$ide e$idence t,at t,e%e #a"a!ete"% ,a$e
con$e")ed too(
&a'ra" 1. &a'%ostcs P#ot $or +eta-
Su""ary a%d Co%c#uso%s
O$e"all0 we ac,ie$ed ou" "e%ea"c, )oal and we"e a+le to )ene"ate a #"o)"a! in
SAS -o" Baye%ian lo)i%tic "e)"e%%ion( e we"e a+le to de$elo# PROC MCMC wit,
Baye%ian analy%i% and dyna!ic #"io" $alue%( e we"e a+le to )ene"ate t,e coe--icient%
u%in) PROC MCMC t,at can +e a%%i)ned a wei),t -o" #o%t*%t"ati-ication( Ou" #"i!a"y
!ean% o- acco!#li%,in) t,i% in$ol$ed identi-yin) "i),t PROC in SAS and de$elo#in)
1?
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
31/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
t,e !ac"o% to i!#le!ent Gel!an
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
32/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
A))e%dx
Pro'ra""%' code
T,i% %et o- code ,a% +een u%ed -o" co!in) u# -o" data line% u%ed on MCMC PROC(
data cstone1psos2Agstset cstone1psos2Agstif race = 3Black 4nl53 then black=' else black=&if race = 36hite 4nl53 then white=' else white=&if edcation" = 37o college3 then ned=' else ned=&if region= 38oth3 then nregion=' else nregion=&if 89:= 3;ale3 then nse
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
33/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
ta+le( T,e"e we ,a$e !ean "e)"e%%ion coe--icient e%ti!ate% a% ne)ati$e( 'ia)"a! -o"
+eta1 "e#"e%ent% w,ite "e%#ondent%( 'ia)"a! -o" +eta?0 and +eta "e#"e%ent education
and "e)ion(
&a'ra" 2. &a'%ostcs P#ot $or +eta0
1D
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
34/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
&a'ra" 3. &a'%ostcs P#ot $or +eta1
&a'ra" 4. &a'%ostcs P#ot $or +eta2
1
-
8/17/2019 Bayesian Analysis Using MCMC on Survey Data
35/35
IMPROVING FORECASTING OF POLITICAL POLLING OUTCOMES
+b#o'ra)(y
Bi%,o#0 C( M( 34( Neu"al Netwo":% -o" Patte"n Reco)nition( O8-o"d(
Ca"lin0 B( P( and Loui%0 T( A( 122( Baye%ian Met,od% -o" 'ata Analy%i%( T,i"d Edition(
CRCJC,a#!an and ;all(
Gel!an0 A( Ca"lin0 ( B(0 Ste"n0 ;( S(0 'un%on0 '( B( 123?( Baye%ian 'ata Analy%i%(
T,i"d Edition( CRCJC,a#!an and ;all(
Gel!an0 A( and ;ill0 ( 122( 'ata Analy%i% U%in) Re)"e%%ion and
Multile$elJ;ie"a"c,ical Model%( Ca!+"id)e(
G,ita0 6( and Gel!an0 A( 123?( H'ee# Inte"action% wit, MRP& Election Tu"nout and
Votin) Patte"n% A!on) S!all Electo"al Su+)"ou#% A!e"ican ou"nal o- Political
Science(
;a%tie0 T(0 Ti+%,i"ani0 R(0 F"ied!an0 ( 122( T,e Ele!ent% o- Stati%tical Lea"nin)& 'ata
Minin)0 In-e"ence0 and P"ediction( S#"in)e"(
ac:!an0 S( 1224( HPoolin) t,e Poll% O$e" an Election Ca!#ai)n Au%t"alian ou"nal
o- Political Science 2& *43(
Line"0 '( A( 123?( H'yna!ic Baye%ian Fo"eca%tin) o- P"e%idential Election% in t,e
State% ou"nal o- t,e A!e"ican Stati%tical A%%ociation 32@& 31*3?(
Ntou-"a%0 I( 122( Baye%ian Modelin) U%in) inBUGS( iley(
Pa":0 '( (0 Gel!an0 A(0 and Ba-u!i0 ( 122( HBaye%ian Multile$el E%ti!ation wit,
Po%t%t"ati-ication& State Le$el E%ti!ate% -"o! National Poll% Political Analy%i%( 31&
?4*?@4(
an)0 (0 Rot,%c,ild0 '(0 Goel0 S(0 and Gel!an0 A( 123( HFo"eca%tin) Election% wit,
Non*Re#"e%entati$e Poll% Inte"national ou"nal o- Fo"eca%tin) Fo"t,co!in)(