eliciting truthful answers to sensitive survey …eliciting truthful answers to sensitive survey...
TRANSCRIPT
Eliciting Truthful Answers to Sensitive SurveyQuestions: New Statistical Methods for
List and Endorsement Experiments
Kosuke Imai Will Bullock
Joint work with Graeme Blair and Jacob Shapiro
Princeton University
July 23, 2010
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 1 / 35
Project Reference
PAPERS:Imai, Kosuke. “Statistical Inference for the Item Count Technique.”Bullock, Will, Kosuke Imai, and Jacob Shapiro. “MeasuringPolitical Support and Issue Ownership Using EndorsementExperiments, with Application to the Militant Groups in Pakistan.”
CODE AND SOFTWARE:Code for endorsement experiments available at the DataverseBlair, Graeme, and Kosuke Imai. list: MultivariateStatistical Analysis for the Item CountTechnique. an R package
PROJECT WEBSITE:http://imai.princeton.edu/projects/sensitive.html
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 2 / 35
Motivation
Survey is used widely in social sciencesValidity of survey depends on the accuracy of self-reportsSensitive questions =⇒ social desirability, privacy concernse.g., racial prejudice, corruptionsLies and nonresponses
How can we elicit truthful answers to sensitive questions?Survey methodology: protect privacy through indirect questioningStatistical methodology: efficiently recover underlying responses
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 3 / 35
Survey Techniques for Sensitive Questions
Randomized Response Technique (Warner, 1965)Most extensively studied and commonly usedUse randomization to protect privacyDifficulties: logistics, lack of understanding among respondents
Item Count Technique (Miller, 1984)Also known as list experiment and unmatched count techniqueUse aggregation to protect privacyDevelop new estimators to enable multivariate regression analysisApplication: racial prejudice in the US
Endorsement ExperimentsUse randomized endorsements to measure support levelsDevelop a measurement model based on item response theoryApplication: Pakistanis’ support for Islamic militant groups
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 4 / 35
Item Count Technique: Example
The 1991 National Race and Politics SurveyRandomize the sample into the treatment and control groupsThe script for the control group
Now I’m going to read you three things that sometimesmake people angry or upset. After I read all three,just tell me HOW MANY of them upset you. (I don’twant to know which ones, just how many.)
(1) the federal government increasing the tax ongasoline;(2) professional athletes getting million-dollar-plussalaries;(3) large corporations polluting the environment.
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 5 / 35
Item Count Technique: Example
The 1991 National Race and Politics SurveyRandomize the sample into the treatment and control groupsThe script for the treatment group
Now I’m going to read you four things that sometimesmake people angry or upset. After I read all four,just tell me HOW MANY of them upset you. (I don’twant to know which ones, just how many.)
(1) the federal government increasing the tax ongasoline;(2) professional athletes getting million-dollar-plussalaries;(3) large corporations polluting the environment;(4) a black family moving next door to you.
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 6 / 35
Design Considerations and Standard Analysis
Privacy is protected unless respondents’ truthful answers are yesfor all sensitive and non-sensitive items (leads to underestimation)A larger number of non-sensitive items results in a higher varianceLess efficient than direct questioningNegative correlation across non-sensitive items is desirable
Standard difference-in-means estimator:
τ̂ = treatment group mean − control group mean
Unbiased for the population proportionStratification is possible on discrete covariates
a large sample size is requirednot desirable for continuous covariates
No existing method allows for multivariate regression analysis
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 7 / 35
Two-Step Nonlinear Least Squares (NLS) Estimator
Generalize the difference-in-means estimator to a multivariateregression estimator
The Model:Yi = f (Xi , γ) + Tig(Xi , δ) + εi
Yi : response variableTi : treatment variableXi : covariatesf (x , γ): model for non-sensitive items, e.g., J × logit−1(x>γ)g(x , δ): model for sensitive item, e.g., logit−1(x>δ)
Two-step estimation procedure:1 Fit the f (x , γ) model to the control group via NLS and obtain γ̂2 Fit the g(x , δ) model to the treatment group via NLS after
subtracting f (Xi , γ̂) from Yi and obtain δ̂
Standard errors via the method of momentsWhen no covariate, it reduces to the difference-in-means estimator
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 8 / 35
Extracting More Information from the Data
Define a “type” of each respondent by (Yi(0),Zi,J+1)
Yi (0): total number of yes for non-sensitive items ∈ {0,1, . . . , J}Zi,J+1: truthful answer to the sensitive item ∈ {0,1}
A total of (2× J) typesExample: two non-sensitive items (J = 2)
Yi Treatment group Control group3 (2,1)2 (1,1) (2,0) (2,1) (2,0)1 (0,1) (1,0) (1,1) (1,0)0 (0,0) (0,1) (0,0)
Joint distribution is identified
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 9 / 35
Extracting More Information from the Data
Define a “type” of each respondent by (Yi(0),Zi,J+1)
Yi (0): total number of yes for non-sensitive items {0,1, . . . , J}Zi,J+1: truthful answer to the sensitive item {0,1}
A total of (2× J) typesExample: two non-sensitive items (J = 2)
Yi Treatment group Control group3 (2,1)2 (1,1) (2,0) (2,1) (2,0)1 �
��(0,1) ���(1,0) (1,1) �
��(1,0)0 ���(0,0) ���(0,1) ���(0,0)
Joint distribution is identified:
Pr(type = (y ,1)) = Pr(Yi ≤ y | Ti = 0)− Pr(Yi ≤ y | Ti = 1)
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 10 / 35
Extracting More Information from the Data
Define a “type” of each respondent by (Yi(0),Zi,J+1)Yi (0): total number of yes for non-sensitive items {0,1, . . . , J}Zi,J+1: truthful answer to the sensitive item {0,1}
A total of (2× J) typesExample: two non-sensitive items (J = 2)
Yi Treatment group Control group3 (2,1)2 (1,1) (2,0) (2,1) (2,0)1 ���(0,1) (1,0) (1,1) (1,0)0 �
��(0,0) ���(0,1) �
��(0,0)
Joint distribution is identified:
Pr(type = (y ,1)) = Pr(Yi ≤ y | Ti = 0)− Pr(Yi ≤ y | Ti = 1)
Pr(type = (y ,0)) = Pr(Yi ≤ y | Ti = 1)− Pr(Yi < y | Ti = 0)
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 11 / 35
The Likelihood Function
g(x , δ): model for sensitive item, e.g., logistic regressionhz(y ; x , ψz) = Pr(Yi(0) = y | Xi = x ,Zi,J+1 = z) :model for non-sensitive item given the response to sensitive item,e.g., binomial or beta-binomial regressionLikelihood:∏
i∈J (1,0)
(1− g(Xi , δ))h0(0; Xi , ψ0)∏
i∈J (1,J+1)
g(Xi , δ)h1(J; Xi , ψ1)
×J∏
y=1
∏i∈J (1,y)
{g(Xi , δ)h1(y − 1; Xi , ψ1) + (1− g(Xi , δ))h0(y ; Xi , ψ0)}
×J∏
y=0
∏i∈J (0,y)
{g(Xi , δ)h1(y ; Xi , ψ1) + (1− g(Xi , δ))h0(y ; Xi , ψ0)}
where J (t , y) represents a set of respondents with (Ti ,Yi) = (t , y)
It would be a nightmare to maximize this!Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 12 / 35
Missing Data Framework and the EM Algorithm
Consider Zi,J+1 as missing dataFor some respondents, Zi,J+1 is knownThe complete-data likelihood has a much simpler form:
N∏i=1
{g(Xi , δ)h1(Yi − 1; Xi , ψ1)Ti h1(Yi ; Xi , ψ1)1−Ti
}Zi,J+1
× {(1− g(Xi , δ))h0(Yi ; Xi , ψ0)}1−Zi,J+1
The EM algorithm: only separate optimization of g(x , δ) andhz(y ; x , ψz) is required
weighted logistic regressionweighted binomial or beta-binomial regression
Asymptotically unbiased and most efficient
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 13 / 35
Empirical Application: Racial Prejudice in the US
Kukulinski et al. (1997) analyzes the 1991 National Race andPolitics survey with the standard difference-in-means estimatorFinding: Southern whites are more prejudiced against blacks thannon-southern whites – no evidence for the “New South”
The limitation of the original analysis:“So far our discussion has implicitly assumed that the higher levelof prejudice among white southerners results from somethinguniquely “southern,” what many would call southern culture. Thisassumption could be wrong. If white southerners were older, lesseducated, and the like – characteristics normally associated withgreater prejudice – then demographics would explain the regionaldifference in racial attitudes, leaving culture as little more than asmall and relatively insignificant residual.”
Need for a multivariate regression analysis
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 14 / 35
Results of the Multivariate Analysis
Logistic regression model for sensitive itemBinomial regression model for non-sensitive item (not shown)Little over-dispersionLikelihood ratio test supports the constrained model
Nonlinear Least Maximum LikelihoodSquares Constrained Unconstrained
Variables est. s.e. est. s.e. est. s.e.Intercept −7.084 3.669 −5.508 1.021 −6.226 1.045South 2.490 1.268 1.675 0.559 1.379 0.820Age 0.026 0.031 0.064 0.016 0.065 0.021Male 3.096 2.828 0.846 0.494 1.366 0.612College 0.612 1.029 −0.315 0.474 −0.182 0.569
The original conclusion is supportedStandard errors are much smaller for ML estimator
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 15 / 35
Estimated Proportion of Prejudiced Whites
● ●
●
Est
imat
ed P
ropo
rtio
n
−0.1
0.0
0.1
0.2
0.3
0.4
0.5
Difference in Means Nonlinear Least Squares Maximum Likelihood
SouthSouth
South
Non−South Non−SouthNon−South
Regression adjustments and MLE yield more efficient estimates
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 16 / 35
Simulation Evidence
●
●●
1000 2000 3000 4000 5000
−0.
10.
00.
10.
20.
3
Sample Size
●
●
●
NLS
ML
●
●●
1000 2000 3000 4000 5000
−0.
10.
00.
10.
20.
3
Sample Size
●
●
●
●
● ●
1000 2000 3000 4000 5000
−0.
10.
00.
10.
20.
3
Sample Size
●
●
●
●● ●
1000 2000 3000 4000 5000
−0.
10.
00.
10.
20.
3
Sample Size
●
● ●
●
●
●
1000 2000 3000 4000 5000
0.0
0.5
1.0
1.5
2.0
Sample Size
●
●
●
NLS
ML
●●
●
1000 2000 3000 4000 5000
0.0
0.5
1.0
1.5
2.0
Sample Size
●
●
●
●
●
●
1000 2000 3000 4000 5000
0.0
0.5
1.0
1.5
2.0
Sample Size
●
●
●
●
●
●
1000 2000 3000 4000 5000
0.0
0.5
1.0
1.5
2.0
Sample Size
●
●
●
● ●●
1000 2000 3000 4000 5000
0.80
0.85
0.90
0.95
1.00
Sample Size
●
● ●
NLS
ML
● ● ●
1000 2000 3000 4000 5000
0.80
0.85
0.90
0.95
1.00
Sample Size
● ● ●
●● ●
1000 2000 3000 4000 5000
0.80
0.85
0.90
0.95
1.00
Sample Size
●
●
●
●
●●
1000 2000 3000 4000 5000
0.80
0.85
0.90
0.95
1.00
Sample Size
●
●●
Bias Root Mean Squared Error Coverage of 90% Confidence Intervals
Sou
thA
geM
ale
Col
lege
Sample Size Sample Size Sample Size
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 17 / 35
Endorsement Experiments
Measuring support for political actors (e.g., candidates, parties)when studying sensitive questionsAsk respondents to rate their support for a set of policiesendorsed by randomly assigned political actors
Experimental design:
1 Select policy questions
2 Randomly divide sample into control and treatment groups
3 Across respondents and questions, randomly assign political actorsfor endorsement (no endorsement for the control group)
4 Compare support level for each policy endorsed by different actors
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 18 / 35
The Pakistani Survey Experiment
6,000 person urban-rural sample
Four very different groups:Pakistani militants fighting in Kashmir (a.k.a. Kashmiri tanzeem)Militants fighting in Afghanistan (a.k.a. Afghan Taliban)Al-Qa’idaFirqavarana Tanzeems (a.k.a. sectarian militias)
Four policies:WHO plan to provide universal polio vaccination across PakistanCurriculum reform for religious schoolsReform of FCR to make Tribal areas equal to rest of the countryPeace jirgas to resolve disputes over Afghan border (Durand Line)
Response rate over 90%
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 19 / 35
Endorsement Experiment Questions: Example
The script for the control groupThe World Health Organization recently announceda plan to introduce universal Polio vaccinationacross Pakistan. How much do you support such aplan?
The script for the treatment groupThe World Health Organization recently announceda plan to introduce universal Polio vaccinationacross Pakistan, a policy that has receivedsupport from Al-Qa’ida. How much do you supportsuch a plan?
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 20 / 35
Distribution of ResponsesPolio Vaccinations Curriculum Reform FCR Reforms Durand Line
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Not At All
A LittleA Moderate Amount
A LotA Great Deal
Control Group
Pakistani militant groups in Kashmir
Afghan TalibanAl−Qaida
Firqavarana Tanzeems
Control Group
Pakistani militant groups in Kashmir
Afghan TalibanAl−Qaida
Firqavarana Tanzeems
Control Group
Pakistani militant groups in Kashmir
Afghan TalibanAl−Qaida
Firqavarana Tanzeems
Control Group
Pakistani militant groups in Kashmir
Afghan TalibanAl−Qaida
Firqavarana Tanzeems
Pun
jab
Sin
dhN
WF
PB
aloc
hist
an
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 21 / 35
Endorsement Experiments Framework
Data from an endorsement experiment:N respondentsJ policy questionsK political actorsYij ∈ {0,1}: response of respondent i to policy question jTij ∈ {0,1, . . . ,K}: political actor randomly assigned to endorsepolicy j for respondent iYij (t): potential response given the endorsement by actor tCovariates measured prior to the treatment
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 22 / 35
The Proposed Model
Quadratic random utility model:
Ui(ζj1, k) = −‖(xi + sijk )− ζj1‖2 + ηij ,
Ui(ζj0, k) = −‖(xi + sijk )− ζj0‖2 + νij ,
where xi is the ideal point and sijk is the support levelThe statistical model (item response theory):
Pr(Yij = 1 | Tij = k) = Pr(Yij(k) = 1) = Pr(Ui(ζj1, k) > Ui(ζj0, k))
= Pr(αj + βj(xi + sijk ) > εij)
Hierarchical modeling:
xiindep.∼ N (Z>i δ, σ
2x )
sijkindep.∼ N (Z>i λjk , ω
2jk )
λjki.i.d.∼ N (θk ,Φk )
“Noninformative” hyper prior on (αj , βj , δ, θk , ω2jk ,Φk )
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 23 / 35
Quantities of Interest and Model Fitting
Average support level for each militant group k
τjk (Zi) = Z>i λjk for each policy j
κk (Zi) = Z>i θk averaging over all policies
Standardize them by dividing the (posterior) standard deviation ofideal points
Bayesian Markov chain Monte Carlo algorithmMultiple chains to monitor convergenceImplementation via JAGS (Plummer)
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 24 / 35
Model for the Division Level Support
Ordered response with an intercept αjl varying across divisionsThe model specification:
xiindep.∼ N (δdivision[i],1)
sijkindep.∼ N (λk ,division[i], ω
2k )
δdivision[i]indep.∼ N (µprovince[i], σ
2province[i])
λk ,division[i]indep.∼ N (θk ,province[i],Φk ,province[i])
Averaging over policiesPartial pooling across divisions within each province
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 25 / 35
Estimated Division Level Support
−1.
0−
0.5
0.0
0.5
1.0
Sta
ndar
dize
d Le
vel o
f Sup
port
Pakistani militant groups in KashmirMilitants fighting in AfghanistanAl−QaidaFirqavarana Tanzeems
Bah
awal
pur
n=
118
Der
a G
hazi
Kha
n n
=0
Fais
alab
ad
n=31
3
Guj
ranw
ala
n=
403
Laho
re
n=57
9
Mul
tan
n=
495
Raw
alpi
ndi
n=
208
Sar
godh
a n
=13
1
Hyd
erab
ad
n=20
3
Kar
achi
n=
473
Lark
ana
n=
311
Mir
purk
has
n=
0
Suk
kur
n=
293
Ban
nu
n=0
Der
a Is
mai
l Kha
n n
=84
Haz
ara
n=
287
Koh
at
n=50
Mal
akan
d n
=0
Mar
dan
n=
215
Pes
haw
ar
n=28
8
Kal
at
n=10
3
Mak
ran
n=
0
Nas
iraba
d n
=21
0
Que
tta
n=32
0
Sib
i n
=67
Zho
b n
=61
Punjab Sindh NWFP Balochistan
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 26 / 35
Model with Individual Covariates
Ordered response with an intercept αjl varying across divisionsThe model specification:
xiindep.∼ N (δdivision[i] + Z>i δ
Z ,1)
sijkindep.∼ N (λk ,division[i] + Z>i λ
Zk , ω
2k )
δdivision[i]indep.∼ N (µprovince[i], σ
2province[i])
λk ,division[i]indep.∼ N (θk ,province[i],Φk ,province[i])
Expands upon the division level model to include individual levelcovariates:
gender, urban/rural, education, incomeIndividual level covariate effects after accounting for differencesacross divisionsPoststratification on these covariates using the census
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 27 / 35
Estimated Effects of Individual Covariates
−0.
2−
0.1
0.0
0.1
0.2
Pakistani militant groups in Kashmir
Sta
ndar
dize
d Le
vel o
f Sup
port
●
●
●
●
●
●●
● ● ●
Female Rural Income Education −0.
2−
0.1
0.0
0.1
0.2
Militants fighting in Afghanistan
Sta
ndar
dize
d Le
vel o
f Sup
port
●
●
●
● ●●
●
● ●
●
Female Rural Income Education
−0.
2−
0.1
0.0
0.1
0.2
Al−Qaida
Sta
ndar
dize
d Le
vel o
f Sup
port
●
●
●
●
●
● ●●
●
●
Female Rural Income Education −0.
2−
0.1
0.0
0.1
0.2
Firqavarana Tanzeems
Sta
ndar
dize
d Le
vel o
f Sup
port
●
●
●
●
● ●● ●
●
●
Female Rural Income Education
Demographics play a small role in explaining support for groupsImai and Bullock (Princeton) Sensitive Survey Questions PolMeth 28 / 35
Regional Clustering of the Support for Al-Qaida
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 29 / 35
Correlation between Support and Violence
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
● ●
●
●
●
●●
●
●●
−0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2
0
100
200
300
400
Pakistani militant groups in Kashmir
Division−Level Estimated Support
Num
ber
of In
cide
nts
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
● ●
●
●
●
●●
●
●●
−0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2
0
100
200
300
400
Militants fighting in Afghanistan
Division−Level Estimated Support
Num
ber
of In
cide
nts
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
● ●
●
●
●
●●
●
●●
−0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2
0
100
200
300
400
Al−Qaida
Division−Level Estimated Support
Num
ber
of In
cide
nts
●
●
●
●
● ●
●
●
●
●
●
●●●
●
●
● ●
●
●
●
●●
●
●●
−0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2
0
100
200
300
400
Firqavarana Tanzeems
Division−Level Estimated Support
Num
ber
of In
cide
nts
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 30 / 35
Simulation Studies
1 Based on the Pakistani DataSame 2 models plus province-level issue ownership modelTop-level parameters held constant across simulationsSample sizes and distribution same as beforeIdeal points, endorsements and responses follow IRT models
2 Varying sample sizesModel for division-level estimates with no covariatesModel for province-level estimates with no covariates but supportvarying across policiesN = 1000,1500,2000Again, top-level parameters held constant across simulations whileideal points, endorsements and responses follow IRT models
100 simulations under each scenario (3 chains, 60000 iterations)Frequentist evaluation of Bayesian estimators
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 31 / 35
Monte Carlo Evidence based on the Pakistani Data
Den
sity
−0.10 −0.05 0.00 0.05 0.10
0
10
20
30
40
Den
sity
0.80 0.85 0.90 0.95 1.00
0
5
10
15
20
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Effect Size
Pro
port
ion
Sta
tistic
ally
Sig
nific
ant
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
Den
sity
−0.10 −0.05 0.00 0.05 0.10
0
10
20
30
40
Den
sity
0.80 0.85 0.90 0.95 1.00
0
5
10
15
20
●●●●●
●●
●●●
●●
●
●
●
●●
●●●●
●
●
●
●
●
●
●
●
●
●●
●●
●●●●
●
●
●
●
●
●
●
●●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●●●
●●
●
● ●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Effect Size
Pro
port
ion
Sta
tistic
ally
Sig
nific
ant
The
Div
isio
n M
odel
W
ith In
divi
dual
Cov
aria
tes
The
Div
isio
n M
odel
Bias Coverage Rate of the 90% Confidence Intervals
Statistical Power
α level = 0.10
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 32 / 35
Model for the Province Level Issue Ownership
The Model specification:
xiindep.∼ N (δprovince[i],1)
sijkindep.∼ N (λjk ,province[i], ω
2jk )
λjk ,province[i]indep.∼ N (θk ,province[i],Φk ,province[i])
Pooling across divisions within each provincePartial pooling across policies
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 33 / 35
Monte Carlo Evidence with Varying Sample Size
●
●●
●
N=1000 N=1500 N=2000
−0.2
−0.1
0.0
0.1
0.2
Bias
Bia
s
●
●
N=1000 N=1500 N=2000
0.75
0.80
0.85
0.90
0.95
1.00
1.05
Coverage Rate of the 90% Confidence Intervals
Cov
erag
e R
ate
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Statistical Power
Effect Size
Pro
port
ion
Sta
tistic
ally
Sig
nific
ant
●●●●●●●●●●●●●●●●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●●●●●
●●●
●●●●●●●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
N=2000N=1500N=1000
●
●
N=1000 N=1500 N=2000
−0.2
−0.1
0.0
0.1
0.2
Bia
s
●
●
●
●
●
●
N=1000 N=1500 N=2000
0.75
0.80
0.85
0.90
0.95
1.00
1.05
Cov
erag
e R
ate
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Effect Size
Pro
port
ion
Sta
tistic
ally
Sig
nific
ant
●● ●
●●●●
●●
●
●●
●●
●
●
● ● ●
●● ●
●●
● ●● ●
● ● ●
●●
●● ●●
●●●
●●
●
●●● ●
●
●
●●
●●
●● ● ● ● ● ● ● ●
●
●
N=2000N=1500N=1000
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
N=1000 N=1500 N=2000
−0.2
−0.1
0.0
0.1
0.2
Bia
s
●
● ●
N=1000 N=1500 N=2000
0.75
0.80
0.85
0.90
0.95
1.00
1.05
Cov
erag
e R
ate
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Effect Size
Pro
port
ion
Sta
tistic
ally
Sig
nific
ant
●●●●●
●●
●●●●●●●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●●
●●●●●●
●
●●●●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●●
●
●
N=2000N=1500N=1000
The
Div
isio
n M
odel
The
Issu
e P
rovi
nce
Mod
elT
he D
ivis
ion
Mod
el
With
Indi
vidu
al C
ovar
iate
s
α level = 0.10
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 34 / 35
Concluding Remarks
Viable alternatives to the randomized response technique
Item Count TechniqueEasy for researchers to implementEasy for respondents to understandWidely applicableNeed to carefully choose non-sensitive itemsAggregation =⇒ loss of efficiency
Endorsement ExperimentsMost indirect questioningApplicability limited to measuring supportNeed to carefully choose policy questionsMany groups =⇒ loss of efficiency
New statistical methods for efficient inferenceOpen-source software and code available
Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 35 / 35