eliciting truthful answers to sensitive survey …eliciting truthful answers to sensitive survey...

18
Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai Will Bullock Joint work with Graeme Blair and Jacob Shapiro Princeton University July 23, 2010 Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 1 / 35 Project Reference P APERS : Imai, Kosuke. “Statistical Inference for the Item Count Technique.” Bullock, Will, Kosuke Imai, and Jacob Shapiro. “Measuring Political Support and Issue Ownership Using Endorsement Experiments, with Application to the Militant Groups in Pakistan.” C ODE AND S OFTWARE : Code for endorsement experiments available at the Dataverse Blair, Graeme, and Kosuke Imai. list: Multivariate Statistical Analysis for the Item Count Technique. an R package P ROJECT WEBSITE : http://imai.princeton.edu/projects/sensitive.html Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 2 / 35

Upload: others

Post on 24-Jun-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Eliciting Truthful Answers to Sensitive SurveyQuestions: New Statistical Methods for

List and Endorsement Experiments

Kosuke Imai Will Bullock

Joint work with Graeme Blair and Jacob Shapiro

Princeton University

July 23, 2010

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 1 / 35

Project Reference

PAPERS:Imai, Kosuke. “Statistical Inference for the Item Count Technique.”Bullock, Will, Kosuke Imai, and Jacob Shapiro. “MeasuringPolitical Support and Issue Ownership Using EndorsementExperiments, with Application to the Militant Groups in Pakistan.”

CODE AND SOFTWARE:Code for endorsement experiments available at the DataverseBlair, Graeme, and Kosuke Imai. list: MultivariateStatistical Analysis for the Item CountTechnique. an R package

PROJECT WEBSITE:http://imai.princeton.edu/projects/sensitive.html

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 2 / 35

Page 2: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Motivation

Survey is used widely in social sciencesValidity of survey depends on the accuracy of self-reportsSensitive questions =⇒ social desirability, privacy concernse.g., racial prejudice, corruptionsLies and nonresponses

How can we elicit truthful answers to sensitive questions?Survey methodology: protect privacy through indirect questioningStatistical methodology: efficiently recover underlying responses

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 3 / 35

Survey Techniques for Sensitive Questions

Randomized Response Technique (Warner, 1965)Most extensively studied and commonly usedUse randomization to protect privacyDifficulties: logistics, lack of understanding among respondents

Item Count Technique (Miller, 1984)Also known as list experiment and unmatched count techniqueUse aggregation to protect privacyDevelop new estimators to enable multivariate regression analysisApplication: racial prejudice in the US

Endorsement ExperimentsUse randomized endorsements to measure support levelsDevelop a measurement model based on item response theoryApplication: Pakistanis’ support for Islamic militant groups

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 4 / 35

Page 3: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Item Count Technique: Example

The 1991 National Race and Politics SurveyRandomize the sample into the treatment and control groupsThe script for the control group

Now I’m going to read you three things that sometimesmake people angry or upset. After I read all three,just tell me HOW MANY of them upset you. (I don’twant to know which ones, just how many.)

(1) the federal government increasing the tax ongasoline;(2) professional athletes getting million-dollar-plussalaries;(3) large corporations polluting the environment.

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 5 / 35

Item Count Technique: Example

The 1991 National Race and Politics SurveyRandomize the sample into the treatment and control groupsThe script for the treatment group

Now I’m going to read you four things that sometimesmake people angry or upset. After I read all four,just tell me HOW MANY of them upset you. (I don’twant to know which ones, just how many.)

(1) the federal government increasing the tax ongasoline;(2) professional athletes getting million-dollar-plussalaries;(3) large corporations polluting the environment;(4) a black family moving next door to you.

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 6 / 35

Page 4: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Design Considerations and Standard Analysis

Privacy is protected unless respondents’ truthful answers are yesfor all sensitive and non-sensitive items (leads to underestimation)A larger number of non-sensitive items results in a higher varianceLess efficient than direct questioningNegative correlation across non-sensitive items is desirable

Standard difference-in-means estimator:

τ̂ = treatment group mean − control group mean

Unbiased for the population proportionStratification is possible on discrete covariates

a large sample size is requirednot desirable for continuous covariates

No existing method allows for multivariate regression analysis

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 7 / 35

Two-Step Nonlinear Least Squares (NLS) Estimator

Generalize the difference-in-means estimator to a multivariateregression estimator

The Model:Yi = f (Xi , γ) + Tig(Xi , δ) + εi

Yi : response variableTi : treatment variableXi : covariatesf (x , γ): model for non-sensitive items, e.g., J × logit−1(x>γ)g(x , δ): model for sensitive item, e.g., logit−1(x>δ)

Two-step estimation procedure:1 Fit the f (x , γ) model to the control group via NLS and obtain γ̂2 Fit the g(x , δ) model to the treatment group via NLS after

subtracting f (Xi , γ̂) from Yi and obtain δ̂

Standard errors via the method of momentsWhen no covariate, it reduces to the difference-in-means estimator

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 8 / 35

Page 5: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Extracting More Information from the Data

Define a “type” of each respondent by (Yi(0),Zi,J+1)

Yi (0): total number of yes for non-sensitive items ∈ {0,1, . . . , J}Zi,J+1: truthful answer to the sensitive item ∈ {0,1}

A total of (2× J) typesExample: two non-sensitive items (J = 2)

Yi Treatment group Control group3 (2,1)2 (1,1) (2,0) (2,1) (2,0)1 (0,1) (1,0) (1,1) (1,0)0 (0,0) (0,1) (0,0)

Joint distribution is identified

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 9 / 35

Extracting More Information from the Data

Define a “type” of each respondent by (Yi(0),Zi,J+1)

Yi (0): total number of yes for non-sensitive items {0,1, . . . , J}Zi,J+1: truthful answer to the sensitive item {0,1}

A total of (2× J) typesExample: two non-sensitive items (J = 2)

Yi Treatment group Control group3 (2,1)2 (1,1) (2,0) (2,1) (2,0)1 �

��(0,1) ���(1,0) (1,1) �

��(1,0)0 ���(0,0) ���(0,1) ���(0,0)

Joint distribution is identified:

Pr(type = (y ,1)) = Pr(Yi ≤ y | Ti = 0)− Pr(Yi ≤ y | Ti = 1)

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 10 / 35

Page 6: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Extracting More Information from the Data

Define a “type” of each respondent by (Yi(0),Zi,J+1)Yi (0): total number of yes for non-sensitive items {0,1, . . . , J}Zi,J+1: truthful answer to the sensitive item {0,1}

A total of (2× J) typesExample: two non-sensitive items (J = 2)

Yi Treatment group Control group3 (2,1)2 (1,1) (2,0) (2,1) (2,0)1 ���(0,1) (1,0) (1,1) (1,0)0 �

��(0,0) ���(0,1) �

��(0,0)

Joint distribution is identified:

Pr(type = (y ,1)) = Pr(Yi ≤ y | Ti = 0)− Pr(Yi ≤ y | Ti = 1)

Pr(type = (y ,0)) = Pr(Yi ≤ y | Ti = 1)− Pr(Yi < y | Ti = 0)

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 11 / 35

The Likelihood Function

g(x , δ): model for sensitive item, e.g., logistic regressionhz(y ; x , ψz) = Pr(Yi(0) = y | Xi = x ,Zi,J+1 = z) :model for non-sensitive item given the response to sensitive item,e.g., binomial or beta-binomial regressionLikelihood:∏

i∈J (1,0)

(1− g(Xi , δ))h0(0; Xi , ψ0)∏

i∈J (1,J+1)

g(Xi , δ)h1(J; Xi , ψ1)

×J∏

y=1

∏i∈J (1,y)

{g(Xi , δ)h1(y − 1; Xi , ψ1) + (1− g(Xi , δ))h0(y ; Xi , ψ0)}

×J∏

y=0

∏i∈J (0,y)

{g(Xi , δ)h1(y ; Xi , ψ1) + (1− g(Xi , δ))h0(y ; Xi , ψ0)}

where J (t , y) represents a set of respondents with (Ti ,Yi) = (t , y)

It would be a nightmare to maximize this!Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 12 / 35

Page 7: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Missing Data Framework and the EM Algorithm

Consider Zi,J+1 as missing dataFor some respondents, Zi,J+1 is knownThe complete-data likelihood has a much simpler form:

N∏i=1

{g(Xi , δ)h1(Yi − 1; Xi , ψ1)Ti h1(Yi ; Xi , ψ1)1−Ti

}Zi,J+1

× {(1− g(Xi , δ))h0(Yi ; Xi , ψ0)}1−Zi,J+1

The EM algorithm: only separate optimization of g(x , δ) andhz(y ; x , ψz) is required

weighted logistic regressionweighted binomial or beta-binomial regression

Asymptotically unbiased and most efficient

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 13 / 35

Empirical Application: Racial Prejudice in the US

Kukulinski et al. (1997) analyzes the 1991 National Race andPolitics survey with the standard difference-in-means estimatorFinding: Southern whites are more prejudiced against blacks thannon-southern whites – no evidence for the “New South”

The limitation of the original analysis:“So far our discussion has implicitly assumed that the higher levelof prejudice among white southerners results from somethinguniquely “southern,” what many would call southern culture. Thisassumption could be wrong. If white southerners were older, lesseducated, and the like – characteristics normally associated withgreater prejudice – then demographics would explain the regionaldifference in racial attitudes, leaving culture as little more than asmall and relatively insignificant residual.”

Need for a multivariate regression analysis

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 14 / 35

Page 8: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Results of the Multivariate Analysis

Logistic regression model for sensitive itemBinomial regression model for non-sensitive item (not shown)Little over-dispersionLikelihood ratio test supports the constrained model

Nonlinear Least Maximum LikelihoodSquares Constrained Unconstrained

Variables est. s.e. est. s.e. est. s.e.Intercept −7.084 3.669 −5.508 1.021 −6.226 1.045South 2.490 1.268 1.675 0.559 1.379 0.820Age 0.026 0.031 0.064 0.016 0.065 0.021Male 3.096 2.828 0.846 0.494 1.366 0.612College 0.612 1.029 −0.315 0.474 −0.182 0.569

The original conclusion is supportedStandard errors are much smaller for ML estimator

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 15 / 35

Estimated Proportion of Prejudiced Whites

● ●

Est

imat

ed P

ropo

rtio

n

−0.1

0.0

0.1

0.2

0.3

0.4

0.5

Difference in Means Nonlinear Least Squares Maximum Likelihood

SouthSouth

South

Non−South Non−SouthNon−South

Regression adjustments and MLE yield more efficient estimates

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 16 / 35

Page 9: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Simulation Evidence

●●

1000 2000 3000 4000 5000

−0.

10.

00.

10.

20.

3

Sample Size

NLS

ML

●●

1000 2000 3000 4000 5000

−0.

10.

00.

10.

20.

3

Sample Size

● ●

1000 2000 3000 4000 5000

−0.

10.

00.

10.

20.

3

Sample Size

●● ●

1000 2000 3000 4000 5000

−0.

10.

00.

10.

20.

3

Sample Size

● ●

1000 2000 3000 4000 5000

0.0

0.5

1.0

1.5

2.0

Sample Size

NLS

ML

●●

1000 2000 3000 4000 5000

0.0

0.5

1.0

1.5

2.0

Sample Size

1000 2000 3000 4000 5000

0.0

0.5

1.0

1.5

2.0

Sample Size

1000 2000 3000 4000 5000

0.0

0.5

1.0

1.5

2.0

Sample Size

● ●●

1000 2000 3000 4000 5000

0.80

0.85

0.90

0.95

1.00

Sample Size

● ●

NLS

ML

● ● ●

1000 2000 3000 4000 5000

0.80

0.85

0.90

0.95

1.00

Sample Size

● ● ●

●● ●

1000 2000 3000 4000 5000

0.80

0.85

0.90

0.95

1.00

Sample Size

●●

1000 2000 3000 4000 5000

0.80

0.85

0.90

0.95

1.00

Sample Size

●●

Bias Root Mean Squared Error Coverage of 90% Confidence Intervals

Sou

thA

geM

ale

Col

lege

Sample Size Sample Size Sample Size

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 17 / 35

Endorsement Experiments

Measuring support for political actors (e.g., candidates, parties)when studying sensitive questionsAsk respondents to rate their support for a set of policiesendorsed by randomly assigned political actors

Experimental design:

1 Select policy questions

2 Randomly divide sample into control and treatment groups

3 Across respondents and questions, randomly assign political actorsfor endorsement (no endorsement for the control group)

4 Compare support level for each policy endorsed by different actors

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 18 / 35

Page 10: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

The Pakistani Survey Experiment

6,000 person urban-rural sample

Four very different groups:Pakistani militants fighting in Kashmir (a.k.a. Kashmiri tanzeem)Militants fighting in Afghanistan (a.k.a. Afghan Taliban)Al-Qa’idaFirqavarana Tanzeems (a.k.a. sectarian militias)

Four policies:WHO plan to provide universal polio vaccination across PakistanCurriculum reform for religious schoolsReform of FCR to make Tribal areas equal to rest of the countryPeace jirgas to resolve disputes over Afghan border (Durand Line)

Response rate over 90%

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 19 / 35

Endorsement Experiment Questions: Example

The script for the control groupThe World Health Organization recently announceda plan to introduce universal Polio vaccinationacross Pakistan. How much do you support such aplan?

The script for the treatment groupThe World Health Organization recently announceda plan to introduce universal Polio vaccinationacross Pakistan, a policy that has receivedsupport from Al-Qa’ida. How much do you supportsuch a plan?

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 20 / 35

Page 11: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Distribution of ResponsesPolio Vaccinations Curriculum Reform FCR Reforms Durand Line

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Not At All

A LittleA Moderate Amount

A LotA Great Deal

Control Group

Pakistani militant groups in Kashmir

Afghan TalibanAl−Qaida

Firqavarana Tanzeems

Control Group

Pakistani militant groups in Kashmir

Afghan TalibanAl−Qaida

Firqavarana Tanzeems

Control Group

Pakistani militant groups in Kashmir

Afghan TalibanAl−Qaida

Firqavarana Tanzeems

Control Group

Pakistani militant groups in Kashmir

Afghan TalibanAl−Qaida

Firqavarana Tanzeems

Pun

jab

Sin

dhN

WF

PB

aloc

hist

an

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 21 / 35

Endorsement Experiments Framework

Data from an endorsement experiment:N respondentsJ policy questionsK political actorsYij ∈ {0,1}: response of respondent i to policy question jTij ∈ {0,1, . . . ,K}: political actor randomly assigned to endorsepolicy j for respondent iYij (t): potential response given the endorsement by actor tCovariates measured prior to the treatment

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 22 / 35

Page 12: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

The Proposed Model

Quadratic random utility model:

Ui(ζj1, k) = −‖(xi + sijk )− ζj1‖2 + ηij ,

Ui(ζj0, k) = −‖(xi + sijk )− ζj0‖2 + νij ,

where xi is the ideal point and sijk is the support levelThe statistical model (item response theory):

Pr(Yij = 1 | Tij = k) = Pr(Yij(k) = 1) = Pr(Ui(ζj1, k) > Ui(ζj0, k))

= Pr(αj + βj(xi + sijk ) > εij)

Hierarchical modeling:

xiindep.∼ N (Z>i δ, σ

2x )

sijkindep.∼ N (Z>i λjk , ω

2jk )

λjki.i.d.∼ N (θk ,Φk )

“Noninformative” hyper prior on (αj , βj , δ, θk , ω2jk ,Φk )

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 23 / 35

Quantities of Interest and Model Fitting

Average support level for each militant group k

τjk (Zi) = Z>i λjk for each policy j

κk (Zi) = Z>i θk averaging over all policies

Standardize them by dividing the (posterior) standard deviation ofideal points

Bayesian Markov chain Monte Carlo algorithmMultiple chains to monitor convergenceImplementation via JAGS (Plummer)

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 24 / 35

Page 13: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Model for the Division Level Support

Ordered response with an intercept αjl varying across divisionsThe model specification:

xiindep.∼ N (δdivision[i],1)

sijkindep.∼ N (λk ,division[i], ω

2k )

δdivision[i]indep.∼ N (µprovince[i], σ

2province[i])

λk ,division[i]indep.∼ N (θk ,province[i],Φk ,province[i])

Averaging over policiesPartial pooling across divisions within each province

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 25 / 35

Estimated Division Level Support

−1.

0−

0.5

0.0

0.5

1.0

Sta

ndar

dize

d Le

vel o

f Sup

port

Pakistani militant groups in KashmirMilitants fighting in AfghanistanAl−QaidaFirqavarana Tanzeems

Bah

awal

pur

n=

118

Der

a G

hazi

Kha

n n

=0

Fais

alab

ad

n=31

3

Guj

ranw

ala

n=

403

Laho

re

n=57

9

Mul

tan

n=

495

Raw

alpi

ndi

n=

208

Sar

godh

a n

=13

1

Hyd

erab

ad

n=20

3

Kar

achi

n=

473

Lark

ana

n=

311

Mir

purk

has

n=

0

Suk

kur

n=

293

Ban

nu

n=0

Der

a Is

mai

l Kha

n n

=84

Haz

ara

n=

287

Koh

at

n=50

Mal

akan

d n

=0

Mar

dan

n=

215

Pes

haw

ar

n=28

8

Kal

at

n=10

3

Mak

ran

n=

0

Nas

iraba

d n

=21

0

Que

tta

n=32

0

Sib

i n

=67

Zho

b n

=61

Punjab Sindh NWFP Balochistan

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 26 / 35

Page 14: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Model with Individual Covariates

Ordered response with an intercept αjl varying across divisionsThe model specification:

xiindep.∼ N (δdivision[i] + Z>i δ

Z ,1)

sijkindep.∼ N (λk ,division[i] + Z>i λ

Zk , ω

2k )

δdivision[i]indep.∼ N (µprovince[i], σ

2province[i])

λk ,division[i]indep.∼ N (θk ,province[i],Φk ,province[i])

Expands upon the division level model to include individual levelcovariates:

gender, urban/rural, education, incomeIndividual level covariate effects after accounting for differencesacross divisionsPoststratification on these covariates using the census

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 27 / 35

Estimated Effects of Individual Covariates

−0.

2−

0.1

0.0

0.1

0.2

Pakistani militant groups in Kashmir

Sta

ndar

dize

d Le

vel o

f Sup

port

●●

● ● ●

Female Rural Income Education −0.

2−

0.1

0.0

0.1

0.2

Militants fighting in Afghanistan

Sta

ndar

dize

d Le

vel o

f Sup

port

● ●●

● ●

Female Rural Income Education

−0.

2−

0.1

0.0

0.1

0.2

Al−Qaida

Sta

ndar

dize

d Le

vel o

f Sup

port

● ●●

Female Rural Income Education −0.

2−

0.1

0.0

0.1

0.2

Firqavarana Tanzeems

Sta

ndar

dize

d Le

vel o

f Sup

port

● ●● ●

Female Rural Income Education

Demographics play a small role in explaining support for groupsImai and Bullock (Princeton) Sensitive Survey Questions PolMeth 28 / 35

Page 15: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Regional Clustering of the Support for Al-Qaida

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 29 / 35

Correlation between Support and Violence

● ●

●●●

● ●

●●

●●

−0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2

0

100

200

300

400

Pakistani militant groups in Kashmir

Division−Level Estimated Support

Num

ber

of In

cide

nts

● ●

●●●

● ●

●●

●●

−0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2

0

100

200

300

400

Militants fighting in Afghanistan

Division−Level Estimated Support

Num

ber

of In

cide

nts

● ●

●●●

● ●

●●

●●

−0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2

0

100

200

300

400

Al−Qaida

Division−Level Estimated Support

Num

ber

of In

cide

nts

● ●

●●●

● ●

●●

●●

−0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2

0

100

200

300

400

Firqavarana Tanzeems

Division−Level Estimated Support

Num

ber

of In

cide

nts

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 30 / 35

Page 16: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Simulation Studies

1 Based on the Pakistani DataSame 2 models plus province-level issue ownership modelTop-level parameters held constant across simulationsSample sizes and distribution same as beforeIdeal points, endorsements and responses follow IRT models

2 Varying sample sizesModel for division-level estimates with no covariatesModel for province-level estimates with no covariates but supportvarying across policiesN = 1000,1500,2000Again, top-level parameters held constant across simulations whileideal points, endorsements and responses follow IRT models

100 simulations under each scenario (3 chains, 60000 iterations)Frequentist evaluation of Bayesian estimators

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 31 / 35

Monte Carlo Evidence based on the Pakistani Data

Den

sity

−0.10 −0.05 0.00 0.05 0.10

0

10

20

30

40

Den

sity

0.80 0.85 0.90 0.95 1.00

0

5

10

15

20

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Effect Size

Pro

port

ion

Sta

tistic

ally

Sig

nific

ant

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

Den

sity

−0.10 −0.05 0.00 0.05 0.10

0

10

20

30

40

Den

sity

0.80 0.85 0.90 0.95 1.00

0

5

10

15

20

●●●●●

●●

●●●

●●

●●

●●●●

●●

●●

●●●●

●●

● ●●

●●

●●

●●

●●●

●●

● ●

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Effect Size

Pro

port

ion

Sta

tistic

ally

Sig

nific

ant

The

Div

isio

n M

odel

W

ith In

divi

dual

Cov

aria

tes

The

Div

isio

n M

odel

Bias Coverage Rate of the 90% Confidence Intervals

Statistical Power

α level = 0.10

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 32 / 35

Page 17: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Model for the Province Level Issue Ownership

The Model specification:

xiindep.∼ N (δprovince[i],1)

sijkindep.∼ N (λjk ,province[i], ω

2jk )

λjk ,province[i]indep.∼ N (θk ,province[i],Φk ,province[i])

Pooling across divisions within each provincePartial pooling across policies

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 33 / 35

Monte Carlo Evidence with Varying Sample Size

●●

N=1000 N=1500 N=2000

−0.2

−0.1

0.0

0.1

0.2

Bias

Bia

s

N=1000 N=1500 N=2000

0.75

0.80

0.85

0.90

0.95

1.00

1.05

Coverage Rate of the 90% Confidence Intervals

Cov

erag

e R

ate

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Statistical Power

Effect Size

Pro

port

ion

Sta

tistic

ally

Sig

nific

ant

●●●●●●●●●●●●●●●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●●●●●●

●●●

●●●●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

N=2000N=1500N=1000

N=1000 N=1500 N=2000

−0.2

−0.1

0.0

0.1

0.2

Bia

s

N=1000 N=1500 N=2000

0.75

0.80

0.85

0.90

0.95

1.00

1.05

Cov

erag

e R

ate

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Effect Size

Pro

port

ion

Sta

tistic

ally

Sig

nific

ant

●● ●

●●●●

●●

●●

●●

● ● ●

●● ●

●●

● ●● ●

● ● ●

●●

●● ●●

●●●

●●

●●● ●

●●

●●

●● ● ● ● ● ● ● ●

N=2000N=1500N=1000

N=1000 N=1500 N=2000

−0.2

−0.1

0.0

0.1

0.2

Bia

s

● ●

N=1000 N=1500 N=2000

0.75

0.80

0.85

0.90

0.95

1.00

1.05

Cov

erag

e R

ate

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Effect Size

Pro

port

ion

Sta

tistic

ally

Sig

nific

ant

●●●●●

●●

●●●●●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

N=2000N=1500N=1000

The

Div

isio

n M

odel

The

Issu

e P

rovi

nce

Mod

elT

he D

ivis

ion

Mod

el

With

Indi

vidu

al C

ovar

iate

s

α level = 0.10

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 34 / 35

Page 18: Eliciting Truthful Answers to Sensitive Survey …Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai

Concluding Remarks

Viable alternatives to the randomized response technique

Item Count TechniqueEasy for researchers to implementEasy for respondents to understandWidely applicableNeed to carefully choose non-sensitive itemsAggregation =⇒ loss of efficiency

Endorsement ExperimentsMost indirect questioningApplicability limited to measuring supportNeed to carefully choose policy questionsMany groups =⇒ loss of efficiency

New statistical methods for efficient inferenceOpen-source software and code available

Imai and Bullock (Princeton) Sensitive Survey Questions PolMeth 35 / 35