1 field experiments for assessing question validity patrick sturgis, department of sociology,...

1

Field experiments for assessing question validity

Patrick Sturgis, Department of Sociology, University of Surrey, UK

Paper presented at conference on ‘Survey Measurement: Assessing the Reliability and Validity of Contemporary Questionnaire Items’ The Royal Statistical Society,10 April 2008

2

Plan of Talk

• Standard validity assessment for survey questions

• Field experiments

• Example 1 – Political knowledge

• Example 2 – Social trust

• Concluding remarks

3

Standard validity assessment

• Nothing• Face/process validity• Correlation with criterion variables• Multi-trait-multi-method (MTMM)• Expert panels• Behaviour coding• Interviewer debrief• Thinkaloud protocols/cognitive interview

4

Limitations

• Small n/purposive selection – do inferences generalize?

• Do different techniques/researchers identify same ‘problems’

• Do modifications increase validity?

• Paradoxical limitations for survey research!

5

Field Experiments

• Large n with randomization of alternate forms

• Clean and powerful inference

• Lack of criterion reference can be problematic

• But theory can help!

6

Example 1(with Nick Allum, Patten Smith)

Measuring Political Knowledge:

Guessing and partial knowledge

7

Standard approach• MCQ format:• “The Number of MPs in Parliament is about 100”

a. Trueb. Falsec. DK

• DKs ‘encouraged’• Two key problems (Mondak 2001; 2002):

– Some say DK when they can answer correctly at p > 0.5 (partial knowledge)

– Some provide a substantive answer when they cannot answer correctly at p >0.5 (guessing)

8

Personality Variance

• Variation in knowledge scores reflects more than just knowledge

• Men more likely to guess in absence of knowledge

• Women more likely to say DK with partial knowledge

• Thus, men ‘appear’ to know more about politics than women

9

The Solution?

• Force all respondents to provide an answer even if they genuinely DK (Mondak 2001)

• Randomly allocate residual DKs across substantive categories

• Removes personality variance by omitting option of guessing

• And of saying DK in presence of partial knowledge

10

Study 1 - Partial Knowledge

• BMRB CATI omnibus (quota sample) • Interviewing 17-19 December 2004• N = 1006• Three true/false knowledge items:

– Britain's electoral system is based on proportional representation

– MPs from different parties are on parliamentary committees

– The Conservatives are opposed to the ratification of a constitution for the European Union

11

Design

• “For the next few questions, I am going to read out some statements, and for each one, please tell me if it is true or false. If you don't know, just say so and we will skip to the next one”

• If respondent answers DK:• “You said earlier that you don't know whether

the number of MPs is about 100. Could you please just give me your best guess?”

• Partial knowledge in initial DK responses if % correct after probe > .5

12

Probed DK Responses

13

Results

Model predicted probabilities of correct answers =

• 71% for those giving an initial response

• 53% for probed DKs

• 50% for random allocation and

Binary logit predicting correct answer (0,1)

No gender difference

14

Study 2 - Guessing

• Ask standard format knowledge questions but where answer options are all wrong

• Respondents choosing any substantive alternative are ‘guessing’:1. Who is the Secretary of State for Trade and

Industry? Is it, a. Geoff Hoonb. Peter Hain orc. Do you not know? (correct=Alan Johnson)

• BMRB omnibus n=2011, 4-6 November 2005 and 9-11 December 2005

15

Who is the Secretary of State for Trade and Industry? Is it, a. Geoff Hoonb. Alan Johnson, orc. Do you not know?

26

16

41

37

32

47

0

5

10

15

20

25

30

35

40

45

50

male female

Alan Johnson

Alistair Darling

DK

16

1 1

42

27

57

72

0

10

20

30

40

50

60

70

80

male female

correct

guess

DK

Who is the Secretary of State for Trade and Industry? Is it, a. Geoff Hoonb. Peter Hain, orc. Do you not know?

17

Binary Logit Model

Dependent variable guess=1, dk=0

18

Conclusions

• No evidence that DKs in survey knowledge items conceal partial knowledge

• Guessing, however, is common and differential (favouring men)

• Guessing also related to political knowledge• Recommendation: use ‘standard’ format items• For marginal comparisons, randomly allocate DKs

to substantive categories• For associational relationships use number right

scoring (treat DK and incorrect as equivalent)

19

Example 2(with Patten Smith)

Investigating Social Trust

Using thinkalouds

20

Conceptions of Trust

• Trust is a ‘good thing’• Trusting citizens are good citizens (voting,

volunteering, civic engagement)• Trusting societies are good societies

(more democratic, egalitarian, > economic performance)

• Trust ‘lubricates’ social and economic transactions

• Reduces ‘monitoring costs’

21

‘Thick’ Trust

• Also ‘particularized’ or ‘strategic’ trust

• Between people who know one another

• Based on personal experience

• Encapsulated interests; ‘your interests are my interests’ (Hardin)

• I trust x to do y

22

‘Thin’ trust

• Also ‘social’ or ‘generalized’ trust

• Trust between people not personally known to one another

• More akin to a core social value or attitude

• “an evaluation of the moral standards of the society in which we live” (Newton)

• A ‘default position’ in transactions with unknown others

23

Does this matter?

• Primary social and individual returns are to thin/social trust

• Thick and thin trust may even be negatively correlated

• The less we trust people in general, the more we retreat to the safety of those we know

• So, empirically distinct measures are clearly essential

24

The standard trust question

• Generally speaking, would you say that most people can be trusted, or that you can't be too careful in dealing with people?– Most people can be trusted– Can’t be too careful

• Usually credited to Rosenberg (1959), the ‘Rosenberg Generalized Trust’ (RGT) item

25

The Local Area Trust item

• How much do you trust people in your local area?– a lot– a fair amount– not very much– not at all

• Reflects Putnam’s emphasis on trust being a property of local areas

26

Trust by Question type

• These items are both used more or less interchangeably as measures of generalized trust

• Yet, they yield very different estimates of trust at the national level. e.g.:– Social Capital Community Benchmark survey: 47%

most people can be trusted; 83% trust people in local area ‘some’ or ‘a lot’

– UK Taking Part survey: 44% most people can be trusted; 74% trust ‘many’ or ‘some’ of the people in their local area

• Why such a large discrepancy in generalized trust (trust in strangers)?

27

Research Design

• Ipsos-MORI general population omnibus survey• Random selection of small areas, quota

controlled selection of individuals• n=989 (fieldwork, November 2007)• Respondents randomly assigned to RGT or TLA

item• In answering the last question, who came to

mind when you were thinking about ‘most people’/ ‘people in your local area’?

28

Distributions for trust questions

RGT item (n=508) TLA item (n=481)

Most people can be trusted 48% (229) A lot 20% (100)

Can’t be too careful 52% (252) A fair amount 60% (302)

Not very much 17% (88)

Not at all 3% (17)

29

Primary Codes Higher Order Codes % mentioned 1. colleagues/ ex-colleagues 2. family/ family member 3. friends Known others 42% 4. most people I know/ meet 5. neighbours 6. people from my church 7. anyone/ all people 8. everyone/ everybody 9. foreigners/ ethnic minorities 10. general public/ people in general Unknown others 22% 11. children/ young people 12. no-one in particular 13. strangers 14. people in this town/ village Local community 5% 15. doctors 16.officials/ authority figures/professionals

17. police Named job/ profession 10% 18. politicians/political parties 19. salesmen/ sales people 20. tradesmen 21. don't know these days 22. identity theft 23. you have to place trust in people 24. people interested in themselves Other (not relevant) 13% 25. people mostly trustworthy 26. trust people until they upset me 27. trusting is naïve 28. other answers 29. don't know/not stated Don’t know/ no answer 22%

30

Who comes to mind by RGT

0%

10%

20%

30%

40%

50%

60%

70%

80%

known others unknown others namedjob/profession

people in localarea

other don't know/notstated

code

% m

en

tio

ne

d

most people can be trusted

can't be too careful

31

Who comes to mind by TLA

0%

10%

20%

30%

40%

50%

60%

70%

80%


people in localarea

other don't know/notstated

code

% m

en

tio

ne

d

a lot

a fair amount

not at all/not very much

32

Who came to mind – both questions

0%

10%

20%

30%

40%

50%

60%


people in local area other don't know/notstated

code

% m

enti

on

ed

RGT

TLA

33

Explanatory Models 1 RGT Item – Binary Logit Model Model 1a Model 2a

Covariates Logit (S.E.) O.R

. Logit (S.E.) O.R. Age (years) 0.028 (0.036) 1.03 0.013 (0.038) 1.01 Sex (male=1) 0.057 (0.197) 1.06 0.091 (0.207) 1.09 social class (ABC1=1) 0.817 (0.213)*** 2.26 0.949 (0.227)*** 2.58 longstanding illness (yes = 1) 0.355 (0.335) 1.43 0.462 (0.349) 1.59 Highest qualification (ref=no qualifications) Degree 0.944 (0.337)** 2.60 1.029 (0.354)** 2.80 GSCE or above 0.108 (0.261) 1.11 0.142 (0.276) 1.15 Marital status (ref = single, never married) Divorced 0.236 (0.454) 1.27 0.508 (0.476) 1.66 Married 0.176 (0.274) 1.19 0.413 (0.291) 1.51 Widow -0.124 (0.516) 0.88 0.272 (0.540) 1.31 Who came to mind? (ref=2. unknown others) 1. known others - - 1.535 (0.267)*** 4.64 3. people in local area - - 1.885 (0.763)** 6.60 4. named job/ profession - - -0.255 (0.373) 0.78 5. other (not relevant) - - 0.257 (0.328) 1.29 6. non-one/ don't know/ not stated - - 1.043 (0.280)*** 2.84 Constant -1.178 (0.345) 0.31 -2.161 (0.410) 0.12

34

Explanatory Models 2 TLA Item – Ordered Logit Model Model 1b Model 2b

Covariates Logit (S.E.) O.R

. Logit (S.E.) O.R. Age (years) 0.097 (0.034)** 1.10 0.076 (0.034)* 1.08 Sex (male=1) -0.393 (0.186)** 0.68 -0.255 (0.190) 0.77 social class (ABC1=1) 0.751 (0.204)*** 2.12 0.771 (0.207)*** 2.16 longstanding illness (yes = 1) 0.230 (0.293) 1.26 0.297 (0.297) 1.35 Highest qualification (ref=no qualifications) Degree 0.605 (0.312)* 1.83 0.425 (0.320) 1.53 GSCE or above 0.218 (0.255) 1.24 0.075 (0.258) 1.08 Marital status (ref = single, never married) Divorced -0.247 (0.409) 0.78 -0.206 (0.418) 0.81 Married 0.323 (0.249) 1.38 0.275 (0.253) 1.32 Widow 0.516 (0.440) 1.68 0.447 (0.448) 1.56 Who came to mind? (ref=2. unknown others) 1. known others - 1.559 (0.305)*** 4.75 3. people in local area - 0.953 (0.408)* 2.59 4. named job/ profession - 0.087 (0.305) 1.09 5. other (not relevant) - 0.383 (0.356) 1.47 6. non-one/ don't know/ not stated - 0.579 (0.346) 1.78 Constant - - - -

35

Concluding Remarks

• Large-scale field experiments are a useful way of assessing validity of questions

• Random sample + random manipulation yields strong inferential power

• Under-utilized due to cost considerations

• But are they really so costly?

• A complement to rather than replacement for small n approaches

36

Papers• Sturgis, P. Allum, N. & Smith, P. (2008) The Measurement

of Political Knowledge in Surveys Public Opinion Quarterly 72,90-102.

• Sturgis, P. and Smith, P. (2008) Assessing the Validity of Generalized Trust Questions: What kind of trust are we measuring? Paper presented at the ‘Conference on Composite Scores’ ESADE, Barcelona, 14-15 February 2008.

• Sturgis, P. and Smith, P. (2007) Fictitious Issues Revisited: political knowledge, interest, and the generation of nonattitudes. (under review).

• Sturgis, P., Choo, M. & Smith, P. (2007) Response Order, Party Choice, and Evaluations of the National Economy: A Survey Experiment. Survey Research Methods (in press).

1 field experiments for assessing question validity patrick sturgis, department of sociology,...

Documents

presence of partial

dk mondak

knowledge scores

substantive answer

probed dk responsesitem

partial knowledgethus

partial knowledgesome

answer options