design & analysis of social surveys

38
DESIGN & ANALYSIS OF SOCIAL SURVEYS COMPARATIVE PERSPECTIVES ACROSS TIME & SPACE MORE EFFICIENT CROSS-NATIONAL DATA ANALYSIS WITH MULTIPLE CORRESPONDENCE ANALYSIS Fifth International Conference John Kochevar Social Science Methodology 17 Monument Square Cologne, Germany October 2000 Boston, MA 02129

Upload: guest3bd2a12

Post on 29-Jan-2018

1.311 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Design & Analysis Of Social Surveys

DESIGN & ANALYSIS OF SOCIAL SURVEYS

COMPARATIVE PERSPECTIVES ACROSS

TIME & SPACE

MORE EFFICIENT CROSS-NATIONAL DATA ANALYSIS

WITH

MULTIPLE CORRESPONDENCE ANALYSIS

Fifth International Conference John KochevarSocial Science Methodology 17 Monument SquareCologne, Germany October 2000 Boston, MA 02129

Page 2: Design & Analysis Of Social Surveys

OUTLINE

I. Introduction

II. Commercial Cross-National Research

III. Theoretical and Practical Problems

IV. An Analysis Approach and Methodology

V. Examples- Fear of Hypoglycemia in Four Nations- Job Stress among Women in Five Nations

VI. Summary

EFFICIENT METHODS FOR MINING CROSS-NATIONAL DATA

Page 3: Design & Analysis Of Social Surveys

INTRODUCTION

Page 4: Design & Analysis Of Social Surveys

COMPARATIVE CROSS-NATIONAL RESEARCH IS VERY DIFFICULT

THREE LESSONS

• There are severe theoretical and practical restraints in comparative cross-national research.

• Exploratory data analysis is necessary for most cross-national research.

• Path analysis logic and multiple correspondence analysis can be used together for comprehensive, convenient and efficient data mining.

Page 5: Design & Analysis Of Social Surveys

COMMERCIAL CROSS-NATIONAL RESEARCH

A GROWING SEGMENT

Page 6: Design & Analysis Of Social Surveys

BOOKS ON CROSS-NATIONALRESEARCH IN HARVARD LIBRARIES

• We used a keyword search on Harvard’s online catalog to estimate the number of books published under “cross-national,” “cross-cultural” and “comparative studies” over the last 40 years. We counted books that appeared to be quantitative in nature.

• The number of books on cross-national research peaked in the 1970’s and has leveled off since.

• This is not a precise count. Other books on cross-national research may appear under different keywords. We only counted English language

references.

Number of Books

1960 to 1999

0

5

10

15

20

25

30

1960-65 1971-75 1981-85 1991-95

Page 7: Design & Analysis Of Social Surveys

INCOME FROM INTERNATIONAL RESEARCH

TOP 50 US RESEARCH FIRMS

Note: Data collected by Council of American Survey Research Organization (CASTRO).Source: Honomichl Top 50 Marketing News, June 5, 2000; May 28, 1990.

$2.6 billion

31% 39%

$6.8 billion

Total US$ Revenues

PercentInternational

Percent International

Total US$ Revenues

2000 1990

Page 8: Design & Analysis Of Social Surveys

PROBLEMS

WEAK THEORY MESSY DATA

Page 9: Design & Analysis Of Social Surveys

CROSS-NATIONAL COMMERCIAL RESEARCH

THEORETICAL PROBLEMS

• Concept Equivalence

- Awareness and understanding of products vary strongly across nations.

- Beliefs, attitudes and values are seldom equivalent.

• Too many variables and too few theories

- Behavior is multivariate complex.

- Marketing “Theory” is a fashion industry.

Page 10: Design & Analysis Of Social Surveys

CROSS-NATIONAL COMMERCIAL RESEARCH

PRACTICAL PROBLEMS

• Rating scales are not used the same way across nations.

- Surface validity is often weak.

- Reliability is weak. Intervals are not equal.

• Statistical assumptions are not justified.

- Normal distribution

- Homogeneity of variance

- Interaction effects

• Clients are not well-trained.

- Not interested in methodology

- Cannot read data displays

- Have the patience of angry children

• Data is often dirty.

- Sampling error/Non-response problems

- Interviewer error

- Problems in editing, coding, tabulating

Page 11: Design & Analysis Of Social Surveys

I f you love surveys or

sausages, you should not

wat ch e it her be ing made . . .

Source: Otto von Bismarck: “Wenn sie Gesetze and Würste mögen, dan sollten Sie niemals bei der Herstellung von beiden zuschauen.”

Page 12: Design & Analysis Of Social Surveys

CONCEPTUAL APPROACH AND METHODOLOGY

COMPREHENSIVE ROBUST UNDERSTANDABLE

Page 13: Design & Analysis Of Social Surveys

ASSUME NOTHING / CHECK EVERYTHING

Multivariate

Hierarchical Effect Models

Effects are universal. Nations are intervening.

Blocked indicators

Dependent variables. Simon/Blalock approach.

Two-stage analysis

Check distributions Eliminate outliers

Seek interaction effects Multiple Correspondence Analysis

Practical Display results in the simplest format.

CONCEPTUAL METHODOLOGICAL

Page 14: Design & Analysis Of Social Surveys

1. Conceptual Model. Decide on dependent variable. Organize similar independent variables into “blocks.” Organize blocks into a path model or hierarchical levels of influence.

2. Recode data. Categorical level of analysis.

3. Universal Analysis. Run separate analyses on each block of indicators with the dependent variable(s). Remove outliers and rerun analyses.

4. Nation Analysis. Introduce nations into each Universal analysis to determine extent of “universality.”

5. Most important predictors. Run the strongest predictors from each block of indicators. Introduce nations in a second analysis.

6. Check results. Examine the most important findings using crosstabs and other statistical techniques.

DATA MINING WITH MCA

STAGES OF ANALYSIS

Page 15: Design & Analysis Of Social Surveys

EXAMPLE I

FEAR OF HYPOGLYCEMIA

Page 16: Design & Analysis Of Social Surveys

CROSS-NATIONAL COMMERCIAL SURVEYS

FEAR OF HYPOGLYCEMIA

• An American pharmaceutical manufacturer wanted to know the causes of fear of hypoglycemia (low blood sugar) among diabetics.

• Sample: Diabetics in four nations: United Kingdom, Germany, Spain, France.

• Questionnaires were self-completed under supervision.

• Approximately 50 respondents in each nation. N=195.

• Data quality varied. There was high variance in the main fear rating.

Page 17: Design & Analysis Of Social Surveys

FACTORS CAUSING FEAR OF HYPOGLYCEMIADemographics Diabetes Experiences

Fear Rating

Type of Symptoms

FrequencySymptoms

FrequencyWorry

Reasonsfor

Worry

Age

Sex

Education

Type of Diabetes

Years of

Diabetes

Insulin Use

Living Conditions

Type ofInsulin

Page 18: Design & Analysis Of Social Surveys

FEAR OF HYPOGLYCEMIA AND NATIONALITYIndependent

Intervening

Frequency Reasons

Worry

Nationality

Demographics

Diabetes

Experience Symptoms

Fear Rating

Demographics

Diabetes

Experience Symptoms

Worry

Nationality

Fear Rating

Page 19: Design & Analysis Of Social Surveys

24

23

22

21

20

191817

16

15

14

13

12

11

10

9

8

7

65

4

3

2

1

Age3 LT 204 20-255 26-306 31-357 36-408 41-459 46-5010 51-5511 56-6012 61-6513 66-7014 71+

Sex15 Male16 Female

Education17 Middle school or less18 High school19 College20 Post-Grad

Living Conditions21 Live Alone22 Live with children but no

other adults23 Live with at least one

other adult24 Live with children and at

least one other adult

1 Low2 High

Intensity of Worry

SEVERE HYPOGLYCEMIA FEAR - DEMOGRAPHICSUNIVERSAL

Page 20: Design & Analysis Of Social Surveys

Age3 LT 204 20-255 26-306 31-357 36-408 41-459 46-5010 51-5511 56-6012 61-6513 66-7014 71+

Sex15 Male16 Female

Education17 Middle school or less18 High school19 College20 Post-Grad

Living Conditions21 Live Alone22 Live with children but no

other adults23 Live with at least one

other adult24 Live with children and at

least one other adult

1 Low2 High

Intensity of Worry

SEVERE HYPOGLYCEMIA FEAR - DEMOGRAPHICSNATION INFLUENCE

28

27

26

25

24

23

22 21

20

19

18

17

1615

14 13

12

11

10

9

8

7

6

5

4

3

2

1

Nation

25 UK 26 Germany 27 France 28 Spain

Page 21: Design & Analysis Of Social Surveys

1 Low2 High

Intensity of Worry

Age3 LT 204 20-255 26-306 31-357 36-408 41-459 46-5010 51-5511 56-6012 61-6513 66-7014 71+

Sex15 Male16 Female

Living Conditions17 Live Alone18 Live with children but no other adults19 Live with at least one other adult20 Live with children and at least one

other adultFrequency of Worry21 Weekly22 Less Often23 Never

Reasons for Worry24 Fear of Associated Symptoms25 Difficult to Manage26 Fear of Coma27 Get it While Asleep28 Get it While Alone / No one to help29 Unpleasant

2928

27

26

25

24

23

22

21 20

19

18

17

16

15

14

13

1211

10

987

6 5

4

3

2 1

SEVERE HYPOGLYCEMIA FEAR - STRONG PREDICTORSUNIVERSAL

Page 22: Design & Analysis Of Social Surveys

SEVERE HYPOGLYCEMIA FEAR - STRONG PREDICTORSNATION INFLUENCE

1 Low2 High

Intensity of Worry

Age 3 LT 20 4 20-25 5 26-30 6 31-35 7 36-408 41-459 46-5010 51-5511 56-6012 61-6513 66-7014 71+

Sex15 Male16 Female

Living Conditions17 Live Alone18 Live with children but no other adults19 Live with at least one other adult20 Live with children and at least one

other adult

Frequency of Worry21 Weekly22 Less Often23 Never

Reasons for Worry24 Fear of Associated Symptoms25 Difficult to Manage26 Fear of Coma27 Get it While Asleep28 Get it While Alone / No one to help29 Unpleasant Nation

33

32 31

30

29

28

27

26

25

24 23

22

2120

19

18

17

1615

14

13

1211

10

9

87

6 5

4

32 1

30 UK 31 Germany 32 France 33 Spain

Page 23: Design & Analysis Of Social Surveys

100% 33% 50%

- - 67 50

100% 92% 93%

- - 8 8

FEAR OF HYPOGLYCEMIA

BEHIND THE MCA CHARTS

Spain France

High

Low

Get it while alone

Total

Fear of associated symptoms

Fear Yes No TotalYes No

Get it while alone

No fear of associated symptoms

High fear

High

Low

86% 71% 74%

14 29 26

² =.09 ² =1.333

² =.64

88% 52% 60%

13 48 40

² =3.268

• This table illustrates part of the data in the final “Top predictors” chart. The results seem to be contradictory and require close attention.

• The MCA chart shows that “Getting while alone” and “Fear of associated symptoms” interact with fear more strongly for Spanish diabetics. The data shown here indicate the interactions are stronger for France.

• Overall, there was a stronger relationship between Spain and high fear. The MCA chart does not include the data point “No fear of associated symptoms.” The data point of France, 32, is pulled toward this invisible point and away from high fear. The display is correct, but incomplete. Take care. . .

Page 24: Design & Analysis Of Social Surveys

1. Data Display. We show only two examples from the total analysis. In the full analysis we ran universal and national MCA’s for variables blocked as “Demographics,” “Diabetes,” and “Experiences.” Those variables highly associated with the dependent variable were run in the final “Top predictor” MCA charts. There were eight display charts.

2. Final Predictors and “Causality”. The distance between a predictor and the dependent variable is approximately their ² distance (higher ², closer distance) with other relationships taken into account. Logically, we need to know time order and if-only-if, to infer causality. Practically speaking, variables that are causally related show stronger relationships than those which are not causally related. We use our judgement to pick top predictors that have the strongest relationships with the dependent variable.

3. Interpretation of Axes. Initially we interpreted the axes in presentations to clients. They disagreed with our interpretations and could not agree on their own. In general, MCA appears to commercial clients because it summarizes important interactions and subgroups in their overall context. We seldom interpret latent variables implicit in axes loadings.

4. Severe Hypoglycemia and Fear. The final table shows that some factors, e.g. “Fear of Coma”, “Unpleasant”, “Get it while asleep”, were associated with fear of severe hypoglycemia, independent of other variables. The strongest relationships, however, were interactions of several variables, e.g. “Fear of associated symptoms” (e.g. ”confusion”), “Get it while alone” and Spanish nationality. None of the strong relationships changed when nationality was introduced, so we conclude that nationality does not intervene.

So why are the Spanish experiencing fear so intensely? Subsequent research determined that the Spanish gave themselves more insulin injections and positively valued the initial symptoms of hypoglycemia because these otherwise unpleasant symptoms indicated their sugar was under control. They chose to risk hypoglycemia and experience its symptoms more than other nationalities. Germans, for example, gave themselves fewer injections, allowed their sugar to run high, and ultimately experienced more long term toxic consequences of their illness, e.g. diabetic foot amputations.

SEVERE HYPOGLYCEMIA FEARNOTES ON INTERPRETATION

Page 25: Design & Analysis Of Social Surveys

EXAMPLE II

WOMEN AND WORK STRESS

Page 26: Design & Analysis Of Social Surveys

CROSS-NATIONAL COMMERCIAL SURVEYS

WOMEN AND WORK STRESS

• An American woman’s magazine wanted to know the causes of job stress among working women.

• Sample: Working women magazine readers in five nations: United States, Japan, Germany, Brazil and Australia.

• Questionnaires were in magazines, self-completed and returned by mail (N=22,500). We randomly sampled returns.

• Final sample N=4,500.

• Data quality varied.

Page 27: Design & Analysis Of Social Surveys

Demographics

FACTORS ASSOCIATED WITH WORK STRESS

Personality

EducationIncome

Work Stress

Incidence DurationSeverity

Age

PerfectionismStress is Stimulating

Work Motivations

Career GoalsReasons for Working

Home Factors Work Factors

Culture

Nationality

Home Problems

Children

Marital Status

Environment

ControlSocial support

Occupation

Page 28: Design & Analysis Of Social Surveys

WORK STRESS AND DEMOGRAPHICSUNIVERSAL

Income - Quintiles4 Low5 6 Medium78 High

Education 9 Less than HS 10 High School11 Vocational/Trade 12 Jr Coll/Assoc Some College 13 College14 Grad Prof School

Age15 18-2516 26-3017 31-3618 36-4019 41-4520 46-5021 51-5522 56+

1 Low 2 Moderate 3 High

Stress

22

2120

1918

17

16

15

14

131211

10

9

8

7

65

4

32

1

Page 29: Design & Analysis Of Social Surveys

WORK STRESS AND DEMOGRAPHICSCOUNTRY INFLUENCE

Income - Quintiles4 Low5 6 Medium78 High

Education 9 Less than HS 10 High School11 Vocational/Trade 12 Jr Coll/Assoc Some College 13 College14 Grad Prof School

Age15 18-2516 26-3017 31-3618 36-4019 41-4520 46-5021 51-5522 56+

1 Low 2 Moderate 3 High

Stress

27

26

25

24

23

2221

2019

18

1716

15

14

13

12

11

10

9

8

7

65

4

3

2

1

Country23 US 24 Japan 25 Australia 26 Germany 27 Brazil

Page 30: Design & Analysis Of Social Surveys

WORK STRESS AND STRONG PREDICTORS

UNIVERSAL

1 Low 2 Moderate 3 High

Stress

Age 4 18-25 5 26-30 6 31-35 7 36-40 8 41-45 9 46-5010 51-5511 56+

Education12 Less than high school13 High School14 Voc/Trade15 Jr college/some educ16 College17 Graduate school

18 Work to support myself

Perfectionist - Describes19 Very well20 Somewhat21 Describes22 Somewhat does not23 Does not

Stimulated by stress/ pressure24 Yes25 Pressure, not stress26 Seldom

27 Moved28 Changed jobs

Occupation29 Managers30 Professionals31 Craftsmen32 Technicians and Admin.33 Bureaucratized Service34 Commercialized Service35 Routinized workers36 Laborers37 Marginal Workers (Students)

38 No privacy39 Too many interruptions40 Too much work to do a good job

Can control pace - describes41 Very well42 Somewhat43 Describes44 Somewhat doesn’t 45 Does not

Hours per week46 1-2047 21-3448 35-3949 4050 41-4551 46-5952 60+

Colleagues under stress53 Majority54 A few55 Don’t know

Female Work Friends56 None57 1-258 3-559 6+

59

58

57

56

55

5453

52

51

50

49

48

47

46

45

44 43

42

4140

39 38

37

36

35

34

3332

31

30

29

282726

25

24

23

22

2120

19

18

17

16

15

14

1312

11109

8

7

6

5

4

3

2

1

Page 31: Design & Analysis Of Social Surveys

64

63

62

61

60

59

5857 56

55

54

53

5251

50

4948

47

46

45

4443

42 41

4039 38

37

36

35

3433

32

31

3029

2827

26

25

24

23

22

2120

19

18

17

1615

14

13

12

1110

98

7

6

54

3

2

1

WORK STRESS AND STRONG PREDICTORSNATION INFLUENCE

1 Low 2 Moderate 3 High

Stress

Country60 US 61 Japan 62 Australia 63 Germany 64 Brazil

Age 4 18-25 5 26-30 6 31-35 7 36-40 8 41-45 9 46-5010 51-5511 56+

Education12 Less than high school13 High School14 Voc/Trade15 Jr college/some educ16 College17 Graduate school

18 Work to support myself

Perfectionist - Describes19 Very well20 Somewhat21 Describes22 Somewhat does not23 Does not

Stimulated by stress/ pressure24 Yes25 Pressure, not stress26 Seldom

27 Moved28 Changed jobs

Occupation29 Managers30 Professionals31 Craftsmen32 Technicians and Admin.33 Bureaucratized Service34 Commercialized Service35 Routinized workers36 Laborers37 Marginal Workers (Students)

38 No privacy39 Too many interruptions40 Too much work to do a good job

Can control pace - describes41 Very well42 Somewhat43 Describes44 Somewhat doesn’t 45 Does not

Hours per week46 1-2047 21-3448 35-3949 4050 41-4551 46-5952 60+

Colleagues under stress53 Majority54 A few55 Don’t know

Female Work Friends56 None57 1-258 3-559 6+

Page 32: Design & Analysis Of Social Surveys

DO CAUSES OF STRESS INTERACT?TOTAL SAMPLE

• This table shows the cumulative impact of the major stress predictors.

• There is a small cumulative impact with the addition of each problem.

• A total of 130 respondents reported all four problems.

Too Much Work to do a Good Job

+ Low Pace Control

+ Majority of Colleagues Under Stress

+ Too Many Interruptions

Those Reporting High Stress

45%

54%

59%

60%

Page 33: Design & Analysis Of Social Surveys

Yes No

45% 23%

22%

29%

19%

20%

10%

9%

IS GERMANY UNIQUE?Too Much Work to Do a Good Job?

All Countries Average

High Stress

Strength of Relationship

Strength by Country

Germany

Australia

US

Japan

Brazil

Cannot Control Pace? Yes No

44% 28%

16%

23%

16%

16%

15%

7%

Page 34: Design & Analysis Of Social Surveys

1. Data Display. We show only four charts from a much larger analysis. In addition, we have not displayed some data points. For example, “No privacy” has two values, “Describes” and “Does not Describe.” We do not show the “Does not Describe” on the chart. This makes it possible to fit many more variables on the chart - and it is consistent with our data mining approach. However, the absence of a cause (e.g. describe“) can have unique interactions. We examine the initial computer results and plot them only when they are important.

2. Occupation. We coded results to a standard used by job safety researchers except in Japan. At the time of the survey there was a controversy concerning stress-related worker deaths (karaoshi) in Japan. Japanese workers were classified only as “part-time” or “full-time” under the orders of a unit manager (not a researcher) who was afraid the results might reflect badly on certain businesses. Unfortunately, almost all female Japanese workers are customarily classified as part-time workers, and all tables with occupation as a variable are distorted by the unique relationship between Japan and occupation.

3. Job stress. The Universal table supports the findings of earlier studies that show that immediate factors in

the worker’s environment - “Control of Pace”, “Too much work” - are the most important cause of worker stress. This relationship held while controlling for a variety of other relationships and was even stronger in the presence of other work conditions, e.g. “Colleagues under stress.” German women tended to experience more stress in response to bad work conditions. Some of this can be explained by the higher proportion of factory workers in the German sample, but not all.

We are conducting additional analyses to determine if actions taken to alleviate stress, e.g. Exercise, dancing, drinking, may account for the differences in experienced work stress across nations.

WORKING WOMEN AND WORK STRESSNOTES ON INTERPRETATION

Page 35: Design & Analysis Of Social Surveys

SUMMARY

Page 36: Design & Analysis Of Social Surveys

• Theory is weak in cross-national commercial research. Exploratory data analysis is required.

• Data is messy. Error is very high. Use categorical level of measurement.

• Interaction effects are common. Systematically identify effects on single dependent variable. Use multiple correspondence analysis.

• Results are multivariate complex. Determine the most important predictors among blocks of similar predictors. Use causal path logic.

• Display all results. Show the comparative effects of nations for each block of indicators.

• Check results using alternative analyses.

SUMMARY

Page 37: Design & Analysis Of Social Surveys

• Exploratory. Comprehensive.

• Logic is obvious.

• Few assumptions about data.

• Interactions become apparent.

• Efficient.

• Easy to read displays.

CORRESPONDENCE ANALYSIS FOR DATA MINING

ADVANTAGES

Page 38: Design & Analysis Of Social Surveys

REFERENCES

Problems of Cross-national Comparative Research

The problems have been well known for many years.

Armer, M., Grimsaw, A.D. (Eds.), Comparative Social Research: Methodological Problems and Strategies.New York: Wiley, 1973.

Kohn, M.L.,Cross-National Research as an Analytic Strategy. American Sociological Review.1987, Vol.52,713-731.

Van de Vijer, F., Leung, K. Methods and Data Analysis for Cross-Cultural Research. London: Sage, 1997.

Correspondence Analysis

The application of correspondence analysis for detecting interactions was noted by Hayashi.

Hayashi, C., Suzuki, T. Quantitative Approach to a Cross-Societal Research. Annals of the Institute of Statistical Mathematics, Vol. 27, 1975, 1-32.

Logic and Strategy

Hayashi, C., The Quantitative Study of National Character. In Sasaki, M. (Ed.) Values and Attitudes Across Nations, Boston: Brill, 1998, 91-114.

Jones, L. (Ed.) The Collected Works of John W. Tukey: Volume IV Philosophy and Principles of Data Analysis. Monterey, CA: Wordsworth and Brooks, 1986.

Simon, H. A. Spurious Correlation: A Causal Interpretation. Journal of the American Statistical Association. 1954, Vol. 49, 467-479.

Blalock, H.M. Multiple Indicators and the Causal Approach to Measurement Error. American Journal of Sociology, 1969, Vol. 75, 264-272.