respondents as measurement items: a calcutta study david heise and shibashis mukherjee november 7,...

Download Respondents as Measurement Items: A Calcutta Study David Heise and Shibashis Mukherjee November 7, 2014 Social Psychology, Health, and Life Course Workshop

If you can't read please download the document

Upload: theodora-robinson

Post on 17-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

That’s partly true… You will have Calcuttan Shibashis telling you how we got our data. But a lot of this presentation is a retired prof talking about methodology.

TRANSCRIPT

Respondents as Measurement Items: A Calcutta Study David Heise and Shibashis Mukherjee November 7, 2014 Social Psychology, Health, and Life Course Workshop Indiana University, Bloomington This presentation may be downloaded at Our title hints that you are in for an exotic Eastern experience today Ratha Yatra Festival in Puri, India Thats partly true You will have Calcuttan Shibashis telling you how we got our data. But a lot of this presentation is a retired prof talking about methodology. The problem We sociologists are oriented towards characterizing variable distributions. But a lot of social life relates to norms, about which most people agree. Most of what you know about assessing distributions is wrong for assessing norms. To assess norms: Use relatively small samples. Most of the people reporting a norm say the same thing. How many times do your have to hear people say that fathers are men? Dont use random sampling. Seek the best enculturated individuals as respondents. For example, better respondents may be older people who lived their whole lives in the same culture. Cull respondents. Evaluate informants with respect to their cultural authority, and eliminate those who do not help to delineate the norms of interest. That is the central topic of our talk today. Provenance of the ideas Sociologist Peter Rossi and colleagues began developing some of the key ideas about assessing norms during the 1980s. Reviews are available in: Guillemina Jasso "Factorial survey methods for studying beliefs and judgments. Sociological Methods & Research 34: Lisa Wallander "25 years of factorial surveys in sociology: A review." Social Science Research 38: Anthropologist A. Kimball Romney began developing other ideas in the 1980s. A. K. Romney, S. C. Weller, and W. H. Batchelder "Culture as consensus: A theory of culture and informant accuracy." American Anthropologist 88: Susan Weller. Cultural consensus theory: Applications and frequently asked questions. Field Methods 2007; 19(4): 339368. Overall review: D. Heise Surveying Cultures: Discovering Shared Conceptions and Sentiments. Hoboken, NJ: Wiley Interscience. This presentation may be downloaded at Plan of this presentation Data collection We obtained data from forty Calcutta respondents, half female, who rated 1,469 stimuli on three affective dimensions. Analyses We conducted novel multivariate analyses aimed at examining respondent quality. Issues How do norm assessments change when only the best respondents are retained? What is the social psychology of respondents who are dropped? Rating Scales One SideOther Side Evaluation scale Beautiful ( )Ugly ( ) Lovely ( )Repulsive ( ) Kind ( )Cruel ( ) Superior ( )Inferior ( ) Potency scale Huge ( )Minute ( ) Powerful ( )Powerless ( ) Big ( )Little ( ) Strong ( )Weak ( ) Activity scale Fast ( )Slow ( ) Industrious ( )Lazy ( ) Alive ( )Dead ( ) Thin ( )Thick ( ) Concepts Calcutta respondents rated 1200 of the 1500 concepts rated by IU respondents in new concepts were added (relevant to Bengali/Indian culture) Example: (A Communist), (a cricket stadium) Final breakdown includes 502 identities, 480 behaviors, 283 modifiers, 195 settings, 9 other kinds of concepts Total=1469 stimuli 15 questionnaires were used each with 98 concepts (except for one set of 97) Data Collection Questionnaire Part 1: Demographic questions (sex, age, geographic origin) Part 2: Interactive tutorial (Java applet, affective ratings) Part 3: Rating feelings for each stimulus on three scales (E-P-A) Randomization of the questionnaire Order of stimuli Order of E-P-A scales Orientation of each scale (Beautiful, Lovely, Kind, Superior can be on the left or right) The questionnaires were not randomized unlike in ordinary Surveyor. Each respondent did all the 15 questionnaires, so they could choose the order Data Collection: Continued Most college educated Bengalis know English Key terms in the applet were used in English (Bengali lacks word-for- word translations) (e.g. Skip button) Study Information Sheet was used in both English and Bengali: You are invited to participate in a research study of sentiments about different kinds of people, behaviors, and places. You were selected as a possible subject because you are a native Bengali speaker who has not lived outside India and therefore has not been directly in contact with other cultures and you are 18 years of age or older. We ask that you read this form and ask any questions you may have before agreeing to be in the study. Data Collection: Continued All parts except the tutorial were in Bengali using Bengali fonts At the end of survey,id was asked and then Save button had to be pressed Data were saved in Indiana University server Respondents answers were collapsed into single dataset usingids Time line and payment for data collection Each respondent took around 30 to 45 minutes to complete a questionnaire Each respondent took 2-3 weeks to complete all 15 questionnaires Total length of data collection was 1 semester (Fall semester 2013) Payment: Approximately 1000 Rupees ($20) was paid on completion Recruiting Ad was posted to student groups in Facebook Students were recruited from Jadavpur University Later, snow-ball sampling was done using initial recruits Myid was given for contact 77 individuals contacted, 40 completed the survey Problems with data collection Security issues in Java: Java applet not running in Internet browsers 26 respondents faced problems in running the applet and 21 dropped out after filling 1-2 questionnaires All those who completed three questionnaires completed the entire survey Data were discarded from respondents who did not complete all 15 questionnaires (as they were not paid) Calcutta Dataset worde1p1a1e2p2a2e3p3a3 a40 B_abandon B_abuse NA B_accuse_incriminate B_address B_advise B_agree_with B_aid_help B_amuse B_analyze B_annoy B_answer_reply_to B_apologize_to B_appeal_to B_applaud_compliment_praise B_appoint B_apprehend B_approach B_argue_with B_arrest concept 1,469 Pan-Respondent Correlation Matrix Based on Calcutta Data e1e2e3p1p2p3a1a2a3 e e e p p p a a a Clarification Through Component Analysis R sexscale R# RC1RC2RC3 me p a fe p a me p a me p a me p a Component vs. factor analysis of the correlation matrix. Missing data were replaced by respondent means on each scale. 11 components were significant according to Horn's test of principal components. The highest loadings on C1 involved the E-scale; A-scale loaded highest on C2; P-scale on C3. So the first three components corresponded to EPA. Other components clustered sets of respondents. We extracted and rotated the first 3 components. The matrix of rotated component loadings is three columns by 120 respondent/scale variables. Next we re-organize this matrix into three groups corresponding to E, P, and A; and two sub-groups corresponding to sex of respondent; with variables sorted by loadings within each group. Males E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A Females E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A E P A Component Analysis of Pan-Respondent Correlations All respondents use the evaluation scale similarly. A third to half of the respondents use the Potency and Activity scales in a convergent way. On the other hand, look at the instances where a scale is better at measuring a dimension other than the one it is supposed to measure (other > supposed & other >=.2) To get reasonably valid measures of each dimension include just those respondents whose ratings load 0.20 or higher on the focal dimension. Respondent 21, a male, was excluded from summations because his ratings on all three dimensions showed so little correlation with the ratings of others. Normal Practice vs. Culling Respondents Normal PracticeCulling Respondents Include a respondents ratings on a particular scale only if that respondent-scale combination loaded 0.20 or higher on the component it was supposed to measure. Ns: Evaluation, 19 males and 20 females. Potency, 6 males and 11 females. Activity, 8 males and 9 females. Example: Rapist ( ) males females The Difference Between Normal Practice and Culling Respondents: Interrelations Among EPA Means From All Respondents Means From Selected Respondents Correlations Among EPA Means From All Respondents mEmPmAfEfPfA Means From Selected Respondents mEmPmAfEfPfA Conclusions Culling increases independence of EPA dimensions Culling lowers gender correlations on Potency and Activity Comparison with U.S.A. data where more respondents rated fewer stimuli Percents of U.S.A. respondents who gave valid responses on each EPA scale when rating 100 stimuli (sexes combined). Valid: loading >=.2 and highest on target dimension. Percents Subset Number of Respondents EvaluationPotencyActivity Minimum Median Percent valid among the 40 Calcutta male and female respondents 100% on E 42% on P 42% on A But here valid only meant loading >=.2; not necessarily highest on target dimension. Overall, more U.S.A. respondents than Calcutta respondents gave valid responses. Recapping Virtually all respondents use Evaluation scales to assess the goodness-badness of stimuli, and that is true for both Calcutta and IU respondents. Some respondents use Potency scales to assess powerfulness-powerlessness and Activity scales to assess liveliness-quietnessaround 40% of Calcutta respondents, and 60%-80% of IU respondents. But many respondents use Potency / Activity scales as if they were assessing Evaluation instead of Potency / Activity. A few respondents use Potency scales as if they were assessing Activity, or vice versa. Culling respondents who use scales inappropriately before computing normative measures reduces interdependency of the EPA dimensions. And increases independence between male and female norms on Potency and Activity. How can sentiments be normative if not representing all respondents? So normative measures should be based only on best respondents. Right? The Social Psychology of Culled Respondents Seven Calcutta respondents had Potency ratings that correlated above.5 with the Evaluation dimension (and had lower correlations with Potency and Activity). One had Potency ratings that correlated above.5 with Activity. The rest correlated most with Potency, correlated with nothing, or correlated with multiple dimensions. Another seven Calcutta respondents had Activity ratings that correlated above.5 with the Evaluation dimension (and had lower correlations with Activity and Potency). Three had Activity ratings that correlated above.5 with Potency. The rest correlated most with Activity, correlated with nothing, or correlated with multiple dimensions. Focus just on the respondents whose Potency or Activity ratings correlated mainly with Evaluation. What goes on with the focal respondents? Their judgment of potency/activity reduces to evaluation because they do not apprehend P / A. How do such individuals manage to generate proper interpersonal behavior without a sense of potency or a sense of activity? The production of behavior is dependent on all three dimensions according to affect control theory. They interpret adjectives that supposedly connote potency or activity as indicating goodness. HugeMinute PowerfulPowerless BeautifulBigLittleUgly LovelyStrongWeakRepulsive Kind FastSlowCruel SuperiorIndustriousLazyInferior AliveDead ThinThick Evokes Nevertheless they have potency and activity associations for each concept, and presumably those associations might be assessed with other adjectives. They convert evaluations into assessments of potency and activity. HugeMinute PowerfulPowerless BeautifulBigLittleUgly LovelyStrongWeakRepulsive Kind FastSlowCruel SuperiorIndustriousLazyInferior AliveDead ThinThick Implies Which then are processed in the potency and activity systems. Like most emerging issues, this one is in a fog. They evaluate only They turn P and A into E They turn E into P and A And thats where we leave you! Should the worst respondents be culled when assessing norms? What goes on with respondents who collapse dimensions? And Your turn