instructor: vincent duffy, ph.d. associate professor of ie lecture 11 – questionnaire design &...

Instructor: Vincent Duffy, Ph.D.Instructor: Vincent Duffy, Ph.D.

Associate Professor of IE Associate Professor of IE

Lecture 11 – Questionnaire Design & EvaluationLecture 11 – Questionnaire Design & Evaluation

Thurs. Feb. 22, 2007Thurs. Feb. 22, 2007

IE 486 Work Analysis & Design IIIE 486 Work Analysis & Design II

AdministrativeAdministrative

• Briefly review QOTD answer sheet from L10

• Questionnaire design & analysis – (in preparation for Lab 3 beginning Friday)

• Lab 2 due in class next Tuesday 27th

QOTD- Lecture 11QOTD- Lecture 11

• Q.1. Briefly discuss what is meant by criteria, measures & dependability.

• Q.2 What methods are appropriate for evaluating questionnaire data?

• Q.3 Should the same analyses be used for nominal/category data and continuous data?

Questionnaire designQuestionnaire design

• QOTD – included in slides

• Methods in Work Analysis & Design– Dependability of measures

• Evaluating questionnaire data• Factor analysis• Internal consistency

– Questionnaires • examples & scales

Q.1 Briefly discuss what is meant by criteria, Q.1 Briefly discuss what is meant by criteria, measures & dependabilitymeasures & dependability

This is related to methods of data collection & evaluation

• A criterion is an evaluation standard that can be used as a surrogate or correlate of outcome measures…such as system effectiveness, human performance and attitudes.

– Eg. For driving performance we previously considered stopping

distance and time to lane change (or lane deviations)


• Important aspects of criteria include:– Relative strengths/weaknesses of the data

collection methods– Balance between costs of methods &

available resources • (eg. Consider motion capture test bed).

– Costs include money, time, personnel & expertise Relevance


• Relative strengths/weaknesses of the data collection methods (dependability) • Today, aspects of criteria & methods of

evaluation • Dependability is related to the validity &

reliability• Brief intro to dependability and measures of

validity & reliability – more in lab tomorrow

• Important aspects of criteria include:– Relevance, Linearity and Homogeneity


• Relevance– Evaluation criteria must contribute to overall system mission

• Eg. Speed, response time, errors, comfort, acceptance

• Linearity – usually assumed – However, for industrial performance vs. intelligence,

initially, no relationship was shown• Additional analysis showed up to IQ of 90, r=0.46; between 90 and

110 r=0.04. For IQ>110, r=-0.52 (piecewise linear; see graph)

– (non linear) IQ initially contributes to performance, then no impact, then too high IQ appears related to boredom

see also Salvendy & Carayon (1997) supplementary reading for more detailPerformance IQ


• Homogeneity – performance vs. time of day– Note how performance changes with time

• there appears to be warm up & slow down for the same operator

• The lack of consistency in performance over time of day raises issues of how/when to best collect data

Methods – evaluating questionnaire dataMethods – evaluating questionnaire data

• Q.2 Which statistical technique can be used to evaluate questionnaire data?


• Q.2 Which statistical technique can be used to evaluate questionnaire data?– It will depend on the objective of the analysis...


• Which statistical technique becomes especially appropriate when questionnaire responses are measured on a nominal scale?

• First, what is meant by nominal scale?– Quality categorized as high, medium, low (discrete data)– Machine breakdowns due to mechanical failure electrical

failure or operator misuse– For analysis of discrete data you can use a Chi-Square

analysis


• Q.3 Which statistical technique becomes especially appropriate when questionnaire responses are measured on a nominal scale?


• Q.3 Which statistical technique becomes especially appropriate when questionnaire responses are measured on a nominal scale?A Chi Squared analysis For example – you can test whether color choice is related to gender

Female Male TotalGreen 70 40 110 chose green

38.9% 22.2% 61.1% chose green64.6% 36.4% 64.6% of those who

chose green were female

Blue 30 40 70 chose blue16.7% 22.2% 38.9% chose blue42.9% 57.1% 57.1% of those who

chose blue were maleIs there a significant difference in color choice, depending on gender?A Chi square statistic (shown after analysis of the original data set in SAS)

shows 2=7.5, p=0.006 (p<0.05). Hence, we would conclude yes.


• Q.3 How might a response scale be designed to maximize the chance the responses (on a questionnaire) will be on a ratio measurement scale?– Use continuous, numerical, and anchored scales and do

pilot testing – the average of questions/items can give a continuous measure.

– A factor analysis will likely be done before an ANOVA (test of differences).

- Factor analysis tries to find a factor (can think of it as a new variable) that will provide the highest set of correlations with the original variables (squares of these variables) producing the largest eigenvalue.

Examples of ScalesExamples of Scales

• Eg. Range from Very important – Very unimportant– With Neither important or unimportant in the middle – could have

‘moderately’ important in between.

• Or …Strongly agree, agree, undecided, disagree, strongly disagree

• Very much, much, fair, a little, not at all• Much worse than usual, worse than usual, about the

same, better than usual, much better than usual• Excellent, very good, good, fair, poor• All, most, a good bit, most, some, little, none• Always, very often, fairly often, sometimes, almost never,

never

The Hackman & Oldham Job Satisfaction The Hackman & Oldham Job Satisfaction Survey (1975)Survey (1975)

How satisfied are you with this aspect of your job?

__1.The amount of job security I have.

__2. The amount of pay and fringe benefits I receive

__3. The amount of personal growth and development I get in doing my job.

__4. The people I talk to and work with on my job.

__5. The degree of respect and fair treatment I receive from my boss.

__6. The feeling of worthwhile accomplishment I get from doing my job.

__7. The chance I get to know other people while on the job.

1

Extremely Dissatisfied

2 Dissatisfied

3

Slightly Dissatisfied

4

Neutral

5

Slightly Satisfied

6

Satisfied

7

Extremely Satisfied

Methods – evaluating questionnaire dataMethods – evaluating questionnaire data- Check eigenvalues of each factor (or variable) before

deciding how many variables to include/consider - Eigenvalues should be greater than 1 for each factor that you include

- Check percent of variance explained by each factor by taking the eigenvalue divided by the # of items - For example, if the eigenvalue (from linear algebra) is 3.66 with 6

questions (max. 6 units in the eigenvalue) – - then 3.66/6 or 0.61 (61%) of total variance in the questionnaire (whole

questionnaire) is explained by the first factor.

- You would expect a sum of at least 50% of the variance to be explained by the factors you have chosen to represent important variables.

Methods – evaluating questionnaire dataMethods – evaluating questionnaire data- Check each factor before deciding how many variables to include/consider -

Eigenvalues should be greater than 1 for each factor that you include- Check percent of variance explained by taking eigenvalue divided by the # of

items (eg. If eigenvalue is 3.66 with 6 questions (max. 6 units in the eigenvalue –eg. 3.66/6 or 0.61 (61%) of total variance in the questionnaire (whole questionnaire) is explained by the first factor.

• An illustration of a principal components factor analysis with varimax rotation – item coefficients are shown below…(look for coefficients > 0.4)

• The values shown below are after rotation - and are easier to interpret Factor 1 Factor 2

Ques. 1 Feel blue - 0.898 0.047Ques. 2 People stare at me - 0.165 0.935Ques. 3 People follow me - 0.222 0.926Ques. 4 Basically happy 0.905 - 0.279Ques. 5 People want to hurt me - 0.549 0.544 Ques. 6 Enjoy going to parties 0.647 - 0.302

Note: an item/question that loads on two different factors (such as question 5) would likely be dropped from further consideration; these factor analyses tend to be more useful when n is large (eg. n>100 participants)

Methods – Dependability - Methods – Dependability - evaluating questionnaire data (cont.)evaluating questionnaire data (cont.)

- Factor analysis tries to find a factor (you can think of it as a new variable) that will provide the highest set of correlations with the original variables (the squares of these variables) producing the largest eigenvalue.

Structured questionnaireStructured questionnaire

– Anchor– Avoid ambiguity– Avoid leading

questions– Measure component– Measure whole– Sensitivity of scale– Halo effect– Dependability

Internal Consistency – testing Internal Consistency – testing reliability of the measuresreliability of the measures

• Internal consistency is the extent to which tests or procedures assess the same construct.

• It is a measure of the precision between the observers or of the measuring instruments used in a study.

• Cronbach’s Alpha measures how well a set of items (or variables) measure a single latent construct.

• Cronbach’s Alpha can be used as a measure of internal consistency.

Cronbach’s Alpha as a measure of Cronbach’s Alpha as a measure of Internal ConsistencyInternal Consistency

General steps toward analysis General steps toward analysis using a questionnaireusing a questionnaire

• Step 1: Conduct the survey using the structured questionnaire

• Step 2: Analyze the collected data

• Step 3: Make recommendation; report, presentation of results

Examples of inferences that can be Examples of inferences that can be drawn from questionnaire datadrawn from questionnaire data

• 1) Which generic features (across manufacturers and models) are liked most, disliked most, or having greater difficulties in usage?

• 2) Which manufacturer and which model of the manufacturer is preferred regarding to each feature by the survey customers?

• 3) Which generic features (within manufacturers) are liked most, disliked most, or having greater difficulties in usage?

Can a measure be valid if it is not Can a measure be valid if it is not reliable?reliable?

• A measure can not be valid if it is not reliable.– If we can not measure it consistently, it is then

hard to imagine that it can be correct.– Shown quantitatively…

• In lab more on…Reliability of Predictors and Criterion

cpT RRRR *0

• Reliability of Predictors and Criterion

– R0 = Observed correlation (Validity) between the predictor and criterion

– RT = “True” correlation (Validity) between the predictor and criterion

• The “True” correlation is one previously reported – possibly in the literature.

– Rp = Reliability of predictor

– Rc = Reliability of criterion

Criterion related validityCriterion related validity

cpT RRRR *0

– Suppose there are four tests in a battery with reliabilities of R1=0.60, R2=0.70, R3=0.78 and R4=0.92

• (eg. These could be internal consistency reliabilities);

– and three criteria are utilized with the following reliabilities: r1=0.45, r2=0.60, and r3=0.75.

• In other words, these reliabilities (r) are believed to be the relationships between two (or more) measures intended to be the measuring same criterion

– Then: (part of Lab exercise tomorrow)

Impact of Reliability on ValidityImpact of Reliability on Validity

n

rrrrr

N

RRRRRV nnNN

13211321 ........

Cronbach’s Alpha as a measure of Cronbach’s Alpha as a measure of

Internal ConsistencyInternal Consistency

k is the number of items in the group.

s2res is the variance of residual components, which can not be controlled.

s2p is the variance component for person.

* The alpha coefficient is interpreted as the ratio of true score variance to observed score variance.

222

22

resp

pk kssk

sk

Cronbach’s Alpha as a measure of Cronbach’s Alpha as a measure of

Internal ConsistencyInternal Consistency

• For example, suppose 13 people were asked to rate a pair of questions on a 7-point scale.

• The pair of questions look different but they are testing the same item.

• For example– How much do you like the weather today?

– How do you feel about the weather today?

People Observed Score (for each question) Total (Yi.)

p1 7 6 13

p2 3 5 8

p3 3 3 6

p4 4 3 7

p5 5 5 10

p6 5 3 8

p7 6 6 12

p8 5 3 8

p9 5 5 10

p10 1 2 3

p11 7 6 13

p12 2 2 4

p13 4 5 9

Total ( Y.j ) 57 54 111 (Y..)

Layout of Data Sheet Layout of Data Sheet calculation as part of lab exercise tomorrowcalculation as part of lab exercise tomorrow

Questionnaire DesignQuestionnaire Design• Supplemental material in R.W. Bailey, Human

Performance Engineering 3rd Ed. pp. 559-568 (Appendix). – And Webpage as shown:

http://www.ucc.ie/hfrg/resources/qfaq1.html

instructor: vincent duffy, ph.d. associate professor of ie lecture 11 – questionnaire design &...

Documents

performance iq slide

nominalcategory data

continuous data

homogeneity slide

best collect data slide

th slide

questionnaire design

slides methods