2016 exampack 2016 · total score (summated scale.) ensure that the scale is uni-dimensional - all...
TRANSCRIPT
2016
EXAMPACK 2016 PYC2606
LSLSS
1
Table of Contents Learning model 2
Basic measurement and questionnaire design 2
Content domain Identify the content domain for a questionnaire 2
Suitability of a questionnaire as measuring instrument 3
2 Design a questionnaire 4
Layout of the questionnaire 8
3 Write questionnaire items 11
4 Pilot test the questionnaire 15
5 Evaluate reliability and validity 20
Validity 23
6 Compile a manual 25
7 Evaluate a questionnaire 26
8 Evaluate a manual 30
2
Learning model
deg Learning outcome the purpose of learning achieved by producing specific outcome
products
deg Outcome product the result of the learning activities a learner engages in during the
learning process The learner produces outcome products during learning
deg Production method how an outcome product is produced A series of actions constitute an
activity and a series of activities form a method
deg Learning opportunity work towards achieving the required learning outcomes by
producing particular outcome products A learning opportunity has three elements
(a) an outcome product
(b) the method for producing the outcome product and
(c) a reference to the resource required for producing the outcome product
The outcome product is the most important element each activity of a learning opportunity
is geared towards the production of the outcome product An outcome product has to fulfil
certain standards If a learner produces an outcome product that fulfils certain minimum
criteria the learner can be declared competent
Basic measurement and questionnaire design
Outcome product
A questionnaire and its mini-manual
Production method
1 Identify a suitable content domain
2 Design a questionnaire
3 Write items for the questionnaire
4 Pilot-test the questionnaire
5 Evaluate the questionnairersquos reliability and validity
6 Compile a mini-manual for users of the questionnaire
7 Evaluate the questionnaire
8 Evaluate the mini-manual
Content domain Identify the content domain for a questionnaire
Outcome product
3
1 Identification of a content domain
2 A discussion on the suitability of a questionnaire as measuring instrument
Method
Activity 11 Describe the relevant content domain
Activity 12 Evaluate the suitability of a questionnaire as a measure for this content
domain
Resource reference
1 Content domain Identify the content domain for a questionnaire
2 Suitability of a questionnaire as measuring instrument
The function of the questionnaire is measurement
1 Identifying the focus of a questionnaire
The purpose of the questionnaire refers to what it intends to measure and for whom it will
be used
The first step is to identify the general topic of interest select a problem area within that
topic that you want to investigate reduce the general problem to more specific questions
Do they all relate to each other Can you combine them into one question Should you
rather choose one question and leave the others for separate questionnaires
The content domain therefore consists of the tasks behaviours attitudes etcetera related
to one or more of these questions
2 Limiting the scope of the questionnaire Decide on what is relevant By limiting the scope in this way you can cover your topic
adequately but do not ask irrelevant questions and still have a questionnaire that is relevant
and not too long
Suitability of a questionnaire as measuring instrument
The main purposes of a questionnaire are to
(1) obtain accurate factual information
(2) provide a standard format for recording facts comments and attitudes and
(3) facilitate data processing
The measuring instrument and approach you use depends on the topic you have chosen
and the purpose of your investigation
The questionnaire is ideal for collecting opinions preferences and facts for a specific
purpose from a defined set of respondents typical use is a population census or an opinion
poll Educational and psychological questionnaires measure knowledge interests and other
constructs
4
Exercise Indicate whether a questionnaire would be a suitable measuring instrument
TOPIC
1 support for political parties
2 preference for different types of beer 3 typing skills 4 opinions about the parole system 5 parenting practices
6 effect of personality on intelligence
YES OR NO
Yes
Yes No Yes Yes
Yes and No
REASON
You want to find out facts
looking at a practical ability attitudes to examine effectiveness use
observation questionnaire to measure aspects of personality need a separate test to measure intelligence the relation between personality and intelligence you
would need the right kind of research design
2 Design a questionnaire
Outcome product
A questionnaire specification document
Method
Activity 21 Decide on item format and scaling method
Activity 22 Decide on the total number of items
Activity 23 Design the layout for the questionnaire
Resource reference
Item format
Layout of the questionnaire
Specification document for a questionnaire
Item format 1 Closed questions
Offers respondents a limited choice of alternate replies whereas an open question is one
that allows the respondents to answer in any way they want to
yesno type
truefalse type
multiple choice type
Rating scales
5
11 Inventories and checklists
Also a form of closed question used to obtain straightforward information
12 Advantages and disadvantages of closed questions
The set of alternative answers is uniform and therefore makes it easier to compare peoplersquos
answers quicker to answer sensitive issues are often better addressed The main
disadvantage is that they force the respondent to answer in terms of the alternatives
offered and nothing else a loss of spontaneity loss of rapport if respondents become
irritated Offer an additional option such as ldquootherrdquo Closed questions can direct the
respondentsrsquo thinking and may also influence their answers
2 Open questions
Phrase the question carefully if you want more than just a yes or no answer Invariably
elicit some irrelevant and repetitious information also requires a considerable degree of
language proficiency and communication skills
3 Rating scales
To measure complex or non-factual topics such as opinions beliefs attitudes and values
These are complex issues that have to do with states of mind and are therefore more
difficult to measure They are usually multifaceted Therefore to measure non-factual
topics the tendency is to use rating scales The extent to which they agree or disagree
Ratings may be influenced by a personrsquos mood on the day or by political events in the
country at the time
Guidelines can be followed when compiling a rating scale
1 Define the dimension being rated Each item or statement to be rated must refer to only
one thing or dimension ldquoRate friendliness and efficiencyrdquo you are confusing two different
dimensions
2 Decide on the number of ratings for the scale
3 Decide whether to use an even or uneven number of ratings Uneven number in order to
have a neutral category in the middle but people may tend to choose the neutral one (error
of central tendency)
4 Define the different rating categories must be mutually exclusive - each rating category
should mean something different
Attitude scales are rating scales that consist of a group of items designed to reflect
different attitudes toward the topic in question Their main function is to classify people
with respect to a certain attitude
31 Likert scales
Also known as a summated scale ldquoA summated attitude scale may be described as a rating
scale in which a subject indicates the extent to which he or she agrees (or disagrees) with
6
statements These statements usually deal with a social or political issue The respondent
marks the point that best reflects his or her attitude The scores are added up to obtain a
total score (summated scale) Ensure that the scale is uni-dimensional - all the items
measure the same dimension or topic It is important to have both favourable and
unfavourable statements so that you do not influence the respondent Usually have the
option of 5 or 7 ratings
32 Semantic differential
Used particularly in the measurement of attitudes A seven point rating scales and the scale
points on each end are defined by opposing adjectives
Powerful _ _ _ _ _ _ _ Powerless
The location of positive and negative poles should be random to counteract any halo effect -
the tendency for respondents not to evaluate each item individually but for their responses
to be influenced by their general feeling of like or dislike Important that your two
descriptors define the same construct The semantic differential is useful when you want to
obtain an idea of peoplersquos endorsement of certain attributes
Activity 21 Decide on item format and scaling method
Action Identify different types of items and scaling methods It is important to have a
balance of different types of questions in order to maintain the respondentsrsquo interest as well
as to collect all the relevant information
Item Type
1 Do you have a valid driverrsquos licence Yes No 1 Closed question - limited choice of answers
2 Why do people need to have a valid driverrsquos licence
2 Open question - state their own opinions and allows for any kind of answer
3 People should have a driverrsquos licence (choose one answer)
1048709 for identification purposes
1048709 to prove that they can drive 1048709 in case they have an accident
3 Closed question because there is a limited choice of answers (multiple choice type)
4 Young people are good drivers True False 4 Closed question - limited choice of answers
5 Good drivers are alert - - - - - relaxed cautious - - - - - fast reactors older - - - - - younger
5 Rating scale semantic differential type (extreme scale points are opposing adjectives)
6 Mark the characteristics of good drivers from the list below
1048709 male 1048709 female 1048709 even tempered 1048709 fast reactions 1048709 slow and steady
6 Closed question because there is a limited choice of answers (checklistinventory)
7 Are you a good driver Rate your abilities as follows a great deal very little 1 2 3 4 5
7 Rating scale Likert type
7
self confidence experience knowledge of road rules
8 Good drivers are
8 Open question because it allows respondents to give any kind of answer
9 Should the age for driverrsquos licences be increased to 21
9 This looks like an open question but is a closed question - a yes or no answer It would be an open question if you asked ldquoWhat is your opinion about increasing the driving age to 21
yearsrdquo
Action Link item format and scaling method to the purpose and content of your
questionnaire - decide what kind of items to use in order to get the information you want
Information required
age - under 18 years 18 - 22 years 23 - 35 years 36 - 50 years
gender - closed (check male or female)
socio-economic status
personal experience of crime- closed question with a yesno how much or how
often they personally experienced crime- use a multiple choice item or a rating
scale general description- use an open ended question
levels of stress associated with different crimes - rating scale
personal reactions to different crimes - simple open ended question or you might try
a rating scale like a semantic differential
Specification document for a questionnaire
What a questionnaire should contain A specification document is really just a list of the
required characteristics for your questionnaire in terms of type of items number of items
layout and so on in order for the questionnaire to do what it is supposed to do
Before compiling a questionnaire have a rough idea of the line of enquiry you wish to
follow the kind of questions you will ask the level of language you use how complex the
questions are and so on In this way the purpose of the investigation the kind of
information you want and the characteristics of the respondents influence the questionnaire
specifications The detailed specification of measurement aims should be clearly related to
the purpose of the research
Activity 22 Decide on the total number of items
Ensure that you get the information you want but do not lose respondents because it is too
long or boring Identify the extent to which each content area (the information you need)
needs to be covered then consider the characteristics of your respondents and the time
available for testing
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
1
Table of Contents Learning model 2
Basic measurement and questionnaire design 2
Content domain Identify the content domain for a questionnaire 2
Suitability of a questionnaire as measuring instrument 3
2 Design a questionnaire 4
Layout of the questionnaire 8
3 Write questionnaire items 11
4 Pilot test the questionnaire 15
5 Evaluate reliability and validity 20
Validity 23
6 Compile a manual 25
7 Evaluate a questionnaire 26
8 Evaluate a manual 30
2
Learning model
deg Learning outcome the purpose of learning achieved by producing specific outcome
products
deg Outcome product the result of the learning activities a learner engages in during the
learning process The learner produces outcome products during learning
deg Production method how an outcome product is produced A series of actions constitute an
activity and a series of activities form a method
deg Learning opportunity work towards achieving the required learning outcomes by
producing particular outcome products A learning opportunity has three elements
(a) an outcome product
(b) the method for producing the outcome product and
(c) a reference to the resource required for producing the outcome product
The outcome product is the most important element each activity of a learning opportunity
is geared towards the production of the outcome product An outcome product has to fulfil
certain standards If a learner produces an outcome product that fulfils certain minimum
criteria the learner can be declared competent
Basic measurement and questionnaire design
Outcome product
A questionnaire and its mini-manual
Production method
1 Identify a suitable content domain
2 Design a questionnaire
3 Write items for the questionnaire
4 Pilot-test the questionnaire
5 Evaluate the questionnairersquos reliability and validity
6 Compile a mini-manual for users of the questionnaire
7 Evaluate the questionnaire
8 Evaluate the mini-manual
Content domain Identify the content domain for a questionnaire
Outcome product
3
1 Identification of a content domain
2 A discussion on the suitability of a questionnaire as measuring instrument
Method
Activity 11 Describe the relevant content domain
Activity 12 Evaluate the suitability of a questionnaire as a measure for this content
domain
Resource reference
1 Content domain Identify the content domain for a questionnaire
2 Suitability of a questionnaire as measuring instrument
The function of the questionnaire is measurement
1 Identifying the focus of a questionnaire
The purpose of the questionnaire refers to what it intends to measure and for whom it will
be used
The first step is to identify the general topic of interest select a problem area within that
topic that you want to investigate reduce the general problem to more specific questions
Do they all relate to each other Can you combine them into one question Should you
rather choose one question and leave the others for separate questionnaires
The content domain therefore consists of the tasks behaviours attitudes etcetera related
to one or more of these questions
2 Limiting the scope of the questionnaire Decide on what is relevant By limiting the scope in this way you can cover your topic
adequately but do not ask irrelevant questions and still have a questionnaire that is relevant
and not too long
Suitability of a questionnaire as measuring instrument
The main purposes of a questionnaire are to
(1) obtain accurate factual information
(2) provide a standard format for recording facts comments and attitudes and
(3) facilitate data processing
The measuring instrument and approach you use depends on the topic you have chosen
and the purpose of your investigation
The questionnaire is ideal for collecting opinions preferences and facts for a specific
purpose from a defined set of respondents typical use is a population census or an opinion
poll Educational and psychological questionnaires measure knowledge interests and other
constructs
4
Exercise Indicate whether a questionnaire would be a suitable measuring instrument
TOPIC
1 support for political parties
2 preference for different types of beer 3 typing skills 4 opinions about the parole system 5 parenting practices
6 effect of personality on intelligence
YES OR NO
Yes
Yes No Yes Yes
Yes and No
REASON
You want to find out facts
looking at a practical ability attitudes to examine effectiveness use
observation questionnaire to measure aspects of personality need a separate test to measure intelligence the relation between personality and intelligence you
would need the right kind of research design
2 Design a questionnaire
Outcome product
A questionnaire specification document
Method
Activity 21 Decide on item format and scaling method
Activity 22 Decide on the total number of items
Activity 23 Design the layout for the questionnaire
Resource reference
Item format
Layout of the questionnaire
Specification document for a questionnaire
Item format 1 Closed questions
Offers respondents a limited choice of alternate replies whereas an open question is one
that allows the respondents to answer in any way they want to
yesno type
truefalse type
multiple choice type
Rating scales
5
11 Inventories and checklists
Also a form of closed question used to obtain straightforward information
12 Advantages and disadvantages of closed questions
The set of alternative answers is uniform and therefore makes it easier to compare peoplersquos
answers quicker to answer sensitive issues are often better addressed The main
disadvantage is that they force the respondent to answer in terms of the alternatives
offered and nothing else a loss of spontaneity loss of rapport if respondents become
irritated Offer an additional option such as ldquootherrdquo Closed questions can direct the
respondentsrsquo thinking and may also influence their answers
2 Open questions
Phrase the question carefully if you want more than just a yes or no answer Invariably
elicit some irrelevant and repetitious information also requires a considerable degree of
language proficiency and communication skills
3 Rating scales
To measure complex or non-factual topics such as opinions beliefs attitudes and values
These are complex issues that have to do with states of mind and are therefore more
difficult to measure They are usually multifaceted Therefore to measure non-factual
topics the tendency is to use rating scales The extent to which they agree or disagree
Ratings may be influenced by a personrsquos mood on the day or by political events in the
country at the time
Guidelines can be followed when compiling a rating scale
1 Define the dimension being rated Each item or statement to be rated must refer to only
one thing or dimension ldquoRate friendliness and efficiencyrdquo you are confusing two different
dimensions
2 Decide on the number of ratings for the scale
3 Decide whether to use an even or uneven number of ratings Uneven number in order to
have a neutral category in the middle but people may tend to choose the neutral one (error
of central tendency)
4 Define the different rating categories must be mutually exclusive - each rating category
should mean something different
Attitude scales are rating scales that consist of a group of items designed to reflect
different attitudes toward the topic in question Their main function is to classify people
with respect to a certain attitude
31 Likert scales
Also known as a summated scale ldquoA summated attitude scale may be described as a rating
scale in which a subject indicates the extent to which he or she agrees (or disagrees) with
6
statements These statements usually deal with a social or political issue The respondent
marks the point that best reflects his or her attitude The scores are added up to obtain a
total score (summated scale) Ensure that the scale is uni-dimensional - all the items
measure the same dimension or topic It is important to have both favourable and
unfavourable statements so that you do not influence the respondent Usually have the
option of 5 or 7 ratings
32 Semantic differential
Used particularly in the measurement of attitudes A seven point rating scales and the scale
points on each end are defined by opposing adjectives
Powerful _ _ _ _ _ _ _ Powerless
The location of positive and negative poles should be random to counteract any halo effect -
the tendency for respondents not to evaluate each item individually but for their responses
to be influenced by their general feeling of like or dislike Important that your two
descriptors define the same construct The semantic differential is useful when you want to
obtain an idea of peoplersquos endorsement of certain attributes
Activity 21 Decide on item format and scaling method
Action Identify different types of items and scaling methods It is important to have a
balance of different types of questions in order to maintain the respondentsrsquo interest as well
as to collect all the relevant information
Item Type
1 Do you have a valid driverrsquos licence Yes No 1 Closed question - limited choice of answers
2 Why do people need to have a valid driverrsquos licence
2 Open question - state their own opinions and allows for any kind of answer
3 People should have a driverrsquos licence (choose one answer)
1048709 for identification purposes
1048709 to prove that they can drive 1048709 in case they have an accident
3 Closed question because there is a limited choice of answers (multiple choice type)
4 Young people are good drivers True False 4 Closed question - limited choice of answers
5 Good drivers are alert - - - - - relaxed cautious - - - - - fast reactors older - - - - - younger
5 Rating scale semantic differential type (extreme scale points are opposing adjectives)
6 Mark the characteristics of good drivers from the list below
1048709 male 1048709 female 1048709 even tempered 1048709 fast reactions 1048709 slow and steady
6 Closed question because there is a limited choice of answers (checklistinventory)
7 Are you a good driver Rate your abilities as follows a great deal very little 1 2 3 4 5
7 Rating scale Likert type
7
self confidence experience knowledge of road rules
8 Good drivers are
8 Open question because it allows respondents to give any kind of answer
9 Should the age for driverrsquos licences be increased to 21
9 This looks like an open question but is a closed question - a yes or no answer It would be an open question if you asked ldquoWhat is your opinion about increasing the driving age to 21
yearsrdquo
Action Link item format and scaling method to the purpose and content of your
questionnaire - decide what kind of items to use in order to get the information you want
Information required
age - under 18 years 18 - 22 years 23 - 35 years 36 - 50 years
gender - closed (check male or female)
socio-economic status
personal experience of crime- closed question with a yesno how much or how
often they personally experienced crime- use a multiple choice item or a rating
scale general description- use an open ended question
levels of stress associated with different crimes - rating scale
personal reactions to different crimes - simple open ended question or you might try
a rating scale like a semantic differential
Specification document for a questionnaire
What a questionnaire should contain A specification document is really just a list of the
required characteristics for your questionnaire in terms of type of items number of items
layout and so on in order for the questionnaire to do what it is supposed to do
Before compiling a questionnaire have a rough idea of the line of enquiry you wish to
follow the kind of questions you will ask the level of language you use how complex the
questions are and so on In this way the purpose of the investigation the kind of
information you want and the characteristics of the respondents influence the questionnaire
specifications The detailed specification of measurement aims should be clearly related to
the purpose of the research
Activity 22 Decide on the total number of items
Ensure that you get the information you want but do not lose respondents because it is too
long or boring Identify the extent to which each content area (the information you need)
needs to be covered then consider the characteristics of your respondents and the time
available for testing
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
2
Learning model
deg Learning outcome the purpose of learning achieved by producing specific outcome
products
deg Outcome product the result of the learning activities a learner engages in during the
learning process The learner produces outcome products during learning
deg Production method how an outcome product is produced A series of actions constitute an
activity and a series of activities form a method
deg Learning opportunity work towards achieving the required learning outcomes by
producing particular outcome products A learning opportunity has three elements
(a) an outcome product
(b) the method for producing the outcome product and
(c) a reference to the resource required for producing the outcome product
The outcome product is the most important element each activity of a learning opportunity
is geared towards the production of the outcome product An outcome product has to fulfil
certain standards If a learner produces an outcome product that fulfils certain minimum
criteria the learner can be declared competent
Basic measurement and questionnaire design
Outcome product
A questionnaire and its mini-manual
Production method
1 Identify a suitable content domain
2 Design a questionnaire
3 Write items for the questionnaire
4 Pilot-test the questionnaire
5 Evaluate the questionnairersquos reliability and validity
6 Compile a mini-manual for users of the questionnaire
7 Evaluate the questionnaire
8 Evaluate the mini-manual
Content domain Identify the content domain for a questionnaire
Outcome product
3
1 Identification of a content domain
2 A discussion on the suitability of a questionnaire as measuring instrument
Method
Activity 11 Describe the relevant content domain
Activity 12 Evaluate the suitability of a questionnaire as a measure for this content
domain
Resource reference
1 Content domain Identify the content domain for a questionnaire
2 Suitability of a questionnaire as measuring instrument
The function of the questionnaire is measurement
1 Identifying the focus of a questionnaire
The purpose of the questionnaire refers to what it intends to measure and for whom it will
be used
The first step is to identify the general topic of interest select a problem area within that
topic that you want to investigate reduce the general problem to more specific questions
Do they all relate to each other Can you combine them into one question Should you
rather choose one question and leave the others for separate questionnaires
The content domain therefore consists of the tasks behaviours attitudes etcetera related
to one or more of these questions
2 Limiting the scope of the questionnaire Decide on what is relevant By limiting the scope in this way you can cover your topic
adequately but do not ask irrelevant questions and still have a questionnaire that is relevant
and not too long
Suitability of a questionnaire as measuring instrument
The main purposes of a questionnaire are to
(1) obtain accurate factual information
(2) provide a standard format for recording facts comments and attitudes and
(3) facilitate data processing
The measuring instrument and approach you use depends on the topic you have chosen
and the purpose of your investigation
The questionnaire is ideal for collecting opinions preferences and facts for a specific
purpose from a defined set of respondents typical use is a population census or an opinion
poll Educational and psychological questionnaires measure knowledge interests and other
constructs
4
Exercise Indicate whether a questionnaire would be a suitable measuring instrument
TOPIC
1 support for political parties
2 preference for different types of beer 3 typing skills 4 opinions about the parole system 5 parenting practices
6 effect of personality on intelligence
YES OR NO
Yes
Yes No Yes Yes
Yes and No
REASON
You want to find out facts
looking at a practical ability attitudes to examine effectiveness use
observation questionnaire to measure aspects of personality need a separate test to measure intelligence the relation between personality and intelligence you
would need the right kind of research design
2 Design a questionnaire
Outcome product
A questionnaire specification document
Method
Activity 21 Decide on item format and scaling method
Activity 22 Decide on the total number of items
Activity 23 Design the layout for the questionnaire
Resource reference
Item format
Layout of the questionnaire
Specification document for a questionnaire
Item format 1 Closed questions
Offers respondents a limited choice of alternate replies whereas an open question is one
that allows the respondents to answer in any way they want to
yesno type
truefalse type
multiple choice type
Rating scales
5
11 Inventories and checklists
Also a form of closed question used to obtain straightforward information
12 Advantages and disadvantages of closed questions
The set of alternative answers is uniform and therefore makes it easier to compare peoplersquos
answers quicker to answer sensitive issues are often better addressed The main
disadvantage is that they force the respondent to answer in terms of the alternatives
offered and nothing else a loss of spontaneity loss of rapport if respondents become
irritated Offer an additional option such as ldquootherrdquo Closed questions can direct the
respondentsrsquo thinking and may also influence their answers
2 Open questions
Phrase the question carefully if you want more than just a yes or no answer Invariably
elicit some irrelevant and repetitious information also requires a considerable degree of
language proficiency and communication skills
3 Rating scales
To measure complex or non-factual topics such as opinions beliefs attitudes and values
These are complex issues that have to do with states of mind and are therefore more
difficult to measure They are usually multifaceted Therefore to measure non-factual
topics the tendency is to use rating scales The extent to which they agree or disagree
Ratings may be influenced by a personrsquos mood on the day or by political events in the
country at the time
Guidelines can be followed when compiling a rating scale
1 Define the dimension being rated Each item or statement to be rated must refer to only
one thing or dimension ldquoRate friendliness and efficiencyrdquo you are confusing two different
dimensions
2 Decide on the number of ratings for the scale
3 Decide whether to use an even or uneven number of ratings Uneven number in order to
have a neutral category in the middle but people may tend to choose the neutral one (error
of central tendency)
4 Define the different rating categories must be mutually exclusive - each rating category
should mean something different
Attitude scales are rating scales that consist of a group of items designed to reflect
different attitudes toward the topic in question Their main function is to classify people
with respect to a certain attitude
31 Likert scales
Also known as a summated scale ldquoA summated attitude scale may be described as a rating
scale in which a subject indicates the extent to which he or she agrees (or disagrees) with
6
statements These statements usually deal with a social or political issue The respondent
marks the point that best reflects his or her attitude The scores are added up to obtain a
total score (summated scale) Ensure that the scale is uni-dimensional - all the items
measure the same dimension or topic It is important to have both favourable and
unfavourable statements so that you do not influence the respondent Usually have the
option of 5 or 7 ratings
32 Semantic differential
Used particularly in the measurement of attitudes A seven point rating scales and the scale
points on each end are defined by opposing adjectives
Powerful _ _ _ _ _ _ _ Powerless
The location of positive and negative poles should be random to counteract any halo effect -
the tendency for respondents not to evaluate each item individually but for their responses
to be influenced by their general feeling of like or dislike Important that your two
descriptors define the same construct The semantic differential is useful when you want to
obtain an idea of peoplersquos endorsement of certain attributes
Activity 21 Decide on item format and scaling method
Action Identify different types of items and scaling methods It is important to have a
balance of different types of questions in order to maintain the respondentsrsquo interest as well
as to collect all the relevant information
Item Type
1 Do you have a valid driverrsquos licence Yes No 1 Closed question - limited choice of answers
2 Why do people need to have a valid driverrsquos licence
2 Open question - state their own opinions and allows for any kind of answer
3 People should have a driverrsquos licence (choose one answer)
1048709 for identification purposes
1048709 to prove that they can drive 1048709 in case they have an accident
3 Closed question because there is a limited choice of answers (multiple choice type)
4 Young people are good drivers True False 4 Closed question - limited choice of answers
5 Good drivers are alert - - - - - relaxed cautious - - - - - fast reactors older - - - - - younger
5 Rating scale semantic differential type (extreme scale points are opposing adjectives)
6 Mark the characteristics of good drivers from the list below
1048709 male 1048709 female 1048709 even tempered 1048709 fast reactions 1048709 slow and steady
6 Closed question because there is a limited choice of answers (checklistinventory)
7 Are you a good driver Rate your abilities as follows a great deal very little 1 2 3 4 5
7 Rating scale Likert type
7
self confidence experience knowledge of road rules
8 Good drivers are
8 Open question because it allows respondents to give any kind of answer
9 Should the age for driverrsquos licences be increased to 21
9 This looks like an open question but is a closed question - a yes or no answer It would be an open question if you asked ldquoWhat is your opinion about increasing the driving age to 21
yearsrdquo
Action Link item format and scaling method to the purpose and content of your
questionnaire - decide what kind of items to use in order to get the information you want
Information required
age - under 18 years 18 - 22 years 23 - 35 years 36 - 50 years
gender - closed (check male or female)
socio-economic status
personal experience of crime- closed question with a yesno how much or how
often they personally experienced crime- use a multiple choice item or a rating
scale general description- use an open ended question
levels of stress associated with different crimes - rating scale
personal reactions to different crimes - simple open ended question or you might try
a rating scale like a semantic differential
Specification document for a questionnaire
What a questionnaire should contain A specification document is really just a list of the
required characteristics for your questionnaire in terms of type of items number of items
layout and so on in order for the questionnaire to do what it is supposed to do
Before compiling a questionnaire have a rough idea of the line of enquiry you wish to
follow the kind of questions you will ask the level of language you use how complex the
questions are and so on In this way the purpose of the investigation the kind of
information you want and the characteristics of the respondents influence the questionnaire
specifications The detailed specification of measurement aims should be clearly related to
the purpose of the research
Activity 22 Decide on the total number of items
Ensure that you get the information you want but do not lose respondents because it is too
long or boring Identify the extent to which each content area (the information you need)
needs to be covered then consider the characteristics of your respondents and the time
available for testing
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
3
1 Identification of a content domain
2 A discussion on the suitability of a questionnaire as measuring instrument
Method
Activity 11 Describe the relevant content domain
Activity 12 Evaluate the suitability of a questionnaire as a measure for this content
domain
Resource reference
1 Content domain Identify the content domain for a questionnaire
2 Suitability of a questionnaire as measuring instrument
The function of the questionnaire is measurement
1 Identifying the focus of a questionnaire
The purpose of the questionnaire refers to what it intends to measure and for whom it will
be used
The first step is to identify the general topic of interest select a problem area within that
topic that you want to investigate reduce the general problem to more specific questions
Do they all relate to each other Can you combine them into one question Should you
rather choose one question and leave the others for separate questionnaires
The content domain therefore consists of the tasks behaviours attitudes etcetera related
to one or more of these questions
2 Limiting the scope of the questionnaire Decide on what is relevant By limiting the scope in this way you can cover your topic
adequately but do not ask irrelevant questions and still have a questionnaire that is relevant
and not too long
Suitability of a questionnaire as measuring instrument
The main purposes of a questionnaire are to
(1) obtain accurate factual information
(2) provide a standard format for recording facts comments and attitudes and
(3) facilitate data processing
The measuring instrument and approach you use depends on the topic you have chosen
and the purpose of your investigation
The questionnaire is ideal for collecting opinions preferences and facts for a specific
purpose from a defined set of respondents typical use is a population census or an opinion
poll Educational and psychological questionnaires measure knowledge interests and other
constructs
4
Exercise Indicate whether a questionnaire would be a suitable measuring instrument
TOPIC
1 support for political parties
2 preference for different types of beer 3 typing skills 4 opinions about the parole system 5 parenting practices
6 effect of personality on intelligence
YES OR NO
Yes
Yes No Yes Yes
Yes and No
REASON
You want to find out facts
looking at a practical ability attitudes to examine effectiveness use
observation questionnaire to measure aspects of personality need a separate test to measure intelligence the relation between personality and intelligence you
would need the right kind of research design
2 Design a questionnaire
Outcome product
A questionnaire specification document
Method
Activity 21 Decide on item format and scaling method
Activity 22 Decide on the total number of items
Activity 23 Design the layout for the questionnaire
Resource reference
Item format
Layout of the questionnaire
Specification document for a questionnaire
Item format 1 Closed questions
Offers respondents a limited choice of alternate replies whereas an open question is one
that allows the respondents to answer in any way they want to
yesno type
truefalse type
multiple choice type
Rating scales
5
11 Inventories and checklists
Also a form of closed question used to obtain straightforward information
12 Advantages and disadvantages of closed questions
The set of alternative answers is uniform and therefore makes it easier to compare peoplersquos
answers quicker to answer sensitive issues are often better addressed The main
disadvantage is that they force the respondent to answer in terms of the alternatives
offered and nothing else a loss of spontaneity loss of rapport if respondents become
irritated Offer an additional option such as ldquootherrdquo Closed questions can direct the
respondentsrsquo thinking and may also influence their answers
2 Open questions
Phrase the question carefully if you want more than just a yes or no answer Invariably
elicit some irrelevant and repetitious information also requires a considerable degree of
language proficiency and communication skills
3 Rating scales
To measure complex or non-factual topics such as opinions beliefs attitudes and values
These are complex issues that have to do with states of mind and are therefore more
difficult to measure They are usually multifaceted Therefore to measure non-factual
topics the tendency is to use rating scales The extent to which they agree or disagree
Ratings may be influenced by a personrsquos mood on the day or by political events in the
country at the time
Guidelines can be followed when compiling a rating scale
1 Define the dimension being rated Each item or statement to be rated must refer to only
one thing or dimension ldquoRate friendliness and efficiencyrdquo you are confusing two different
dimensions
2 Decide on the number of ratings for the scale
3 Decide whether to use an even or uneven number of ratings Uneven number in order to
have a neutral category in the middle but people may tend to choose the neutral one (error
of central tendency)
4 Define the different rating categories must be mutually exclusive - each rating category
should mean something different
Attitude scales are rating scales that consist of a group of items designed to reflect
different attitudes toward the topic in question Their main function is to classify people
with respect to a certain attitude
31 Likert scales
Also known as a summated scale ldquoA summated attitude scale may be described as a rating
scale in which a subject indicates the extent to which he or she agrees (or disagrees) with
6
statements These statements usually deal with a social or political issue The respondent
marks the point that best reflects his or her attitude The scores are added up to obtain a
total score (summated scale) Ensure that the scale is uni-dimensional - all the items
measure the same dimension or topic It is important to have both favourable and
unfavourable statements so that you do not influence the respondent Usually have the
option of 5 or 7 ratings
32 Semantic differential
Used particularly in the measurement of attitudes A seven point rating scales and the scale
points on each end are defined by opposing adjectives
Powerful _ _ _ _ _ _ _ Powerless
The location of positive and negative poles should be random to counteract any halo effect -
the tendency for respondents not to evaluate each item individually but for their responses
to be influenced by their general feeling of like or dislike Important that your two
descriptors define the same construct The semantic differential is useful when you want to
obtain an idea of peoplersquos endorsement of certain attributes
Activity 21 Decide on item format and scaling method
Action Identify different types of items and scaling methods It is important to have a
balance of different types of questions in order to maintain the respondentsrsquo interest as well
as to collect all the relevant information
Item Type
1 Do you have a valid driverrsquos licence Yes No 1 Closed question - limited choice of answers
2 Why do people need to have a valid driverrsquos licence
2 Open question - state their own opinions and allows for any kind of answer
3 People should have a driverrsquos licence (choose one answer)
1048709 for identification purposes
1048709 to prove that they can drive 1048709 in case they have an accident
3 Closed question because there is a limited choice of answers (multiple choice type)
4 Young people are good drivers True False 4 Closed question - limited choice of answers
5 Good drivers are alert - - - - - relaxed cautious - - - - - fast reactors older - - - - - younger
5 Rating scale semantic differential type (extreme scale points are opposing adjectives)
6 Mark the characteristics of good drivers from the list below
1048709 male 1048709 female 1048709 even tempered 1048709 fast reactions 1048709 slow and steady
6 Closed question because there is a limited choice of answers (checklistinventory)
7 Are you a good driver Rate your abilities as follows a great deal very little 1 2 3 4 5
7 Rating scale Likert type
7
self confidence experience knowledge of road rules
8 Good drivers are
8 Open question because it allows respondents to give any kind of answer
9 Should the age for driverrsquos licences be increased to 21
9 This looks like an open question but is a closed question - a yes or no answer It would be an open question if you asked ldquoWhat is your opinion about increasing the driving age to 21
yearsrdquo
Action Link item format and scaling method to the purpose and content of your
questionnaire - decide what kind of items to use in order to get the information you want
Information required
age - under 18 years 18 - 22 years 23 - 35 years 36 - 50 years
gender - closed (check male or female)
socio-economic status
personal experience of crime- closed question with a yesno how much or how
often they personally experienced crime- use a multiple choice item or a rating
scale general description- use an open ended question
levels of stress associated with different crimes - rating scale
personal reactions to different crimes - simple open ended question or you might try
a rating scale like a semantic differential
Specification document for a questionnaire
What a questionnaire should contain A specification document is really just a list of the
required characteristics for your questionnaire in terms of type of items number of items
layout and so on in order for the questionnaire to do what it is supposed to do
Before compiling a questionnaire have a rough idea of the line of enquiry you wish to
follow the kind of questions you will ask the level of language you use how complex the
questions are and so on In this way the purpose of the investigation the kind of
information you want and the characteristics of the respondents influence the questionnaire
specifications The detailed specification of measurement aims should be clearly related to
the purpose of the research
Activity 22 Decide on the total number of items
Ensure that you get the information you want but do not lose respondents because it is too
long or boring Identify the extent to which each content area (the information you need)
needs to be covered then consider the characteristics of your respondents and the time
available for testing
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
4
Exercise Indicate whether a questionnaire would be a suitable measuring instrument
TOPIC
1 support for political parties
2 preference for different types of beer 3 typing skills 4 opinions about the parole system 5 parenting practices
6 effect of personality on intelligence
YES OR NO
Yes
Yes No Yes Yes
Yes and No
REASON
You want to find out facts
looking at a practical ability attitudes to examine effectiveness use
observation questionnaire to measure aspects of personality need a separate test to measure intelligence the relation between personality and intelligence you
would need the right kind of research design
2 Design a questionnaire
Outcome product
A questionnaire specification document
Method
Activity 21 Decide on item format and scaling method
Activity 22 Decide on the total number of items
Activity 23 Design the layout for the questionnaire
Resource reference
Item format
Layout of the questionnaire
Specification document for a questionnaire
Item format 1 Closed questions
Offers respondents a limited choice of alternate replies whereas an open question is one
that allows the respondents to answer in any way they want to
yesno type
truefalse type
multiple choice type
Rating scales
5
11 Inventories and checklists
Also a form of closed question used to obtain straightforward information
12 Advantages and disadvantages of closed questions
The set of alternative answers is uniform and therefore makes it easier to compare peoplersquos
answers quicker to answer sensitive issues are often better addressed The main
disadvantage is that they force the respondent to answer in terms of the alternatives
offered and nothing else a loss of spontaneity loss of rapport if respondents become
irritated Offer an additional option such as ldquootherrdquo Closed questions can direct the
respondentsrsquo thinking and may also influence their answers
2 Open questions
Phrase the question carefully if you want more than just a yes or no answer Invariably
elicit some irrelevant and repetitious information also requires a considerable degree of
language proficiency and communication skills
3 Rating scales
To measure complex or non-factual topics such as opinions beliefs attitudes and values
These are complex issues that have to do with states of mind and are therefore more
difficult to measure They are usually multifaceted Therefore to measure non-factual
topics the tendency is to use rating scales The extent to which they agree or disagree
Ratings may be influenced by a personrsquos mood on the day or by political events in the
country at the time
Guidelines can be followed when compiling a rating scale
1 Define the dimension being rated Each item or statement to be rated must refer to only
one thing or dimension ldquoRate friendliness and efficiencyrdquo you are confusing two different
dimensions
2 Decide on the number of ratings for the scale
3 Decide whether to use an even or uneven number of ratings Uneven number in order to
have a neutral category in the middle but people may tend to choose the neutral one (error
of central tendency)
4 Define the different rating categories must be mutually exclusive - each rating category
should mean something different
Attitude scales are rating scales that consist of a group of items designed to reflect
different attitudes toward the topic in question Their main function is to classify people
with respect to a certain attitude
31 Likert scales
Also known as a summated scale ldquoA summated attitude scale may be described as a rating
scale in which a subject indicates the extent to which he or she agrees (or disagrees) with
6
statements These statements usually deal with a social or political issue The respondent
marks the point that best reflects his or her attitude The scores are added up to obtain a
total score (summated scale) Ensure that the scale is uni-dimensional - all the items
measure the same dimension or topic It is important to have both favourable and
unfavourable statements so that you do not influence the respondent Usually have the
option of 5 or 7 ratings
32 Semantic differential
Used particularly in the measurement of attitudes A seven point rating scales and the scale
points on each end are defined by opposing adjectives
Powerful _ _ _ _ _ _ _ Powerless
The location of positive and negative poles should be random to counteract any halo effect -
the tendency for respondents not to evaluate each item individually but for their responses
to be influenced by their general feeling of like or dislike Important that your two
descriptors define the same construct The semantic differential is useful when you want to
obtain an idea of peoplersquos endorsement of certain attributes
Activity 21 Decide on item format and scaling method
Action Identify different types of items and scaling methods It is important to have a
balance of different types of questions in order to maintain the respondentsrsquo interest as well
as to collect all the relevant information
Item Type
1 Do you have a valid driverrsquos licence Yes No 1 Closed question - limited choice of answers
2 Why do people need to have a valid driverrsquos licence
2 Open question - state their own opinions and allows for any kind of answer
3 People should have a driverrsquos licence (choose one answer)
1048709 for identification purposes
1048709 to prove that they can drive 1048709 in case they have an accident
3 Closed question because there is a limited choice of answers (multiple choice type)
4 Young people are good drivers True False 4 Closed question - limited choice of answers
5 Good drivers are alert - - - - - relaxed cautious - - - - - fast reactors older - - - - - younger
5 Rating scale semantic differential type (extreme scale points are opposing adjectives)
6 Mark the characteristics of good drivers from the list below
1048709 male 1048709 female 1048709 even tempered 1048709 fast reactions 1048709 slow and steady
6 Closed question because there is a limited choice of answers (checklistinventory)
7 Are you a good driver Rate your abilities as follows a great deal very little 1 2 3 4 5
7 Rating scale Likert type
7
self confidence experience knowledge of road rules
8 Good drivers are
8 Open question because it allows respondents to give any kind of answer
9 Should the age for driverrsquos licences be increased to 21
9 This looks like an open question but is a closed question - a yes or no answer It would be an open question if you asked ldquoWhat is your opinion about increasing the driving age to 21
yearsrdquo
Action Link item format and scaling method to the purpose and content of your
questionnaire - decide what kind of items to use in order to get the information you want
Information required
age - under 18 years 18 - 22 years 23 - 35 years 36 - 50 years
gender - closed (check male or female)
socio-economic status
personal experience of crime- closed question with a yesno how much or how
often they personally experienced crime- use a multiple choice item or a rating
scale general description- use an open ended question
levels of stress associated with different crimes - rating scale
personal reactions to different crimes - simple open ended question or you might try
a rating scale like a semantic differential
Specification document for a questionnaire
What a questionnaire should contain A specification document is really just a list of the
required characteristics for your questionnaire in terms of type of items number of items
layout and so on in order for the questionnaire to do what it is supposed to do
Before compiling a questionnaire have a rough idea of the line of enquiry you wish to
follow the kind of questions you will ask the level of language you use how complex the
questions are and so on In this way the purpose of the investigation the kind of
information you want and the characteristics of the respondents influence the questionnaire
specifications The detailed specification of measurement aims should be clearly related to
the purpose of the research
Activity 22 Decide on the total number of items
Ensure that you get the information you want but do not lose respondents because it is too
long or boring Identify the extent to which each content area (the information you need)
needs to be covered then consider the characteristics of your respondents and the time
available for testing
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
5
11 Inventories and checklists
Also a form of closed question used to obtain straightforward information
12 Advantages and disadvantages of closed questions
The set of alternative answers is uniform and therefore makes it easier to compare peoplersquos
answers quicker to answer sensitive issues are often better addressed The main
disadvantage is that they force the respondent to answer in terms of the alternatives
offered and nothing else a loss of spontaneity loss of rapport if respondents become
irritated Offer an additional option such as ldquootherrdquo Closed questions can direct the
respondentsrsquo thinking and may also influence their answers
2 Open questions
Phrase the question carefully if you want more than just a yes or no answer Invariably
elicit some irrelevant and repetitious information also requires a considerable degree of
language proficiency and communication skills
3 Rating scales
To measure complex or non-factual topics such as opinions beliefs attitudes and values
These are complex issues that have to do with states of mind and are therefore more
difficult to measure They are usually multifaceted Therefore to measure non-factual
topics the tendency is to use rating scales The extent to which they agree or disagree
Ratings may be influenced by a personrsquos mood on the day or by political events in the
country at the time
Guidelines can be followed when compiling a rating scale
1 Define the dimension being rated Each item or statement to be rated must refer to only
one thing or dimension ldquoRate friendliness and efficiencyrdquo you are confusing two different
dimensions
2 Decide on the number of ratings for the scale
3 Decide whether to use an even or uneven number of ratings Uneven number in order to
have a neutral category in the middle but people may tend to choose the neutral one (error
of central tendency)
4 Define the different rating categories must be mutually exclusive - each rating category
should mean something different
Attitude scales are rating scales that consist of a group of items designed to reflect
different attitudes toward the topic in question Their main function is to classify people
with respect to a certain attitude
31 Likert scales
Also known as a summated scale ldquoA summated attitude scale may be described as a rating
scale in which a subject indicates the extent to which he or she agrees (or disagrees) with
6
statements These statements usually deal with a social or political issue The respondent
marks the point that best reflects his or her attitude The scores are added up to obtain a
total score (summated scale) Ensure that the scale is uni-dimensional - all the items
measure the same dimension or topic It is important to have both favourable and
unfavourable statements so that you do not influence the respondent Usually have the
option of 5 or 7 ratings
32 Semantic differential
Used particularly in the measurement of attitudes A seven point rating scales and the scale
points on each end are defined by opposing adjectives
Powerful _ _ _ _ _ _ _ Powerless
The location of positive and negative poles should be random to counteract any halo effect -
the tendency for respondents not to evaluate each item individually but for their responses
to be influenced by their general feeling of like or dislike Important that your two
descriptors define the same construct The semantic differential is useful when you want to
obtain an idea of peoplersquos endorsement of certain attributes
Activity 21 Decide on item format and scaling method
Action Identify different types of items and scaling methods It is important to have a
balance of different types of questions in order to maintain the respondentsrsquo interest as well
as to collect all the relevant information
Item Type
1 Do you have a valid driverrsquos licence Yes No 1 Closed question - limited choice of answers
2 Why do people need to have a valid driverrsquos licence
2 Open question - state their own opinions and allows for any kind of answer
3 People should have a driverrsquos licence (choose one answer)
1048709 for identification purposes
1048709 to prove that they can drive 1048709 in case they have an accident
3 Closed question because there is a limited choice of answers (multiple choice type)
4 Young people are good drivers True False 4 Closed question - limited choice of answers
5 Good drivers are alert - - - - - relaxed cautious - - - - - fast reactors older - - - - - younger
5 Rating scale semantic differential type (extreme scale points are opposing adjectives)
6 Mark the characteristics of good drivers from the list below
1048709 male 1048709 female 1048709 even tempered 1048709 fast reactions 1048709 slow and steady
6 Closed question because there is a limited choice of answers (checklistinventory)
7 Are you a good driver Rate your abilities as follows a great deal very little 1 2 3 4 5
7 Rating scale Likert type
7
self confidence experience knowledge of road rules
8 Good drivers are
8 Open question because it allows respondents to give any kind of answer
9 Should the age for driverrsquos licences be increased to 21
9 This looks like an open question but is a closed question - a yes or no answer It would be an open question if you asked ldquoWhat is your opinion about increasing the driving age to 21
yearsrdquo
Action Link item format and scaling method to the purpose and content of your
questionnaire - decide what kind of items to use in order to get the information you want
Information required
age - under 18 years 18 - 22 years 23 - 35 years 36 - 50 years
gender - closed (check male or female)
socio-economic status
personal experience of crime- closed question with a yesno how much or how
often they personally experienced crime- use a multiple choice item or a rating
scale general description- use an open ended question
levels of stress associated with different crimes - rating scale
personal reactions to different crimes - simple open ended question or you might try
a rating scale like a semantic differential
Specification document for a questionnaire
What a questionnaire should contain A specification document is really just a list of the
required characteristics for your questionnaire in terms of type of items number of items
layout and so on in order for the questionnaire to do what it is supposed to do
Before compiling a questionnaire have a rough idea of the line of enquiry you wish to
follow the kind of questions you will ask the level of language you use how complex the
questions are and so on In this way the purpose of the investigation the kind of
information you want and the characteristics of the respondents influence the questionnaire
specifications The detailed specification of measurement aims should be clearly related to
the purpose of the research
Activity 22 Decide on the total number of items
Ensure that you get the information you want but do not lose respondents because it is too
long or boring Identify the extent to which each content area (the information you need)
needs to be covered then consider the characteristics of your respondents and the time
available for testing
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
6
statements These statements usually deal with a social or political issue The respondent
marks the point that best reflects his or her attitude The scores are added up to obtain a
total score (summated scale) Ensure that the scale is uni-dimensional - all the items
measure the same dimension or topic It is important to have both favourable and
unfavourable statements so that you do not influence the respondent Usually have the
option of 5 or 7 ratings
32 Semantic differential
Used particularly in the measurement of attitudes A seven point rating scales and the scale
points on each end are defined by opposing adjectives
Powerful _ _ _ _ _ _ _ Powerless
The location of positive and negative poles should be random to counteract any halo effect -
the tendency for respondents not to evaluate each item individually but for their responses
to be influenced by their general feeling of like or dislike Important that your two
descriptors define the same construct The semantic differential is useful when you want to
obtain an idea of peoplersquos endorsement of certain attributes
Activity 21 Decide on item format and scaling method
Action Identify different types of items and scaling methods It is important to have a
balance of different types of questions in order to maintain the respondentsrsquo interest as well
as to collect all the relevant information
Item Type
1 Do you have a valid driverrsquos licence Yes No 1 Closed question - limited choice of answers
2 Why do people need to have a valid driverrsquos licence
2 Open question - state their own opinions and allows for any kind of answer
3 People should have a driverrsquos licence (choose one answer)
1048709 for identification purposes
1048709 to prove that they can drive 1048709 in case they have an accident
3 Closed question because there is a limited choice of answers (multiple choice type)
4 Young people are good drivers True False 4 Closed question - limited choice of answers
5 Good drivers are alert - - - - - relaxed cautious - - - - - fast reactors older - - - - - younger
5 Rating scale semantic differential type (extreme scale points are opposing adjectives)
6 Mark the characteristics of good drivers from the list below
1048709 male 1048709 female 1048709 even tempered 1048709 fast reactions 1048709 slow and steady
6 Closed question because there is a limited choice of answers (checklistinventory)
7 Are you a good driver Rate your abilities as follows a great deal very little 1 2 3 4 5
7 Rating scale Likert type
7
self confidence experience knowledge of road rules
8 Good drivers are
8 Open question because it allows respondents to give any kind of answer
9 Should the age for driverrsquos licences be increased to 21
9 This looks like an open question but is a closed question - a yes or no answer It would be an open question if you asked ldquoWhat is your opinion about increasing the driving age to 21
yearsrdquo
Action Link item format and scaling method to the purpose and content of your
questionnaire - decide what kind of items to use in order to get the information you want
Information required
age - under 18 years 18 - 22 years 23 - 35 years 36 - 50 years
gender - closed (check male or female)
socio-economic status
personal experience of crime- closed question with a yesno how much or how
often they personally experienced crime- use a multiple choice item or a rating
scale general description- use an open ended question
levels of stress associated with different crimes - rating scale
personal reactions to different crimes - simple open ended question or you might try
a rating scale like a semantic differential
Specification document for a questionnaire
What a questionnaire should contain A specification document is really just a list of the
required characteristics for your questionnaire in terms of type of items number of items
layout and so on in order for the questionnaire to do what it is supposed to do
Before compiling a questionnaire have a rough idea of the line of enquiry you wish to
follow the kind of questions you will ask the level of language you use how complex the
questions are and so on In this way the purpose of the investigation the kind of
information you want and the characteristics of the respondents influence the questionnaire
specifications The detailed specification of measurement aims should be clearly related to
the purpose of the research
Activity 22 Decide on the total number of items
Ensure that you get the information you want but do not lose respondents because it is too
long or boring Identify the extent to which each content area (the information you need)
needs to be covered then consider the characteristics of your respondents and the time
available for testing
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
7
self confidence experience knowledge of road rules
8 Good drivers are
8 Open question because it allows respondents to give any kind of answer
9 Should the age for driverrsquos licences be increased to 21
9 This looks like an open question but is a closed question - a yes or no answer It would be an open question if you asked ldquoWhat is your opinion about increasing the driving age to 21
yearsrdquo
Action Link item format and scaling method to the purpose and content of your
questionnaire - decide what kind of items to use in order to get the information you want
Information required
age - under 18 years 18 - 22 years 23 - 35 years 36 - 50 years
gender - closed (check male or female)
socio-economic status
personal experience of crime- closed question with a yesno how much or how
often they personally experienced crime- use a multiple choice item or a rating
scale general description- use an open ended question
levels of stress associated with different crimes - rating scale
personal reactions to different crimes - simple open ended question or you might try
a rating scale like a semantic differential
Specification document for a questionnaire
What a questionnaire should contain A specification document is really just a list of the
required characteristics for your questionnaire in terms of type of items number of items
layout and so on in order for the questionnaire to do what it is supposed to do
Before compiling a questionnaire have a rough idea of the line of enquiry you wish to
follow the kind of questions you will ask the level of language you use how complex the
questions are and so on In this way the purpose of the investigation the kind of
information you want and the characteristics of the respondents influence the questionnaire
specifications The detailed specification of measurement aims should be clearly related to
the purpose of the research
Activity 22 Decide on the total number of items
Ensure that you get the information you want but do not lose respondents because it is too
long or boring Identify the extent to which each content area (the information you need)
needs to be covered then consider the characteristics of your respondents and the time
available for testing
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
8
Action You need to identify the coverage required for each content area You need at least
one item on each of these content areas In some cases one item is not enough For
example if you want information on stress levels associated with different crimes you
might want to use a rating scale Rating scales do not have a fixed number of items but for
the purposes of this assignment your rating scale should consist of at least
twelve items It is also useful to have more than one item dealing with the same aspect to
serve as a control so that you can see whether the respondent is answering questions
consistently or not For example in addition to your rating scale you might also have an
open ended question that deals with the same content area
Action You should evaluate the impact of characteristics of respondents and the time
available for completing the questionnaire
You could cover the content domain comprehensively with 21 items (some of which may be
grouped into a rating scale containing approximately twelve items) We could break down
the coverage of the content areas as follows the first three items would be closed
questions to collect biographical information then a filter question (closed yesno type)
followed by an open question on personal experience of crime a rating scale (consisting of
twelve items) on levels of stress associated with different crimes a closed (multiple choice)
question on personal reactions to crimes and an open question to serve as a control an
open question on perceptions of the effect of crime and lastly an open question for any
other comments the respondent may wish to add Therefore have five closed items four
open items and a twelve item rating scale (total of 21 items) The questionnaire should not
be too long or complicated
Layout of the questionnaire
1 Introduction and covering letter
A well designed questionnaire with a professional appearance is more likely to be
completed The introduction informs respondents about the purpose convinces them that
their participation is valued motivates them to complete the questionnaire reduces their
fears regarding time and inconvenience and assures them of confidentiality and safety
Guidelines for an introduction to a questionnaire
1 the name of the person or organisation conducting the study to establish credibility
2 a general statement of the objectives of the questionnaire
3 assurance that their participation is valued and confidential
4 some estimate of the time required to complete the questionnaire
2 Confidentiality and anonymity
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
9
Anonymity elicits more accurate information greater freedom to express themselves
without fear that their responses would be used in a way that is not in their interests
Important in surveys that involve lsquosensitiversquo topics
3 Length of the questionnaire
Depends on the topic and the degree of interest it holds for the respondent Ideally 30
minutes to complete Also depends on the characteristics of the respondents Specialists
more willing to complete a longer questionnaire For people with low levels of literacy or
education it is better to keep questionnaires short Make sure that each question is directly
relevant need to have thorough coverage of your topic to ensure ltreliabilitygt and
ltvaliditygt The aim is to strike a balance between a concise questionnaire and one that is
inclusive enough to ensure validity
4 Presentation and sequence of questions
1 Try to avoid putting ideas into the respondents minds or suggesting preferable
attitudes Start with open questions and then introduce more structured questions
2 Start with a broad question that orients the respondent to the topic followed by the
twelve item rating scale (moving from the general to the more specific) - the funnel
approach
3 Better to put personal data questions near the end preceded by a short explanation
such as ldquoTo help us classify your answers Items on biographical information - only
a few items at the beginning but if there are a lot of items better at the end
4 You probably have groups of questions relating to particular aspects of your main topic
Decide on the order in which to present these groups of questions Two main
considerations the logic of the survey and the likely reactions of the respondents Start
off with lsquoawarenessrsquo questions relating to the topic in general followed by lsquofactualrsquo
questions dealing with the respondentsrsquo own actions or behaviour Then you might
include questions on likes and dislikes preferences and attitudes
5 Sensitive or very personal issues should come toward the end of the questionnaire to
avoid embarrassing or offending the respondents A closed question and an open
question serve as a sort of validity check for this content area
6 Place one or more open ended questions at the end to allow the respondents to express
opinions or feelings related but have not been covered by the questions Respondents
are more likely to feel satisfied that answering the questions was worth the effort
5 Balance of question types
The ideal is to vary the type of questions so that the respondents do not get bored or
irritated (which may affect the validity of their responses)
6 Filter questions
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
10
Start with a filter or screening question that excludes some respondents from answering
irrelevant questions If the answer is no skip the next few questions
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
11
3 Write questionnaire items
Outcome product
A set of items for measuring specific content areas
Method
Activity 31 Apply criteria for writing questionnaire items
Activity 32 Write items for a questionnaire
Resource reference
Writing questionnaire items
Introduction
All research is aimed at finding answers Questions may arise from anomalies or gaps that a
researcher has found in existing theories from a need to solve a practical problem or just
personal curiosity and intuition Good items are critical to the success of a research project
They produce reliable data and accurate information upon which valid conclusions can be
based
Writing questionnaire items
1 General principles guiding the construction of good items
1 The items should be based on a meaningful definition or description of what you want to
measure
2 Constructing items is a science - requires an in-depth knowledge of onersquos topic and
familiarity with the principles governing good item design And art - requires creativity in
selecting or constructing items appropriate to the particular context
3 The items should be aimed at obtaining meaningful information with a minimum of
distortion
4 Careful thought must be given to the relevance language level cultural interpretations
and clarity of the items Important that it is reader-friendly Avoid items that are
humiliating confusing or make respondents feel inadequate
2 General guidelines for using and modifying existing items
Recommended that researchers use well-known questionnaires of which the reliability and
validity have already been established You must critically scrutinise each item
3 Guidelines for constructing new items
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
12
There may be no existing questionnaire that taps the particular construct you intend to
investigate or you may have to eliminate a number of unsuitable items
31 Relevance of items
Keep in mind what you are aiming to find out
Dorsquos and Donrsquots
Do read each item and ask yourself if the item relates to your topic
Donrsquot be tempted to ask questions that are interesting but not vital to your research
32 Language level
The respondents may not be as knowledgeable or have as large a vocabulary as you
Donrsquot use academic or technical terminology jargon words that are seldom used in
everyday speech very long sentences or complicated syntax (see example below)
Do phrase your items in such a way that the language level matches that of your
respondents
If you are not sure whether items would be easily understood do present them to a small
group of respondents
33 Cultural context
The same item may mean different things to groups with different socio-economic and
cultural backgrounds Be sure that the questionnaire does not contain phrases that have
different connotations in different cultures
Do try to see the items from the respondentsrsquo perspective
Do be aware of possible cross-cultural differences
If your target population is different from your own cultural group then do pre-test your
items on a few members of that group
Do try to have your items correctly translated
34 Clarity
If anything in your questionnaire is not understood andor misinterpreted your
results will be useless
Do avoid ambiguity interpreted in a number of ways Visiting lecturers can help one
feel less isolated
(Does this mean that the lecturers do the visiting - or do the students)
Donrsquot ask questions with two inherent issues
I am fully occupied and I donrsquot feel lonely
Rather break such questions or statements into two separate items
Do scrutinise any items that contain the conjunctions lsquoandrsquo or lsquoorrsquo to see if they contain
more than one possible issue
Wherever possible donrsquot use negatives
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
13
Do use active rather than passive statements Passive statements are more difficult
to understand and therefore more difficult to respond to
It is believed by students that they will be given extension by lecturers
The following is simpler
Students believe lecturers will give them extension
Do ask specific questions rather than general or vague questions General items may
not be interpreted in the same way by everyone and thus produce unreliable answers
Do write items that are specific simple clear and to the point
35 Fitting items to the choice of responses
When you construct lsquoclosedrsquo items be sure that the given responses are appropriate for each
item
36 Factual questions
It can be difficult to remember events in the distant past
Do limit the time frame to the immediate past (at the most the last six months)
When asking questions relating to factual information do make sure your respondents
have the information
37 Leading questions
Those that influence respondents to give a particular answer
Donrsquot write items that encourage respondents to give a particular answer
Discrimination in South Africa is horrific isnrsquot it
Donrsquot give examples unless it is really necessary
Do you use any word processing packages such as ZZ
4 Problems relating to response bias or response style
Tendencies to choose a particular type of answer
41 Social desirability response bias
Tendency to choose what one believes to be the most socially acceptable response
Deliberate faking when they are fully aware of what is being measured and for what
purpose when their identity is disclosed and when they are aware that their responses will
affect them in some way Respondents may also lie to protect their real feelings or justify
their behaviour or because they do not want to admit their ignorance
42 Response styles
Tendency to make a particular type of response tend to choose extreme responses such or
repeatedly choose central responses Design balanced questionnaires Positively stated and
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
14
negatively scored eg ldquoPeople often let me downrdquo Positively stated and positively scored
eg ldquoI trust peoplerdquo
Activity 31 Apply criteria for writing questionnaire items
Action Evaluate existing items according to these criteria Shortcomings of the following
questions
1 What is your income
vague
2 Donrsquot you disagree with yesterdayrsquos Parliamentary decision regarding smoking and
drinking (YesNo)
leading question which contains two inherent issues and a double negative
3 We should be less passive about what is happening in the environment
(agreeuncertaindisagree)
vague ldquoWho is lsquowersquordquo ldquoLess passive than whatrdquo and ldquoWhat environmentrdquo
4 I feel depressed and sad (neversometimesoftenall the time)
two inherent items lsquoDepressionrsquo and lsquooftenrsquo may mean different things to different
people
5 How often do you take drugs (neversometimesoftenall the time)
imprecise and may be interpreted in various ways
6 Abortion should not be legalised (agreedisagree)
too global
7 Most men are more emotionally stable than most women are (agreedisagree)
is a leading question
8 Suppose you are measuring Unisa studentsrsquo level of motivation and one of your items is
ldquoHow many hours do you spend studying each weekrdquo
does not necessarily relate to motivation
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
15
4 Pilot test the questionnaire
Outcome product
A set of items to be included in the final version of the questionnaire
Method
Activity 41 Administer and revise the questionnaire
Activity 42 Do an item analysis
Resource reference
Correlation coefficient
Item analysis
Introduction
Improve your questionnaire further by actually trying it out and seeing how people respond
to each item In particular you will use simple item analysis techniques to improve the 12-
item rating scale that forms part of your questionnaire
Activity 41 Administer and revise the questionnaire
Action Identify a suitable sample get hold of people to try it out on The sample should
represent the population to which you hope to generalise your findings
One needs at least one more person than there are items in onersquos scale Finding a sample is
a matter of balancing practical issues with theoretical requirements
Action Administer the questionnaire to the sample Be sure to be ethical about what your
are doing The answer to each of the questions should be YES
Ethical checklist
Respondents understand why they are being asked
to complete the questionnaire
It is to help you with your studies
It is to help you improve the questionnaire
They will not get their scores back
Respondents understand that they donrsquot have to
complete the questionnaire
You wonrsquot hold it against them if they decide not to
do it
You wonrsquot tell anybody else if they refused
Respondents understand that their responses will
be confidential
Their names wonrsquot be on the questionnaire
You wonrsquot show their responses to anybody else
YesNo Notes
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
16
Keep notes of what kinds of questions people ask and what difficulties arise - to make
improvements When you get the questionnaire back quickly scan it to see that they have
completed all of it
Action There are two ways of using a pilot study to improve your questionnaire Use what
happened during the study Look again at the notes you made while you were administering
the questionnaire Now write a short summary of the changes
Item analysis
Procedures to select the best items for inclusion commonly used criteria are item
difficulty (item facility or item variance) and item discrimination
1 Item difficultyvariance
Ideal questionnaire is where about half the people gets each of the items right Item
analysis involves discarding items that are too easy or too difficult The difficulty index for
an item is usually calculated by dividing the number of people who gave a correct response
by the total number of people in the sample The difficulty index should be between 025
and 075 and the average difficulty should be about 05
2 Item discrimination
The ability of an item to discriminate between respondents according to whatever the
measuring instrument as a whole is measuring Items should only be selected if they
measure the same characteristics - else they lose focus The higher the correlation
coefficient the more discriminating the item A minimum correlation of 02 is generally
required Items with negative or zero correlations are almost always excluded A negative
correlation could be indicative that an item should have been reverse scored
3 How many items to exclude
It is usual to discard 20 to 30 of the items
4 Other forms of item analysis
A range of item bias statistics help test constructors to identify items that perform
differently (are biased) for different groups
Activity 42 Do an item analysis
The second way of using a pilot study is to analyse the responses people gave to each item
in the questionnaire Remove items with too little variance and remove items that donrsquot
discriminate You only need to perform an item analysis on the rating scale part of your
questionnaire
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
17
Action Compile a data sheet The possible responses to each item in the scale will have a
number from 1 to 5
Statement 1
Never
2
Almost
never
3
Sometimes
4
Most of
the time
5
Always
I like loud music
radic
I prefer quiet places
radic
I enjoy noisy environments
radic
The ticked options are known as item responses Item 2 in our example should be lsquoreverse-
scoredrsquo because if somebody says she never likes quiet places she is in effect saying
that she always likes noisy places and she should therefore get a high score
Statement 1
Never
2
Almost never
3
Sometimes
4
Most of the time
5
Always
I like loud
music
radic
5
I prefer quiet places REVERSE SCORE
radic
5
I enjoy noisy environments
radic
4
The data sheet is divided into rows and columns - one row for each person in your sample
and one column for each item in your rating scale and total score Take a questionnaire
and transfer the score for each item to the first row on the data sheet
Calculate each respondentrsquos total score lowest anybody can have on the rating scale is 12
highest is 60
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 5 5 4
2 3 3
3 3 2
4 3 4
Action Start with the actual item analysis of your rating scale Find items with too little
variance where almost everybody in the sample gets the same item score You want your
scale to show differences between people
Compare items and decide which are better items in terms of the amount of variance they
show
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
18
Run your eye down each of the columns on your data sheet and look for items that may not
have sufficient variance If a column contains mainly only one number the item doesnrsquot
show much variance If a column contains a good spread of numbers the item shows lots of
variance
It is not always possible to explain why most people end up answering an item in the same
way - the item may have been too extremely worded they are too vague or that there is a
strong lsquosocially desirablersquo way of responding
Correlation coefficient
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour eg anxiety intelligence stress independence etcetera Correlation
coefficient - the relation between the constructs
1 The correlation coefficient
The statistical relationship between two constructs is called a correlation A value close to 0
indicates a weak relationship while 0 represents no correlation The numerical size of a
correlation coefficient indicates the strength of the relationship while the sign positive
negative) indicates the direction of the relationship
2 The scatter plot
The graphic display of the correlation coefficient If there is a perfect positive relation
between two constructs (a correlation coefficient of +1) the dots form a perfectly straight
line with an upward slope For a a correlation coefficient of -1 the scores form a perfectly
straight line with a downward slope No relation (a correlation coefficient of 0) between two
constructs results in an undefined shape
3 Using correlations in item analysis
If the item correlates strongly with the total score we know that it measures more or less
the same thing as the other items
Action To measure differences between people our items need to show some variance but
even if the items show lots of variance the scale may not measure anything in particular
Ensure that each item in the scale measures more or less the same thing and that the
items are not too divergent You want an item to discriminate between high and low scorers
because it shows that the item measures more or less the same thing as the other items in
the scale
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
19
Cases Items Total Score
1 2 3 4 5 6 7 8 9 10 11 12
1 4 53
2 1 12
3 5 49
4 4 40
5 2 16
Item 3 does seem to be pretty good at discriminating between high and low scorers
Looking whether item scores correspond with a total score is called item-total correlation
Professional questionnaire constructors usually calculate a ltcorrelation coefficientgt (an
index of how strongly two variables are related) to establish how strong each item-total
correlation is
Item
Relation between the scores on item 3 and the scale total
Each dot on the scatterplot represents a person Dots are arranged roughly in a diagonal
line from bottom left to top right This means that there is a strong correlation between the
item score and the total score - the item discriminates well If the dots donrsquot seem to form a
pattern at all then there is no correlation if the line seems to go from top left to bottom
right then there is a negative correlation (the item does discriminate but the wrong way
round so it is no good) You will have to draw 12 scatter plots (one for each of the 12
items)
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
20
Identify what appears to be the worst items in your scale in terms of failure to discriminate
The reason why items donrsquot discriminate is usually because they measure something
different from the other items in the scale Sometimes the wording of the item but
sometimes it seems inexplicable and one just has to accept that it is so
Action Compile a final version of your questionnaire Your scale should have
8 items so discard 4 items Study the list for items that donrsquot show much variance and the
other list for items that donrsquot discriminate well
Your 8 item scale is more coherent and has a greater degree of reliability
5 Evaluate reliability and validity
An evaluation of the reliability of the questionnaire
An evaluation of the validity of the questionnaire
Method
Activity 51 Evaluate the reliability of the rating scale
Activity 52 Evaluate the validity of the questionnaire
Resource reference
Correlation coefficient
Reliability
Validity
Introduction
The results should be reliable that is the questionnaire should measure consistently You
will evaluate the reliability of the final version of the rating scale included in your
questionnaire The interpretations based on the results should also be valid that is it should
measure what it claims to measure
Reliability
How consistently the questionnaire measures that which it is supposed to measure
1 Measurement error and reliability
Various conditions might affect the results of the questionnaire eg the occasion on which
the questionnaire is administered or the sample of items in the questionnaire Their effect
on the results is unpredictable and inconsistent These irrelevant conditions are called
unsystematic sources of variation The reliability refers to the consistency of results
over different administrations involving different occasions test forms etc
A statistical index of reliability is the reliability coefficient range between 0 and 1
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
21
Unreliable questionnaire - reliability coefficient close to 0
Reliable questionnaire - reliability coefficient of 1
The closer the value of the reliability coefficient to 1 the more reliable the test
2 Different types of reliability
21 Test-retest reliability
How consistent the results of a questionnaire are over different occasions administer the
same questionnaire to the same group on two consecutive occasions Scores are correlated
and the correlation coefficient represents the degree of test-retest reliability The closer the
correlation coefficient is to 1 the more consistent Test-retest reliability thus indicates
stability or consistency of scores over time
A perfect correlation does not indicate that the second scores were identical a persons
relative position to that of the others in the group stays the same The time interval should
be at least several days to reduce the possibility of effects such as familiarity with the type
of items or respondents remembering their answers
22 Alternate-forms reliability
Two forms of the same questionnaire are often developed To know how consistent the
results are over different forms obtain an estimate of the alternate-forms reliability Both
forms administered to the same group on two consecutive occasions Scores are correlated
The closer the correlation coefficient (or reliability coefficient) is to 1 the greater the extent
to which the forms are indeed equivalent and thus measure the same attribute Alternate-
forms reliability is thus a measure of equivalence
Alternate-forms reliability can also be used to determine the consistency of results over
different occasions or stability over time This offers some solution to possible memory
effects experienced with test-retest reliability A disadvantage of alternate-forms reliability
is that it expensive and time-consuming and that it is difficult to produce truly parallel
forms
23 Split-half reliability
A single questionnaire is administered only once This questionnaire is then divided into two
parts regarded as two parallel halves Each person has a total score on the one half and a
total score on the second half - two sets of scores that are then correlated The correlation
is an estimate of the reliability of either of the two halves and is thus a measure of
equivalence
A common method used to divide a questionnaire is to compare scores on the odd items
with scores on the even items A shorter questionnaire is generally less reliable the
reliability of the whole questionnaire is called the split-half reliability and it measures the
degree of equivalence between the two halves that is the extent to which they measure the
same attribute It reflects the consistency and indicates the degree of relatedness of the
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
22
items This type of reliability is therefore also regarded as a measure of the internal
consistency and the closer the split-half reliability is to 1 the higher the internal
consistency of the questionnaire
3 Evaluating reliability
The nature and purpose of the questionnaire determines which type of reliability is
appropriate A psychological test such as an intelligence test the reliability coefficient
should be above 090 A reliability coefficient of 070 can be useful if the results are used in
combination with other information about the individual or group
Activity 51 Evaluate the reliability of the rating scale
Action You should be able to distinguish between different types of reliability The purpose
of the questionnaire determines which type of reliability is appropriate
The internal consistency of a rating scale - the extent to which the items measure the same
thing Obtain an estimate of the split-half reliability of this rating scale the degree of
equivalence between two halves of the rating scale A limitation of this method is that the
reliability coefficient that one obtains to some extent depends on the items included in
each of the two halves
Action Look at these eight items and divide the rating scale into two halves by grouping
the odd items and even items together Now re-number your items from 1 to 8 For each
person calculate the total score for the odd items and for the even items
1 3 5 7 Total Score
2 4 6 8 Total Score
1
2
3
4
Data sheet for two halves of the rating scale
The relation between these two sets of scores will give you an estimate of the reliability of
either of the two halves of the rating scale
For each person take the total score on the odd items and the total score on the even
items and where the two meet you make a dot on the graph Draw a straight line
resembling the shape of the scatterplot
If your scatter plot has a very undefined shape the correlation coefficient is close to 0
indicating a weak relation between the two halves If the line has an upward slope the
correlation coefficient falls between 0 and +1 If most dots are close to the line the
correlation coefficient is close to +1 and there is a fairly strong relation
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
23
Action In this context the correlation coefficient is a reliability coefficient Values closer to
1 indicate a more reliable rating scale
Validity
The extent to which it measures what it claims to measure The extent to which the scores
can be used for the intended purpose There are three categories of gathering validity
evidence content validity criterion-related validity and construct validity
1 Content validity
The content validity is determined by the degree to which the items in the questionnaire
are representative of the universe of tasks behaviours or attitudes (the content domain)
that it was designed to measure Content validity can be ensured by proper design
Content validity cannot be expressed in terms of a quantitative index
Face validity refers to the degree to which items appear to be relevant Content validity is
based on the subjective evaluation by people who are not necessarily experts either in the
particular area or in psychometrics If the respondents do not regard the items as relevant
(the questionnaire does not have sufficient face validity) they might be less motivated and
even unwilling to cooperate
2 Criterion-related validity
The criterion-related validity of a questionnaire is the extent to which the scores on the
questionnaire are effective in estimating an individuals position or performance on the
relevant criterion Approaches to gathering evidence of criterion-related validity are
concurrent validity and predictive validity
With concurrent validity measures are obtained on the criterion at approximately the
same time as the scores on the questionnaire The extent to which scores accurately
estimates an individuals present position on the relevant criterion is then determined
Determined if you want to use your questionnaire to identify some current behaviour or
status of individuals
For example you want to classify psychiatric patients according to their disturbances Take
a representative group of psychiatric patients and administer it to them At the same time
you would ask psychiatrists or clinical psychologists to classify these patients according to
type of disturbance
To evaluate predictive validity the measures on the criterion are obtained in the future It
is then determined to which extent the scores accurately predict an individuals scores on
the relevant criterion Determined if you want to use your questionnaire to predict some
future performance of individuals For example to select candidates for entrance into this
course take a representative group of students applying for the course and administer your
questionnaire to them
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
24
At the end of the course you could obtain the students examination marks You will then
determine how effective scores on your questionnaire are in predicting the studentsrsquo
examination marks
To determine criterion-related validity calculate the correlation between the results and the
measures on the criterion the resulting correlation coefficient is known as the validity
coefficient
3 Construct validity
A construct is an unobservable quality which forms part of a theory designed to explain
observable behaviour For example anxiety is not observable but it forms part of a theory
that explains observable behaviours
You have to define your construct in terms of observable behaviours You can thus define
the construct validity as the extent to which it indeed measures the theoretical construct
it aims to measure Construct validity cannot be expressed in terms of a single validity
coefficient You would expect groups who are supposed to differ in terms of a construct to
also obtain significantly different scores on a questionnaire measuring this construct
Another way to determine construct validity is to look at the correlation coefficients
between different questionnaires
Convergent validity - if two questionnaires measure the same construct you would expect
the scores to be significantly correlated
Discriminant validity - if two constructs are theoretically unrelated you would not expect a
high correlation
Activity 52 Evaluate the validity of the questionnaire
Action You should be able to distinguish between categories of validity
The content validity of your questionnaire is influenced by how well you designed the
questionnaire The content domain is the universe of tasks behaviours attitudes etcetera
implied by the purpose of your questionnaire The degree to which the items in your
questionnaire are representative of the content domain determines the content validity of
your questionnaire Criterion - related validity of the questionnaire - how well a
questionnaire estimates an individualrsquos position or performance on some outcome measure
Construct validity - to make conclusions about a theoretical construct that underlies
the behaviours measured by the questionnaire
Action Consider the content domain of your questionnaire and the questionnaire
specification document and evaluate the content validity of your questionnaire
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
25
6 Compile a manual
Outcome product
A manual for the questionnaire consisting of a description of the aim and design an
evaluation of the properties and procedures for administration scoring and interpretation
Method
Activity 61 Discuss the process of developing the questionnaire
Activity 62 Compile a manual
Resource reference
Manual The purpose and structure of a manual
Manual The purpose and structure of a manual
1 Purpose of a manual
Someone else might be interested in using your test or questionnaire Report the process of
analysing and selecting the items as well as the reliability and validity of the questionnaire
Give instructions for the administration of the questionnaire for the scoring of the
questionnaire and some guidelines on how to interpret the results
2 Structure of a manual
Aim and design
Aim
Target population
Design of the questionnaire
Properties of the questionnaire
Item analysis and item selection
Reliability
Validity
Procedures for administration scoring and interpretation
Instructions for administration
Instructions for scoring
Guidelines for interpretation
21 Aim and design
Should be clear what the questionnaire measures and how this information can be used
The aim of the questionnaire determines for whom it will be used Describe characteristics
of the target population that are relevant to the aim of the questionnaire Important to
state for which country this questionnaire has been developed and age as the subject
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
26
matter of this questionnaire A brief description of the design of the questionnaire should
be provided The type of items should also be indicated - multiple-choice items or rating
22 Properties of the questionnaire
To determine how effective the questionnaire is you need to administer it to a group of
people who are representative of the target population This group should be described and
indicated to what extent they are representative in terms of those characteristics that
define the target population It should also be mentioned when and under which
circumstances the questionnaire was administered to them Describe each technique used
for item analysis and you should indicate which criteria were used to justify the inclusion
or exclusion of items in the item selection process Important for the user to know how
reliable or consistent the questionnaire is Give a brief description of the method used to
determine reliability and justify why this was used The estimated reliability coefficient is
then evaluated in terms of an acceptable level of reliability Identify the category of validity
(be it content validity criterion-related validity or construct validity) that is relevant for
your questionnaire
23 Procedures for administration scoring and interpretation
Provide general instructions for administration of a questionaire Who is allowed to
administer it the situation in which it should be administered (groups or individuals)
complete on their own or if supervision is needed the material needed and how to deal
with a person asking an explanation Provide instructions for scoring A correct answer
could score a 1 or a 0 Rating scales - on a five-point rating scale a 1 = do not at all agree
and 5 = strong agreement with this statement The total score indicates the persons
attitude towards the topic under investigation Reverse scoring might be necessary
The guidelines for interpretation of the results should be based on the aim of the
questionnaire
7 Evaluate a questionnaire
Outcome product
An evaluated questionnaire
Method
Activity 71 Explore a questionnaire rating scale
Activity 72 Use the questionnaire rating scale to evaluate a questionnaire
Activity 73 Compare your evaluations to the QWAN
Resource reference
Content domain Identify the content domain for a questionnaire
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
27
Item format
Layout of the questionnaire
Suitability of a questionnaire as measuring instrument
Writing questionnaire items
Activity 71 Explore a questionnaire rating scale
Action The four main facets one should consider when evaluating a questionnaire are
1 the instructions of the questionnaire
2 the characteristics of the items of the questionnaire
3 the characteristics of the questionnaire as a whole
4 the functionality of the questionnaire
1 Questionnaire instructions
These instructions should address
the purpose of the questionnaire confidentiality of the information provided in the
questionnaire and how to complete the questionnaire explain how the questionnaire
questions should be tackled Provided in a cover letter
11 The purpose of the questionnaire
Rate 0 purpose not explained
Rate 1 explained in terms of one of the following
(1) what it is intended to measure
(2) whom it is supposed to be used for
Rate 2 explained in terms of both statements
12 The confidentiality of information provided
Rate 0 not mentioned
Rate 1 absolutely confidential or identity will not be disclosed
Rate 2 absolutely confidential and identity will not be disclosed required to provide name
Rate 3 absolutely confidential and identity will not be disclosed not required to provide
name
13 Instructions for how to handle questions
Rate 0 not explained how the questionnaire should be completed
Rate 1 is explained but does not hold for all items
Rate 2 is explained and holds for all items
2 Item characteristics
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
28
All items have to be relevant to the topic presented in understandable language
meaningful for people from different cultural or language backgrounds clear and
unambiguous answerable and non-leading
21 Item relevance
Rate 0 all items are irrelevant
Rate 1 most items are irrelevant
Rate 2 some items are irrelevant
Rate 3 none items are irrelevant
22 Item language level
Rate 0 all items presented in language that is too difficult
Rate 1 most items
Rate 2 some items
Rate 3 none items
23 Item in cultural context
Rate 0 all items presented contain phrases that may be unclear to or be interpreted
differently
Rate 1 most items
Rate 2 some items
Rate 3 none items
24 Item clarity
Rate 0 all items are phrased in such a way that they may confuse respondents
Rate 1 most items
Rate 2 some items
Rate 3 none items
25 Item answerability
Rate 0 all items are unanswerable (ie the choice of provided responses does not fit
the item or the respondent does not have the required information)
Rate 1 most items
Rate 2 some items
Rate 3 none items
26 Item as a leading question
Rate 0 all items are leading questions
Rate 1 most items
Rate 2 some items
Rate 3 none items
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
29
3 Questionnaire characteristics
Questionnaires should have sufficient items to cover the topic but they should not be too
lengthy Items should be presented in a particular order to
(a) counter response style and item bias
(b) increase the efficiency of the questionnaire By grouping questions and by using filter
questions and a good balance of different question types
(c) be sensitive towards respondents Put respondents at ease by incorporating neutral and
interesting questions at the beginning and forbidding sensitive and personal questions
towards the end Allow respondents to raise opinions and vent feelings by providing open
ended questions at the end
31 The scope of the questionnaire
Rate 0 does not focus on the topic that it is supposed to cover
Rate 1 part of it focusses
Rate 2 focusses but topic is not covered in full
Rate 3 focusses and the topic is covered in full
32 Questionnaire item sequence
Rate 0 no items or if all items are irrelevant
Rate 1 sequence seems to
(1) introduce a particular response style or bias and
(2) force respondents to respond in an inefficient manner and
(3) make respondents feel emotionally uncomfortable
Rate 2 if the sequence seems to cause any two problems
Rate 3 if the sequence seems to cause any one problem
Rate 4 if the sequence does not seem to cause any of the problems
4 Questionnaire functionality
The issue to be evaluated is whether the questionnaire is structured in such a manner that
it can function maximally in the light of its declared purpose Does the structure of the
questionnaire (kinds of items and the sequence) support the questionnairersquos functionality
(ie what the questionnaire could be used for what it is capable of) given its declared
purpose (ie the kind of information it is expected to deliver) In other words the structure
and functionality of a questionnaire is a function of its declared purpose
41 The functionality of the questionnaire
Rate 0 the structure limits its functionality with regard to all three functions
(1) to obtain accurate information
(2) to provide a standard format for recording facts comments and attitudes
(3) to facilitate data processing
Rate 1 structure limits its functionality with regard to two
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
30
Rate 2 structure limits its functionality with regard to one
Rate 3 structure limits its functionality with regard to none
Activity 73 Compare your evaluations to the QWAN
QWAN stands for Quality Without A Name No single person can be absolutely sure that
the ratings heshe assigns are absolutely correct The QWAN is the quality we all strive for
but that no one can claim It is simply a standard against which you measure yourself
8 Evaluate a manual
Outcome product
An evaluated manual
Method
Activity 81 Explore a manualrsquos rating scale
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Activity 83 Compare your evaluations to the QWAN
Resource reference
Manual The purpose and structure of a manual
Activity 81 Explore a manualrsquos rating scale
The main issues one should consider when evaluating this kind of manual are
1 the extent to which the manual constitutes its purpose and
2 the quality of the information provided in the manual
1 The extent to which the manual constitutes its purpose
The purpose of a questionnaire manual is to provide information for the person who plans
to use the questionnaire Questionnaire administrators need to know three things
(a) Whether the questionnaire is relevant for the purpose the questionnaire administrator
has in mind
(b) whether the questionnaire will work and
(c) how the questionnaire should be used
The manual constitutes its purpose if it provides information about (a) the nature of the
questionnaire (ie the aim and design of the questionnaire and the population it can be
used with) (b) the functionality of the questionnaire (ie the analysis and selection of items
for the questionnaire and the reliability and validity of the questionnaire) and (c)
instructions for using the questionnaire (ie instructions for administration scoring and
interpretation) Focus on whether the manual is able to achieve its purpose not whether it
has in fact achieved this purpose
11 The purpose areas covered in the manual
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
31
Rate 0 no information about the following purpose areas
(a) the nature of the questionnaire
(b) the functionality of the questionnaire
(c) instructions for using the questionnaire
Rate 1 information about one
Rate 2 information about any two
Rate 3 information about all three
Proper communication requires a logical presentation and clear and correct language The
manual should start with a description of
(a) the nature of the questionnaire then discuss
(b) the functionality of the questionnaire and conclude with
(c) instructions for using the questionnaire
The logical sequence would be (a) (b) (c)
12 The logical sequence of the presentation
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
Rate 1 if your rating in 11 is lsquo2rsquo but purpose areas are not in the logical sequence
Rate 2 if your rating in 11 is lsquo2rsquo and if the purpose areas are logical
Rate 3 if your rating in 11 is lsquo3rsquo but if the purpose areas are not in the logical sequence
Rate 4 if your rating in 11 is lsquo3rsquo and if the purpose areas are logical
Each of the purpose areas refer to specific content topics
The nature of the questionnaire refers to
- the aim of the questionnaire
- the target population
- the design of the questionnaire
The functionality of the questionnaire refers to
- the sample used to test the questionnaire
- the analysis and selection of the questionnairersquos items
- the reliability of the questionnaire
- the validity of the questionnaire
The instructions for using the questionnaire refer to
- instructions for administration
- instructions for scoring
- instructions for interpretation
A manual should cover ten different content topics kept together in their distinct groups
namely the three purpose areas
13 The logical grouping of the content topics
Rate 0 if your rating in 11 is lsquo0rsquo or lsquo1rsquo
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
32
Rate 1 content topics of none of the three purpose areas are kept in their logical groups
Rate 2 one kept
Rate 3 two kept
Rate 4 all kept
Effective communication requires clear precise and correct language written in short
direct sentences to enable unambiguous and precise communication Technical information
should be simple and straightforward
14 The clarity of the manualrsquos language
Rate 0 if your rating in 11 is lsquo0rsquo
Rate 1 if the manual contains
(a) difficult language and
(b) ambiguous statements and
(c) grammar and spelling mistakes
Rate 2 any two of (a) (b) and (c)
Rate 3 any one of (a) (b) and (c)
Rate 4 none of (a) (b) and (c)
2 The quality of the information provided in the manual
The manual should cover ten content topics
21 The aim of the questionnaire
22 The population targeted by the questionnaire
23 The design of the questionnaire
24 The sample used to test the questionnaire
25 How items were analysed and selected for the questionnaire
26 The reliability of the questionnaire
27 The validity of the questionnaire
28 Instructions for administering the questionnaire
29 Instructions for scoring the questionnaire
210 Guidelines for interpreting the information obtained via the questionnaire
21 The aim of the questionnaire
Rate 0 manual does not provide sufficient information
Rate 1 describes one of the following
(1) what the questionnaire measures
(2) how the information obtained could be used
Rate 2 describes both (1) and (2)
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
33
22 The population targeted by the questionnaire
Rate 0 not sufficient information
Rate 1 describes the target population but it is not appropriate
Rate 2 describes the target population and it is appropriate
23 The design of the questionnaire
Rate 0 not sufficient information
Rate 1 if the manual describes one of the following
(1) the domain of the questionnaire
(2) how the questionnaire items cover the domain
(3) the types of items used in the questionnaire
Rate 2 any two
Rate 3 all three
24 The sample used to test the questionnaire
Rate 0 not sufficient information
Rate 1 describes the sample used but if the characteristics of the sample group differ from
target population
Rate 2 describes the sample used and if the characteristics of the sample group correspond
to the target population
25 How items were analysed and selected for the questionnaire Rate 0 not sufficient
information about the analysis and selection of
questionnaire items
Rate 1 describes one of the following
(1) the technique used for item analysis
(2) the criteria used for including items in or excluding items from the questionnaire
Rate 2 describes both (1) and (2)
26 The reliability of the questionnaire
Rate 0 not sufficient information
Rate 1 does one of the following
(1) describes the method used to determine reliability
(2) motivates why the particular type of reliability is used
(3) evaluates the reliability coefficient in terms of what can be regarded as an
acceptable level of reliability
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
27 The validity of the questionnaire
Rate 0 not sufficient information
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
34
Rate 1 does one of the following
(1) names the category of validity that is relevant to the questionnaire
(2) explains how evidence for the questionnairersquos validity was gathered
(3) discusses to what extent the questionnaire measures what it claims to measure
Rate 2 any two of (1) (2) and (3)
Rate 3 all three of (1) (2) and (3)
28 Instructions for administering the questionnaire
Rate 0 does not provide instructions
Rate 1 instructions include one or two of the following
(1) an indication of the kind of person who is allowed to administer the questionnaire
(2) the situations in which the questionnaire can be administered
(3) the material required for the administration of the questionnaire
(4) the ways in which enquiries about items should be handled
(5) guidelines as to how the questionnaire should be introduced to people who are
about to complete the questionnaire
Rate 2 include three or four
Rate 3 include all five
29 Instructions for scoring the questionnaire
Rate 0 if scoring (but no decoding) is required but it does not provide instructions
Rate 1 if both scoring and decoding are required and provides instructions for scoring but
not for decoding
Rate 2 if both scoring and decoding are required and provides instructions for both
210 Guidelines for interpreting the information obtained via the questionnaire
Rate 0 does not provide instructions
Rate 1 does one of the following
(1) provides instructions for interpretation of the information
(2) explains how the interpretation fits the aim
Rate 2 does both (1) and (2)
Activity 82 Use the manualrsquos rating scale to evaluate a manual
Action Study the manual below
Manual for ldquoTHE MARSTON PERSONALITY QUESTIONNAIRE (THE MPQ)rdquo
The Marston Personality Questionnaire (MPQ) was designed to measure individualsrsquo
preferred behaviour styles in their work environments The MPQ is intended to be used with
the Marston Job Description System (MJDS) a method to describe any job in terms of
behaviour styles Once a job has been described by the MJDS one knows what kind of
behaviour style would be required of the person who does the job
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high
35
The MPQ measures preferred behaviour style in terms of drive (the ability to get things
done personally) interaction (the ability to work with people) management (the ability to
keep systems going) and regulation (the ability to adhere to rules and regulations) Each
factor is measured on a ten point scale
50 multiple choice items If ltdescription of work situationgt I prefer to
(a) ltdescription of lsquodriversquo actiongt
(b) ltdescription of lsquointeractionrsquo actiongt
(c) ltdescription of lsquomanagementrsquo actiongt
(d) ltdescription of lsquoregulationrsquo actiongt
To counter response bias the sequence in which the action descriptions are provided is
varied randomly Nine hundred university students used in the development of the
questionnaire
The original MPQ consisted of 145 items Item analysis showed 63 really good but that the
remaining 82 items did not meet the criteria to be included Thirteen were excluded thus 50
items were retained The validity coefficient of 091 is high