measurement, data collection, validity & reliability data is your friend

56
Measurement, Data Collection, Validity & Reliability Data is your friend

Upload: byron-norman

Post on 13-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measurement, Data Collection, Validity & Reliability Data is your friend

Measurement, Data Collection,

Validity & Reliability

Data is your friend

Page 2: Measurement, Data Collection, Validity & Reliability Data is your friend

Agenda

• Measurement

• Measures (aka, ways to collect data)

• Validity/reliability, up close and personal

Page 3: Measurement, Data Collection, Validity & Reliability Data is your friend

Educational Measurement

• Measurement: assignment of numbers to differentiate values of a variable

• GOOD RESEARCH MUST HAVE SOUND MEASUREMENT!!

Page 4: Measurement, Data Collection, Validity & Reliability Data is your friend

Thought Question

• Consider the following scores on a test

Marco 90 Adriane 85 Linda 75 Christy 99Chantelle 88 Jay 45 Remi 68 Marcus 97Chi Bo 92 Donnie 85

• Which measure of central tendency would Adriane use when telling her parents about her performance?

Page 5: Measurement, Data Collection, Validity & Reliability Data is your friend

Descriptive Statistics

• Statistics: procedures that summarize and analyze quantitative data• Descriptive statistics: statistical procedures that

summarize a set of numbers in terms of central tendency or variation

• Important for understanding what the data tells the researcher

Page 6: Measurement, Data Collection, Validity & Reliability Data is your friend

Descriptive Statistics: A Caution

• Statistics can provide us with useful information, but they can be interpreted in different ways to say different things

Page 7: Measurement, Data Collection, Validity & Reliability Data is your friend

Thought Question

If Jay scored an 85 instead of a 45, what changes?

Highly deviant scores (called "outliers") have no more effect on the median than those scores very close to the middle. However, outliers can greatly affect the mean.

Page 8: Measurement, Data Collection, Validity & Reliability Data is your friend

Descriptive Statistics

• Frequency distributions (see Figure 6.2)• Normal - scores equally distributed around

middle• Positively skewed - large number of low scores

and a small number of high scores; mean being pulled to the positive

• Negatively skewed - large number of high scores and a small number of low scores; mean being pulled to the negative

Page 9: Measurement, Data Collection, Validity & Reliability Data is your friend

Normal Distribution

Page 10: Measurement, Data Collection, Validity & Reliability Data is your friend

An Extreme Example

• Consider the salaries of 10 people

• Group A – All are teachers.

Salaries: $45,000 $45,000 $45,000

$50,000 $50,000 $50,000

$50,000 $55,000 $55,000

$55,000

Page 11: Measurement, Data Collection, Validity & Reliability Data is your friend

An Extreme Example

• Consider the salaries of 10 people• Group B – Nine are teachers; 1 is Donovan

McNabb.Salaries: $45,000 $45,000 $45,000

$50,000 $50,000 $50,000$50,000 $55,000 $55,000$6,300,000

Page 12: Measurement, Data Collection, Validity & Reliability Data is your friend

An Extreme Example

• What happens to the mean and median in these 2 examples? Does it change?

• What happens to the normal distribution?

Page 13: Measurement, Data Collection, Validity & Reliability Data is your friend

Positive Skew

Page 14: Measurement, Data Collection, Validity & Reliability Data is your friend

Negative Skew

Page 15: Measurement, Data Collection, Validity & Reliability Data is your friend

Case in Point: Teacher Salary

• Compare Radnor to Philadelphia• Is the salary distribution for Philadelphia

going to be positively or negatively skewed? (Hint: Look at the # years of experience)

Page 16: Measurement, Data Collection, Validity & Reliability Data is your friend

Descriptive Statistics

• Variability• How different are the scores?• Types

• Range: the difference between the highest and lowest scores

• Standard deviation• The average distance of the scores from the mean• The relationship to the normal distribution

• ±1 SD = 68% of all scores in a distribution• ±2 SD = 95% of all scores in a distribution

Page 17: Measurement, Data Collection, Validity & Reliability Data is your friend

Variability

Page 18: Measurement, Data Collection, Validity & Reliability Data is your friend

Variability

• Why does variability matter?

Page 19: Measurement, Data Collection, Validity & Reliability Data is your friend

Descriptive Statistics

• Relationship• How two sets of scores relate to one another

• Correlation (positive)• Low .10 - .39• Moderate .40 - .69• High > .70

Page 20: Measurement, Data Collection, Validity & Reliability Data is your friend

Example of Correlation

Page 21: Measurement, Data Collection, Validity & Reliability Data is your friend

Measures of Data Collection

• Tests

• Questionnaires

• Observations

• Interviews

Page 22: Measurement, Data Collection, Validity & Reliability Data is your friend

Measures (Means of Data Collection)

You must match the instrument to the research question!

Page 23: Measurement, Data Collection, Validity & Reliability Data is your friend

Questionnaires

http://www.authentichappiness.sas.upenn.edu/

• Thoughts on those you responded to• Approaches to Happiness• Optimism• Grit

Page 24: Measurement, Data Collection, Validity & Reliability Data is your friend

Examples to critique

• Measures• Questionnaire – Psychological School

Membership Survey used with middle school students

• Interview protocol – for teachers & counselors regarding professional development issues

• Observation instrument – PDE 430 for student teachers

• What are 2 benefits and 2 limitations of this measure?

Page 25: Measurement, Data Collection, Validity & Reliability Data is your friend

Questionnaires

• Used to obtain a subject’s perceptions, attitudes, beliefs, values, opinions, or other non-cognitive traits

• Scales - a continuum that describes subject’s responses to a statement • Likert• Checklists• Ranked items

Page 26: Measurement, Data Collection, Validity & Reliability Data is your friend

Questionnaires

• Likert scales• Response options require the subject to

determine the extent to which they agree with a statement

• Debate over odd v. even number responses• Statements must reflect extreme positive or

extreme negative positions• Example – CATS evaluations

Page 27: Measurement, Data Collection, Validity & Reliability Data is your friend

Questionnaires

• Checklists• Choose options

• Ranked items • Sequential order• Avoids marking everything high or low

Page 28: Measurement, Data Collection, Validity & Reliability Data is your friend

Questionnaires

• Problems with measuring non-cognitive traits• Difficulty clearly defining what is being measured

• Self-concept or self-esteem

• Response set• Responding same way (Ex - all 4’s on CATS)

• Social desirability • “PC filter”

• Faking• Agreeing with statements because of the negative

consequences associated with disagreeing

Page 29: Measurement, Data Collection, Validity & Reliability Data is your friend

Questionnaires

• Controlling problems• Equal numbers of positively and negatively

worded statements• Alternating positive and negative statements • Providing confidentiality or anonymity to

respondents

Page 30: Measurement, Data Collection, Validity & Reliability Data is your friend

Designing Questionnaires

• Online resources• http://pareonline.net/getvn.asp?v=5&n=3• http://www.peecworks.org/PEEC/PEEC_Inst/I0

004E536• http://www.statpac.com/surveys/

Page 31: Measurement, Data Collection, Validity & Reliability Data is your friend

Observations

• Observations - direct observations of behaviors• Provide first hand account (ameliorates issues

of self-reporting in questionnaires)• Natural or controlled settings

• Ex – classroom vs. lab (child attachment studies)

• Structured or unstructured observations• Ex – frequency counts vs. narrative record

• Detached or involved observers

Page 32: Measurement, Data Collection, Validity & Reliability Data is your friend

Observations

• Inference• Low inference - involves little if any inference

on the observers part• On-task/Off-task behavior instrument

• High inference - involves high levels of inference on the observers part

• Teacher effectiveness – PDE form 430

Page 33: Measurement, Data Collection, Validity & Reliability Data is your friend

Observations

• Controlling observer effects• Observer bias

• Training• Inter-rater reliability (Cronbach’s alpha)• Multiple observers

• Contamination - knowledge of the study influences the observation

• Training• Targeting specific behaviors• Observers do not know of the expected outcomes• Observers are “blind” to which group is which

Page 34: Measurement, Data Collection, Validity & Reliability Data is your friend

Observations

• Observer effects• Halo effectHalo effect - initial ratings influence subsequent

ratings

• Hawthorne effectHawthorne effect - increased performance results from awareness of being part of study

• LeniencyLeniency - wanting everyone to do well

• Central TendencyCentral Tendency - measuring in the middle

• Observer DriftObserver Drift - failing to record pertinent information

Page 35: Measurement, Data Collection, Validity & Reliability Data is your friend

Interviews

• What are some challenges to doing this kind of interviewing?

http://www.youtube.com/watch?v=d6bXH2k9MKE

Page 36: Measurement, Data Collection, Validity & Reliability Data is your friend

Interviews

• Advantages• Establish rapport & enhance motivation• Clarify responses through additional

questioning• Capture the depth and richness of responses• Allow for flexibility• Reduce “no response” and/or “neutral”

responses

Page 37: Measurement, Data Collection, Validity & Reliability Data is your friend

Interviews

• Disadvantages• Time consuming• Expensive• Small samples• Subjective – interviewer characteristics,

contamination, bias

Page 38: Measurement, Data Collection, Validity & Reliability Data is your friend

Validity and Reliability

What’s all the fuss about?

Page 39: Measurement, Data Collection, Validity & Reliability Data is your friend

Validity/Reliability and Trustworthiness

• Why do we need validity and reliability in quantitative studies and “trustworthiness” in qualitative studies?

We can’t trust the results if we can’t trust the

methods!

Page 40: Measurement, Data Collection, Validity & Reliability Data is your friend

Reader’s Digest version…

• Reliability • The extent to which scores are free from error

• Error is measured by consistency

• Validity• The extent to which inferences are appropriate,

meaningful, and useful

• “Does the instrument measure what it is supposed to measure??”

Page 41: Measurement, Data Collection, Validity & Reliability Data is your friend

Thought Question

• On the ACT and SAT assessments, there is a definitive script that test administrators are required to follow exactly. What measurement issue are the test makers addressing?

Page 42: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability of Measurement

• Reliability - The extent to which measures are free from error

• Error is measured by consistency

Page 43: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability of Measurement

• Reliability• Measurement

• 0.00 indicates no reliability or consistency• 1.00 indicates total reliability or consistency• < .60 = weak reliability• > .80 = sufficient reliability

Page 44: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability of Measurement

• Types of reliability evidence• Stability (i.e. test-retest)

• Testing the same subject using the same test on two occasions

• Limitation - carryover effects from the first to second administration of the test

• Equivalence (i.e. parallel form)• Testing the same subject with two parallel (i.e. equal)

forms of the same test taken at the same time• Limitation - difficulty in creating parallel forms

Page 45: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability of Measurement

• Equivalence and stability• Testing the same subject with two forms of

the same test taken at different times• Limitation - difficulty in creating parallel

forms

Page 46: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability of Measurement

• Internal consistency• Testing the same subject with one test and

“artificially” splitting the test into two halves

• Limitations - must have a minimum of ten (10) questions

• Often see “Chronbach’s alpha” for reliability coefficient (ex – Learning styles)

Page 47: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability of Measurement

• Agreement/ Inter-rater reliability• Observational measures• Multiple observers coding similarly

Page 48: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability of Measurement

• Enhancing reliability• Standardized administration procedures

(e.g. directions, conditions, etc.)• Appropriate reading level• Reasonable length of the testing period• Counterbalancing the order of testing if

several tests are being given

Page 49: Measurement, Data Collection, Validity & Reliability Data is your friend

Validity of Measurement

• Validity: the extent to which inferences are appropriate, meaningful, and useful

• Current example – content tests and teacher licensure

Page 50: Measurement, Data Collection, Validity & Reliability Data is your friend

Validity of Measurement

• For research results to have any value, validity of the measurement of a variable must exist• Use of established and “new”

instruments and the implications for establishing validity

• Importance of establishing validity prior to data collection (e.g. pilot tests)

Page 51: Measurement, Data Collection, Validity & Reliability Data is your friend

Validity

• Content

• Predictive (criterion-related)

• Concurrent

• Construct

Page 52: Measurement, Data Collection, Validity & Reliability Data is your friend

Thought Question

• Criticisms of standardized tests like the SAT claim that they discriminate against particular groups of students (especially minorities) and do not represent a broad enough domain of knowledge to adequately assess a student’s academic potential. What issue of validity is operating in these arguments?

Page 53: Measurement, Data Collection, Validity & Reliability Data is your friend

Thought Question

• Other arguments against the SAT state that the tests do not adequately estimate an individual’s ability to succeed in college. What issue of validity is operating here?

Page 54: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability & Validity of Measurement

• What is the relationship of reliability to validity?• If a watch consistently gives the time at 1:10

when actually it is 1:00, it is ____ but not ____.

• ______ is necessary but not sufficient condition for _______.

• To be _____ , an instrument must be ______, but a ____ instrument is not necessarily _____.

Page 55: Measurement, Data Collection, Validity & Reliability Data is your friend

Reliability & Validity of Measurement

• What is the relationship of reliability to validity?• If a watch consistently gives the time at 1:10

when actually it is 1:00, it is reliable but not valid.

• Reliability is necessary but not sufficient condition for validity

• To be valid, an instrument must be reliable, but a reliable instrument is not necessarily valid.

Page 56: Measurement, Data Collection, Validity & Reliability Data is your friend

Midterm

• Multiple Choice: 50 pts

• Short Answer: 25 pts

• Article Critique: 25 pts

Bring article with you to class. It’s ok to have notes on it.