rree measurement-larry-d3
TRANSCRIPT
![Page 1: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/1.jpg)
Larry D. Gruppen, Ph.D.University of Michigan
From Concepts to Data:
Conceptualization, Operationalization, and
in Educational Research
Measurement
![Page 2: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/2.jpg)
Objectives
• Identify key research design issues
• Wrestle with the complexities of educational measurement
• Explain the concepts of reliability and validity in educational measurement
• Apply criteria for measurement quality when conducting educational research
![Page 3: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/3.jpg)
Agenda
• A brief nod to design• From theory to measurement• Criteria for measurement quality
– Reliability– Validity
• Application: analyze an article
![Page 4: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/4.jpg)
Guiding Principles forScientific Research in Education
1. Question: pose significant question that can be investigated empirically
2. Theory: link research to relevant theory
3. Methods: use methods that permit direct investigation of the question
4. Reasoning: provide coherent, explicit chain of reasoning
5. Replicate and generalize across studies
6. Disclose research to encourage professional scrutiny and critique
![Page 5: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/5.jpg)
Study design
• Study design consists of:– Your measurement method(s)– The participants and how they are assigned– The intervention– The sequence and timing of measurements
and interventions
![Page 6: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/6.jpg)
Comparison Group
• Pre-post design - compare intervention group to itself
• Non-equivalent control group design - compare intervention group to an existing group
• Randomized control group design - compare to equivalent controls
![Page 7: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/7.jpg)
Overview of Study Designs
• Symbols– Each line represents a group.– x = Intervention (e.g. treatment)
– O1, O2, O3…= Observation (measurement) at Time 1, Time 2, Time 3, etc.
– R = Random assignment
![Page 8: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/8.jpg)
Non-Experimental Designs
![Page 9: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/9.jpg)
x O1
One-Group Posttest
x O1
![Page 10: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/10.jpg)
Quasi-Experimental Designs
![Page 11: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/11.jpg)
x O1
O1
Posttest-Only Control Group
![Page 12: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/12.jpg)
O1 x O2
One-Group Pretest-Posttest
![Page 13: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/13.jpg)
O1 x O2
O1 O2
Control Group Pretest-Posttest
![Page 14: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/14.jpg)
Experimental Designs
![Page 15: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/15.jpg)
Posttest Only Randomized Control Group
R x O1
R O1
![Page 16: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/16.jpg)
R O1 x O2
R O1 O2
Randomized Control Group Pretest-Posttest
![Page 17: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/17.jpg)
Theory
Constructs
Operational Definition
Measurement
From Theory to Measurement
![Page 18: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/18.jpg)
Measurement
• Measurement: assignment of numbers to objects or events according to rules
• Quality: reliability and validity
![Page 19: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/19.jpg)
The Challenge of Educational Measurement
• Almost all of the constructs we are interested in are buried inside the individual
• Measurement depends on transforming these internal states, events, capabilities, etc. into something observable
• Making them observable may alter the thing we are measuring
![Page 20: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/20.jpg)
Examples of Measurement Methods
• Tests (knowledge, performance): defined response, constructed response, simulations
• Questionnaires (attitudes, beliefs, preferences): rating scales, checklists, open-ended responses
• Observations (performance, skills): tasks (varying degrees of authenticity), problems, real-world behaviors, records (documents)
![Page 21: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/21.jpg)
Reliability
• Dependability (consistency or stability) of measurement
• A necessary condition for validity
![Page 22: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/22.jpg)
Types of Reliability
• Stability (produces the same results with repeated measurements over time):– Test-retest – Correlation between scores at 2 times
• Equivalence/Internal Consistency (produces same results with parallel items on alternate forms):– Alternate forms; split-half; Kuder-Richardson; Chronbach’s alpha – Correlation between scores on different forms; Calculate
coefficient alpha (a)• Consistency (produces the same results with different observers or
raters):– Inter-rater agreement – Correlation between scores from different raters; kappa
coefficient
![Page 23: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/23.jpg)
Validity
• Refers to the accuracy of inferences based on data obtained from measurement
• Technically, measures aren’t valid, inferences are
• No such thing as validity in the abstract: the key issue is ‘valid’ for what inference
• Want to reduce systematic, non-random error• Unreliability lowers correlations, reducing validity
claims
![Page 24: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/24.jpg)
Conventional View of Validity
• Face validity: logical link between items and purpose—makes sense on the surface
• Content validity: items cover the range of meaning included in the construct or domain. Expert judgment
• Criterion validity: relationship between performance on one measurement and performance on another (or actual behavior) Concurrent and Predictive Correlation coefficients
• Construct validity: directly connect measurement with theory. Allows interpretation of empirical evidence in terms of theoretical relationships. Based on weight of evidence. Convergent and discriminant evidence. Multitrait-MultiMethod Analysis (MTMM)
![Page 25: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/25.jpg)
Unified View of Construct Validity(Messick S, Amer Psych, 1995)
• Validity is not a property of an instrument but rather of the meaning of the scores. Must be considered holistically.
• 6 Aspects of Construct Validity Evidence– Content—content relevance & representativeness– Substantive—theoretical rationale for observed consistencies in
test responses– Structural—fidelity of scoring structure to structure of construct
domain– Generalizability—generalization to the population and across
populations– External—convergent and discriminant evidence– Consequential—intended and unintended consequences of
score interpretation; social consequence of assessment (fairness, justice)
![Page 26: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/26.jpg)
Finding Measurement Instruments
• Scan the engineering education literature (obviously)• Email engineering ed researchers (use the network)• Examine literature for instruments used in prior studies• General education/social science instrument databases
– Buros Institute of Mental Measurements (Mental Measurement Yearbook, Tests in Print) http://buros.unl.edu/buros/jsp/search.jsp
– ERIC databases http://www.eric.ed.gov/– Educational Testing Service Test Collection
http://www.ets.org/testcoll/index.html• Construct your own (last resort!)
– Get some expert consultation (test writing, survey design, questionnaire construction, etc.)
![Page 27: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/27.jpg)
Example
• In your groups, analyze the Steif & Dantzler statics concept inventory article. Look for:– Theoretical framework– Constructs used in the study– How constructs were operationalized– Measurement process
• Attention to reliability and validity
![Page 28: Rree measurement-larry-d3](https://reader033.vdocuments.site/reader033/viewer/2022042701/55a62e161a28ab696d8b4658/html5/thumbnails/28.jpg)
References
• Campbell DT, Stanley JC. Experimental and quasi-experimental designs for research. Chicago: Rand McNally; 1969.
• Cook, T.D. and Campbell, D.T. (1979). Quasi-Experimentation: Design and Analysis for Field Settings. Rand McNally, Chicago, Illinois.
• Messick S. Validity of psychological assessment: validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist. 1995;50:741-749.
• Messick S. Validity. In: Linn RL, ed. Educational measurement. 3rd ed. New York: American Council on Education & Macmillan; 1989:13-103.