Download - Messick’s framework

MESSICK’S FRAMEWOR

KWhat Do Evaluators

Need to Know?

Outline

What this report will cover

1. Concepts of Validity

2. Messick’s Contributions

3. Messick’s Framework

Validity Concept

The concept of validity has historically seen a variety of iterations that involved “packing” different aspects into the concept and subsequently “unpacking” some of them.

Points of broad consensus

Validity if the most fundamental consideration in the evaluation of the appropriateness of claims about, and uses and interpretations of assessment results.

Validity is a matter of degree rather than all or none.

SICI Conference 2010North Rhine-Westphalia Quality Assurance in the Work of “Inspectors”

Main controversial aspect

…empirical evidence and theoretical rationales…

Validity is “an integrated evaluativejudgment of the degree to whichempirical evidence and theoreticalrationales support the adequacy andappropriateness of inferences and actions based on test scores or other modes of assessment.” Messick, S. (1989). Validity. In R. Linn

(Ed.), Educational Measurement (3rd ed., pp.13-103). Washington, DC: American Council on Education/Macmillan.

Broad, but not universal agreement

(for a dissenting viewpoint, Lissitz & Samuelsen, 2007)

Karen Samuelsen, Assistant Professor in the Department of Educational Psychology and Instructional Technology.

Robert W. LissitzProfessor of Education in the College of Education at the University of Maryland and Director of the Maryland Assessment Research Center for Education Success (MARCES).

Broad, but not universal agreement(for a dissenting viewpoint, Lissitz & Samuelsen, 2007)

It is the uses and interpretations of an assessment result, i.e. the inferences, rather than the assessment result itself that is validated.

Validity may be relatively high for one use of assessment results by quite low for another use or interpretation

Messick’s contributions

According to Angoff (1988), theoretical conceptions of validity and validation practices have change appreciably over the last 60 years largely because of Messick’s many contributions to our contemporary conception of validity.

Ruhe V. and Zumbo B.Evaluation in Distance Education and E-Learning pp. 73-91

1951 Cureton , the essential feature of validity was “how well a test does the job it was employed to do” (p.621)

1954 American Psychological Association (APA) listed four distinct types of validity


Types of Validity

1. Construct Validity refers to how well a particular test can be show to assess the construct that it is said to measure.

2. Content Validity refers to how well test scores adequately represent the content domain that these scores are said to measure. Ruhe V. and Zumbo B.

Evaluation in Distance Education and E-Learning pp. 73-91

3. Predictive Validity is the degree to which the predictions made by a test are confirmed by the later behavior of the tested individuals.

4. Concurrent Validity is the extent to which individuals scores on a new test correspond to their scores on an established test of the same construct that is determined shortly before of after the new test. Ruhe V. and Zumbo B.


1966 APA, Standards for Educational and Psychological Tests and Manuals, criterion-related validity and predictive validity were collapsed into criterion-related validity.

1980 Guion, three aspects of validity referred to as “Holy Trinity.”


1996 Hubley & Zumbo, the Holy Trinity referred by Guion, means that at least one type of validity is needed but one has three chances to get it.

1957 Loevinger, argued that construct validity was the whole of validity, anticipating a shift away from multiple types to a single type of validity.


1988 Angoff, validity was viewed as a property of tests, but the focus later shifted to the validity of a test in a specific context or application, such as the workplace.


1974 Standards for Educational and Psychological Tests (APA, American Educational Research Association and National Council on Measurement in Education) shifted the focus of content validity from a representative sample of content knowledge to a representative sample of behaviors in a specific context.


1989 Messick professional standard s were established for a number of applied testing areas such as “counseling, licensure, certification and program evaluation


1985 Standards (APA, American Educational Research Association and National Council on Measurement in Education validity was redefined as the “appropriateness, meaningfulness, and usefulness of the specific inferences made from test scores. Ruhe V. and Zumbo B.


1985 the unintended social consequences of the use of tests – for example, bias and adverse impact---were also included in the Standards (Messick 1989).


Validation Practice is “disciplined inquiry” (Hubley & Zumbo, 1996) that

started out historically with calculation of measures of a single aspect of validity (content validity or predictive validity)

Building an argument based on multiple sources of evidence (e.g. statistical calculations, qualitative data, reflections on one’s own values and those of others, and an analysis of unintended consequences)

These calculations are based on logical or mathematical models that date from the early 20th century (Crocker & Algina, 1986)

Messick (1989) describes these procedures as fragmented, unitary approaches to validation

Hubley and Zumbo (1996) describe them as “scanty, disconnected bits of evidence…to make a two-point decision about the validity of a test”

Cronbach (1982) recommended a more comprehensive, argument-based approach to validation that considered multiple and diverse sources of evidence

Validation practice has also evolved from a fragmented approach to a comprehensive, unified approach in which multiple sources of data are used to support an argument

Messick’s framework

What is Validity? Validity is “an integrated evaluative judgment

of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment” (Messick, 1989)

Validity is a unified concept, and validation is a scientific activity based on the collection of multiple and diverse type of evidence (Messick, 1989; Zumbo, 1998, 2007)

Messick’s Conception of Validity

JustificationOutcomes

Test Interpretation Test Use

Evidential basis

Construct Validity(CV)

CV + Relevance/+ Utility(RU)

Consequential basis

Value Implications(CV+RU+VI)

Social Consequences(CV+RU+VI+UC)



Evidential basis



Consequential basis



In terms of functions(interpretation vs.

use)

Basis for justifying validity

(evidential basis vs. consequential

basis)



Evidential basis



Consequential basis



refer to traditional scientific evidence traditional

psychometrics

relevance to learners and

to society, and to cost

benefit



Evidential basis



Consequential basis



Consequential basis is not about poor test practice

rather, the consequences of testing refer to the

unanticipated or unintended consequences

of legitimate test interpretation and use



Evidential basis



Consequential basis



refers to underlying values, including

language or rhetoric, theory,

and ideology



Evidential basis



Consequential basis



Defined as the unintended

social effects of testing,

including the actual and

potential effects of test use,

especially issues as bias,

adverse impact and distrib

utive

justice and any other indirect

effects, both actual/potential

and positive/negative, of using

the test on the overall

educational system

The four facets

The evidential basis of Messick’s framework contains two facets

1. Traditional psychometric evidence

2. The evidence for relevance in applied settings such as the workplace as well as utility or cost-benefit.

Evidential Basis for Test Inferences and Use

The evidential basis for test interpretation is an appraisal of the scientific evidence for construct validity.

A construct is a “definition of skills and knowledge included in the domain to be measured by a tool such as a test” (Reckase, 1998b)

The four traditional types of validity are included in this first facet.

Evidential Basis for Test Inferences and Use

The evidential basis for test use includes measures of predictive validity (e.g., correlations with other tests of behaviors) as well as ultility (i.e., a cost-benefit analysis)

Predictive validity coefficients re measures of behavior to be predicted from the test (e.g., a correlation between scores on a road test and a written driver qualification test)

Cost- benefit refers to an analysis of costs compared with benefits, which in education are often difficult to quantify.

The consequential basis of Messick’s framework contains two facets

1. Value Implications (VI)1. (CV + RU + VI)

2. Social Consequences 1. (CV + RU + VI + UC)

Value Implications

Rhetoric Theories Ideologies

Value Implications: The Dimensions

Value implications requires an investigation of three components

Rhetoric or value -laden language and terminology Value-laden language that conveys both a

concept and an opinion of concept Underlying theories

Underlying assumptions or logic of how a program is supposed to work (Chen, 1990)

Underlying ideologies A complex mix of shared values and beliefs that

provide a framework for interpreting the world (Messick, 1989)

RhetoricIncludes language that is discriminatory, exaggerated, or over blown, such as derogatory language used to refer to the homeless.

In validation practice, the rhetoric surrounding standardized tests should be critically evaluated to determine whether these terms are accurate description of knowledge and skills said to be assessed by a test (Messick, 1989)

Theory

The second component of the value implications category is an appraisal of the theory underlying the test. A theory connotes a body of knowledge that organizes, categorizes, describes, predicts, explains and otherwise aids in understanding phenomenon and organizing and directing thoughts, observations and actions (Sidan& Sechrest, 1999)

Ideology

The third component of value implications is an appraisal of the “broader ideologies that give theories their perspective and purpose (Messick, 1989)

An ideology is a “complex configuration of shared values, affects and beliefs that provides, among other things, an existential framework for interpreting the world.” (Messick, 1989)

Values implications challenge us to reflect upon:

a. The personal or social values suggested by our interest in the construct and the name/label selected to represent that construct

b. The personal or social values reflected by the theory underlying the construct and its measurement

c. The values reflected by the broader social ideologies that impacted the development of the identified theory

Messick 1980, 1989

Social Consequences

Social consequences refer to consequences for society stemming from the use of a measure

Remember that construct validity, relevance and utility, value implications and social consequences all work together and impact one another in test interpretation and use.

[email protected]

Download - Messick’s framework

Top Related