principles and applications of special education...

64
PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT CLASS 3: DESCRIPTIVE STATISTICS & RELIABILITY AND VALIDITY FEBRUARY 2, 2015

Upload: others

Post on 20-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT

C L A S S 3 : D E S C R I P T I V E S T A T I S T I C S & R E L I A B I L I T Y A N D V A L I D I T Y

F E B R U A R Y 2 , 2 0 1 5

Page 2: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  Define basic terminology used in assessment, such

as validity, reliability, standard deviation, etc. •  Understand how to evaluate the technical adequacy

of tests including the norms, reliability, and validity. •  Interpret information from formal and informal

assessments. •  Describe the function of standardized assessment in

the eligibility process.

OBJECTIVES

Page 3: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

TONIGHT’S SCHEDULE

4:30 – 4:45 Group Presentations – Utah SPED Rules 4:45 – 5:15 Problem-Solving Teams – Case Studies 5:15 – 6:00 Descriptive Statistics 6:00 – 6:15 Break 6:15 – 7:00 Reliability & Validity 7:00 – 7:20 Graduate Students – Annotated Bibliography

Page 4: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

REVIEW

§ Children can not be determined to have a disability because of what? § Describe each of the following: § RIOT/ICEL § RTI and its relationship to the medical model § LRE § Components of an IEP

Page 5: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

CRITERION OR NORM-REFERENCED?

•  WISC-IV (Intelligence Test) •  History test •  Correct words on a spelling test •  Woodcock Johnson Achievement Test III •  Driving Test •  Number of steps correctly performed in a dressing

routine.

Page 6: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

WHY IS MEASUREMENT IMPORTANT? Standardized assessment is heavily applied in the

educational decision-making process.

Educators must understand § Test-selection criteria § Basic principles of measurement § Administration techniques § Scoring procedures

Page 7: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

High priority placed on assessment Mistakes made by professionals: •  Identified students based upon referral information and not testing.

•  Data presented played little role in planning.

•  Choosing poor-quality instruments.

•  Taking the recommendation at face value.

•  Using quick assessments even if those assessments do not address the areas of concern.

•  Failure to establish effective rapport with the examinee.

•  Failure to document behaviors during the examination that may be of diagnostic value.

•  Failure to adhere to the administration rules.

•  Making scoring errors.

•  Ineffectively interpreting assessment results for educational use.

CONCERNS IN THE FIELD

Page 8: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Nominal Scale § Used for identification purposes only; the numbers

function like a name (e.g., an ID number) § Numbers cannot be used in mathematical operations § Least useful scale

NUMERICAL SCALES

Page 9: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Ordinal Scale § Used to rank the order of items § Numbers have the quality of identification and indicate

greater or lesser quality (e.g., first place, second place, etc.)

§ Numbers are not equidistant (i.e., the distance between first and second place and second and third place is not necessarily the same)

NUMERICAL SCALES

Page 10: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Ratio Scale § Used for direct comparisons and mathematical

manipulations § Numbers are equidistant from each other § Numbers have a true meaning of absolute zero § Can be used in all mathematical operations (e.g.,

counts of behaviors, income, height, weight, etc.)

NUMERICAL SCALES

Page 11: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Interval Scale § Used for identification that rank greater or lesser quality

or amount – Numbers are equidistant (e.g., degrees on a

thermometer, IQ Scores, rating scales). – Most data in education will be interval scale data. § Does not have an absolute-zero quality § Numbers cannot be used in other mathematical

operations (e.g., multiplication)

NUMERICAL SCALES

Page 12: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  Scores an individual receives when individual items on tests are summed.

•  Raw scores convey very little meaning unless referenced to some standard.

•  Subtract the number of items students missed from the number of items presented.

•  All other scores, derived scores, are “derived” from the raw score.

RAW SCORES

Page 13: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Large sets of data are organized and understood through methods known as descriptive statistics.

Derived scores obtain meaning from large sets of data or large samples of scores.

Scores derived from the raw score include: § Percentile rank § Standard score § Grade equivalent § Age equivalent

DESCRIPTIVE STATISTICS

Page 14: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  Measures of Central Tendency: A way to organize data to see how the data cluster, or are distributed around a numerical representation of the average score.

§  Caution using this technique if your scores are widely scattered.

•  A normal distribution represents the way test scores would fall if a test was given to every single student of the same age or grade in the population.

MEASURES OF CENTRAL TENDENCY

Most students’ scores fall in the middle of the curve

Fewer students’ scores fall at the edges of the curve

Distribution is symmetric or equal on either side of the vertical line.

Page 15: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

It is important to know how students performed as a group and what constitutes excellent, average, and poor performance.

Frequency Distribution §  Rank scores from highest to lowest. §  Tally how many of each score was obtained.

Mode §  The score that occurs the most number of times.

Bimodal Distribution §  The distribution has two modes.

Multimodal Distribution §  A distribution with three or more modes.

Frequency Polygon §  A graph that represents a data set.

AVERAGE PERFORMANCE

Page 16: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

MODE & FREQUENCY POLYGON

Mode

Page 17: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

MEDIAN Median

• Found by rank ordering the data set, writing each score the number of times it occurs.

• Count halfway down the list of scores; 50% of the data are listed above the median and 50% are below.

• In a data set with an even number of scores, the median score may not actually exist in the data set.

Page 18: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

MEDIAN EXAMPLES

Median Score

100 69

97 69

89 68

85 62

85 60

78 Median Score

100 83

96 82

95 80

90 78

85 77

84

Page 19: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  One of the best measures of average performance is the mean.

•  The mean is found by calculating simple average. •  Mean can be affected by extreme scores, especially if

the group is composed of only a few students. § Can be controlled by eliminating extreme scores (i.e., outliers).

Example:

Data set: 90, 80, 75, 60, 70, 65, 80, 100, 80, 80 780 ÷ 10 = 78

MEAN

Page 20: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  Measures of dispersion are used to calculate how scores are spread from the mean.

•  Variability is the way that scores in a set of data are spread apart. •  Range

§ Provides an idea about the spread. § Calculated by subtracting the lowest score from the highest score.

Example: § Top score = 100; Lowest score = 45 § 100 – 45 = 55

MEASURES OF DISPERSION

Page 21: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  Data are described as having variance. •  Variance can be described as the degree or amount of

variability or dispersion in a set of scores. •  The dispersion of a set of scores around the mean •  Applicable for Equal Interval & Ratio, not Nominal or

Ordinal

VARIANCE

Page 22: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  Standard deviation is one determined typical unit above and below the score of 100.

•  Standard deviation is one method of calculating difference in scores or variability of scores known as dispersion.

•  Must calculate variance before you can calculate standard deviation. •  Standard Deviation = √ variance •  Any test score that is 1 standard deviation above or below the mean

score is considered significant. •  Applicable for Equal Interval & Ratio, not Nominal or Ordinal

STANDARD DEVIATION

Page 23: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  In a normal distribution, the standard deviations represent the percentages of scores shown on the bell curve.

•  More than 68% of the scores fall within one standard deviation above or below the mean.

•  Two standard deviations below the mean = Intellectual Disability •  Two standard deviations above the mean = Gifted

STANDARD DEVIATION & NORMAL DISTRIBUTION

Page 24: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such
Page 25: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

•  When small samples or very restricted populations are used, test results may not distribute into a normal curve.

•  Extreme scores can change the appearance of a set of scores and subsequently influence the way the data are described.

•  Distributions can be skewed in a positive or negative direction.

SKEWED DISTRIBUTIONS

Negatively Skewed: Large number of scores occur above the mean.

Positively Skewed: Large number of scores occur below the mean.

Page 26: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Percentile Rank § Rank each score on the continuum of the normal

distribution § Percentiles range from <1% to 99.9%, with 50 being the

average. § A person who scores at the 75%tile scored as well or

better than 75% of the students in that age/grade group.

TYPES OF SCORES

Page 27: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Percentile Rank For example: Jalen obtained a percentile rank of 42. This means that Jalen performed as well as or better than 42% of children his age on the test.

Or, 42% of children Jalen's age scored at or below Jalen's score.

Page 28: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Descriptors for Percentile Ranges

Percentile Range Descriptor

98th %ile and Above Upper Extreme

91st to 97th %ile Well Above Average

75th to 90th %ile Above Average

25th to 74th %ile Average

9th to 24th %ile Below Average

3rd to 8th %ile Well Below Average

2nd %ile and Below Lower Extreme

Page 29: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

TYPES OF SCORES T scores § Have an average of 50 and standard deviation of 10

Stanines § Scores are divided into 9 groups with 5 being the mean and 2 being

the standard deviation Deciles § Scores are divided into 10 groups, 10 for the lowest group, 100 for

the highest § Each group represents 10% of the obtained scores.

Page 30: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

STANDARD SCORES Standard scores are scores of relative standing with a set,

fixed, predetermined mean and standard deviation

Page 31: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such
Page 32: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such
Page 33: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

CHOICE OF TEST SCORES

Percentile Ranks § Preferable over age and grade equivalents § Are considered “comparable scores” § Straightforward indicators of an individual’s standing

within a group § Reported as a reference to the student’s standing to the

group upon which the test was normed

Page 34: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Standard Scores § Advantages § Comparative § Based upon a normal or normalized distribution of

scores (bell curve) § Can be directly translated into percentile ranks

§ Because of a uniform mean (bell curve), they can be compared from one subtest to the next and one test administration to another.

CHOICE OF TEST SCORES (CONTINUED)

Page 35: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Age and Grade Equivalents: § Appear to be the simplest, but in fact, they can be the most

misinterpreted. § Major limitations: § Do not provide information about whether student’s

performance is within average limits. § Do not describe a student’s current instructional level. § Do not indicate what test questions the student answered

correctly. § A word of caution: Findings should be reported and worded

carefully to prevent misinterpretation.

CHOICE OF TEST SCORES (CONTINUED)

Page 36: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

GRADE EQUIVALENTS ARE OBTAINED FROM MEAN OR MEDIAN SCORES BY GRADE.

Page 37: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such
Page 38: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

MISLEADING: FALSE IMPRESSION OF PROGRESS

Grade Placement

Grade Equivalent

Years Below

Percentile Rank

2 1.9 .1 25th

3 2.4 .6 25th

4 3.1 .9 25th

5 3.9 1.1 25th

6 4.5 1.5 25th

7 5.3 1.7 25th

Page 39: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

AGE AND GRADE EQUIVALENTS What are the scores based on?

Why is this a problem?

32 - 16

37 17

42 12 19 67

59 23 12

70

45 - 11 +16 - 32 +26 - 26

+14 - 6 +12 -15 + 5

+14

1)

7) 8) 9) 10) 11) 12)

6) 5) 4) 3) 2)

Chelyn only gets the even # questions correct Raw Score = 6

Lou only gets the odd # questions correct Raw Score = 6

DO THESE STUDENTS HAVE THE SAME SKILLS?

Page 40: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

AGE & GRADE EQUIVALENTS 2 years below grade has different meanings at different grades Kurt is at the 12.5 grade level and obtained a grade equivalent of 10.5 on

the Reading Recognition Subtest of the PIAT. Mason is at the 3.5 grade level and obtained a grade equivalent of 1.5 on

the same test. Is their performance the same? Who performed better? Kurt obtained a standard score of 93, 33rd percentile Mason obtained a standard score of 72, 3rd percentile

Page 41: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

GRADE EQUIVALENTS MEAN DIFFERENT THINGS ON DIFFERENT TESTS •  Billy, grade placement 7.5, obtained a grade equivalent of 5.5 on the

WRMT. •  Bobby, grade placement 7.5, obtained a grade equivalent of 5.5 on the

Reading Subtest of the WRAT. Is their performance the same? Who performed better? Billy performed at the 18th percentile Bobby performed at the 34th percentile

•  At the same point on the scale and the same age level, identical grade equivalents mean different things on different tests.

Page 42: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

RELIABILITY & VALIDITY

Page 43: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

RELIABILITY & VALIDITY Aids in determining test accuracy and dependability § Reliability—the dependability or consistency of an

instrument across time or items. § Validity—the degree to which an instrument measures what

it was designed to measure. Instruments should have both properties but may have only

one (not that strong of an instrument)

Page 44: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Correlation—the degree of relationship between two variables.

§ Two administrations of the same test § Administration of equivalent forms

Correlation coefficient ranges: +1.00 to -1.00 § Perfect positive correlation = +1.00 § Perfect negative correlation = -1.00 § No correlation = 0 § Numbers closer to +1.00 represent stronger relationships

§  The greater degree of the relationship, the more reliable the instrument. §  The + does not indicate strength, but direction.

Correlation (r)

Page 45: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

Scattergrams provide a graphic representation of a data set and show a correlation.

The more closely the dots on a scattergram approximate a straight line, the nearer to perfect the correlation.

SCATTERGRAM

Page 46: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

TYPES OF CORRELATION

P O S I T I V E C O R R E L A T I O N

Variables with a positive relationship move in the same direction. Scores on variables increase simultaneously.

N E G A T I V E C O R R E L A T I O N

High scores on one variable are associated with low scores on another variable.

No Correlation

n  When data from two variables are not associated or have no relationship.

n  No linear direction on a scattergram

Page 47: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

RELATIONSHIP BETWEEN RELIABIITY & VALIDITY

Suppose I have a faulty measuring tape and I use it to measure each student’s height.

On the other hand, if I have a correctly printed measuring tape…

My tool is invalid, but it’s still reliable.

My tool is both valid and reliable.

Page 48: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

RELIABILITY Another way to think of reliability is to imagine a kitchen scale. If you weigh five pounds of potatoes in the morning, and the scale is reliable, the same scale should register five pounds for the potatoes an hour later.

Page 49: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

VALIDITY Let’s imagine a bathroom scale that consistently tells you that you weigh 130 pounds. The reliability (consistency) of this scale is very good, but it is not accurate (valid) because you actually weigh 150 pounds (perhaps you re-set the scale in a weak moment).

Page 50: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

RELIABILITY CHECKS

Test-Retest (Stability) Equivalent Forms Inter-Rater (Agreement)

Page 51: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

TEST-RETEST RELIABILITY Test-retest reliability—the trait being measured is one that is stable

over time. If the trait being measured remains constant, the re-administration of

the instrument will result in scores similar to the first score. § Important to conduct re-test shortly after first test to control for

influencing variables. Difficulties: § Too soon: Students may remember test items (practice effect) and

score higher the second time. § Too far: Greater influence of time variables (e.g., learning,

maturation, etc.)

Page 52: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

EQUIVALENT (ALTERNATE) FORMS RELIABILITY

Equivalent forms reliability § Two forms of the same instrument are used. § Items are matched for difficulty. Advantage: Two tests of the same difficulty level that can

be administered within a short time frame without the influence of practice effects.

Page 53: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

INTERRATER RELIABILITY

Interrater reliability § The consistency of a test across examiners. § One person administers a test, a second person rescores

the test. § The scores are then correlated to determine how much

variability exists between the scores.

Page 54: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

ASSUMPTIONS OF TESTING 1.  People involved are skilled 2.   Error is always present 3.   Acculturation is comparable 4.   Behavior sample is adequate 5.   Present behavior is observed

Page 55: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

1. PEOPLE ARE SKILLED: •  in administering the test - including establishing

rapport

•  in scoring the test

•  in interpreting the results

•  in utilizing the results

Page 56: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

2. ERROR IS ALWAYS PRESENT

Obtained Score = True Score + Error Random error is unreliability § e.g., lack of familiarity with tests, examiner fatigue, etc.

Do not make decisions based on error

Page 57: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

3. ACCULTURATION IS COMPARABLE

The comparison group has comparable Experiential Background § Test item asking about how to get out of a forest for an

inner city child. Opportunity to Learn § Books available in the child’s home

Page 58: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

4. BEHAVIOR SAMPLE IS ADEQUATE All tests are only samples of behavior. Samples of Behaviors on Test

Domain of Behaviors of Interest

Page 59: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

5. PRESENT BEHAVIOR IS OBSERVED Future behavior is inferred. Tests can only inform us directly about present behavior.

Page 60: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

TEST VALIDITY Does the test actually measure what it is supposed to measure? §  Criterion-related validity: Comparing scores with other

criteria known to be indicators of the same trait or skill §  Concurrent Validity: Two tests are given within a very

short timeframe (often the same day). If scores are similar, the tests are said to be measuring the same trait.

§  Predictive Validity: Measures how well an instrument can predict performance on some other variable (e.g, ACT or GRE scores).

Page 61: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

CONTENT VALIDITY Ensuring that the items in a test are representative of content purported to be measured. § PROBLEM: Teachers often generalize and assume the test covers more

than it does (e.g., the WRAT-3 reading subtest only measures word recognition—not phonemic awareness, phonics, vocabulary, reading comprehension, etc.).

Some of the variables of content validity may influence the manner in which results are obtained and can contribute to bias in testing.

§ Presentation Format: The method by which items are presented to the student

§ Response Mode: The method for the examinee to answer items.

Page 62: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

VALIDITY OF TEST ~V~ VALIDITY OF USE §  Tests may be used inappropriately even though they are

valid instruments. §  Results obtained may be used in an invalid manner. §  Tests may be biased and/or discriminate against different

groups. §  Item bias, when an item is answered incorrectly a

disproportionate number of times by one group compared to another.

§  Predictive validity may predict accurately for one group and not another.

Page 63: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

NEXT WEEK •  Read Chapter 5 •  Submit Online Self-Assessment

Page 64: PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION …clb0711.weebly.com/uploads/7/2/9/7/7297883/class_3-descriptive_statistics.pdf• Define basic terminology used in assessment, such

SOURCES § Overton, T. (2012). Assessing learners with special needs (7th ed.). Upper

Saddle River, NJ: Pearson Education Inc.