introduction to the assessment measures (temi) presentation · © 2008 university of texas...
TRANSCRIPT
© 2008 University of Texas System/Texas Education Agency 1
Vaughn Gross Center for Reading and Language Arts
at The University of Texas at Austin
© 2008 University of Texas System/Texas Education Agency
Introduction to the
Assessment MeasuresBrian R. Bryant, Ph.D.
Austin, TX
© 2008 University of Texas System/Texas Education Agency 2
The Texas Early Mathematics
Inventories are TEKS-based,
norm-referenced, curriculum-
associated, group-administered,
alternate-form measures of
mathematics for students in
grades K, 1, and 2.
© 2008 University of Texas System/Texas Education Agency 3
TEMI-PM
Broad-based
TEKSNumber,
operations, and
quantitative
reasoning
TEMI-O
Fall, winter,
spring
© 2008 University of Texas System/Texas Education Agency 4
1. Identify Struggling Students
08-09 Early Math Progress
0
20
40
60
80
100
120
140
Stud
ent A
Stud
ent B
Stud
ent C
Stud
ent D
Stud
ent E
Stud
ent F
Stud
ent G
Stud
ent H
Stud
ent I
Stud
ent J
Stud
ent K
Stud
ent L
Stud
ent M
Stud
ent N
Stud
ent O
Stud
ent P
Stud
ent Q
Stud
ent S
Stud
ent T
Stud
ent U
Stud
ent V
Stud
ent W
Student Name
TEM
I-P
M T
ota
l S
co
re
Fall
Winter
Spring
Fall
Benchmark
=26
Winter
Benchmark
=63
Spring
Benchmark
=83
2. Monitor Progress Over Time
© 2008 University of Texas System/Texas Education Agency 5
Fall Winter Spring
25F
25W
25S
Monitor Progress
< 25th %ile; qualifies for intervention
> 25th %ile; qualifies
to exit
© 2008 University of Texas System/Texas Education Agency 6
Norms – How the TEMI-PM and TEMI-O
normative scores were established…
Reliability – Evidence that the tests yield
consistent results…
Validity – Evidence that the test scores
are useful for established purposes…
© 2008 University of Texas System/Texas Education Agency 7
Scores:
Student Scores (Raw Scores/Totals) – How
many points the students score on the
subtests/tests.
Normative Statistics
© 2008 University of Texas System/Texas Education Agency 8
Scores:
Percentiles – Where a student scores relative
to peers in the normative sample (NOT percent
correct); percentiles range from 1 to 99; if a
student scores at the 33rd %ile, 32% of the
normative sample scored lower than the
student and 66% of the sample scored higher.
Normative Statistics
32 66
---------------------------------------------------------------------------
© 2008 University of Texas System/Texas Education Agency 9
Scores:Descriptive Ratings
1.99Very Superior> 98
7Superior92–98
16Above Average76–91
50Average25–75
16Below
Average9–24
7Poor2–8
1.99Very Poor< 2
% of normative
sampleRating%ile rank range
Normative Statistics
© 2008 University of Texas System/Texas Education Agency 10
1,791 Kindergarten
1,764 Grade 1
1,675 Grade 2
69 Districts Fall
65 Districts Winter
65 Districts Spring
Normative Sample
included districts
across all
20 ESCs.
Normative Statistics
© 2008 University of Texas System/Texas Education Agency 11
Help Wanted
?
Looking for
normative sample
districts across all
20 ESCs
for the
Spanish
TEMI.
Bilingual/ESL
Kindergarten
Grade 1
Grade 2
© 2008 University of Texas System/Texas Education Agency 12
1,791Total
65236.40%32.80%White
82145.85%50.20%Hispanic
80.45%0.40%
Native
American
633.50%3.40%Asian
24713.80%13.20%
African
American
Total
Grade K
Adjusted State
Percentage
Grade K
State
Percentage Ethnicity
Male: 51.7% Female: 48.3%
Normative Statistics
© 2008 University of Texas System/Texas Education Agency 13
1,764Total
60634.35%32.40%White
82646.80%50.30%Hispanic
80.48%0.40%Native American
673.77%3.50%Asian
25814.60%13.60%African American
Total
Grade 1
Adjusted
State
Percentage
Grade 1
State
Percentage
Male: 51.8% Female: 48.2%
Ethnicity
Normative Statistics
© 2008 University of Texas System/Texas Education Agency 14
1,675Total
59235.34%33.20%White
76745.78%49.20%Hispanic
60.34%0.30%
Native
American
633.74%3.50%Asian
24814.80%13.80%
African
American
Total
Grade 2
Adjusted
State
Percentage
Grade 2
State
Percentage
Male: 51.3% Female: 48.7%
Ethnicity
Normative Statistics
© 2008 University of Texas System/Texas Education Agency 15
Reliability – The consistency with
which a test measures an ability.
Reliability of the Measures
© 2008 University of Texas System/Texas Education Agency 16
Alternate Forms (Immediate)
Content sampling
TEMI-PM: Median coefficients for
subtests > .75, total > .83
TEMI-O subtests > .7, totals > .8
Reliability of the Measures
© 2008 University of Texas System/Texas Education Agency 17
Coefficients Alpha
Content sampling
TEMI-O subtests > .8, totals = .9
Median TEMI-PM* subtests > .9, totals >.9
*CAVEAT – Alphas for fluency measures
may be inappropriate; adjustment for items
= >75% answered.
Reliability of the Measures
© 2008 University of Texas System/Texas Education Agency 18
Alternate Forms (Delayed)
Content + time sampling
TEMI-PM: Median coefficients for
subtests > .75, total > .84
A = 1 – content sampling
B = 1 – alternate forms delayed
C = A + B
D = 1 – C (time sampling error alone)
Reliability of the Measures
© 2008 University of Texas System/Texas Education Agency 19
Scorer Differences
Two raters separately scored 30
TEMI-PM and 30 TEMI-O protocols
each in K, 1, and 2.
Correlations:
K = .99 – 1.00
1 = .99 – 1.00
2 = .99 – 1.00
Reliability of the Measures
© 2008 University of Texas System/Texas Education Agency 20
Validity – Evidence that a test’s
scores do what they are intended to
do. The validity of a test’s scores will
vary according to the purposes for
which they are used. For the TEMI, the
scores primarily are used to identify
struggling students and monitor
progress over time.
Validity of the Test Scores
© 2008 University of Texas System/Texas Education Agency 21
Content-Description Validity – What evidence
is there that the math tests are related to
curricula?
How the tests and items were
developed…TEKS/basals? Why group? Why
fluency?
Traditional item analysis
Differential item functioning
Validity of the Test Scores
© 2008 University of Texas System/Texas Education Agency 22
Criterion-Prediction Validity – What
evidence is there that the tests relate to
math performance now and in the
future?
Concurrent: SAT-10, ITBS, Teacher
Ratings – relationships .4 to .8
Predictive: SAT-10 – relationships .4 to
.8; AUC, moderate to high; good Se&Sp
Validity of the Test Scores
© 2008 University of Texas System/Texas Education Agency 23
Construct-Identification Validity –
What evidence is there that the
tests relate to the construct
mathematics? What are the
hypotheses and the evidence?
Validity of the Test Scores
© 2008 University of Texas System/Texas Education Agency 24
Sensitive to progress within year – scores
Floor/ceiling effects – minimal (most + -2SD)
Intercorrelations – moderate
Group differentiation – SAT-10 < 25th %ile
Related to spoken language – negligible to low
Related to writing – moderate
Related to reading – moderate
PM: Useful for intervention – published articles
Factor analysis – single factor
Item validity – supported by traditional IA
Validity of the Test Scores
© 2008 University of Texas System/Texas Education Agency 25
Technical Adequacy Summary
Normative sample large and representative.
Tests have adequate reliability (content
sample, time sampling, scorer agreement).
Test scores are valid for identification and
progress-monitoring purposes—starter set of
studies: Subtests and totals reflect math and
TEKS; relate to current and future math
performance; empirically satisfy a variety of
validity-related hypotheses.