introduction to the assessment measures (temi) presentation · © 2008 university of texas...

© 2008 University of Texas System/Texas Education Agency 1

Vaughn Gross Center for Reading and Language Arts

at The University of Texas at Austin

© 2008 University of Texas System/Texas Education Agency

Introduction to the

Assessment MeasuresBrian R. Bryant, Ph.D.

Austin, TX


The Texas Early Mathematics

Inventories are TEKS-based,

norm-referenced, curriculum-

associated, group-administered,

alternate-form measures of

mathematics for students in

grades K, 1, and 2.


TEMI-PM

Broad-based

TEKSNumber,

operations, and

quantitative

reasoning

TEMI-O

Fall, winter,

spring


1. Identify Struggling Students

08-09 Early Math Progress

0

20

40

60

80

100

120

140

Stud

ent A

Stud

ent B

Stud

ent C

Stud

ent D

Stud

ent E

Stud

ent F

Stud

ent G

Stud

ent H

Stud

ent I

Stud

ent J

Stud

ent K

Stud

ent L

Stud

ent M

Stud

ent N

Stud

ent O

Stud

ent P

Stud

ent Q

Stud

ent S

Stud

ent T

Stud

ent U

Stud

ent V

Stud

ent W

Student Name

TEM

I-P

M T

ota

l S

co

re

Fall

Winter

Spring

Fall

Benchmark

=26

Winter

Benchmark

=63

Spring

Benchmark

=83

2. Monitor Progress Over Time


Fall Winter Spring

25F

25W

25S

Monitor Progress

< 25th %ile; qualifies for intervention

> 25th %ile; qualifies

to exit


Norms – How the TEMI-PM and TEMI-O

normative scores were established…

Reliability – Evidence that the tests yield

consistent results…

Validity – Evidence that the test scores

are useful for established purposes…


Scores:

Student Scores (Raw Scores/Totals) – How

many points the students score on the

subtests/tests.

Normative Statistics


Scores:

Percentiles – Where a student scores relative

to peers in the normative sample (NOT percent

correct); percentiles range from 1 to 99; if a

student scores at the 33rd %ile, 32% of the

normative sample scored lower than the

student and 66% of the sample scored higher.


32 66

---------------------------------------------------------------------------


Scores:Descriptive Ratings

1.99Very Superior> 98

7Superior92–98

16Above Average76–91

50Average25–75

16Below

Average9–24

7Poor2–8

1.99Very Poor< 2

% of normative

sampleRating%ile rank range



1,791 Kindergarten

1,764 Grade 1

1,675 Grade 2

69 Districts Fall

65 Districts Winter

65 Districts Spring

Normative Sample

included districts

across all

20 ESCs.



Help Wanted

?

Looking for

normative sample

districts across all

20 ESCs

for the

Spanish

TEMI.

[email protected]

Bilingual/ESL

Kindergarten

Grade 1

Grade 2


1,791Total

65236.40%32.80%White

82145.85%50.20%Hispanic

80.45%0.40%

Native

American

633.50%3.40%Asian

24713.80%13.20%

African

American

Total

Grade K

Adjusted State

Percentage

Grade K

State

Percentage Ethnicity

Male: 51.7% Female: 48.3%



1,764Total

60634.35%32.40%White

82646.80%50.30%Hispanic

80.48%0.40%Native American

673.77%3.50%Asian

25814.60%13.60%African American

Total

Grade 1

Adjusted

State

Percentage

Grade 1

State

Percentage


Ethnicity



1,675Total

59235.34%33.20%White

76745.78%49.20%Hispanic

60.34%0.30%

Native

American

633.74%3.50%Asian

24814.80%13.80%

African

American

Total

Grade 2

Adjusted

State

Percentage

Grade 2

State

Percentage


Ethnicity



Reliability – The consistency with

which a test measures an ability.

Reliability of the Measures


Alternate Forms (Immediate)

Content sampling

TEMI-PM: Median coefficients for

subtests > .75, total > .83

TEMI-O subtests > .7, totals > .8



Coefficients Alpha

Content sampling

TEMI-O subtests > .8, totals = .9

Median TEMI-PM* subtests > .9, totals >.9

*CAVEAT – Alphas for fluency measures

may be inappropriate; adjustment for items

= >75% answered.



Alternate Forms (Delayed)

Content + time sampling

TEMI-PM: Median coefficients for

subtests > .75, total > .84

A = 1 – content sampling

B = 1 – alternate forms delayed

C = A + B

D = 1 – C (time sampling error alone)



Scorer Differences

Two raters separately scored 30

TEMI-PM and 30 TEMI-O protocols

each in K, 1, and 2.

Correlations:

K = .99 – 1.00

1 = .99 – 1.00

2 = .99 – 1.00



Validity – Evidence that a test’s

scores do what they are intended to

do. The validity of a test’s scores will

vary according to the purposes for

which they are used. For the TEMI, the

scores primarily are used to identify

struggling students and monitor

progress over time.

Validity of the Test Scores


Content-Description Validity – What evidence

is there that the math tests are related to

curricula?

How the tests and items were

developed…TEKS/basals? Why group? Why

fluency?

Traditional item analysis

Differential item functioning



Criterion-Prediction Validity – What

evidence is there that the tests relate to

math performance now and in the

future?

Concurrent: SAT-10, ITBS, Teacher

Ratings – relationships .4 to .8

Predictive: SAT-10 – relationships .4 to

.8; AUC, moderate to high; good Se&Sp



Construct-Identification Validity –

What evidence is there that the

tests relate to the construct

mathematics? What are the

hypotheses and the evidence?



Sensitive to progress within year – scores

Floor/ceiling effects – minimal (most + -2SD)

Intercorrelations – moderate

Group differentiation – SAT-10 < 25th %ile

Related to spoken language – negligible to low

Related to writing – moderate

Related to reading – moderate

PM: Useful for intervention – published articles

Factor analysis – single factor

Item validity – supported by traditional IA



Technical Adequacy Summary

Normative sample large and representative.

Tests have adequate reliability (content

sample, time sampling, scorer agreement).

Test scores are valid for identification and

progress-monitoring purposes—starter set of

studies: Subtests and totals reflect math and

TEKS; relate to current and future math

performance; empirically satisfy a variety of

validity-related hypotheses.

introduction to the assessment measures (temi) presentation · © 2008 university of texas...

Documents