expanding adaptive algorithms in new ways: rscat andecho...

1

Expanding Adaptive Algorithms in New Ways: RSCAT and Echo-Adapt

Bingnan Jiang and Michelle Barrett

Research Technology, Data Science, and Analytics

2

Bingnan Jiang, Ph.D.

• PhD Research Methodology, Measurement & Data Analysis, University of Twente

• Graduate Certificate in Large Scale Assessment, University of Maryland College Park

• Certifications in Agile methods, Lean/Six Sigma, Data Science

• Former:

• Middle & high school mathematics teacher

• Sr. Consultant, Colorado Department of Education

• Ed Technologist, Specializing in Assessment Technology

You may not know us (yet)...

Vice President

Research Technology, Data Science &

Analytics, ACT

Chair, AI and Adaptive Technologies

IEEE Industry Consortium on

Learning Engineering

Senior Research Scientist

Research Technology, Data Science &

Analytics, ACT

• PhD in Electrical & Computer Engineering, Northeastern University, Boston MA

• Interests: computerized adaptive testing, optimization, data science, Bayesian statistics, software development

• Former:

• Operations Research Scientist, Pacific Metrics, Lakewood, CO

• Operations Research Analysist, Norfolk Southern, Atlanta, GA

Michelle Barrett, Ph.D.

Outline

• Adaptive algorithms for learning and assessment

• Computerized adaptive testing with the shadow-test approach

• Ways to use this approach• RSCAT overview

• RSCAT demo

• Echo-Adapt overview (demo upon request post-session)

3

4

Learning & Assessment

5

Assessment... Learning...

Longitudinal

Statistical Models: Bayesian Knowledge TracingFocus: Process

Cross-section

Statistical Models: Item Response TheoryFocus: Outcome

Adaptive...

6

Statistical Model(s) for CAT

Item Response Theory (IRT)

• Acknowledges difference in item difficulty, discrimination, and other attributes depending on specific model

• Places items and examinees on the same difficulty/ability scale

• Allows for comparison of ability even when different items are administered (e.g., pre/post, different students)

• Often we MAXIMIZE INFORMATION in adaptive testing

7

Shadow Test Approach

A Fundamental Dilemma in CAT

8

CAT

Administer items sequentially to maximize information

Meet content and other specifications

9

The Shadow Test Approach Solves this Dilemma and Supports Other Operational Needs

Shadow-Test CAT

van der Linden, Wim J (2006)

Optimal Item

Administration

Use Item Bank

Efficiently

(Exposure Control)

Content

Specification

Compliance

Administer New

Items

(Field Testing)

A Shadow-Test Selects Optimal Items While Conforming to the Test Blueprint

Q1:…Q1000:

Item Pool

Shadow Test

Items Already Administered

Rest of the Optimal Test (Unseen to Test Takers )

• Test blueprint constraints• Real-time constraints• Test taker’s updated

abilityNext Optimal Item

Item i• Max accuracy• Item order• Item-passage

association

10

11

Shadow Tests Are Assembled Dynamically Throughout Adaptive Testing

van der Linden, Wim J (2006)

Shadow-Test Assembly Modeled as Mixed Integer Programming (MIP)

12

Objective Function

Constraints

Decision Variables

• Maximize test information maximize��

�

• Test length

• Item of Specific Content

�� = 50�

�

10 ≤ � � ≤∈��

15

• Item Selection

• Binary Variables� = 0or1, for all item �

ACT Internal Only13

Seem Complicated?

We have options…

15

Scalability Interoperability

Optimized Performance

Security & Reliability

At Scale

Easy Access Flexibility

Standalone Usability

Open-Source Package

Ways to Implement Shadow-Testing CAT

RSCATUseCases

1. Academic Research

• Test configuration and simulation

• Investigate item pool and test configuration interactions

• Evaluate measurement error, bias, etc.

2. Ed Tech R&D

• Unpack the algorithm "black box", understand methods

• Evaluate ways to improve outcome measures in products

16

17

Demo

18

Configure CAT and Run Simulations in RSCAT

Shiny APIs APIs

− No coding work− Easy use− Visualization

− Advanced use− Integrated with

R programs

− Advanced use− Integrated with

Java programs

Software Architecture

19

Available to the Public

License: CC BY-NC 4.0

Email: [email protected]

20

GitHub CRAN

https://github.com/act-org/RSCAT Coming Soon!

21

From Open Source to Scale: Highlights

MIP SolverExposure Control

Not supported

Embedded in operational testing;

Online IRT calibration for new

items

User Interface

Shiny app

Web app

Test configuration management

Lock and release configurations for live

delivery

Scalability

Runs locally

Runs in cloud

40,000+ concurrentexaminees,

<500ms response

Highly performant commercial

solver

Licensed by user; Open-source,

community and commercial solvers

available

At item and/or passage level;

Overall and conditional on

ability level

At item level;Overall only

Field Testing

22

References

• Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied psychological measurement, 6(4), 431-444.

• Durlach, P. J. & Spain, R. D. (2014). Framework for Instructional Technology: Methods of Implementing Adaptive Training and Education. Technical Report 1335. U.S. Army Research Institute for the Behavioral and Social Sciences. www.dtic.mil/docs/citations/ADA597411

• Meindl, B. & Templ, M. (2012). Analysis of commercial and free and open source solvers for linear optimization problems. Retrieved April 2017 from http://www.statistik.tuwien.ac.at/forschung/CS/CS-2012-1complete.pdf.

• Mittelmann, H. (2011). Benchmarks for Optimization Software. Retrieved April 2017 from http://plato.asu.edu/bench.html.

• Molnar, M. (2017). Market is booming for digital formative assessments. Retrieved from https://www.edweek.org/ew/articles/2017/05/24/market-is-booming-for-digital-formative-assessments.html

• van der Linden, W. J. (2006). Linear models for optimal test design. Springer Science & Business Media.

• van der Linden, W. J., & Veldkamp, B. P. (2007). Conditional item-exposure control in adaptive testing using item-ineligibility probabilities. Journal of Educational and Behavioral Statistics, 32(4), 398-418.

23

Thank you!Together, we’re helping people achieve success.

24

Technical Appendix

25

Assessment Helps to Close the Learning Gap

Identify Gaps

Close Gaps

Learning

Plan

Learning

Guidance

Learning

Status

Assessment

Learning

Plan

Technical Appendix

26

From Open Source to ScaleFeature RSCAT Echo-Adapt®

User Interface Shiny App, Installation Guide Web App, User Manual, Integration Guide, Technical Appendix

Constraint Types Basic Expanded

Objective Functions Available Maximum Fisher Information Maximum Fisher Information

Psychometric Model IRT IRT

Scoring Method EAP EAP, MCMC

Exposure Control Item, overall Item/passage, overall/conditional

Simulate Test Administration Sequential Load-balanced and auto-scaled

Field Testing Not available Embedded in operational testing with online calibration

Test Configuration Management Store locallyCopy, save, edit

Lock & release for test administration

Performance Limited optimization Fully optimized

Scalability Local installation 40,000+ concurrent examinees with <500 ms response time

Shadow Test MIP SolverLicensed by user;

Open-source, community and commercial solvers available with different performance profiles

Highly performant commercial solver

Interoperability N/AWith test delivery engines*: IMS Global aQTI CAT standard*Some test delivery engines are LTI and Caliper compliant

A MIP Is Built and Solved in Five Steps

1. Choice of decision variables� Integer variables, e.g., item selection binary variables

2. Modeling of the objective function� Maximize assessment accuracy (test fisher information at

ability estimate)

3. Modeling of constraints� Number of items of specific contents should be in a range

� Average word count of test should be in a range

� Enemy items

4. Input the MIP model into a MIP solver, e.g., FICO Xpress

5. Evaluation of solutions

27

Technical Appendix

28

Scoring Method

• Expected a posteriori (EAP) − The expected ability from posterior distribution based on quadrature points

− Bock & Mislevy (1982)

− Easy to implement and effective

• Markov chain Monte Carlo (MCMC) − A fully Bayesian approach based on Markov chains

− Gibbs sampling and Metropolis-Hastings

− Works well for items with unknown parameters

Technical Appendix

29

MCMC for Scoring and Field Testing

�� = �� , ��

………�� ! �"

Testing Stage 1 to #

A Gibbs sampler with a Metropolis-Hastings step

Update �� or ��

Markov Chain

$�

Technical Appendix

30

MCMC Scoring

Draw Sample

Candidate �%,�&

Accept/Reject �%,�&with Probability P

Draw from normal proposal density '(�%,�|�%,�*+�)

Next Iteration, - ← - + 1

Resample�0* from

Posterior Distribution

�%,�* = �%,�& or �%,�*+�

1%,� = �%,�� , ⋯ , �%,�3Upon Stationary

4 = min 6(�%,�& )7(�%,�& ; �0*+�)9:0 1 − 7(�%,�& ; �0*+�)�+9:0

6(�%,�*+�)7(�%,�*+�; �0*+�)9:0 1 − 7(�%,�*+�; �0*+�)�+9:0 , 1

• After examinee < submits response =>? to operational item >? in stage ?

Update 6 �%

Technical Appendix

31

A Shadow Test Selects Operational and Field-Test Items Simultaneously

Shadow Test MIP Objective Function

Maximize Sum of Posterior Expected Item Information

Maximize Sum of D Criteria

Operational Field-Test

≡ A+��B: �C; �C3

C �

maximize��

�maximize�DE�E

F

E �

DE ≡ A+��det I+� �E + �EC ; �C3

C �−det I+� �E

Technical Appendix

32

MCMC to Update Field-Test Item Parameters

• After field-test item J is responded by examinee < with response =J<

�E = �E� , ⋯ , �E3

Resample �%*

Draw Parameter

Candidate KE&

Samples saved through the MCMC ability update process

Next Iteration, - ← - + 1

Accept/Reject KE&with Probability 4L

Upon Stationary

Draw from normal proposal density '(�E|�E*+�)

�E* = �E& or �E*+�4L = min 6(�E& )7(�%,�* ; �E& )9MN 1 − 7(�%,�* ; �E& ) �+9MN

6(�E*+�)7(�%,�* ; �E*+�)9MN 1 − 7(�%,�* ; �E*+�) �+9MN , 1

Update 6 �E

Technical Appendix

33

Exposure Control: Item-Ineligibility Constraints

• Eligibility probability of item/passage in a theta range

:the number of examinees through j who visited theta range k and took item/passage i

:the number of examinees through j who visited theta range k when item/passage iwas eligible

:exposure goal rate

max( 1)ˆ ( | ) min ,1 , for 0.ijkj

i k ijk

ijk

rP E

εθ α

α

+

= >

ijkα

ijkε

maxr

van der Linden and Veldkamp (2007)

Technical Appendix

34

Exposure Control: Item-Ineligibility Constraints (Cont’d)

• Eligibility experiment

If , then item/passage i is ineligible at theta interval k

• Ineligibility soft constraint

: Fisher information of item/passage i at the current ability estimate;

: Penalty of selecting ineligible item/passages

ˆ~ (1, ), where ( | ).ik i k

X B p p P E θ=

0ikX =

1

ˆ( )

j

I

i i i

i i S

I x M xθ

= ∈

−∑ ∑

ˆ( )iI θ

M

35

About Us

Mission-Driven

Helping people achieve education and workplace success.

37

We create products that genuinely help our customers succeed.

Our ValueProposition

AuthorityWe leverage our research and credibility to validate and certify

PersonalizationWe provide portable, personalized experiences

IntegrationWe embed our assessments into existing processes and cycles

38

About Us

Our Guiding Principles

InclusiveWe do everything we can to level the playing field for everyone, regardless of needs, backgrounds, or resources.

HolisticWe assess and appreciate each person's unique traits and skills, to help navigate toward college and career success.

TransformationalWe lead the industry through our research and technology, constantly evolving as an integral part of the learning process.

We strive to always be…

39

ACT Today

• Serving 3rd Grade - Career

• Customers in 50 states and 130+ countries

• 15M assessments in FY16

• 60% of 2017 graduating class took the ACT

• Fee waivers for 700,000 underserved students: $36M +

• More than 4 million National Career Readiness Certificates awarded

• 1000+ employees, in 37 states

40Founded in Iowa City, 1959

Products and ServicesACT is widely known for the ACT test, but that is just one aspect of the work we do...

41

42

Students &

Parents

ACT can help you plan your future, prepare for college and career, and achieve success.

K-12 Educators &

Administrators

ACT helps you track student progress and prepare them for success through high school and beyond.

Postsecondary

Professionals

ACT solutions can help you find, attract, place and retain students at your school.

Job Seekers &

Employers

ACT workforce development solutions help job seekers, employers, and business leaders, achieve career and business goals.

43

ACT Product Portfolio

High School 525K

3rd Grade – early High School5.3M Summative

3.8M Interim861K Classroom

High School 4M

High School Access to enrollment, scholarship opportunities

Measures foundational, soft skills 2M

Improve the skills essential to workplace success

276K Individual Learners

SolutionsAnnual Volume (est.)

44

Measures social and emotional learning (SEL) skills

High School

11th grade through Higher Education

Measures/certifies essential work skills

380+ Participating Counties

Measures/certifies work skills in a community

4M+ Awarded15,000+ Employers Recognized

SolutionsAnnual Volume (est.)

45

Innovative Services & Solutions

46

Growing a portfolio of solutions to advance our mission.

Research-BasedWe use data and research to drive policy, product and business decisions.

47

Research lies at the heart of everything we do.

ACT research guides thought-leadership, and drives solutions:

• ACT College and Career Readiness Standards

• ACT College Readiness Benchmarks

• Broad Definitions of Readiness

• ACT Policy Platforms

48

We are industry thought-leaders who continually reinvest in our research.

Among our top findings:

49

Readiness Matters Early MonitoringMatters

Multiple Dimensions Matter

50

Policy Platforms

ACT has articulated policy recommendations in the form of policy platforms in three areas: K–12 education, postsecondary education, and workforce development.

NonprofitWe re-invest in research, programs and services to support our mission.

51

Through purposeful investments, employee engagement, and thoughtful advocacy, the Center for Equity in Learning supports innovative partnerships, initiatives, campaigns, and programs that help young people succeed in education and the workplace.

Follow us @ACTEquity

52

Equity in LearningWe do everything we can to level the playing field for everyone, regardless of needs, backgrounds, or resources.

We believe success is different for everyone.

No matter where you’ve been, where you are, or where you want to go, ACT can help.

53

54

ACTNext supports ACT by pursuing and developing a research agenda integrating the most recent findings in psychometrics, statistics, assessment design, analytics, measurement, educational data mining and technological innovations

ACTNext

Research and Development + Business Innovation Center

Artificial Intelligence (AI) in assessment

• AI scoring – essays, short answer, open-ended math

• Diagnostics, error pattern recognition

• Create optimal learning pathways

• Automated item generation

• Test security, online proctoring to proctorless(anytime, anywhere)

• Real-time authoritative, personalized, integrated instruction, tutoring and advising at scale

55

We’ve disrupted the industry once… …and we’re ready to do it again.

Why? Because at ACT, we’re passionate about what we do and determined to help people achieve education and workplace success.

56

Champion the ACT MissionBe a voice, make a difference.

57

58

• Represented in all 50 states and the District of Columbia

• More than 10,000 members—and growing

• 640 actively engaged state council members

• 20 elected Steering Committee members

ACTState Organizations

A unique network of teachers, counselors, administrators, enrollment advisors and business professionals.

Work Ready Communities

ACT Work Ready Communities empowers states, regions and counties with data, processes and tools that drive economic growth.

National Career Readiness Certificate (NCRC®) measures and closes the skills gap, building common frameworks that link, align and match their workforce development efforts.

Total Certified Counties: 380+

Employers Supporting: 15,000+

Jobs Profiled: 21,000+

NCRC Total: 4 Million+

59

60

ACT College & Career Readiness Champions

Established in 2013 to create awareness and celebrate achievement in college and career readiness for all.

Help advance our mission

Be a voice, make a difference.

Connect on social.

61visit ACT.org

expanding adaptive algorithms in new ways: rscat andecho...

Documents