2013 nces mis conference title: data linking for analytics—k–12 to community college to...

32
2013 NCES MIS CONFERENCE TITLE: Data Linking for Analytics—K–12 to Community College to University 10:15am, February 14, 2013 Watson (IEBC), Osumi (UH), Ikenaga (UH)

Upload: robert-webster

Post on 27-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

2013 NCES MIS CONFERENCETITLE: Data Linking for Analytics—K–12 to Community College to

University

10:15am, February 14, 2013Watson (IEBC), Osumi (UH), Ikenaga (UH)

Jean OsumiSenior Associate for Academic Policy and Evaluation Hawaii P-20 Partnerships for Education, University of Hawaii

Todd IkenagaSLDS Program Manager Hawaii P-20 Partnerships for Education, University of Hawaii

John WatsonDirector of Analytics, Institute for Evidence Based Change (California)

Introductions

2

Agenda

Background, Systems, Approach

HI-PASS History, Current Effort, Data Sources

Cross-Segment Data Linking

Reporting

Progress, Next Steps

3

Some IEBC Project History

4

Our first multi-segment project: Cal-PASS

Started in San Diego in 1998Became a State funded project in 2003Goals:

Collect actionable dataLinks primary, secondary and post-secondary institutions on

a regional basisTracks students from one segment to the next

5

Using Data From A Systems Perspective

Educational System

Technologyand

Research Expertise

Organizational Habits

Human Judgments

and Behavior

6

Keys to Data Use

Focus on a few key metrics Key Performance Indicators, a good way to

start Focus on the goal – student learning and

completion Track cohorts, not just snapshots

◦ Look at leakage points Use data as a way to improve, not to punish Most important – tell a story

7

Systems

Main Warehouse

Research DB

Validation

> Loading

Web Site/PortalReporting

In-SITES Tools(Development | Published)

ETL > Data mart > Data Store

 

8

Data vs. Use We got it wrong – need to focus on consumer;

less on the data

Ron Thomas rules for Data Analysis…◦ We are in the knowledge Business – not the data business◦ Data is about improvement, in particular improvement in

instruction◦ A protocol for using data is important◦ Must build capacity of practitioners to acquire and use data

9

HI-PASS History

10

HI-PASS Initial Groundwork Started with two groups with different purposes:

Maui (PLC) & Hawaii P-20 (assess statewide impact)

2009 Statewide Forum on Longitudinal Data◦ top priorities that emerged were data

governance and access to data, which drove overarching MOU

One-by-one data sources and funding became clear – all focusing on P-20 as an end-goal

11

Overall Plan

12

College and Career Readiness Indicators

Completed for every public high school Classes of 2008, 2009, 2010, 2011 Measures by school

◦ College access nationwide◦ SAT scores◦ Percentage of completion of the BOE Recognition Diploma◦ College level work: Advanced Placement and Running Start◦ College level and remedial/development enrollment for Math and English (UH only)

13

P20W SLDS System Overview

14

Challenges

MOUs Lawyers

Changing culture

Data Quality

Sustainability

Federal and State regulations

15

Data Linking and Expectations

16

Technology

Microsoft SQL Server, Enterprise Edition

Dundas Dashboard

Windows-compliant CMS

17

Multi-segment linking

CA: Multiple IDs across segments

K-12 > CC◦ Encrypted provided derived key◦ Derived Key

K-12 > University◦ Encrypted provided derived key◦ Derived Key

CC > University◦ Encrypted provided derived key

18

Multi-segment linking

Texas◦ Texas Pathways (Higher Ed Coordinating Board)

Encrypted identifier

◦ GC-PASS Provided encrypted identifier > derived key

Gulf Coast-PASS

19

Deduplication Techniques There are times when we suspect duplicate

records, or specific keys aren’t available. Especially found in cross-segment situations Some instances:

1) name change2) typos in name or birthdate3) detecting false matches 4) detecting CONFLICTING IDs 5) finding transposed names across institutions

20

Deduplication Techniques Remedy: multiple-pass deduplication,

including:◦ Creation of metakey based on various techniques◦ These agree with primary derived keys 95% of

time◦ Current Method (23 stages):

Rule-based cleaning Comparison vectors Cosine similarity

21

Labor Data: Promises, Pitfalls For an increasing number of projects, there

is the hope of understanding what happens to student cohorts, population as they leave school

Labor data can be the answer◦ Additional MOUs◦ Data formats can vary◦ Data security concerns lead to carefully-planned

processing◦ Results promising. Example: Coachella Valley, CA

22

HI Data LinkingDemographics matching across K-12 to postsecondary mainly using name, date

of birth and gender in different combinations with strictest criteria used first.

Match 1 - last name, first name, dob, gender

Match 2 - last name, first name, dob

Match 3 - last name, first name (imbedded), dob, gender

Match 4 - last name (imbedded), first name, dob, gender

Match 5 - last name (imbedded), first name (imbedded), dob, gender

Match 6 - last name, first name (first 3), dob, gender

Match 7 - last name, first name (first 3), dob (month/year), gender

Match 8- last name (first 3), first name (first 3), dob, gender

Match 9 - last name (first 3), first name (first 3), dob (month/year), gender

Match 10 - last name, first 3 letters of FN, dob (month/day), gender

23

Output / Reporting

24

Web-based Query Reports

25

Progress indicators within the LLDIDashboard

Program Review

27

Intervention Program

28

Transition Across Segments

29

Reporting

30

OLAP Cubes◦K12 to postsecondary transitions - focus on

remedial enrollments and placement

◦Postsecondary enrollments linked to workforce data - Shows students, enrollments, GPA, campus, time period, linked to employment data

◦CTE tool -Similar to K12 but focusing on CTE pathways and subsequent majors and awards at postsecondary

HI-PASS: Focus of output

31

Project Status Continue with HI-PASS to identify issues

◦ Data quality◦ Integration problems◦ Missing elements◦ Definitions and standards

Complete infrastructure build for permanent system

Continue working on our data governance framework◦ Research◦ Data quality◦ Security and Access to data

Establish a framework for data literacy and use

Focus on identifying, training diverse user levels

32