a crash course in predictive models - home | act crash course in predictive models ... a tree...

39
A Crash Course in Predictive Models Concepts and tools behind the data

Upload: dangdieu

Post on 28-Mar-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

A Crash Course in Predictive ModelsConcepts and tools behind the data

Page 2: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Outline• What is predictive modeling?• What questions can it help us to answer?• Brief survey of approaches• Predictive tools in enrollment marketing services• DIY tools

Page 3: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Image source: fivethirtyeight.com

Presenter
Presentation Notes
Creating a statistical model to predict what is going to happen.538 built this model on the polls, the economy, and historical data. Their predictions were wrong. Some take-aways: -bad data yields bad results –no forecast is infallible -decision makers need to understand what is behind their data –predictions are only part of the equation; you have to analyze and execute on the info
Page 4: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Image source: fivethirtyeight.com

Presenter
Presentation Notes
Easy to look at that map and say, predictive models are bunk, but I think it actually illustrates why it is so important. Take a look at how close some of these states were, the states that made the difference in the election. What if the campaigns could know what it will take to influence the voters who are up for grabs in these states?
Page 5: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Image source: gizmodo.com

Presenter
Presentation Notes
That may have been exactly what happened. Here is a screenshot from a database that the RNC paid a predictive analytics company to assemble, and then that company left it unsecured on the Internet for a couple of weeks. It predicts what individual voters think about individual issues. What are some ways this could be put to use? Where should Trump invest time and money? Not people who score low on the AmericaFirst index, maybe not people who score super high either. Most impact could come from focusing on those open to swinging his way. If AmericaFirst doesn’t resonate strongly, what else will? That is the message he can serve up to them.
Page 6: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Some questions predictive modeling can address:• Who will enroll?• How many students will enroll?• Where should we focus recruiting efforts?• Which names should we buy, and what should we do with

them?• Which marketing tools/messages will be most effective?• How will changes in tuition and financial aid impact

enrollment?

Page 7: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Some Examples

Presenter
Presentation Notes
Examples from schools who have taken a stab at this using in house resources.
Page 8: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Common approaches• Linear or multiple linear regression• Logistic regression• Machine learning

Page 9: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Linear regression

Image source: http://onlinestatbook.com/2/regression/graphics/gpa.jpg

Presenter
Presentation Notes
Predict values from several measured or binomial variables. Result is a continuous number that fits the line.
Page 10: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Logistic regression

Image source: www.mssqltips.com

Presenter
Presentation Notes
Predict values from several measured or binomial variables. Result is binomial: two possible outcomes
Page 11: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Machine learning

Image source: By Stephen Milborrow (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)

Presenter
Presentation Notes
Can predict an outcome or a continuous number. Instead of just using a formula, machine learning focuses is on patterns and builds a solution based on what it observes. A tree showing survival of passengers on the Titanic ("sibsp" is the number of spouses or siblings aboard). The figures under the leaves show the probability of survival and the percentage of observations in the leaf.��By Stephen Milborrow (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons
Page 12: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Example 1: Will an inquiry enroll?(Goenner and Pauls, 2006)

Presenter
Presentation Notes
One can see that the model correctly predicts the enrollment behavior of 89.25% of inquiries. The sensitivity of the model, which is the model’s ability to predict enrollment of students that do enroll is 36%, compared to its ‘‘specificity’’ of 97%, which is the model’s ability to predict students that will not enroll and do not enroll.
Page 13: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

How did they do it?

• Formulated the question (Enrolled?: Yes or No)• Chose an appropriate method: Logistic regression• Chose variables with the help of the literature and

statistical tools

Page 14: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

PossibleVariables

Goenner and Pauls, 2006

Page 15: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

How did they do it?

• Formulated the question (Enrolled?: Yes or No)• Chose an appropriate method: Logistic regression• Chose variables with the help of the literature and

statistical tools• Prepped data• Built model with part of data• Tested model with part of data• Can apply to future data

Page 16: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Significance“Our strategy is to focus additional recruitment efforts on those students who receive higher model scores. With the focus on a smaller segment of the population, tools such as the direct calling of prospective students becomes feasible”

“Our institution has used our predictive model to segment the market of prospective students by where they live…we find that 70% fall in zip codes that score in the highest 15% of all zip codes”

(Goenner and Pauls, p. 953, 2006)

Page 17: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Example 2: Will an applicant enroll?(Hayes, Price and York, 2013)

Presenter
Presentation Notes
As the table shows, the model correctly classified 75.7% of applicants using a cutoff value of .5. The model predicted total enrollment yield of 31.6% of accepted applicants, compared to the observed value of 43.6% (72.5% of actual enrollment). In order to validate the model, it was applied to a randomly selected holdout sample of 50% of the 2009-10 accepted applicants. Although a model usually fits the sample used for its estimation better than it fits the population, classification accuracy for the holdout sample was 78.1% using a cutoff of .5, as shown in Table 4 under “Unselected Cases”. (Hayes, Price and York, 2013)
Page 18: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

How did they do it?

• Formulated the question (Enrolled?: Yes or No)• Chose an appropriate method: Logistic regression• Chose variables that were accessible

• ACT score• Discount• In state or out of state

• Prepped data• Built model with part of data• Tested model with part of data• Can apply to future data

Page 19: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Significance“While previous studies typically have utilized numerous variables obtained from a variety of sources, the current model requires only two variables that are readily available to any institution. Our model was able to predict enrollment decisions with a level of accuracy comparable to models employing more variables ”

(Hayes, Price and York, p. 67, 2013)

Page 20: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Example 3: Will a prospect enroll?(Herridge and Heil, 2003)

Presenter
Presentation Notes
As the table shows, the model correctly classified 75.7% of applicants using a cutoff value of .5. The model predicted total enrollment yield of 31.6% of accepted applicants, compared to the observed value of 43.6% (72.5% of actual enrollment). In order to validate the model, it was applied to a randomly selected holdout sample of 50% of the 2009-10 accepted applicants. Although a model usually fits the sample used for its estimation better than it fits the population, classification accuracy for the holdout sample was 78.1% using a cutoff of .5, as shown in Table 4 under “Unselected Cases”. (Hayes, Price and York, 2013)
Page 21: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

How did they do it?

• Formulated the question (Likelihood of enrollment: score)• Chose an appropriate method: Multiple linear regression• Chose variables that were available across dataset and

evaluated with the help of statistical tools• Prepped data• Built model with part of data• Tested model with part of data• Can apply to future data

Page 22: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Significance

(Herridge and Heil, 2003)

Presenter
Presentation Notes
Table 2 is a sample of our prospect search strategy to build this pool. Students from core market segments with higher model scores were automatically treated as inquiries, there- fore bypassing the search process. These students immediately received ACU’s freshmen introduction packet (with viewbook, application and corresponding documents) while we speculated they would only be re- ceiving search pieces from our competing institutions. Immediately treating these students as inquiries also allowed us to begin our telecounseling campaigns rather than waiting for search responses or telequalification results from an outside source. What resulted was a 16% increase in early applications. On the opposite end of this spectrum, students from non-core seg- ments with lower model scores were given a single opportunity to in- quire through a search piece mailing. A non-respondent communication did not exist for this group. Students from non-core segments with higher model scores were treated with a traditional search piece and were given a non-respondent piece. Although we felt there was minimal risk in this initiative, this was much different than the traditional conservative approach to ACU’s past prospect search process. (Herridge and Heil, 2003)
Page 23: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Significance

(Herridge and Heil, 2003)

Presenter
Presentation Notes
As these data show, there is a clear increase in the overall numbers and clear increases in the core segments, which we wanted to grow. Of particular note is the amazing growth in the Texas Church of Christ segment. Many on our campus had believed this market to be “tapped out” (Herridge and Heil, 2003)
Page 24: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Example 4: Will a prospect enroll?(Hansen, 2015)

Presenter
Presentation Notes
As Figure 1 shows, the ANN did a much better job of accurately predicting matriculation, correctly ranking 77% of matriculating students in the top cohort and identifying nearly 100% within the top three cohorts. (Hansen, 2015)
Page 25: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

How did they do it?

• Formulated the question (Likelihood of enrollment: score)• Chose an appropriate method: Machine learning• Chose variables that were available across dataset and

augmented with census data• Prepped data• “Trained” model with part of data• Tested model with part of data• Can apply to future data

Page 26: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Significance“Application of machine learning and ANNs to our University admissions process has been simple to implement and quite effective at very low cost. The three models we have developed provide the University with useful information about prospects throughout the admissions process, accurately identifying prospects that will matriculate the following fall.”

(Hansen, pp.5-6, 2015)

Page 27: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Where to Start

Page 28: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Easy ways to get started• ACT predictive indices• Talk to your vendors• Academic articles

Page 29: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

ACT predictive indices• Mobility: In-state or out-of-state • Institution Type: Public or private • Selectivity: Open enrollment to highly selective • Size: Small to large

Presenter
Presentation Notes
Aggregate or individual student level Strategically target students for recruitment Predict whether a student will enroll at your institution Integrate into enrollment management system
Page 30: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

(Bassiri and Moore, 2016)

Page 31: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Talk with your vendors• Hobsons

• Vast amounts of data from Naviance• Uses machine learning to analyze how students progress through

the funnel• Can help allocate spend more effectively• Will soon offer predictive scores by high school through RepVisits

Presenter
Presentation Notes
Aggregate or individual student level Strategically target students for recruitment Predict whether a student will enroll at your institution Integrate into enrollment management system
Page 32: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Talk with your vendors• NRCCUA

• Regression model built on 3 years enrolled data• Scores names in database• Ability to purchase names by predictive score

Presenter
Presentation Notes
Aggregate or individual student level Strategically target students for recruitment Predict whether a student will enroll at your institution Integrate into enrollment management system
Page 33: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Talk with your vendors• AdWords• Facebook• Display/mobile ad vendors

Presenter
Presentation Notes
Aggregate or individual student level Strategically target students for recruitment Predict whether a student will enroll at your institution Integrate into enrollment management system
Page 34: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

What have others done and how?• Visit your campus library!• Google Scholar

Presenter
Presentation Notes
Don’t reinvent the wheel. See what others have done. Even if you don’t have any desire to do any predictive modeling yourself, digging into the literature can help prepare you to make better decisions about who to hire, what questions to ask, and which tools to use.
Page 35: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Trying it Yourself

Page 36: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

What data is available?• Data from CRM?• Data from SIS?• Data from name purchases?• Data from website?• External data (census, IRS, credit bureaus, etc.)?

Presenter
Presentation Notes
What data are you realistically able to use? Who will you need to convince to get to the data you want? Can you do a “pilot” project to bolster your case?
Page 37: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Selecting a statistical test• Laerd statistical test selector• University of Illinois statistical test selector• What test did an existing model use?• Make a friend on campus

Presenter
Presentation Notes
Don’t reinvent the wheel. See what others have done. Even if you don’t have any desire to do any predictive modeling yourself, digging into the literature can help prepare you to make better decisions about who to hire, what questions to ask, and which tools to use.
Page 38: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

Predictive modeling tools• Excel• R• Minitab• Weka• SPSS• SAS• Rapid Insight• IBM

Presenter
Presentation Notes
If you’re experimenting, don’t overdo it. Select a product that fits the question you are trying to answer, and offers the documentation/interface you need to get started.
Page 39: A Crash Course in Predictive Models - Home | ACT Crash Course in Predictive Models ... A tree showing survival of passengers on the Titanic \ ... • ACT score • Discount • In

ReferencesBassiri, D., Moore, J. (2016). Predictive modeling for targeted recruitment and predicting enrollment. ACT Enrollment

Planners Conference. Chicago, IL.

Cameron, D., & Conger, K. (2017, June 19). GOP Data Firm Accidentally Leaks Personal Details of Nearly 200 Million American Voters. Gizmodo. Retrieved from http://gizmodo.com/gop-data-firm-accidentally-leaks-personal-details-of-ne-1796211612

Goenner, C. F., & Pauls, K. (2006). A predictive model of inquiry to enrollment. Research in Higher education, 47(8), 935-956.

Hansen, David, "Machine Learning for Predictive Analytics Made Easy – A Case Study" (2015). Faculty Publications - Department of Electrical Engineering and Computer Science. Paper 15.h p://digitalcommons.georgefox.edu/eecs_fac/15

Hayes, J. B., Price, R. A., & York, R. P. (2013). A simple model for estimating enrollment yield from a list of freshman prospects. Academy of Educational Leadership Journal, 17(2), 61.

Herridge, B., & Heil, R. (2004). Building a Better Applicant Pool–A Case Study of the Use of Predictive Modeling and Market Segmentation to Build and Enroll Better Pools of Students. Journal of Marketing for Higher Education, 13(1-2), 33-55.

Silver, N. (2016, November 08). 2016 Election Forecast. FiveThirtyEight. Retrieved from https://projects.fivethirtyeight.com/2016-election-forecast