section 2.1 introduction to enterprise miner. 2 objectives open enterprise miner. explore the...

45
Section 2.1 Introduction to Enterprise Miner

Upload: ashlee-ray

Post on 17-Dec-2015

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

Section 2.1

Introduction to Enterprise Miner

Page 2: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

2

Objectives Open Enterprise Miner. Explore the workspace components of Enterprise

Miner. Set up a project in Enterprise Miner. Conduct initial data exploration using Enterprise Miner.

Page 3: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

3

This demonstration illustrates opening Enterprise Miner and exploring its workspace components.

Demonstration

Page 4: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

4

The Scenario Determine who should be approved for a home equity

loan. The target variable is a binary variable that indicates

whether an applicant eventually defaulted on the loan. The input variables are variables such as the amount

of the loan, amount due on the existing mortgage, the value of the property, and the number of recent credit inquiries.

Page 5: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

5

This demonstration illustrates setting up a project in Enterprise Miner and conducting initial data exploration.

Demonstration

Page 6: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

Section 2.2

Modeling Issues and Data Difficulties

Page 7: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

7

Objectives Discuss data difficulties inherent in data mining. Examine common pitfalls in model building.

Page 8: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

9

Time Line

Projected:

Actual:

Dreaded:

Needed:

Data Preparation Data Analysis

Allotted Time

(Data Acquisition)

Page 9: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

10

Data Arrangement

Acct type

2133 MTG2133 SVG2133 CK

2653 CK2653 SVG

3544 MTG3544 CK3544 MMF3544 CD3544 LOC

Acct CK SVG MMF CD LOC MTG

2133 1 1 0 0 0 1

2653 1 1 0 0 0 0

3544 1 0 1 1 1 1

Long-Narrow

Short-Wide

Page 10: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

11

Derived Inputs

Claim Accident Date Time

11nov96 102396/12:38

22dec95 012395/01:42

26apr95 042395/03:05

02jul94 070294/06:25

08mar96 123095/18:33

15dec96 061296/18:12

09nov94 110594/22:14

Delay Season Dark

19 fall 0

333 winter 1

3 spring 1

0 summer 0

69 winter 0

186 summer 0

4 fall 1

Page 11: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

12

Roll Up

HH Acct Sales

4461 2133 1604461 2244 424461 2773 2124461 2653 2504461 2801 122

4911 3544 786

5630 2496 458 5630 2635 328

6225 4244 276225 4165 759

HH Acct Sales

4461 2133 ?

4911 3544 ?

5630 2496 ?

6225 4244 ?

Page 12: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

13

Rolling Up Longitudinal Data

Frequent Flying VIP Flier Month Mileage Member

10621 Jan 650 No

10621 Feb 0 No

10621 Mar 0 No

10621 Apr 250 No

33855 Jan 350 No

33855 Feb 300 No

33855 Mar 1200 Yes

33855 Apr 850 Yes

Page 13: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

14

Transactions

Hard Target Search

Fraud

Page 14: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

15

Oversampling

OK

Fraud

Page 15: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

16

Undercoverage

AcceptedGood

RejectedNo Follow-up

AcceptedBad

NextGeneration

Page 16: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

17

cking #cking ADB NSF dirdep SVG bal

Y 1 468.11 1 1876 Y 1208 Y 1 68.75 0 0 Y 0 Y 1 212.04 0 6 0 . . 0 0 Y 4301 y 2 585.05 0 7218 Y 234 Y 1 47.69 2 1256 238 Y 1 4687.7 0 0 0 . . 1 0 Y 1208 Y . . . 1598 0 1 0.00 0 0 0 Y 3 89981.12 0 0 Y 45662 Y 2 585.05 0 7218 Y 234

Errors, Outliers, and Missings

Page 17: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

18

Missing Value Imputation

Cases

Inputs

?

?

?

?

?

?

??

?

Page 18: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

19

The Curse of Dimensionality

1–D

2–D

3–D

Page 19: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

20

Dimension ReductionIn

pu

t 3

Input1

Redundancy

Input 2Input1

E(T

arg

et)

Irrelevancy

Page 20: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

Fool’s Gold

My model fits thetraining data perfectly...

I’ve struck it rich!

Page 21: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

22

Data Splitting

Page 22: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

23

Model Complexity

Too flexible

Not flexible enough

Page 23: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

24

Overfitting

Training Set Test Set

Page 24: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

25

Better Fitting

Training Set Test Set

Page 25: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

Section 2.3

Introduction to Decision Trees

Page 26: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

27

Objectives Explore the general concept of decision trees. Understand the different decision tree algorithms. Discuss the benefits and drawbacks of decision tree

models.

Page 27: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

28

Fitted Decision Tree

NINQ>1

75%

2%

0 1-2

45%

DELINQ

DEBTINC

<45 45

10%

0,1

21%>2

BAD =New CaseDEBTINC = 20

NINQ = 2DELINQ = 0

Income = 42K

45%

Page 28: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

29

Divide and Conquer

n = 5,000

10% BAD

n = 3,350 n = 1,650Debt-to-Income

Ratio < 45

yes no

21% BAD5% BAD

Page 29: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

30

The Cultivation of Trees Split Search

– Which splits are to be considered? Splitting Criterion

– Which split is best? Stopping Rule

– When should the splitting stop? Pruning Rule

– Should some branches be lopped off?

Page 30: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

31

Possible Splits to Consider

1

100,000

200,000

300,000

400,000

500,000

2 4 6 8 10 12 14 16 18 20

NominalInput Ordinal

Input

Input Levels

Page 31: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

32

Splitting Criteria

Left Right

Perfect Split

Debt-to-Income Ratio < 45

A CompetingThree-Way Split

4500 0

0 500

3196 1304

154 346

Not Bad

Bad

2521 1188

115 162

791

223

Left Center Right

4500

500

4500

500

4500

500

Not Bad

Bad

Not Bad

Bad

Page 32: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

33

The Right-Sized Tree

Stunting

Pruning

Page 33: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

34

A Field Guide to Tree Algorithms

CART

AIDTHAIDCHAID

ID3C4.5C5.0

Page 34: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

35

Benefits of Trees Interpretability

– tree-structured presentation

Mixed Measurement Scales– nominal, ordinal, interval

Regression trees

Robustness

Missing Values

Page 35: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

36

Benefits of Trees

Automatically– Detects interactions (AID)– Accommodates

nonlinearity– Selects input variables

InputInput

Prob

MultivariateStep Function

Page 36: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

37

Drawbacks of Trees

Roughness

Linear, Main Effects

Instability

Page 37: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

Section 2.4

Building and Interpreting Decision Trees

Page 38: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

39

Objectives Explore the types of decision tree models available in

Enterprise Miner. Build a decision tree model. Examine the model results and interpret these results. Choose a decision threshold theoretically and

empirically.

Page 39: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

40

This demonstration illustrates building a decision tree model with Enterprise miner and examining the results.

Demonstration

Page 40: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

41

Consequences of a Decision

Decision 1 Decision 0

Actual 1 True Positive False Negative

Actual 0 False Positive True Negative

Page 41: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

42

ExampleRecall the home equity line of credit scoring example. Presume that every two dollars loaned eventually returns three dollars if the loan is paid off in full.

Page 42: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

43

Consequences of a Decision

Decision 1 Decision 0

Actual 1 True Positive False Negative

(cost=$2)

Actual 0 False Positive

(cost=$1)

True Negative

Page 43: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

44

Bayes Rule

positive false ofcost

negative false ofcost 1

1

Page 44: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

45

Consequences of a Decision

Decision 1 Decision 0

Actual 1 True Positive

(profit=$2)

False Negative

Actual 0 False Positive

(profit=$-1)

True Negative

Page 45: Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project

46

This demonstration illustrates using the target profile to select a decision threshold.

Demonstration