sensitivity and specificity in predictive modeling

16
Sensitivity and Specificity in Predictive Modeling Sarajit Poddar 7 June 2015 Solving Workforce Problems using Analytics

Upload: sarajit-poddar-gphr-hrmp

Post on 11-Aug-2015

63 views

Category:

Business


3 download

TRANSCRIPT

Page 1: Sensitivity and Specificity in Predictive Modeling

Sensitivity and Specificity in Predictive ModelingSarajit Poddar7 June 2015

Solving Workforce Problems using Analytics

Page 2: Sensitivity and Specificity in Predictive Modeling

Sensitivity

1. When a Predictive Model is applied on a Real Life data, Sensitivity is the Probability to “Selecting” up the Correct outcome.

2. For instance, a Predictive Model is developed to identify Higher Performer Employees who are likely to leave within 6 months, Sensitivity is the probability to “identifying someone who will actually leave”.

3. Sensitivity is also called the “True Positive Rate”.

Page 3: Sensitivity and Specificity in Predictive Modeling

Specificity

1. When applying the Predictive Model on a Real Life data, Specificity is the Probability to “Rejecting” up the Incorrect outcome.

2. For instance, when the Predictive Model of identifying High Performer attrition, Specificity is the probability to “not identifying someone who will not leave”.

3. Sensitivity is also called the “True Negative Rate”.

Page 4: Sensitivity and Specificity in Predictive Modeling

Trade-off between Sensitivity and Specificity1. Sensitivity: When we are too cautious with identifying the potential

leavers, we may end up including in out pool someone who will not leave. Thus, we will end up having a bigger pool of identified employees, than it should. If Organisation is devising initiatives for preventing the attrition, it may have to allocate more fund than required, to address this. Thus, while sensitivity is high, the specificity is low.

2. Specificity: When the organisation wants to restrict the pool size, it may have more stringent selection condition. While it will not select the “non-leavers”, it may also miss out on “potential leavers”. Thus, while specificity is high, the sensitivity is low.

Page 5: Sensitivity and Specificity in Predictive Modeling

“Thus one needs to judge, what is more important for addressing the issue at hand. If losing high-performing Sales

Employees is going to cost the company more (opportunity cost), perhaps

increasing sensitivity is going to be more effective.”

Page 6: Sensitivity and Specificity in Predictive Modeling

False Positive (Type 1 Error)

If the predictive algorithm ends up selecting a high performer who has no “flight risk”, this called “False

Positive”. It is “Positive” because, the selection action has happened. It is “False” because, the employee

selected does not belong to the Target group.

Target group = High performers having high “flight-risk”.

Page 7: Sensitivity and Specificity in Predictive Modeling

False Negative (Type 2 Error)

If the predictive algorithm fails to select a high performer who has significant “flight risk”, this called “False Negative”. It is “Negative” because, someone from the Target group is “not selected”. It is “False” because, the employee not-selected belongs to the

Target group.

Target group = High performers having high “flight-risk”.

Page 8: Sensitivity and Specificity in Predictive Modeling

Actual Positive Actual Negative

Test

Outc

om

eN

eg

ati

ve

True Positive

False Positive

False Negative

True Negative

Test

Outc

om

ePo

siti

ve

Page 9: Sensitivity and Specificity in Predictive Modeling

Re-visiting the Errors

Type 1 Error (False Positive)

Selecting a member outside the Target group

Relaxed selection algorithm, with large filters to allow someone outside the target group.

Type 2 Error (True Negative)

Failure to select a member within the Target group.

Stringent selection algorithm, with small filters to dis-allow someone even within the target group.

Page 10: Sensitivity and Specificity in Predictive Modeling

Applying the Concept to Talent Acquisitionin

Page 11: Sensitivity and Specificity in Predictive Modeling

Sensitivity & SpecificitySensitivity

Probability of “Selecting” High quality candidates.

Increasing Sensitivity can mean, relaxing the selection parameters, thus allowing selection of “poor quality candidates”.

Decreasing Sensitivity can mean putting stringent selection parameters, potentially losing out on “good quality candidates”

Specificity

Probability of “Not Selecting” Poor quality candidates.

Increasing Specificity can mean, putting stringent selection parameters, thus increasing the chance of rejecting “poor quality candidates”.

Decreasing Specificity can mean relaxing the selection parameters, thus failing to reject “poor quality candidates”

Page 12: Sensitivity and Specificity in Predictive Modeling

Type 1 and Type 2 Errors

Type 1 Error (False Positive)

Selecting “Poor quality candidates”.

Relaxed selection algorithm.

Type 2 Error (True Negative)

Rejecting “High Quality Candidates”.

Stringent selection algorithm.

Page 13: Sensitivity and Specificity in Predictive Modeling

Important Ratios

Page 14: Sensitivity and Specificity in Predictive Modeling

Important Ratios1. True positive rate (TPR), Sensitivity = Σ True

positive / Σ Condition positive

2. True negative rate (TNR), Specificity = Σ True negative / Σ Condition negative

3. False positive rate (FPR), Fall-out = Σ False positive / Σ Condition negative

4. False negative rate (FNR), Miss rate = Σ False negative / Σ Condition positive

5. Accuracy (ACC) = Σ True positive + Σ True negative / Σ Total population

6. Prevalence = Σ Condition positive / Σ Total population

7. Positive predictive value (PPV), Precision = Σ True positive / Σ Test Outcome Positive

8. False discovery rate (FDR) = Σ False positive / Σ Test Outcome Positive

9. False omission rate (FOR) = Σ False negative / Σ Test Outcome Negative

10.Negative predictive value (NPV) = Σ True negative / Σ Test Outcome Negative

11.Positive likelihood ratio (LR+) = TPR / FPR

12.Negative likelihood ratio (LR−) = FNR / TNR

13.Diagnostic odds ratio (DOR) = LR+ / LR−

Source: Wikipedia

Page 15: Sensitivity and Specificity in Predictive Modeling

Condition Positive

Condition NegativeTe

st O

utc

om

ePo

siti

ve

Test

Outc

om

e

Neg

ati

ve

10 200

90 600

Scenario: Suppose, out of 1000 sales employees, 100 are high performers. 10 among the high performers have left the company in last 6 months. While 200 among the remaining employees have left. If an Predictive Algorithm is built which can predict this, what are the various ratios?

HighPerformers

Not HighPerformers

Left

the

Com

pany

Sta

yed

in t

he

Com

pany

True positive rate (TPR), Sensitivity = Σ True positive / Σ Condition positive

= 10 / 100 = 0.1

True negative rate (TNR), Specificity = Σ True negative / Σ Condition negative

= 600 / 800 = 0.75

False positive rate (FPR), Fall-out = Σ False positive / Σ Condition negative

= 200 / 800 = 0.25

False negative rate (FNR), Miss rate = Σ False negative / Σ Condition positive

= 90 / 100 = 0.9

Accuracy (ACC) = (Σ True positive + Σ True negative) / Σ Total population

= 610 / 1000 = 0.61

Prevalence = Σ Condition positive / Σ Total population

= 100/ 1000 = 0.1

Positive predictive value (PPV), Precision = Σ True positive / Σ Test

Outcome Positive

= 10 / 210 = 0.04

False discovery rate (FDR) = Σ False positive / Σ Test Outcome Positive

= 200 / 210 = 0.1

False omission rate (FOR) = Σ False negative / Σ Test Outcome Negative

= 90 / 690 = 0.13

Negative predictive value (NPV) = Σ True negative / Σ Test Outcome Negative

= 600 / 690 = 0.87

Positive likelihood ratio (LR+) = TPR / FPR

= 0.1 / 0.25 = 0.4

Negative likelihood ratio (LR−) = FNR / TNR

= 0.9 / 0.75 = 1.33

Diagnostic odds ratio (DOR) = LR+ / LR−

= 0.4 / 1.33 = 0.30

True Positive

False Positive

False Negative

True Negative

Illustration

Page 16: Sensitivity and Specificity in Predictive Modeling

Thank you