sensitivity, specificity and roc curve analysis

21
Sensitivity, Specificity and ROC Curve Analysis

Upload: horatio-short

Post on 17-Dec-2015

269 views

Category:

Documents


5 download

TRANSCRIPT

Sensitivity, Specificity and ROC Curve Analysis

Criteria for Evaluating a Screening Test

•Validity: provide a good indication of who does and does not have disease

-Sensitivity of the test-Specificity of the test

•Reliability: (precision): gives consistent results when given to same person under the same conditions

•Yield: Amount of disease detected in the population, relative to the effort

-Prevalence of disease/predictive value

Validity of Screening Test (Accuracy)

- Sensitivity: Is the test detecting true cases of disease? Ideal is 100%: 100% of cases are detected; =Pr(T+|D+)

-Specificity: Is the test excluding those without disease? Ideal is 100%: 100% of non-cases are negative; =Pr(T-|D-)

- See Gehlbach, Chp. 10

True Cases of Glaucoma

Yes No

IOP > 22: Yes 50 100

No 50 1900

(total) 100 2000

Sensitivity = 50% (50/100) False Negative=50%Specificity = 95% (1900/2000) False Positive=5%

Example: Screening for Glaucoma using IOP

Consider:

-The impact of high number of false positives: anxiety, cost of further testing

-Importance of not missing a case: seriousness of disease, likelihood of re-screening

Where do we set the cut-off for a screening test?

Yield from the Screening Test: Predictive Value

•Relationship between Sensitivity, Specificity, and Prevalence of Disease

Prevalence is low, even a highly specific test will give large numbers of False Positives

•Predictive Value of a Positive Test (PPV): Likelihood that a person with a positive test has the disease

•Predictive Value of a Negative Test (NPV): Likelihood that a person with a negative test does not have the disease

True Cases of GlaucomaYes No

IOP > 22: Yes 50 100

No 50 1900

(total) 100 2000

Specificity = 95% (1900/2000) False Positive=5%Positive Predictive Value =33% (50/150)

Screening for Glaucoma using IOP

How Good does a Screening Test have to be?

IT DEPENDS

-Seriousness of disease, consequences of high false positivity rate:

-Rapid HIV test should have >90% sensitivity, 99.9% specificity

-Screen for nearsighted children proposes 80% sensitivity, >95% specificity

-Pre-natal genetic questionnaire could be 99% sensitive, 80% specific

Choosing a cut-point: receiver operating characteristic curves

• Situation where screening test yields results as a continuous value (e.g., intraocular pressure for glaucoma)

• Want to select a value above (or below) which to call “diseased” or “at risk”

• How do we select that value?

Non-diseasedcases

Diseasedcases

Test result valueor

subjective judgment of likelihood that case is diseased

Threshold

12

Non-diseasedcases

Diseasedcases

Test result valueor

subjective judgment of likelihood that case is diseased

More typically:

Threshold

TP F

racti

on (s

ensi

tivity

)FP Fraction (1-specificity)

less aggressivemindset

Non-diseasedcases

Diseasedcases

Threshold

moderatemindset

Non-diseasedcases

Diseasedcases TP

Fra

ction

(sen

sitiv

ity)

FP Fraction (1-specificity)

Threshold

more aggressivemindset

Non-diseasedcases

Diseasedcases TP

Fra

ction

(sen

sitiv

ity)

FP Fraction (1-specificity)

Threshold

Non-diseasedcases

Diseasedcases

Entire ROC curve

TP F

racti

on (s

ensi

tivity

)FP Fraction (1-specificity)

Entire ROC curve

Reader Skilland/or

Level of Technology

chance lin

e

TP F

racti

on (s

ensi

tivity

)

FP Fraction (1-specificity)

Highly discriminate

(good)

Somewhat discriminate (not as good)

Non-informative (no better than chance)

Use area under to curve (AUC) to judge discriminating ability.

Gehlbach: want AUC>80%

Luke Neff: Refractory Burn Shock DataLogistic Regression and ROC Curve Analysis

Response Profile

OrderedValue PET Total

Frequency

1 0 22

2 1 20

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 20.2651 1 <.0001

Score 15.3270 1 <.0001

Wald 10.1930 1 0.0014

Luke Neff: Refractory Burn Shock DataLogistic Regression and ROC Curve Analysis

Analysis of Maximum Likelihood Estimates

Parameter DF Estimate StandardError

WaldChi-Square Pr > ChiSq

Intercept 1 -3.0649 0.9514 10.3771 0.0013

Admission Lactate 1 0.8436 0.2642 10.1930 0.0014

Odds Ratio Estimates

Effect Point Estimate

95% WaldConfidence Limits

Admission Lactate 2.325 1.385 3.902

Luke Neff: Refractory Burn Shock DataLogistic Regression and ROC Curve Analysis

Area StandardError

0.8489 0.0633

95% WaldConfidence Limits

0.7249 0.9729

Pred Prob True Pos True Neg False Pos

False Neg Se 1 - Sp

0.9995 1 22 0 19 0.05 00.9863 2 22 0 18 0.1 00.9838 3 22 0 17 0.15 0

0.96 4 22 0 16 0.2 00.9402 6 22 0 14 0.3 00.9353 7 22 0 13 0.35 00.9182 8 22 0 12 0.4 0

0.889 9 22 0 11 0.45 00.8401 10 22 0 10 0.5 00.8284 11 22 0 9 0.55 00.7894 12 22 0 8 0.6 0

0.675 12 21 1 8 0.6 0.050.637 12 20 2 8 0.6 0.09

0.5767 12 18 4 8 0.6 0.180.5351 13 17 5 7 0.65 0.23

0.493 14 17 5 6 0.7 0.230.4302 14 16 6 6 0.7 0.270.4096 15 16 6 5 0.75 0.270.3894 16 16 6 4 0.8 0.270.3695 17 16 6 3 0.85 0.270.3312 18 15 7 2 0.9 0.320.3127 18 14 8 2 0.9 0.360.2611 18 13 9 2 0.9 0.410.2299 18 12 10 2 0.9 0.450.1881 19 10 12 1 0.95 0.550.1637 19 8 14 1 0.95 0.640.1525 19 7 15 1 0.95 0.680.1419 19 5 17 1 0.95 0.770.1226 19 4 18 1 0.95 0.820.1056 19 2 20 1 0.95 0.910.0907 19 1 21 1 0.95 0.950.0718 20 0 22 0 1 1

Corresponds to lactate value of about 3.0

Point that Maximizes sum of sensitivity and specificity.