diagnostic tests

60
DIAGNOSTIC TESTS Assist. Prof. E. Çiğdem Kaspar Yeditepe University Faculty of Medicine Department of Biostatistics and Medical Informatics

Upload: ganesa

Post on 22-Feb-2016

94 views

Category:

Documents


1 download

DESCRIPTION

DIAGNOSTIC TESTS. Assist . Prof. E. Çiğdem Kaspar Yeditepe University Faculty of Medicine Department of Biostatistics and Medical Informatics. Why we need a diagnostic test?. We need “information” to make a decision “Information” is usually a result from a test Medical tests: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DIAGNOSTIC  TESTS

DIAGNOSTIC TESTS

Assist. Prof. E. Çiğdem KasparYeditepe University Faculty of MedicineDepartment of Biostatistics and Medical Informatics

Page 2: DIAGNOSTIC  TESTS

Why we need a diagnostic test?

We need “information” to make a decision “Information” is usually a result from a test Medical tests:

To screen for a risk factor (screen test) To diagnosse a disease (diagnostic test) To estimate a patient’s prognosis (pronostic test)

When and in whom, a test should be done? When “information” from test result have a value.

Page 3: DIAGNOSTIC  TESTS

Value of a diagnostic test

The ideal diagnostic test: Always give the right answer:

Positive result in everyone with the diseaseNegative result in everyone else

Be quick, safe, simple, painless, reliable & inexpensive But few, if any, tests are ideal. Thus there is a need for clinically useful substitutes

Page 4: DIAGNOSTIC  TESTS

Is the test useful ?

Reproducibility (Precision) Accuracy (compare to “gold standard”) Feasibility Effects on clinical decisions Effects on Outcomes

Page 5: DIAGNOSTIC  TESTS

Determining Usefulnessof a Medical Test

Question Possible Designs Statistics for Results

1. How reproducible is the test?

Studies of:- intra- and inter observer &- intra- and inter laboratory

variability

Proportion agreement, kappa, coefficient of variance, mean & distribution of differences (avoid correlation coefficient)

Page 6: DIAGNOSTIC  TESTS

Determining Usefulnessof a Medical Test

Question Possible Designs Statistics for Results

2. How accurate is the test?

Cross-sectional, case-control, cohort-type designs in which test result is compared with a “gold standard”

Sensitivity, specificity, PV+, PV-, ROC curves, LRs

Page 7: DIAGNOSTIC  TESTS

Determining Usefulnessof a Medical Test

Question Possible Designs

Statistics for Results

3. How often do test results affect clinical decisions?

Diagnostic yield studies, studies of pre-& post test clinical decision making

Proportion abnormal, proportion with discordant results, proportion of tests leading to changes in clinical decisions; cost per abnormal result or per decision change

Page 8: DIAGNOSTIC  TESTS

Determining Usefulnessof a Medical Test

Question Possible Designs

Statistics for Results

4. What are the costs, risks, & acceptability of the test?

Prospective or retrospective studies

Mean cost, proportions experiencing adverse effects, proportions willing to undergo the test

Page 9: DIAGNOSTIC  TESTS

Determining Usefulnessof a Medical Test

Question Possible Designs Statistics for Results

5. Does doing the test improve clinical outcome, or having adverse effects?

Randomized trials, cohort or case-control studies in which the predictor variable is receiving the test & the outcome includes morbidity, mortality, or costs related either to the disease or to its treatment

Risk ratios, odd ratios, hazard ratios, number needed to treat, rates and ratios of desirable and undesirable outcomes

Page 10: DIAGNOSTIC  TESTS

Common Issues for Studies of Medical Tests

Spectrum of Disease Severity and Test Results: Difference between Sample and Population? Almost tests do well on very sick and very well people. The most difficulty is distinguishing Healthy & early,

presymtomatic disease.Þ Subjects should have a spectrum of disease that reflects

the clinical use of the test.

Page 11: DIAGNOSTIC  TESTS

Common Issues for Studies of Medical Tests

Sources of Variation: Between patients Observers’ skill Equipments

=> Should sample several different institutions to obtain a generalizable result.

Page 12: DIAGNOSTIC  TESTS

Common Issues for Studies of Medical Tests

Importance of Blinding: (if possible) Minimize observer bias Ex. Ultrasound to diagnose appendicitis(It is different to clinical practice)

Page 13: DIAGNOSTIC  TESTS

Studies of the Accuracy of Tests

Does the test give the right answer? “Tests” in clinical practice:

Symptoms Signs Laboratory tests Imagine testsÞ To find the right answer.Þ “Gold standard” is required

Page 14: DIAGNOSTIC  TESTS

How accurate is the test?

Validating tests against a gold standard: New tests should be validated by comparison against an

established gold standard in an appropriate subjects Diagnostic tests are seldom 100% accurate (false

positives and false negatives will occur)

Page 15: DIAGNOSTIC  TESTS

Describing the performance of a new diagnostic test

Physicians are often faced with the task of evaluation the merit of a new diagnostic test. An adequate critical appraisal of a new test requires a working knowledge of the properties of diagnostic tests and the

mathematical relationships between them.

Page 16: DIAGNOSTIC  TESTS

The gold standard test: Assessing a new diagnostic test begins with the identification of a group of patients known to have the disorder of interest, using an accepted reference test known as the gold standard.Limitations:1) The gold standard is often the most risky,

technically difficult, expensive, or impractical of

available diagnostic options.

2) For some conditions, no gold standard is available.

Page 17: DIAGNOSTIC  TESTS

The basic idea of diagnostic test interpretation is to calculate the probability a patient has a disease under consideration given a certain test result.  A 2 by 2 table can be used for this purpose. Be sure to label the table with the test results on the left side and the disease status on top as shown here: 

Test Disease

Present AbsentPositive True Positive False Positive

Negative False Negative

True Negative

Page 18: DIAGNOSTIC  TESTS

The sensitivity of a diagnostic test is the probability that a diseased individual will have a positive test result. Sensitivity is the true positive rate (TPR) of the test.

Sensitivity = P(T+|D+)=TPR

= TP / (TP+FN)

diseased alltest positive withdiseased

Page 19: DIAGNOSTIC  TESTS

The specificity of a diagnostic test is the probability that a disease-free individual will have a negative test result. Specificity is the true negative rate (TNR) of the test.

Specificity=P(T-|D-) = TNR

=TN / (TN + FP).

free-disease alltest negative withfree-disease

Page 20: DIAGNOSTIC  TESTS

False-positive rate: The likelihood that a nondiseased patient has an abnormal test result.

FPR = P(T+|D-)=

= FP / (FP+TN)

free-diseased alltest positive withfree-disease

Page 21: DIAGNOSTIC  TESTS

False-negative rate: The likelihood that a diseased patient has a normal test result.

FNR = P(T-|D+)=

= FN / (FN+TP)

diseased alltest negative withdiseased

Page 22: DIAGNOSTIC  TESTS

Pretest Probability is the estimated likelihood of disease before the test is done. It is the same thing as prior probability and is often estimated. If a defined population of patients is being evaluated, the pretest probability is equal to the prevalence of disease in the population. It is the proportion of total patients who have the disease.

P(D+) = (TP+FN) / (TP+FP+TN+FN)

Page 23: DIAGNOSTIC  TESTS

Sensitivity and specificity describe how well the test discriminates between patients with and without disease. They address a different question than we want answered when evaluating a patient, however. What we usually want to know is: given a certain test result, what is the probability of disease? This is the predictive value of the test.

Page 24: DIAGNOSTIC  TESTS

Predictive value of a positive test is the proportion of patients with positive tests who have disease.

PVP=P(D+|T+) = TP / (TP+FP)

This is the same thing as posttest probability of disease given a positive test. It measures how well the test rules in disease.

Page 25: DIAGNOSTIC  TESTS

Predictive value of a negative test is the proportion of patients with negative tests who do not have disease. In probability notation:

PVN = P(D-|T-) = TN / (TN+FN)

It measures how well the test rules out disease. This is posttest probability of non-disease given a negative test.

Page 26: DIAGNOSTIC  TESTS

Evaluating a 2 by 2 table is simple if you are methodical in your approach. 

Test Disease

Present Absent

Positive TP FP Total positive

Negative FN TN Total negative

Total with disease

Total with- out disease

Grand total

Page 27: DIAGNOSTIC  TESTS

Bayes’ Rule MethodBayes’ rule is a mathematical formula that may be used as an alternative to the back calculation method for obtaining unknown conditional probabilities such as PVP or PVN from known conditional probabilities such as sensitivity and specificity.

FPRDpTPRDpTPRDpTDPPVP

))(1()()()(

FNRDpTNRDpTNRDpTDPPVN

))(1()()()(

)()()()()()(

)(ABPAPABPAP

ABPAPBAP

General form of Bayes’ rule is

Using Bayes’ rule, PVP and PVN are defined as

Page 28: DIAGNOSTIC  TESTS

Example The following table summarizes results of a study to evaluate the dexamethasone suppression test (DST) as a diagnostic test for major depression. The study compared results on the DST to those obtained using the gold standard procedure (routine psychiatric assessment and structured interview) in 368 psychiatric patients.

1. What is the prevalence of major depression in the study group?

2. For the DST, determinea-Sensitivity and specificity

b-False positive rate (FPR) and false negative rate (FNR)

c-Predictive value positive (PVP) and predictive value negative (PVN)

Page 29: DIAGNOSTIC  TESTS

DST Result

Depression Total+ -

+ 84 5 89- 131 148 279

Total 215 153 368

Sensitivity = P(T+|D+)=TPR=TP/(TP+FN)=84/215=0.391Specificity=P(T-|D-)=TNR=TN / (TN + FP)=148/153=0.967FPR = P(T+|D-)=FP/(FP+TN)=5/153=0.033

FNR = P(T-|D+)=FN/(FN+TP)=131/215=0.609

PVN = P(D-|T-) = TN / (TN+FN)=148/279=0.53

PVP=P(D+|T+) = TP / (TP+FP)=84/89=0.944

P(D+) =215/368 =0.584

Page 30: DIAGNOSTIC  TESTS

FNR=1-Sensitivity=1-0.391=0.609FPR=1-Specificity=1-0.967=0.033

Page 31: DIAGNOSTIC  TESTS

Validating tests against a gold standard

A test is valid if: It detects most people with disorder (high Sen) It excludes most people without disorder (high Sp) a positive test usually indicates that the disorder is

present (high PV+) The best measure of the usefulness of a test is the

LR: how much more likely a positive test is to be found in someone with, as opposed to without, the disorder

Page 32: DIAGNOSTIC  TESTS

ROC (Receiver Operating Characteristic ) CURVEWe want to be able to compare the

accuracy of diagnostic tests.Sensitivity and specificity are

candidate measures for accuracy, but have some problems, as we’ll see.

ROC curves are an alternative measure

We plot sensitivity against 1 – specificity to create the ROC curve for a test

Page 33: DIAGNOSTIC  TESTS

ROC (Receiver Operating Characteristic ) CURVE

The ROC Curve is a graphic representation of the relationship between sensitivity and specificity for a diagnostic test. It provides a simple tool for applying the predictive value method to the choice of a positivity criterion.

ROC Curve is constructed by plottting the true positive rate (sensitivity) against the false positive rate (1-specificty) for several choices of the positivity criterion.

Page 34: DIAGNOSTIC  TESTS

Plotting the ROC curve is a popular way of displaying the discriminatory accuracy of a diagnostic test for detecting whether or not a patient has a disease or condition.

ROC methodology is derived from signal detection theory [1] where it is used to determine if an electronicreceiver is able to satisfactory distinguish between signal and noise.It has been used in medical imaging and radiology , psychiatry , non-destructive testing and manufacturing, inspection systems .

Page 35: DIAGNOSTIC  TESTS

ROC Curve

Diagonal segments are produced by ties.

1 - Specificity

1,00,75,50,250,00

Sens

itiv

ity

1,00

,75

,50

,25

0,00

Page 36: DIAGNOSTIC  TESTS

Specific Example

Test Result

Pts with disease

Pts without the disease

Page 37: DIAGNOSTIC  TESTS

Test Result

Call these patients “negative”

Call these patients “positive”

Threshold

Page 38: DIAGNOSTIC  TESTS

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

True Positives

Some definitions ...

Page 39: DIAGNOSTIC  TESTS

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

False Positives

Page 40: DIAGNOSTIC  TESTS

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

True negatives

Page 41: DIAGNOSTIC  TESTS

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

False negatives

Page 42: DIAGNOSTIC  TESTS

Test Result

without the diseasewith the disease

‘‘-’’

‘‘+’’

Moving the Threshold: right

Page 43: DIAGNOSTIC  TESTS

Test Result

without the diseasewith the disease

‘‘-’’

‘‘+’’

Moving the Threshold: left

Page 44: DIAGNOSTIC  TESTS

True

Pos

itive

Rat

e

(se

nsiti

vity

)

0%

100%

False Positive Rate (1-specificity)

0%

100%

ROC curve

Page 45: DIAGNOSTIC  TESTS

RECEIVER OPERATING CHARACTERISTIC (ROC) curve

ROC curves (Receiver Operator Characteristic)

Ex. SGPT and Hepatitis

1-Specificity

Sensitivity

1

1

SGPT D + D - Sum< 50 10 190 20050-99 15 135 150100-149 25 65 90150-199 30 30 60200-249 35 15 50250-299 120 10 130>300 65 5 70Sum 300 450 750

Page 46: DIAGNOSTIC  TESTS

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

A good test: A poor test:

ROC curve comparison

Page 47: DIAGNOSTIC  TESTS

Best Test: Worst test:Tr

ue P

ositi

ve R

ate

0%

100%

False Positive Rate

0%

100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate

0%

100%

The distributions don’t overlap at all

The distributions overlap completely

ROC curve extremes

Page 48: DIAGNOSTIC  TESTS

‘Classical’ estimation

Binormal model:X ~ N(0,1) in nondiseased populationX ~ N(a, 1/b) in diseased population

Then ROC(t) = (a + b-1(t)) for 0 < t < 1

Estimate a, b by ML using readings from sets of diseased and nondiseased patients

Page 49: DIAGNOSTIC  TESTS

ROC curve estimation with continuous data

Many biochemical measurements are in fact continuous, e.g. blood glucose vs. diabetes

Can also do ROC analysis for continuous (rather than binary or ordinal) data

Estimate ROC curve (and smooth) based on empirical ‘survivor’ function (1 – cdf) in diseased and nondiseased groups

Can also do regression modeling of the test result

Another approach is to model the ROC curve directlyas a function of covariates

Page 50: DIAGNOSTIC  TESTS

The most commonly used global index of diagnostic accuracy is the area under the ROCcurve (AUC).

Page 51: DIAGNOSTIC  TESTS

Area under ROC curve (AUC)

Overall measure of test performanceComparisons between two tests based on

differences between (estimated) AUCFor continuous data, AUC equivalent to

Mann-Whitney U-statistic (nonparametric test of difference in location between two populations)

Page 52: DIAGNOSTIC  TESTS

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

AUC = 50%

AUC = 90% AUC =

65%

AUC = 100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

AUC for ROC curves

Page 53: DIAGNOSTIC  TESTS

Examples using ROC analysisThreshold selection for ‘tuning’ an already

trained classifier (e.g. neural nets)Defining signal thresholds in DNA microarrays

(Bilban et al.)Comparing test statistics for identifying

differentially expressed genes in replicated microarray data (Lönnstedt and Speed)

Assessing performance of different protein prediction algorithms (Tang et al.)

Inferring protein homology (Karwath and King)

Page 54: DIAGNOSTIC  TESTS

Homology Induction ROC

Page 55: DIAGNOSTIC  TESTS

Example: One of the parameters which are evaluated for the diagnosis of CHD, is the value of “HDL/Total Cholesterol”. Consider a population consisting of 67 patients with CHD, 93 patients without CHD. The result of HDL/Total Cholesterol values of these two groups of patients are as follows.

CHD+Hdl/Total

Cholestrol

CHD-Hdl/Total

Cholestrol

0,290,260,390,16

.

.

.

0,250,360,300,20

.

.

.

Page 56: DIAGNOSTIC  TESTS

To construct the ROC Curve, we should find sensitivity and specificity for each cut off point. We have two alternatives to find these characteristics.• Cross tables• Normal Curve

Descriptive Statistics

HDL/Total Cholestrol

,2926 ,066 ,16 ,52,2301 ,048 ,06 ,34

GROUPCHD-CHD+

Mean SD Min Max

Page 57: DIAGNOSTIC  TESTS

If HDL/Total Cholestrol is less than or equal to 0,26, we classify this group into diseased.

64 15 7968,8% 22,4% 49,4%

29 52 8131,2% 77,6% 50,6%

93 67 160100,0% 100,0% 100,0%

Count

Count

Count

0,26>

0,26<=

RATIO

Total

- +CHD

Total

SensitivitySpecificity

Page 58: DIAGNOSTIC  TESTS

Best cutoff point

Cutoff Sn 1-Sp0,000 0,000 0,0000,093 0,015 0,0000,129 0,030 0,0000,142 0,045 0,0000,156 0,060 0,0000,158 0,075 0,0000,162 0,075 0,0110,168 0,104 0,0110,171 0,119 0,0110,173 0,119 0,0220,175 0,119 0,032

. . .

. . .

. . .0.26 0.78 0.31

. . .

. . .0,393 1,000 0,9350,402 1,000 0,9460,407 1,000 0,9570,420 1,000 0,9680,446 1,000 0,9780,493 1,000 0,9891,000 1,000 1,000

92 59 15198,9% 88% 94%

1 8 91,1% 12% 5,6%

93 67 160100% 100% 100%

0,171<

0,171>=

RATIO

Total

- +CHD

Total

Let cutoff=0,171

Usually, the best cut-off point is where the ROC curve "turns the corner”

Page 59: DIAGNOSTIC  TESTS

ROC Curve

1 - Seçicilik1,0,9,8,7,6,5,4,3,2,10,0

Sens

itivi

ty1,0

,9

,8

,7

,6

,5

,4

,3

,2

,10,0

1-Specificity

Cutoff=0.26TPR=0.78FPR=0.31TNR=0.69FNR=0.22

Page 60: DIAGNOSTIC  TESTS

Area Under the Curve

Test Result Variable(s): ORAN

,778 ,036 ,000 ,708 ,849Area Std. Errora

AsymptoticSig.b Lower Bound Upper Bound

Asymptotic 95% ConfidenceInterval

The test result variable(s): ORAN has at least one tie between thepositive actual state group and the negative actual state group. Statisticsmay be biased.

Under the nonparametric assumptiona.

Null hypothesis: true area = 0.5b.