statistics bootcamp 2013

21
Statistics Bootcamp 101 for HLABC Members Penny Brasher, PhD Vancouver, BC June 14, 2013 c PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 2 / 57 Statistics are everywhere Angus Reid Public Opinion surveyed 808 randomly selected B.C. residents from May 1 to 2. It claims a margin of error of +/-3.5 c PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 3 / 57

Upload: dean-giustini

Post on 07-May-2015

2.036 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Statistics bootcamp 2013

Statistics Bootcamp 101 for HLABC Members

Penny Brasher, PhD

Vancouver, BC

June 14, 2013

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 2 / 57

Statistics are everywhere

Angus Reid Public Opinion surveyed 808 randomly selected B.C. residents from May 1 to 2. Itclaims a margin of error of +/-3.5

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 3 / 57

Page 2: Statistics bootcamp 2013

What is Statistics?

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 4 / 57

What is Biostatistics?

Biostatistics = statistics applied to biomedical problems

design and analysis of experiments

design and analysis of observational studies

measurement, data analysis (description, inference), statistical graphics

detective work

making decisions in the face of uncertainty (variability)

inference from a sample (specific) to a population (general)

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 5 / 57

Page 3: Statistics bootcamp 2013

Part I

Basic Statistical Concepts

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 8 / 57

Basic Concepts

Two broad categories of statistics:

Descriptive Statistics

Inferential Statistics

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 9 / 57

Page 4: Statistics bootcamp 2013

Basic Concepts

Descriptive Statistics

using numerical summaries and figures to summarize or characterize a set of data.

mean, median, variance, range, etc.

histograms, scatterplots, boxplots, etc.

? no assumptions are made.

⇒ If the data are a random sample from a certain population, the sample represents thepopulation in minature.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 10 / 57

Part II

Types of Data

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 12 / 57

Page 5: Statistics bootcamp 2013

Types of Data

Categorical Data

Nominal variables assume values that fall into unordered categories. Nominal datamay be binary (dichotomous) or polychotomous (polytomous). Examples: admissionstatus (admitted, not admitted), survival status (alive, dead), race (caucasian, asian,black, ...).

Ordinal variables assume values that fall into ordered categories but differencesbetween values are not meaningful. Examples: response to treatment (worse, same,improved), degress of illness (none, mild, moderate, severe), likert-item (stronglydisagree, disagree, neutral, agree, strongly agree).

Numerical (Metric, Quantitative) Data

Numerical discrete variables assume a countable number of values. There can begaps in its possible values. Examples: number of comorbidities, number of falls in ayear.

Numerical continuous variables assume, in theory, inifinite values in a given range;there are no gaps in its possible values. Examples: age, weight, etc.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 13 / 57

Grip strength of health librarians

Data collected:

Cohort HLABCID 1Year of birth 1952 numerical discreteHeight (cm) 161.3 numerical continuousSex F nominalGrip position 2 ordinalDominant hand R nominalOrder RL nominalGrip strength, Right (kg) 31.8 numerical continuousScrunchy face (R) 1 nominalGrip strength, Left (kg) 24.6 numerical continuousScrunchy face (L) 1 nominal

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 14 / 57

Page 6: Statistics bootcamp 2013

Descriptive StatisticsTypes of Data

Nota bene

There is no such thing as

”nonparametric data”.

⇒ Parameters belong to models.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 15 / 57

Part III

Descriptive Statistics

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 16 / 57

Page 7: Statistics bootcamp 2013

Grip strength of health librariansDescriptive Statistics

How would you summarize the characteristics of this sample of librarians?

The characteristics we have collected include:

Year of birthHeight (cm)SexDominant handGrip strength

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 17 / 57

Descriptive StatisticsData Summaries

For categorical variables - frequencies & percentages. 1

For numerical continuous variables, typically, one wants to describe the central tendency(central location) of the data, and the degree to which the data is, or is not, spread out(dispersion).

Why are mean and standard deviation often used to describe continuous variables?

1Don’t report percentages if the sample size is small.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 18 / 57

Page 8: Statistics bootcamp 2013

The Normal (Gaussian) Distribution

Normal distributions are completely determined by only two values – the mean, µ, andthe standard deviation, σ.

−8 −6 −4 −2 0 2 4 6 8

(0,1) (3,1)

(0,2)

Gaussian (normal) distributions

The mean, µ, determinesthe center.

The standard deviation, σdetermines the spread(variability).

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 19 / 57

The Normal (Gaussian) Distribution

Normal distributions are completely determined by only two values – the mean, µ, andthe standard deviation, σ.

µ−4σ µ−2σ µ µ+2σ µ+4σ

N(µ,σ)

95%

95% of observations will lie in theinterval (µ− 1.96σ, , µ+ 1.96σ).

∼70% of observations will lie in theinterval (µ− σ , µ+ σ).

50% of observations will lie in theinterval (µ− 0.675σ , µ+ 0.675σ).

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 20 / 57

Page 9: Statistics bootcamp 2013

Describing DataData Summaries

For continuous variables that are approximately normally distributed the sampledistribution may be summarized with the sample mean, x̄ , and the sample standarddeviation, sd .

? For continuous variables with skewed distributions other summary statistics should beused. If the distribution is unimodal the median and P25 & P75 (Q1 & Q3) or themedian and P10 & P90 could be used.

– Altman DG, Bland JM. Quartiles, quintiles, centiles, and other quantiles. BMJ 1994;309:996.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 21 / 57

Descriptive Statistics

Part of Table 1 from a randomized trial in patients undergoing CABG.

20

Table 1. Anthropometric, baseline and procedural characteristics (intent-to-treat and safety population)

Clevidipine

N=49

Nitroglycerin

N=51

Age, years; mean (SD) 65.8 (11.3) 63.2 (12.3)

Sex

Male, n (%) 40 (81.6) 43 (84.3)

Female, n (%) 9 (18.4) 8 (15.7)

Weight, kg; mean (SD) 79.7 (15.9) 82.1 (18.5)

Height, cm; mean (SD) 170.4 (9.0) 170.5 (12.4)

ASA Physical Status*, n (%)

I 0 (0.0) 0 (0.0)

II 0 (0.0) 1 (2.0)

III 29 (59.2) 33 (64.7)

IV 19 (38.8) 16 (31.4)

V 1 (2.0) 0 (0.0)

Body Mass Index, kg/m2; mean (SD) 27.4 (5.1) 28.2 (5.2)

Index Procedure, n (%)

CABG 43 (87.8) 45 (88.2)

CABG plus valve surgery 6 (12.2) 6 (11.8)

Target MAP, pre-CPB, mmHg; mean (SD) 76.1 (7.0) 76.4 (7.9)

Target MAP, aortic cannulation, mmHg; mean (SD);

CLV n=49, NTG n=49

64.6 (11.9) 63.6 (10.4)

Duration of bypass, min; mean (SD); CLV n=47,

NTG n=51

102.5 (37.1) 99.2 (35.8)

Duration of aortic cannulation (min) mean (SD); CLV

n=35, NTG n=38

18.9 (40.3) 13.3 (26.1)

IABP used, mean (SD) 2 (4.1) 0 (0.0)

Number of grafts, mean (SD) 3.1 (0.8) 3.0 (1.0)

Abbreviations: kg = kilograms, cm = centimeters. ASA = American Society of Anesthesiologists.

IABP = intra-aortic balloon pump.

SD=standard deviation. CLV=clevidipine. NTG=nitroglycerin.

*ASA physical status unknown for 1 NTG-treated patient.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

What changes would you make to this table?

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 22 / 57

Page 10: Statistics bootcamp 2013

Descriptive Statistics

20

Table 1. Anthropometric, baseline and procedural characteristics (intent-to-treat and safety population)

Clevidipine N=49

Nitroglycerin N=51

Age, years; mean (SD) 65.8 (11.3) 63.2 (12.3)

Sex

Male, n (%) 40 (81.6) 43 (84.3)

Female, n (%) 9 (18.4) 8 (15.7)

Weight, kg; mean (SD) 79.7 (15.9) 82.1 (18.5)

Height, cm; mean (SD) 170.4 (9.0) 170.5 (12.4)

ASA Physical Status*, n (%)

I 0 (0.0) 0 (0.0)

II 0 (0.0) 1 (2.0)

III 29 (59.2) 33 (64.7)

IV 19 (38.8) 16 (31.4)

V 1 (2.0) 0 (0.0)

Body Mass Index, kg/m2; mean (SD) 27.4 (5.1) 28.2 (5.2)

Index Procedure, n (%)

CABG 43 (87.8) 45 (88.2)

CABG plus valve surgery 6 (12.2) 6 (11.8)

Target MAP, pre-CPB, mmHg; mean (SD) 76.1 (7.0) 76.4 (7.9)

Target MAP, aortic cannulation, mmHg; mean (SD); CLV n=49, NTG n=49

64.6 (11.9) 63.6 (10.4)

Duration of bypass, min; mean (SD); CLV n=47, NTG n=51

102.5 (37.1) 99.2 (35.8)

Duration of aortic cannulation (min) mean (SD); CLV n=35, NTG n=38

18.9 (40.3) 13.3 (26.1)

IABP used, mean (SD) 2 (4.1) 0 (0.0)

Number of grafts, mean (SD) 3.1 (0.8) 3.0 (1.0)

Abbreviations: kg = kilograms, cm = centimeters. ASA = American Society of Anesthesiologists. IABP = intra-aortic balloon pump. SD=standard deviation. CLV=clevidipine. NTG=nitroglycerin. *ASA physical status unknown for 1 NTG-treated patient.

1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465

⇒ For skewed (asymmetric) data use percentiles.

⇒ For nominal and ordinal variables and for numerical discrete variables with a limitedrange use a table of frequencies.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 23 / 57

Describing dataData Summaries

Sometimes you don’t need to summarize data:

Times to circulatory collapse(s) were

10,35,42,42,43,70; 5,46,50,50,54,64 in IG and C groups, respectively.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 24 / 57

Page 11: Statistics bootcamp 2013

Part V

Inferential Statistics

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 25 / 57

Basic Concepts

Inferential Statistics

making inferences about a population from a sample.

estimation and hypothesis testing.

? some assumptions are made.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 26 / 57

Page 12: Statistics bootcamp 2013

Quantifying the role of chance

A very simple example.

We wish to know if a coin is ”fair”. By ”fair” we mean that the probability of getting ahead on any flip is 1/2.

To determine if the coin is ”fair” we could take it to the laboratory and:

determine the weight distribution throughout the coin,

determine the aerodynamics of the coin,

etc.

In this way we would discover the ”truth”.

OR

We could conduct an experiment, compute some statistics and try to get close to thetruth.

⇒ Statistical inference.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 27 / 57

Significance TestingQuantifying the role of chance

Returning to our very simple example.

We decide to flip the coin 15 times.

We observe 4 heads in 15 flips.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 28 / 57

Page 13: Statistics bootcamp 2013

Significance TestingQuantifying the role of chance

N Observed Expected Assumed p Observed p---------------------------------------------------15 4 7.5 0.50000 0.26667

Pr(k <= 4) = 0.059 (one-sided test)Pr(k <= 4 or k >= 11) = 0.118 (two-sided test)

What does this mean?

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 29 / 57

Significance Testing

Sir Ronald A. Fisher

In general, tests of significance arebased on hypothetical probabilitiescalculated from their nullhypotheses. They do not generallylead to any probability statementsabout the real world, but to arational and well-defined measureof reluctance to the acceptance ofthe hypotheses they test”.

– Fisher RA. Statistical Methods and ScientificInference (1956)

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 30 / 57

Page 14: Statistics bootcamp 2013

Significance TestingA very simple example

Study design: flip coin 15 times.

Test statistic: number of heads.

Evidence against: too many or too few heads.

Probability model: Binomial (n=15,π = 0.5)

Theoretical distribution of the number of heads if the coin is fair.

number of heads

prob

abili

ty

0.00

0.05

0.10

0.15

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

If the coin was fair

Fisher would ask us to consider if 4heads (plus more extreme results)is unlikely under the nullhypothesis, i.e. fair coin.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 31 / 57

Significance TestingThe P-value

Pr(k <= 4 or k >= 11 | if coin is fair) = 0.118

The more interesting question is . . .

What is the probablity that the coin is fair? i.e. What is the probability that the nullhypothesis is true?

I have no idea.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 32 / 57

Page 15: Statistics bootcamp 2013

Quantifying the role of chanceThe P-value

In significance testing, the P-value is the probability of obtaining a result (i.e. teststatistic) at least as extreme as the one that was actually observed, when the nullhypothesis is true.

It is Pr(data|H0 is true).

An ”unlikely” event suggests that H0 is unlikely but the P-value provides no measure ofjust how unlikely H0 is.

Akin to proof by contradiction.

We have a model and we examine the extent to which the data contradict the model.

The basis for suggesting a contradiction is observing data that are highly improbableunder the model.

⇒ In health research involving human subjects, P-values are next to useless.

And yet they’re everywhere.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 33 / 57

Inferential StatisticsP-values vs Confidence Intervals

In a randomized trial comparing two treatments the following mortality results werereported by the authors.

Std Exp Std Expnumber percent

died 19 12 31.7 20.7survived 41 46 68.3 79.3total 60 58

P = 0.21, Fisher’s exact test.

The authors concluded ”there is no difference in mortality”.

What do you think?

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 34 / 57

Page 16: Statistics bootcamp 2013

Inferential StatisticsConfidence Intervals

A P-value tells you nothing about the size of the treatement effect.

The estimate of the true treatment effect is:

31.7% - 20.7% = 11.0%, 95% CI: -4.9% to 26.1%, 80% CI: 0.6% to 21.0%.

What does the confidence interval represent?

statistical definition: If the study were to be repeated 1000 times and a 95% CI wasconstructed each time, we would expect 950 of those intervals to include the populationparameter. A reported confidence interval from a particular study may or may not include the actual

population value.

working definition: Values of the population parameter that are conistent with thesample data.

⇒ The confidence interval gives a plausible range of values for the unknown populationparameter.

Would you want to receive standard treatment?

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 35 / 57

Angus Reid Public Opinion surveyed 808 randomly selected B.C. residents from May 1 to2. It claims a margin of error of +/-3.5

For a proportion the maximum variance is when p = 0.50.

. cii 808 404

-- Binomial Exact --

Obs Mean Std. Err. [95% Conf. Interval]

----------------------------------------------------

808 0.5 0.01759 0.46496 .53504

2*.01759 = 0.03518

Angus Reid is providing the width of the largest possible 95% confidence interval.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 36 / 57

Page 17: Statistics bootcamp 2013

Confidence Intervals vs P-valuesInterpreting Results

Overemphasis on hypothesis testing — and the use of P-values to dichotomizesignificant or non-significant results — has detracted from more usefulapproaches to interpreting study results, such as estimation and confidenceintervals. In medical studies investigators should usually be interested indetermining the size of difference of a measured outcome between groups,rather than a simple indication of whether or not it is statistically significant.

Gardner MJ, Altman DG. Statistics with Confidence

”’The 0.05 syndrome’, a severe, debilitating statistical illness.”

– Palmer CR. 2002

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 37 / 57

Inferential Statistics

P-values Confidence intervals

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 38 / 57

Page 18: Statistics bootcamp 2013

Part VI

The other big problem - useless graphics.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 39 / 57

Graphical Displays

Common pitfalls in statisticsEvaluating research articlesOtherWhat to look for in a clinical trial

Red larger than yellow or yellow larger than red?

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 40 / 57

Page 19: Statistics bootcamp 2013

Graphical Displays

0 2 4 6 8 10frequency

What to look for in a clinical trial

Other

Common pitfalls in statistics

Evaluating research articles

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 41 / 57

Graphical Displays

”Any data that can be encoded by one of the pop charts [pie charts, divided barcharts, area charts] can also be encoded by either a dot plot or a multiway dotplot that typically provides far more efficient pattern perception and tablelook-up than the pop-chart encoding.”

– WS Cleveland, The Elements of Graphing Data (rev. Ed) 1994.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 42 / 57

Page 20: Statistics bootcamp 2013

Graphical Displays

Civic Theatres

Contingency & Transfers

Civic Grants

General Government

Library

Community Services

Engineering

Capital Program & Debt

Fire

Support Services

Parks & Recreation

Police

Utilities

0 5 10 15 20 25

2011 Operating Expenditure Budget ($1.03B)

Percent of Total Budget

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 43 / 57

Graphical DisplaysDynamite plots

”Dynamite pushers” ”Skyscrapers with TV-aerials” ”Pinhead plots”

0

20

40

60

80

100

post

trea

tmen

t sco

rem

ean

+ sd

sham acupuncture

Low information-to-ink ratio. Inaccurate ”look-up”.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 44 / 57

Page 21: Statistics bootcamp 2013

Graphical DisplaysDynamite plots

0

20

40

60

80

100

post

trea

tmen

t sco

rem

ean

+ sd

sham acupuncture0

20

40

60

80

100

Post

trea

tmen

t sco

re

sham acupuncture

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 45 / 57

Graphical DisplaysTufte’s list of nine ”shoulds”The Visual Display of Quantitative Information, 1983

Graphical displays should:

show the data,

induce the viewer to think about the substance rather than about methodology,graphic design, the technology of graphic production, or something else,

avoid distorting what the data have to say,

present many numbers in a small space,

make a large data set coherent,

encourage the eye to compare different pieces of data,

reveal the data at several levels of detail, from a broad overview to the fine structure,

serve a reasonably clear purpose: description, exploration, tabulation, or decoration,and

be closely integrated with the statistical and verbal descriptions of a data set.

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 46 / 57