statistics bootcamp 2013

Statistics Bootcamp 101 for HLABC Members

Penny Brasher, PhD

Vancouver, BC

June 14, 2013

c©PMA Brasher (UBC) Biostats Bootcamp 14.Jun.2013 2 / 57

Statistics are everywhere

Angus Reid Public Opinion surveyed 808 randomly selected B.C. residents from May 1 to 2. Itclaims a margin of error of +/-3.5


What is Statistics?


What is Biostatistics?

Biostatistics = statistics applied to biomedical problems

design and analysis of experiments

design and analysis of observational studies

measurement, data analysis (description, inference), statistical graphics

detective work

making decisions in the face of uncertainty (variability)

inference from a sample (specific) to a population (general)


Part I

Basic Statistical Concepts


Basic Concepts

Two broad categories of statistics:

Descriptive Statistics

Inferential Statistics


Basic Concepts


using numerical summaries and figures to summarize or characterize a set of data.

mean, median, variance, range, etc.

histograms, scatterplots, boxplots, etc.

? no assumptions are made.

⇒ If the data are a random sample from a certain population, the sample represents thepopulation in minature.


Part II

Types of Data


Types of Data

Categorical Data

Nominal variables assume values that fall into unordered categories. Nominal datamay be binary (dichotomous) or polychotomous (polytomous). Examples: admissionstatus (admitted, not admitted), survival status (alive, dead), race (caucasian, asian,black, ...).

Ordinal variables assume values that fall into ordered categories but differencesbetween values are not meaningful. Examples: response to treatment (worse, same,improved), degress of illness (none, mild, moderate, severe), likert-item (stronglydisagree, disagree, neutral, agree, strongly agree).

Numerical (Metric, Quantitative) Data

Numerical discrete variables assume a countable number of values. There can begaps in its possible values. Examples: number of comorbidities, number of falls in ayear.

Numerical continuous variables assume, in theory, inifinite values in a given range;there are no gaps in its possible values. Examples: age, weight, etc.


Grip strength of health librarians

Data collected:

Cohort HLABCID 1Year of birth 1952 numerical discreteHeight (cm) 161.3 numerical continuousSex F nominalGrip position 2 ordinalDominant hand R nominalOrder RL nominalGrip strength, Right (kg) 31.8 numerical continuousScrunchy face (R) 1 nominalGrip strength, Left (kg) 24.6 numerical continuousScrunchy face (L) 1 nominal


Descriptive StatisticsTypes of Data

Nota bene

There is no such thing as

”nonparametric data”.

⇒ Parameters belong to models.


Part III



Grip strength of health librariansDescriptive Statistics

How would you summarize the characteristics of this sample of librarians?

The characteristics we have collected include:

Year of birthHeight (cm)SexDominant handGrip strength


Descriptive StatisticsData Summaries

For categorical variables - frequencies & percentages. 1

For numerical continuous variables, typically, one wants to describe the central tendency(central location) of the data, and the degree to which the data is, or is not, spread out(dispersion).

Why are mean and standard deviation often used to describe continuous variables?

1Don’t report percentages if the sample size is small.


The Normal (Gaussian) Distribution

Normal distributions are completely determined by only two values – the mean, µ, andthe standard deviation, σ.

−8 −6 −4 −2 0 2 4 6 8

(0,1) (3,1)

(0,2)

Gaussian (normal) distributions

The mean, µ, determinesthe center.

The standard deviation, σdetermines the spread(variability).


The Normal (Gaussian) Distribution

Normal distributions are completely determined by only two values – the mean, µ, andthe standard deviation, σ.

µ−4σ µ−2σ µ µ+2σ µ+4σ

N(µ,σ)

95%

95% of observations will lie in theinterval (µ− 1.96σ, , µ+ 1.96σ).

∼70% of observations will lie in theinterval (µ− σ , µ+ σ).

50% of observations will lie in theinterval (µ− 0.675σ , µ+ 0.675σ).


Describing DataData Summaries

For continuous variables that are approximately normally distributed the sampledistribution may be summarized with the sample mean, x̄ , and the sample standarddeviation, sd .

? For continuous variables with skewed distributions other summary statistics should beused. If the distribution is unimodal the median and P25 & P75 (Q1 & Q3) or themedian and P10 & P90 could be used.

– Altman DG, Bland JM. Quartiles, quintiles, centiles, and other quantiles. BMJ 1994;309:996.



Part of Table 1 from a randomized trial in patients undergoing CABG.

20

Table 1. Anthropometric, baseline and procedural characteristics (intent-to-treat and safety population)

Clevidipine

N=49

Nitroglycerin

N=51

Age, years; mean (SD) 65.8 (11.3) 63.2 (12.3)

Sex

Male, n (%) 40 (81.6) 43 (84.3)

Female, n (%) 9 (18.4) 8 (15.7)

Weight, kg; mean (SD) 79.7 (15.9) 82.1 (18.5)

Height, cm; mean (SD) 170.4 (9.0) 170.5 (12.4)

ASA Physical Status*, n (%)

I 0 (0.0) 0 (0.0)

II 0 (0.0) 1 (2.0)

III 29 (59.2) 33 (64.7)

IV 19 (38.8) 16 (31.4)

V 1 (2.0) 0 (0.0)

Body Mass Index, kg/m2; mean (SD) 27.4 (5.1) 28.2 (5.2)

Index Procedure, n (%)

CABG 43 (87.8) 45 (88.2)

CABG plus valve surgery 6 (12.2) 6 (11.8)

Target MAP, pre-CPB, mmHg; mean (SD) 76.1 (7.0) 76.4 (7.9)

Target MAP, aortic cannulation, mmHg; mean (SD);

CLV n=49, NTG n=49

64.6 (11.9) 63.6 (10.4)

Duration of bypass, min; mean (SD); CLV n=47,

NTG n=51

102.5 (37.1) 99.2 (35.8)

Duration of aortic cannulation (min) mean (SD); CLV

n=35, NTG n=38

18.9 (40.3) 13.3 (26.1)

IABP used, mean (SD) 2 (4.1) 0 (0.0)

Number of grafts, mean (SD) 3.1 (0.8) 3.0 (1.0)

Abbreviations: kg = kilograms, cm = centimeters. ASA = American Society of Anesthesiologists.

IABP = intra-aortic balloon pump.

SD=standard deviation. CLV=clevidipine. NTG=nitroglycerin.

*ASA physical status unknown for 1 NTG-treated patient.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

What changes would you make to this table?



20

Table 1. Anthropometric, baseline and procedural characteristics (intent-to-treat and safety population)

Clevidipine N=49

Nitroglycerin N=51

Age, years; mean (SD) 65.8 (11.3) 63.2 (12.3)

Sex

Male, n (%) 40 (81.6) 43 (84.3)

Female, n (%) 9 (18.4) 8 (15.7)

Weight, kg; mean (SD) 79.7 (15.9) 82.1 (18.5)

Height, cm; mean (SD) 170.4 (9.0) 170.5 (12.4)

ASA Physical Status*, n (%)

I 0 (0.0) 0 (0.0)

II 0 (0.0) 1 (2.0)

III 29 (59.2) 33 (64.7)

IV 19 (38.8) 16 (31.4)

V 1 (2.0) 0 (0.0)

Body Mass Index, kg/m2; mean (SD) 27.4 (5.1) 28.2 (5.2)

Index Procedure, n (%)

CABG 43 (87.8) 45 (88.2)

CABG plus valve surgery 6 (12.2) 6 (11.8)

Target MAP, pre-CPB, mmHg; mean (SD) 76.1 (7.0) 76.4 (7.9)

Target MAP, aortic cannulation, mmHg; mean (SD); CLV n=49, NTG n=49

64.6 (11.9) 63.6 (10.4)

Duration of bypass, min; mean (SD); CLV n=47, NTG n=51

102.5 (37.1) 99.2 (35.8)

Duration of aortic cannulation (min) mean (SD); CLV n=35, NTG n=38

18.9 (40.3) 13.3 (26.1)

IABP used, mean (SD) 2 (4.1) 0 (0.0)

Number of grafts, mean (SD) 3.1 (0.8) 3.0 (1.0)

Abbreviations: kg = kilograms, cm = centimeters. ASA = American Society of Anesthesiologists. IABP = intra-aortic balloon pump. SD=standard deviation. CLV=clevidipine. NTG=nitroglycerin. *ASA physical status unknown for 1 NTG-treated patient.

1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465

⇒ For skewed (asymmetric) data use percentiles.

⇒ For nominal and ordinal variables and for numerical discrete variables with a limitedrange use a table of frequencies.


Describing dataData Summaries

Sometimes you don’t need to summarize data:

Times to circulatory collapse(s) were

10,35,42,42,43,70; 5,46,50,50,54,64 in IG and C groups, respectively.


Part V



Basic Concepts


making inferences about a population from a sample.

estimation and hypothesis testing.

? some assumptions are made.


Quantifying the role of chance

A very simple example.

We wish to know if a coin is ”fair”. By ”fair” we mean that the probability of getting ahead on any flip is 1/2.

To determine if the coin is ”fair” we could take it to the laboratory and:

determine the weight distribution throughout the coin,

determine the aerodynamics of the coin,

etc.

In this way we would discover the ”truth”.

OR

We could conduct an experiment, compute some statistics and try to get close to thetruth.

⇒ Statistical inference.


Significance TestingQuantifying the role of chance

Returning to our very simple example.

We decide to flip the coin 15 times.

We observe 4 heads in 15 flips.


Significance TestingQuantifying the role of chance

N Observed Expected Assumed p Observed p---------------------------------------------------15 4 7.5 0.50000 0.26667

Pr(k <= 4) = 0.059 (one-sided test)Pr(k <= 4 or k >= 11) = 0.118 (two-sided test)

What does this mean?


Significance Testing

Sir Ronald A. Fisher

In general, tests of significance arebased on hypothetical probabilitiescalculated from their nullhypotheses. They do not generallylead to any probability statementsabout the real world, but to arational and well-defined measureof reluctance to the acceptance ofthe hypotheses they test”.

– Fisher RA. Statistical Methods and ScientificInference (1956)


Significance TestingA very simple example

Study design: flip coin 15 times.

Test statistic: number of heads.

Evidence against: too many or too few heads.

Probability model: Binomial (n=15,π = 0.5)

Theoretical distribution of the number of heads if the coin is fair.

number of heads

prob

abili

ty

0.00

0.05

0.10

0.15

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

If the coin was fair

Fisher would ask us to consider if 4heads (plus more extreme results)is unlikely under the nullhypothesis, i.e. fair coin.


Significance TestingThe P-value

Pr(k <= 4 or k >= 11 | if coin is fair) = 0.118

The more interesting question is . . .

What is the probablity that the coin is fair? i.e. What is the probability that the nullhypothesis is true?

I have no idea.


Quantifying the role of chanceThe P-value

In significance testing, the P-value is the probability of obtaining a result (i.e. teststatistic) at least as extreme as the one that was actually observed, when the nullhypothesis is true.

It is Pr(data|H0 is true).

An ”unlikely” event suggests that H0 is unlikely but the P-value provides no measure ofjust how unlikely H0 is.

Akin to proof by contradiction.

We have a model and we examine the extent to which the data contradict the model.

The basis for suggesting a contradiction is observing data that are highly improbableunder the model.

⇒ In health research involving human subjects, P-values are next to useless.

And yet they’re everywhere.


Inferential StatisticsP-values vs Confidence Intervals

In a randomized trial comparing two treatments the following mortality results werereported by the authors.

Std Exp Std Expnumber percent

died 19 12 31.7 20.7survived 41 46 68.3 79.3total 60 58

P = 0.21, Fisher’s exact test.

The authors concluded ”there is no difference in mortality”.

What do you think?


Inferential StatisticsConfidence Intervals

A P-value tells you nothing about the size of the treatement effect.

The estimate of the true treatment effect is:

31.7% - 20.7% = 11.0%, 95% CI: -4.9% to 26.1%, 80% CI: 0.6% to 21.0%.

What does the confidence interval represent?

statistical definition: If the study were to be repeated 1000 times and a 95% CI wasconstructed each time, we would expect 950 of those intervals to include the populationparameter. A reported confidence interval from a particular study may or may not include the actual

population value.

working definition: Values of the population parameter that are conistent with thesample data.

⇒ The confidence interval gives a plausible range of values for the unknown populationparameter.

Would you want to receive standard treatment?


Angus Reid Public Opinion surveyed 808 randomly selected B.C. residents from May 1 to2. It claims a margin of error of +/-3.5

For a proportion the maximum variance is when p = 0.50.

. cii 808 404

-- Binomial Exact --

Obs Mean Std. Err. [95% Conf. Interval]

----------------------------------------------------

808 0.5 0.01759 0.46496 .53504

2*.01759 = 0.03518

Angus Reid is providing the width of the largest possible 95% confidence interval.


Confidence Intervals vs P-valuesInterpreting Results

Overemphasis on hypothesis testing — and the use of P-values to dichotomizesignificant or non-significant results — has detracted from more usefulapproaches to interpreting study results, such as estimation and confidenceintervals. In medical studies investigators should usually be interested indetermining the size of difference of a measured outcome between groups,rather than a simple indication of whether or not it is statistically significant.

Gardner MJ, Altman DG. Statistics with Confidence

”’The 0.05 syndrome’, a severe, debilitating statistical illness.”

– Palmer CR. 2002



P-values Confidence intervals


Part VI

The other big problem - useless graphics.


Graphical Displays

Common pitfalls in statisticsEvaluating research articlesOtherWhat to look for in a clinical trial

Red larger than yellow or yellow larger than red?


Graphical Displays

0 2 4 6 8 10frequency

What to look for in a clinical trial

Other

Common pitfalls in statistics

Evaluating research articles


Graphical Displays

”Any data that can be encoded by one of the pop charts [pie charts, divided barcharts, area charts] can also be encoded by either a dot plot or a multiway dotplot that typically provides far more efficient pattern perception and tablelook-up than the pop-chart encoding.”

– WS Cleveland, The Elements of Graphing Data (rev. Ed) 1994.


Graphical Displays

Civic Theatres

Contingency & Transfers

Civic Grants

General Government

Library

Community Services

Engineering

Capital Program & Debt

Fire

Support Services

Parks & Recreation

Police

Utilities

●

●

●

●

●

●

●

●

●

●

●

●

●

0 5 10 15 20 25

2011 Operating Expenditure Budget ($1.03B)

Percent of Total Budget


Graphical DisplaysDynamite plots

”Dynamite pushers” ”Skyscrapers with TV-aerials” ”Pinhead plots”

0

20

40

60

80

100

post

trea

tmen

t sco

rem

ean

+ sd

sham acupuncture

Low information-to-ink ratio. Inaccurate ”look-up”.


Graphical DisplaysDynamite plots

0

20

40

60

80

100

post

trea

tmen

t sco

rem

ean

+ sd

sham acupuncture0

20

40

60

80

100

Post

trea

tmen

t sco

re

sham acupuncture


Graphical DisplaysTufte’s list of nine ”shoulds”The Visual Display of Quantitative Information, 1983

Graphical displays should:

show the data,

induce the viewer to think about the substance rather than about methodology,graphic design, the technology of graphic production, or something else,

avoid distorting what the data have to say,

present many numbers in a small space,

make a large data set coherent,

encourage the eye to compare different pieces of data,

reveal the data at several levels of detail, from a broad overview to the fine structure,

serve a reasonably clear purpose: description, exploration, tabulation, or decoration,and

be closely integrated with the statistical and verbal descriptions of a data set.


statistics bootcamp 2013

Technology