biostatistics in practice session 2: quantitative and inferential issues ii youngju pak...

102
Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician http://research.LABioMed.org/ Biostat 1

Upload: belinda-tate

Post on 04-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Biostatistics in Practice

Session 2: Quantitative and Inferential Issues II

Youngju PakBiostatistician

http://research.LABioMed.org/Biostat 1

Page 2: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

What we have learned in Session 1? Basic Study Design Parameters vs. Statistics Inferential vs. Descriptive statistics Categorical vs. Quantitative Data? Why

important? Summarizing the data with graphs:

Contingency Tables, Box Plots, Histogram, etc.

How to run MYSTAT

2

Page 3: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

3

Page 4: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

4

Page 5: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

5

Page 6: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

6

Page 7: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

7

Page 8: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

8

Page 9: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

9

Page 10: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

10

Page 11: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

11

Page 12: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

12

Page 13: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

13

Page 14: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

14

Page 15: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

15

Page 16: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

16

Page 17: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

17

Page 18: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

18

Page 19: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

19

Page 20: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

20

Page 21: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

21

Page 22: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

22

Page 23: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

23

Page 24: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

24

Page 25: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

25

Page 26: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

26

Page 27: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

27

Page 28: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

28

Page 29: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

29

Page 30: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

30

Page 31: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

31

Page 32: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

32

Page 33: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

33

Page 34: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

34

Page 35: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

35

Page 36: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

36

Page 37: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

37

Page 38: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

38

Page 39: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

39

Page 40: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

40

Page 41: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

41

Page 42: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

42

Page 43: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

43

Page 44: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

44

Page 45: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

45

Page 46: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

46

Page 47: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

47

Page 48: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

48

Page 49: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

49

Page 50: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

50

Page 51: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

51

Page 52: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

52

Page 53: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

53

Page 54: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

54

Page 55: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

55

Page 56: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

56

Page 57: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

57

Page 58: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

58

Page 59: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

59

Page 60: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

60

Page 61: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

61

Page 62: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

62

Page 63: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

63

Page 64: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

64

Page 65: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

65

Page 66: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

66

Page 67: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

67

Page 68: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

68

Page 69: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

69

Page 70: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

70

Page 71: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

71

Page 72: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

72

Page 73: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

73

Page 74: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

74

Page 75: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

75

Page 76: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

76

Page 77: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

77

Page 78: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Today’s topics Article : McCann, et al., Lancet 2007 Nov

3;370(9598):1560-7• Subject selection /Randomization • Efficiency from study design • What statistics were used?• Experimental Units /• Independence of Measurements

Normal Distributions Confidence Intervals & P-values

78

Page 79: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

McCann, et al., Lancet 2007 Nov 3;370(9598):1560-7

Food additives and hyperactive behaviour in 3-year-old and 8/9-year-old children in the community: a randomised, double-blinded, placebo-controlled trial.

Target population: 3-4, 8-9 years old children Study design: randomized, double-blinded, controlled,

crossover trial Sample size: 153 (3 years), 144(8-9 years) in

Southampton UK Objective: test whether intake of artificial food color

and additive (AFCA) affects childhood behavior

Page 80: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

McCann, et al., Lancet 2007 Nov 3;370(9598):1560-7

Sampling: Stratified sampling based on SES in Southampton, UK Baseline measure: 24h recall by the parent of the child’s pretrial diet Group: Three groups, for 3 years old

– mix A : 20 mg of food colorings + 45 mg sodium benzoate, which is a widely used food preservative

– mix B : 30mg of food coloring + 45 mg sodium benzoate(current average daily consumption)

– Placebo– For 8/9 years old: multiply these by 1.25

Cross-over Design

A participants receive one of 6 possible random sequences. In a separate study with N=20, no significant difference in looks and taste of drinks among three groups was found even though people ask about which diet type they got when they received placebo (65%) > mix B (52%) > mix A (40%)

80

T0 (baseline) Week 1 Week 2 Week 3 Week 4 Week 5 Week 6

Randomize Randomize RandomizeTypical Diet Washout Washout

Page 81: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

McCann, et al., Lancet 2007 Nov 3;370(9598):1560-7

Outcomes: Global Hyper Activity(GHA) Score Attention-Deficit Hyperactivity Disorder(ADHD)

rating scale IV by teachers, scaled 1 – 5, higher number means more hyperactive

Weiss-Werry-Peters(WWP) hyperactivity scale by parents,

Classroom observation code, Conners continuous performance test II (CPTII)

GHA to be aggregated from these four scores

81

Page 82: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Why standardized outcome measure?GHA = Global Hyperactivity Aggregate , where a higher value ↔ more hyperactive

For each child at each time:Z1 = Z-Score for ADHD from TeachersZ2 = Z-Score for WWP from ParentsZ3 = Z-Score for ADHD in ClassroomZ4 = Z-Score for Conner on Computer

, where Z-score= (Score-Score at T0)/SD to make each measure scaled similarly.

GHA= Mean of Z1, Z2, Z3, Z482

Page 83: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Why normal distribution?

• Symmetric.• One peak.• Roughly bell-shaped.• No outliers.

Many statistical tests(parametric) rely on the assumption that outcome measures follow the normal distribution. 83

Page 84: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

A property of the normal distribution

For bell-shaped distributions of data (“normally” distributed):

• ~ 68% of values are within mean ±1 SD

• ~ 95% of values are within mean ±2 SD “(Normal) Reference Range”

• ~ 99.7% of values are within mean ±3 SD84

Page 85: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

876543210

150

100

50

0

Intensity

Fre

qu

en

cyWhat if it is not normally distributed

Skewed

Need to transform intensity to another scale, e.g.

Log(intensity)Or Nonparametric tests

1207020

20

10

0

Tumor Volume

Fre

quen

cy

Multi-Peak

Need to summarize with percentiles, not mean.Nonparametric tests

85

Page 86: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Representative or Random Samples

How were the children to be studied selected (second column on the first page)? The authors purposely selected "representative" social classes.

Is this better than a "randomly" chosen sample that ignores social class?

Often hear: Non-random = Non-scientific.

Page 87: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Case Study: Participant Selection

No mention of random samples.

Page 88: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Case Study: Participant Selection

It may be that only a few schools are needed to get sufficient individuals. If, among all possible schools, there are few that are lower SES, none of these schools may be chosen.

So, a random sample of schools is chosen from the lower SES schools, and another random sample from the higher SES schools.

Page 89: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Non-Completing or Non-Adhering Subjects

Is it really a random sample? If not, what are the problems?

Page 90: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Why Randomize?

So that groups will be similar except for the intervention.

So that, when enrolling, we will not unconsciously choose an “appropriate” treatment for a particular subject.

Minimizes the chances of introducing bias when attempting to systematically remove it, as in plant yield example.

Page 91: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Case Study: Crossover Design

Each child is studied on 3 occasions under different diets.

Is this better than three separate groups of children?

Why, intuitively?

How could you scientifically prove your intuition?

Page 92: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Estimated mean changes and their Confidence Intervals

Line or Profile Plot

What information was given by these confidence intervals?92

Confidence Interval

Page 93: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Confidence Interval (CI)

• How well your sample mean(m) reflects the true( or population) mean How confident? 95%?

• A confidence interval (CI) is one of inferential statistics that estimate the true unknown parameter using interval scales.

93

Page 94: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Confidence Interval for Population Mean

95% Reference range or “Normal Range”, is

sample mean ± 2(SD) _____________________________________

95% Confidence interval (CI) for the (true, but unknown) mean for the entire population is

sample mean ± 2(SD/√N)

SD/√N is called “Std Error of the Mean” (SEM)94

Page 95: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Confidence Interval: Case Study

Confidence Interval:

-0.14 ± 1.99(1.04/√73) =

-0.14 ± 0.24 → -0.38 to 0.10

Table 2

Normal Range:

-0.14 ± 1.99(1.04) =

-0.14 ± 2.07 → -2.21 to 1.93

0.13 -0.12 -0.37

Adjusted CI

close to

95

Page 96: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

96

Page 97: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

P-values !

• Used the evidence of contradiction to your null hypothesis (H0)– e.g., H0 : no difference in mean GHA scores

among three different diet.

• Based on the statistical test– Eg., T test statistics = Signal / Noise– if Signal >> Noise statistically significant

• Usually p < 0.05 called as “statistically significant” in favor of Ha

97

Page 98: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Experimental Units_____

Independence of Measurements

98

Page 99: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Units and IndependenceExperiments may be designed such that each measurement does not give additional independent information.

Many basic statistical methods require that measurements are “independent” for the analysis to be valid.

In mathematics, two events are independent if and only if the occurrence of one event makes it neither more nor less probable that the other occurs. 99

Page 100: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Experimental Units in Case Study

What is the experimental unit in this study? 1. School 2. Child 3. Parent 4. GHA score (results from three diets)Are all GHA scores(eg. 153 x 3 groups=459 GHA scores for 3-4 years old children) independent?The analysis MUST incorporate this possible correlation (clustering) if there exists. eg., Mixed Model allowing for clustering due to schools.

100

Page 101: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

What have we learned today?

Page 102: Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician  1

Announcements

• Keys for HW1 and HW 2 will be posted on class website by Wednesday.

102