Download - Topics in Biostatistics: Part II
![Page 1: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/1.jpg)
CCEB
Topics in BiostatisticsPart 2
Sarah J. Ratcliffe, Ph.D.Sarah J. Ratcliffe, Ph.D.Center for Clinical Epidemiology and Center for Clinical Epidemiology and
BiostatisticsBiostatisticsUniversity of Penn School of Medicine University of Penn School of Medicine
![Page 2: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/2.jpg)
CCEB
Outline
Hypothesis testingHypothesis testing ExamplesExamples Interpreting resultsInterpreting results ResourcesResources
![Page 3: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/3.jpg)
CCEB
Hypothesis testing
Steps:Steps: Select a one-sided or two-sided test.Select a one-sided or two-sided test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα
= .05).= .05). Select an appropriate test statistic.Select an appropriate test statistic. Compute test statistic with actual data.Compute test statistic with actual data. Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the ) for the
test statistic.test statistic.
![Page 4: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/4.jpg)
CCEB
Hypothesis testing
Steps cont’d:Steps cont’d: Obtain a tabled value for the statistical Obtain a tabled value for the statistical
test.test. Compare the test statistic to the tabled Compare the test statistic to the tabled
value.value. Calculate a p-value.Calculate a p-value.
Make decision to accept or reject null Make decision to accept or reject null hypothesis.hypothesis.
![Page 5: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/5.jpg)
CCEB
Hypothesis testing
Steps:Steps: Select a one-sided or two-sided test.Select a one-sided or two-sided test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα = .05). = .05). Select an appropriate test statistic.Select an appropriate test statistic. Compute test statistic with actual data.Compute test statistic with actual data. Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test ) for the test
statistic.statistic.
![Page 6: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/6.jpg)
CCEB
Hypothesis testing: One-sided versus Two-sided
Determined by the alternative hypothesis. Unidirectional = one-sided
Example: Infected macaques given vaccine or placebo. Higherviral-replication in vaccine group has no benefit ofinterest.
H0: vaccine has no beneficial effect on viral-replication levels at 6 weeks after infection.
Ha: vaccine lowers viral-replication levels by 6 weeks after infection.
![Page 7: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/7.jpg)
CCEB
Hypothesis testing: One-sided versus Two-sided
Bi-directional = two-sidedExample:
Infected macaques given vaccine or placebo. Interested in whether vaccine has any effect on viral-replication levels, regardless of direction of effect.
H0: vaccine has no beneficial effect on viral-replication levels at 6 weeks after infection.
Ha: vaccine effects the viral-replication levels.
![Page 8: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/8.jpg)
CCEB
Hypothesis testing
Steps:Steps: Select a one-sided or two-sided test.Select a one-sided or two-sided test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα = . = .
05).05). Select an appropriate test statistic.Select an appropriate test statistic. Compute test statistic with actual data.Compute test statistic with actual data. Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test ) for the test
statistic.statistic.
![Page 9: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/9.jpg)
CCEB
Hypothesis testing: Level of Significance
How many different hypotheses are being examining?
How many comparisons are needed to answer this hypothesis?
Are any interim analyses planned?e.g. test data, depending on results
collect more data and re-test.=>=> How many tests will be ran in total?How many tests will be ran in total?
![Page 10: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/10.jpg)
CCEB
Hypothesis testing: Level of Significance
αtotal = desired total Type-I error (false positives) for all comparisons.
One test α1 = αtotal
Multiple tests / comparisons If αi = αtotal, then ∑αi > αtotal
Need to use a smaller α for each test.
![Page 11: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/11.jpg)
CCEB
Hypothesis testing: Level of Significance
Conservative approach: αi = αtotal / number comparisons
Can give different α’s to each comparison. Formal methods include: Bonferroni, Tukey-
Cramer, Scheffe’s method, Duncan-Walker. O’Brien-Fleming boundary or a Lan and Demets analog
can be used to determine αi for interim analyses.
Benjamini Y, and Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. JRSSB, 57:125-133.
![Page 12: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/12.jpg)
CCEB
Hypothesis testing
Steps:Steps: Select a one-tailed or two-tailed test.Select a one-tailed or two-tailed test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα = .05). = .05). Select an appropriate test statistic.Select an appropriate test statistic. Compute test statistic with actual data.Compute test statistic with actual data. Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test ) for the test
statistic.statistic.
![Page 13: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/13.jpg)
CCEB
Hypothesis testing: Selecting an Appropriate test
How many samples are being compared? One sample Two samples Multi-samples
Are these samples independent? Unrelated subjects in each sample. Subjects in each sample related / same.
![Page 14: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/14.jpg)
CCEB
Hypothesis testing: Selecting an Appropriate test
Are your variables continuous or categorical? If continuous, is the data normally distributed?
Normality can be determined using a P-P
(or Q-Q) plot. Plot should be approximately a straight line
for normality. If not normal, can it be transformed to
normality?Blindly assuming normality can lead to
wrong conclusions!!!
![Page 15: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/15.jpg)
CCEB
Hypothesis testing: Selecting an Appropriate test
Approximately a straight line
= normal assumption okay
![Page 16: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/16.jpg)
CCEB
Hypothesis testing: Selecting an Appropriate test
Not a straight line
= NOT normal
Can it be transformed to normality?
![Page 17: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/17.jpg)
CCEB
Hypothesis testing: Selecting an Appropriate test
The natural log transform of the data is approximately a straight line
= normal assumption okay
Analyze the transformed data NOT the original data.
![Page 18: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/18.jpg)
CCEB
Hypothesis testing: Geometric versus Arithmetic mean
GeometricGeometric mean of n positive numerical values is mean of n positive numerical values is the nth root of the product of the n values. the nth root of the product of the n values.
GeometricGeometric will always be will always be less thanless than arithmeticarithmetic.. GeometricGeometric better when some values are very large better when some values are very large
in magnitude and others are small.in magnitude and others are small. If If geometricgeometric is used, log-transform the data before is used, log-transform the data before
analyzing. analyzing. Arithmetic mean of log-transformed data is the Arithmetic mean of log-transformed data is the
log of the geometric mean of the data log of the geometric mean of the data E.g. t-test on log-transformed data = test for E.g. t-test on log-transformed data = test for
location of the geometric mean location of the geometric mean Langley R., Langley R., Practical Statistics Simply ExplainedPractical Statistics Simply Explained, 1970, , 1970,
Dover Press Dover Press
![Page 19: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/19.jpg)
CCEB
Source: Richardson & Overbaugh (2005). Basic statistical considerations in virological experiments. Journal of Virology, 29(2): 669-676.
Type of Data
No. of samplesbeing
compared
Relationshipbetweensamples
Underlyingdistribution ofall samples Potential statistical tests
Binary 1 n/a Binary One sample binomial test
Binary 2 Independent Binary Chi-square test, Fisher's exact test
Binary >2 Independent Binary Chi-square test
Binary 2 Paired Binary McNemar's test
Binary >2 Related Binary Cochran's Q test
Continuous 1 n/a NormalOne sample t-test for means, one-
sample chi-square test fro variances
Continuous 1 n/a Non-normalOne sample Wilcoxon signed-rank test,
one-sample sign test
Continuous 2 Independent NormalTwo-sample t-test for means, two-sample
F test for variances
Continuous 2 Independent Non-normal Wilcoxon rank sum test
Continuous 2 Paired Normal Paired t-test
Continuous 2 Paired Non-normal Wilcoxon signed-rank test, sign test
Continuous >2 Independent NormalOne-way ANOVA for means, Bartlett's
test of homogeneity for variances
Continuous >2 Independent Non-normal Kruskal-Wallis test
Continuous >2 Related Non-normal Friedman rank sum test
![Page 20: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/20.jpg)
CCEB
Hypothesis testing: Selecting an Appropriate test
Other tests are available for more complex situations. For example,
Repeated measures ANOVA: >2 measurements taken on each subject; usually interested in time effect.
GEEs / Mixed-effects models: >2 measurements taken on each subject; adjust for other covariates.
![Page 21: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/21.jpg)
CCEB
Hypothesis testing
Steps:Steps: Select a one-tailed or two-tailed test.Select a one-tailed or two-tailed test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα = .05). = .05). Select an appropriate test statistic.Select an appropriate test statistic. Run the testRun the test..
![Page 22: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/22.jpg)
CCEB
Example 1
Expression of chemokine receptors on CD14+/CD14- populations of blood monocytes.
Percent of cells positive by FACS.
![Page 23: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/23.jpg)
CCEB
CCR8
subject CD14+ CD14-
1 5 17
2 9 25
3 13 36
4 2 9
5 5 18
6 0 2
7 6 6
8 21 30
9 5 6
10 36 35
mean 10.2 18.4
st dev 10.9 12.6
st error 3.4 4.0
![Page 24: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/24.jpg)
CCEB
Example 1 cont’d
Continuous data, 2 samples=> t-test, if normal OR=> Wilcoxon rank sum or signed-rank
sum test, if non-normal Are samples independent or paired?
If independent, can test for equality of variances using a Levene’s test
![Page 25: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/25.jpg)
CCEB
Example 1 cont’d
T-tests in excel
=TTEST(L6:L15,M6:M15,2,2)
Cells containing data from sample 1
Cells containing data from sample 2
1-sided or 2-sided test
Type of t-test:
1: paired
2: independent, equal variance
3: independent, unequal variance
![Page 26: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/26.jpg)
CCEB
![Page 27: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/27.jpg)
CCEB
Example 1 cont’d Possible results for different assumptions:
P-valuesP-values Normal Normal (t-tests)(t-tests)
Non-normal Non-normal (non-parametric (non-parametric
tests)tests)
Independent, Independent, equal varianceequal variance
0.1370.137
Independent, Independent, unequal varianceunequal variance
0.1370.137 0.1050.105
PairedPaired 0.0100.010 0.0130.013
![Page 28: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/28.jpg)
CCEB
Example 1 cont’d
Which result is correct? Data are paired The differences for each subject are
normally distributed.=> Paired t-test
p = .0095There is a difference in the percentage of
positive CD14+ and CD14- cells.
![Page 29: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/29.jpg)
CCEB
A graph of the 95% CIs for the means would give the impression there is no difference …
![Page 30: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/30.jpg)
CCEB
When it’s really the differences we are testing.
![Page 31: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/31.jpg)
CCEB
Example 1 cont’d
Note: paired tests don’t always give lower p-values.
A 1-sided test on the CCR5 values would give p-values of:
p = 0.06 independent samplesp = 0.11 paired samples
WHY?
![Page 32: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/32.jpg)
CCEB
Example 1 cont’d
The differences have a larger spread than the individual variables.
![Page 33: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/33.jpg)
CCEB
Example 2
Does the level of CCR5 expression on PBLs (basal or upregulated using lentiviral vector) determine the % of entry that occurs via CCR5?
Two viruses 89.6 DH12
![Page 34: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/34.jpg)
CCEB
Example 2 cont’dCCR5-mediated entry into PBL from 6 donors
89.6y = 3.7371x - 0.1265
R2 = 0.4473
DH12y = 4.1408x + 4.2137
R2 = 0.4333
0
4
8
12
16
20
0 0.5 1 1.5 2 2.5
% of cells CCR5 positive
% o
f e
ntr
y m
ed
iate
d b
y C
CR
5
89.6
DH12
Linear (89.6)
Linear (DH12)
![Page 35: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/35.jpg)
CCEB
Example 2 cont’d
How do we know if the slope of the line is significantly different from 0?
Can perform a t-test on the slope estimate. For simple linear regression, this is the same as a t-test for correlation (= square root of R2).
![Page 36: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/36.jpg)
CCEB
Example 2 cont’d
![Page 37: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/37.jpg)
CCEB
Interpreting Results
P-values Is there a statistically significant result? If not, was the sample size large
enough to detect a biologically meaningful difference?
![Page 38: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/38.jpg)
CCEB
Online Resources
Power / sample size calculatorsPower / sample size calculators http://calculators.stat.ucla.edu/powercalc/http://calculators.stat.ucla.edu/powercalc/ http://www.stat.uiowa.edu/~rlenth/Power/http://www.stat.uiowa.edu/~rlenth/Power/
Free statistical softwareFree statistical software http://members.aol.com/johnp71/javasta2.html#http://members.aol.com/johnp71/javasta2.html#
FreebiesFreebies
![Page 39: Topics in Biostatistics: Part II](https://reader030.vdocuments.site/reader030/viewer/2022020123/5597e18f1a28ab41758b4615/html5/thumbnails/39.jpg)
CCEB
BECC – Consulting Center
www.cceb.upenn.edu/main/center/becc.htmlwww.cceb.upenn.edu/main/center/becc.html Hourly fee serviceHourly fee service Design and analysis strategies for research Design and analysis strategies for research
proposals; proposals; Selecting and implementing appropriate statistical Selecting and implementing appropriate statistical
methods for specific applications to research data; methods for specific applications to research data; Statistical and graphical analysis of data; Statistical and graphical analysis of data; Statistical review of manuscripts.Statistical review of manuscripts.