![Page 1: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/1.jpg)
WEO Research Workshop
Seoul, 17 November 2018
![Page 2: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/2.jpg)
Biostatistician’s role in study designDr Sunny H Wong
MBChB, DPhil, FRCPEd, FRCPath, FHKCP, FHKAM
Assistant ProfessorInstitute of Digestive Disease
The Chinese University of Hong Kong
![Page 3: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/3.jpg)
Breath of biostatistics• Descriptive statistics (data types, central tendency, dispersion,
exploratory data analysis)
• Probability distributions & confidence intervals• Hypothesis testing (null hypothesis, type I and II errors, sample size,
power)
• Inferential statistics (t-test, chi-square, trend test, Fisher’s test, log rank test, comparative data analysis)
• Correlation & regressions• Multiple comparisons & corrections• Survival analysis• Meta-analysis• Bayesian statistics• Others (diagnosis, public health, bioinformatics)
![Page 4: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/4.jpg)
Outline
1. Study design and power
2. Descriptive statistics
3. Inferential statistics
4. Software and tips
![Page 5: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/5.jpg)
Study design and power
![Page 6: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/6.jpg)
Concepts of statistics
• Science of (1) collecting and analyzing data to help make decisions in uncertainty, and (2) to make inference about a phenomenon observed
![Page 7: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/7.jpg)
Clinical studies
![Page 8: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/8.jpg)
Types of clinical studiesC
linic
al stu
die
s
Descriptive Studies
Population
Samples
Case Reports
Case Series
Cross Sectional
Analytical Studies
Observational
Case Control
Cohort
Experimental RCT
Meta-analysis
![Page 9: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/9.jpg)
Power calculation
𝑛 =2𝜎2(𝑍𝛽 + 𝑍𝛼/2)
2
𝑑2
n = sample size𝜎 = standard deviationβ = powerα = level of significanced = effect size
Loss to follow-up
![Page 10: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/10.jpg)
Descriptive statistics
![Page 11: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/11.jpg)
Types of variables
• Nominal - categories (e.g. gender, ethnicity)
• Ordinal - categories with a trend (e.g. cancer stage, grade)
• Numerical / scalar - quantitative• Continuous scale (e.g. age, height)
• Discrete scale (e.g. number of polyp)
![Page 12: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/12.jpg)
Box-and-whisker plot
Mind these !
![Page 13: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/13.jpg)
Bar chart
• Height is the mean
• How about error bars?• Standard deviation (SD)
• Standard error of mean (SE / SEM)
• Confidence interval (95% CI)
![Page 14: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/14.jpg)
Histogram
Q: What are the differences between bar chart and histogram? (P.T.O.)
![Page 15: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/15.jpg)
Bar chart and histogram
Bonus Q: what is a Pareto chart?
![Page 16: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/16.jpg)
Normal distribution
de Moivre,
Gauss & Laplace
~68%
~95%
~99.7% (+/- 3 sd)
Gaussian distribution(z-statistics)
![Page 17: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/17.jpg)
Normality
• Histogram inspection• n>30
• Fitting shape
• Quantile-quantile plot
• Formal tests• Komolgorov-Smirnov test
• Shapiro Wilk test
![Page 18: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/18.jpg)
QQ plot
Q: What is the hospital length-of-stay distribution? (right skewed distribution)
![Page 19: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/19.jpg)
Inferential statistics
![Page 20: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/20.jpg)
Sampling
e.g. SBP of the Hong Kong population
![Page 21: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/21.jpg)
Sampling distribution
![Page 22: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/22.jpg)
Central limit theorem
![Page 23: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/23.jpg)
Distribution of the means
95% C.I.
![Page 24: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/24.jpg)
Student’s t distribution
• Described by William Sealy Gosset
Resemble normal distribution if
sample size is large (n>30)
![Page 25: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/25.jpg)
Hypothesis testing
• Confirmatory data analysis to determine the probability that a given hypothesis is true
• Null hypothesis ‘H0’: statement of no differences or association between variables
• Alternative hypothesis ‘H1’: statement of differences or association between variables
• Type I (alpha) and Type II (beta) error
![Page 26: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/26.jpg)
P-value
Probability of mistakenly rejecting the null hypothesis (α)
![Page 27: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/27.jpg)
Parametric tests
• Parametric tests assume data comes from a population
• With known probability distribution (e.g. normal)
• Based on a fixed set of parameters (e.g. mean, SD)
Non-parametric tests usually
less powerful
(values discarded)
![Page 28: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/28.jpg)
Comparative tests (interval)
Tests for
difference
Ordinal NominalInterval
Parametric
distribution
Related Unrelated Related Unrelated Related Unrelated
t-test
(paired)
t-test
(unpaired)
Wilcoxon
signed rank
test
Mann-
Whitney U /
Wilcoxon
rank sum test
McNemar
test
Chi-squared
/ Fisher’s
exact test
Yes
No
ANOVA ANOVAFriedman
test
Kruskal-
Wallis test
2 g
rou
ps
3g
rou
ps
Post-hoc tests for
multiple comparisons
![Page 29: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/29.jpg)
A walk-through of rank sum testGroup A Group B Rank A Rank B A B
87 71 19 9 Total rank 127 148
72 42 10 1 Median 74 75.5
94 69 22 8 n 11 12
49 97 2 23
56 78 4 14.5 U(A) 71
88 84 20 17 U(B) 61
74 57 12 5 U statistic 61
61 64 6 7 P-value 0.76
80 78 16 14.5
52 73 3 11
75 85 13 18
91 21
![Page 30: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/30.jpg)
Comparative tests (categorical)
• Examine differences between observed and expected counts
• Two assumptions• Independence of observations
• Count of all cells >5
• Degree of freedom• df = (R-1) x (C-1)
• No of free variables
Drug A Drug B
Cured 20 10 30
Not cured 12 22 34
32 32 64
![Page 31: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/31.jpg)
2 distribution and df
![Page 32: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/32.jpg)
Correlation
Pearson’s correlation coefficient (parametric)
Spearman’s rank correlation coefficient
(non-parametric)
![Page 33: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/33.jpg)
Regression
• Estimating relationships between variables(between dependent and independent variables)
• Logistic regression
• Linear regression
![Page 34: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/34.jpg)
Correlation/association ≠ causality
Q: What are the differences between correlation and regression?
![Page 35: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/35.jpg)
Tips and software
![Page 36: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/36.jpg)
Study design
• Most important is your research question
• Consider• Time
• Effort
• Infrastructure
• Clinical ethics, governance and compliance
![Page 37: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/37.jpg)
Power calculators
GPower: http://www.gpower.hhu.de/CUHK CCRB: http://www2.ccrb.cuhk.edu.hk/web/Clin Calc: http://clincalc.com/stats/samplesize.aspx
![Page 38: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/38.jpg)
Statistical software
R Project for Statistical Computing
![Page 39: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/39.jpg)
Statistical software
![Page 40: WEO Research Workshop · 2018-12-05 · Breath of biostatistics •Descriptive statistics (data types, central tendency, dispersion, exploratory data analysis) •Probability distributions](https://reader034.vdocuments.site/reader034/viewer/2022042412/5f2c66f11c24b00d6314b658/html5/thumbnails/40.jpg)
Questions?