statistics for anaesthesiologists

59
Dr John George K. MD,PDCC Associate Professor of Anaesthesiology KMC, Manipal Statistics for Anaesthesiologists

Upload: xeonfusion

Post on 07-May-2015

461 views

Category:

Education


0 download

DESCRIPTION

Statistics for Anaesthesiologists covers basic to intermediate level statistics for researchers especially commonly used study designs or tests in Anaesthesiology research.

TRANSCRIPT

Page 1: Statistics for Anaesthesiologists

Dr John George K. MD,PDCCAssociate Professor of Anaesthesiology

KMC, Manipal

Statistics for Anaesthesiologists

Page 2: Statistics for Anaesthesiologists

Recommended Software

• RStudio (GUI) with R, R Commander, R Commander Plugins like EZR (Free, Cross platform, powerful programming paradigm)

• G*Power (Free, for power analysis)

• SPSS (Commercial, expensive)

• SOFA (Free, basic)

• Graphpad.com

• Spreadsheet software like MS Excel for initial data entry (export as CSV file format)

Page 3: Statistics for Anaesthesiologists

Data Types

• Nominal or Categorical data

• Ordinal data

• Interval data

• Ratio data

Page 4: Statistics for Anaesthesiologists

Data Types

Nominal: Categorical data and numbers that are simply used as identifiers or names. Ex: social security (Aadhar) number

Ordinal: an ordered series of relationships or rank order. Ex: first, second, or third place in a contest, Likert scale

Interval: A scale that represents quantity and has equal units but for which zero represents simply an additional point of measurement.. Ex: Fahrenheit scale

Ratio: similar to the interval scale. However, this scale also has an absolute zero (no numbers exist below zero). Ex: Height, Weight

Page 5: Statistics for Anaesthesiologists

Parametric tests

Page 6: Statistics for Anaesthesiologists

Non-parametric tests

Page 7: Statistics for Anaesthesiologists

Reporting data types

OK to compute Nominal Ordinal Interval Ratio

Frequency Distribution

Yes Yes Yes Yes

Median, percentiles

No Yes Yes Yes

Mean, SD, SE of mean

No No Yes Yes

Ratio or coefficient of variation

No No No Yes

Page 8: Statistics for Anaesthesiologists

Tests for normality of data

• Kolmogorov-Smirnov Test – inferior to others, relies on goodness of fit of a sample with a normal distribution curve, avoid its use!

• Shapiro-Wilk Test – better, mores specific, more powerful especially with small sample sizes, available in Rcommander, SPSS (under menu Analyze>Descriptive Statistics>Explore)

Page 9: Statistics for Anaesthesiologists

Tests for normality of data

• D'Agostino-Pearson test

• Anderson-Darling test

• Q-Q (Quantile Probability) Plot – visual guide

• Histogram – inferior, look for Skew or Kurtosis

• Density Plot – better, look for Skew or Kurtosis

Page 10: Statistics for Anaesthesiologists

Choosing a statistical test

• Make sure you have adequate sample size (power) to reject null hypothesis (Ho)

• Check is it one (only < or > μ, only one direction) or two-tailed comparison (≠μ , test significance at both sides) – in general use 2

• Look at your data types – ordinal, interval etc

• Do descriptive statistics testing

Page 11: Statistics for Anaesthesiologists

Choosing a statistical test

• Test normality of data – tests and visual comparison (especially when n<30)

• Decide to use Parametric Vs Non-parametric tests

• Look at number of groups 2 or more – t-tests (if n<30), z-test (n>30) or ANOVA (F-test) or their non-parametric equivalents

• For 2 or more groups check if data is paired or independent

Page 12: Statistics for Anaesthesiologists

What is p-value?

Ronald Fisher

Page 13: Statistics for Anaesthesiologists

What is p-value?

Page 14: Statistics for Anaesthesiologists

What is p-value?• The p-value is a probability of the test statistic’s sampling

distribution under the null hypothesis (null distribution, we first assume Ho is true!)

• The (left-tailed) p-value is the quantile of the value of the test statistic, the right-tailed p-value is one minus the quantile, while the two-tailed p-value is twice whichever of these is smaller.

• The p-value is NOT the probability that the null hypothesis is true, nor is it the probability that the alternative hypothesis is false

Page 15: Statistics for Anaesthesiologists

What is p-value?

• p-value is NOT the same as α !

• p-value is NOT the probability of rejecting the null hypothesis (we reject Ho when p-value is less than the significance level which is α)

• p-value is computed while α is set by experimental design

• If Ho is true, α is the probability of rejecting null hypothesis

Page 16: Statistics for Anaesthesiologists
Page 17: Statistics for Anaesthesiologists
Page 18: Statistics for Anaesthesiologists
Page 19: Statistics for Anaesthesiologists
Page 20: Statistics for Anaesthesiologists
Page 21: Statistics for Anaesthesiologists

CHI SQUARE OR FISHER’S EXACT TEST?

• In the days before computers were readily available, people analyzed contingency tables by hand, or using a calculator, using chi-square tests

• Works by computing the expected values for each cell if the relative risk (or odds' ratio) were 1.0. It then combines the discrepancies between observed and expected values into a chi-square statistic from which a P value is computed

Page 22: Statistics for Anaesthesiologists
Page 23: Statistics for Anaesthesiologists
Page 24: Statistics for Anaesthesiologists
Page 25: Statistics for Anaesthesiologists

CHI SQUARE OR FISHER’S EXACT TEST?

• The chi-square test is only an approximation!

• Yates continuity correction is designed to make it better, but it over corrects so gives a p-value that is too large (too 'conservative’)

• With large sample sizes, Yates' correction makes little difference, and the chi-square test works very well. With small sample sizes, chi-square is not accurate, with or without Yates' correction

Page 26: Statistics for Anaesthesiologists

CHI SQUARE OR FISHER’S EXACT TEST?

• Fisher's exact test, as its name implies, always gives an exact P value and works fine with small sample sizes

• Fisher's test (unlike chi-square) is very hard to calculate by hand (so generally used for 2 x 2 or 2 x n table), but is easy to compute with a computer

• Advisable to use when any cell of the table has expected value < 5

Page 27: Statistics for Anaesthesiologists

CHI SQUARE OR FISHER’S EXACT TEST?

• Most statistical books advise using it instead of chi-square test (especially small samples, but chi square becomes acceptable for large sample sizes)

• Fisher’s exact test can be used for a m x n table

• Some have criticized it as the exact answer to the wrong question!

Page 28: Statistics for Anaesthesiologists

Men Women Total

Dieting a b a+b

Not Dieting c d c+d

Total a+c b+d (a+b+c+d)=n

Page 29: Statistics for Anaesthesiologists
Page 30: Statistics for Anaesthesiologists

ANOVA (ANALYSIS OF VARIANCE)

• The one-way analysis of variance (ANOVA) is used to determine whether there are any significant differences between the means of two or more independent (unrelated) groups

• For ex: to understand if exam performance (dependent variable) differed based on test anxiety levels amongst students, dividing students into three independent groups (e.g., low, medium and high-stressed students)

Page 31: Statistics for Anaesthesiologists
Page 32: Statistics for Anaesthesiologists

ONE-WAY ANOVA DESIGN

Treatment/Condition

Levels (Independent Variable)

Group1 Group2 Group3

CONDITION1

S1 DV S6 DV S11 DVS2 DV S7 DV S12 DVS3 DV S8 DV S13 DVS4 DV S9 DV S14 DVS5 DV S10 DV S15 DV

DV = Dependent Variable S = Subject

Page 33: Statistics for Anaesthesiologists

ANOVA (ANALYSIS OF VARIANCE)

• It is an omnibus test statistic and cannot tell you which specific groups were significantly different from each other; it only tells you that at least two groups were different.

• Since you may have ≥3 groups in your study design, determining which of these groups differ from each other is done using a Post-hoc test (Tukey’s test is preferred) which gives a Multiple comparisons table.

Page 34: Statistics for Anaesthesiologists

ANOVA (ANALYSIS OF VARIANCE)

• To apply ANOVA 6 assumptions must be met:

• Assumption #1: Your dependent variable should be measured at the interval or ratio level (i.e., they are continuous)

• Assumption #2: Your independent variable should consist of two or more categorical, independent groups; it can be used for just two groups (but an independent-samples t-test is more commonly used for two groups)

Page 35: Statistics for Anaesthesiologists

ANOVA (ANALYSIS OF VARIANCE)

• Assumption #3: You should have independence of observations, which means that there is no relationship between the observations in each group or between the groups themselves.

• Assumption #4: There should be no significant outliers.

• Assumption #5: Your dependent variable should be approximately normally distributed for each category of the independent variable (but it is quite "robust" to violations of normality)

• Assumption #6: There needs to be homogeneity of variances. (in SPSS using Levene's test for homogeneity of variances)

Page 36: Statistics for Anaesthesiologists

ANOVA (ANALYSIS OF VARIANCE) METHOD

• ANOVA calculates the mean for each of the groups - the Group Means.

• It calculates the mean for all the groups combined - the Overall Mean.

• Then it calculates, within each group, the total deviation of each individual's score from the Group Mean - Within Group (Error )Variation.

Page 37: Statistics for Anaesthesiologists

ANOVA (ANALYSIS OF VARIANCE) METHOD

• Next, it calculates the deviation of each Group Mean from the Overall Mean - Between Group Variation.

• Finally, ANOVA produces the F statistic which is the ratio Between Group Variation to the Within Group (Error) Variation.

Page 38: Statistics for Anaesthesiologists

ANOVA (ANALYSIS OF VARIANCE) METHOD

Page 39: Statistics for Anaesthesiologists

TWO-WAY ANOVA DESIGN

Treatment/Condition

(Independent)

Levels (Independent Variable)

Group1 Group2 Group3

CONDITION1

S1 DV S6 DV S11 DV

S2 DV S7 DV S12 DV

S3 DV S8 DV S13 DV

S4 DV S9 DV S14 DV

S5 DV S10 DV S15 DV

CONDITION2

S16 DV S21 DV S26DV

S17 DV S22 DV S27 DV

S18 DV S23 DV S28 DV

S19 DV S24 DV S29 DV

S20 DV S25 DV S30 DV

Page 40: Statistics for Anaesthesiologists

ANCOVA (ANALYSIS OF COVARIANCE)

• An extension of the one-way ANOVA used to determine whether there are any significant differences between the means of two or more independent (unrelated) groups (specifically, the adjusted means) by adjusting for a third or confounding variable

• Third variable (known as a "covariate” or “confounding variable”) is that you want to "statistically control” that maybe affecting results of ANOVA

• In each one of the two groups we can compute the correlation coefficient between the third variable and dependent variables

Page 41: Statistics for Anaesthesiologists

REPEATED MEASURES ANOVA

• A repeated measures ANOVA is used when you have a single group on which you have measured something a few times

• For example, you may have a test of understanding of Classes. You give this test at the beginning of the topic, at the end of the topic and then at the end of the subject

• You would use a one-way repeated measures ANOVA to see if student performance on the test changed over time

Page 42: Statistics for Anaesthesiologists

REPEATED MEASURES ANOVA

• Repeated measures ANOVA is the equivalent of the one-way ANOVA, but for related, not independent groups, and is the extension of the dependent t-test

• A repeated measures ANOVA is also referred to as a within-subjects ANOVA or ANOVA for correlated samples

• The major advantage with running a repeated measures ANOVA over an independent ANOVA is that the test is generally much more powerful. This particular advantage is achieved by the reduction in variability (due to differences between subjects) during the performance of the test

Page 43: Statistics for Anaesthesiologists
Page 44: Statistics for Anaesthesiologists

REPEATED MEASURES ANOVA

SubjectsTime/Condition (Independent Variable)

T1 T2 T3

S1 S1 S1 S1

S2 S2 S2 S2

S3 S3 S3 S3

S4 S4 S4 S4

S5 S5 S5 S5

Page 45: Statistics for Anaesthesiologists

TWO-WAY ANOVA REPEATED MEASURES

Factor(Independent)

SubjectsTime/Condition (Independent Variable)

T1 T2 T3

GROUP1

S1 S1 S1 S1

S2 S2 S2 S2

S3 S3 S3 S3

S4 S4 S4 S4

S5 S5 S5 S5

GROUP2

S6 S6 S6 S6

S7 S7 S7 S7

S8 S8 S8 S8

S9 S9 S9 S9

S10 S10 S10 S10

Page 46: Statistics for Anaesthesiologists

Variable type & CHOOSING A Test

Explanatory Variable

Response Variable

Methods

Categorical Categorical Contingency Tables

Categorical Quantitative ANOVA

Quantitative Quantitative Regression

Page 47: Statistics for Anaesthesiologists

ANOVA – WHY NOT JUST USE t-TESTS?

• Multiple t-tests are not the answer because as the number of groups grows, the number of needed pair comparisons grows quickly. For example in 7 groups there are 21 pairs. If we test 21 pairs we should not be surprised to observe things that happen only 5% of the time. Thus in 21 pairings, a p-value = 0.05 for one pair cannot be considered significant.

• Our level of significance α has to be divided for multiple comparisons (Ex: for above it becomes α/21)

• ANOVA puts all the data into one number (F) and gives us one p-value for the null hypothesis.

Page 48: Statistics for Anaesthesiologists

ANOVA – WHY NOT JUST USE t-TESTS?

From eBook: Research skills for Psychology Majors by William Gabrenya

Page 49: Statistics for Anaesthesiologists

Likert ITEM & LIKERT Scale

Page 50: Statistics for Anaesthesiologists

Likert ITEM & LIKERT Scale

• Likert scale consists of multiple Likert-type items

• Likert-type scales (such as "On a scale of 1 to 10, with one being no pain and ten being high pain, how much pain are you in today?")

• Represent ordinal data (order, rank, but no real distance)

Page 51: Statistics for Anaesthesiologists

Likert ITEM & LIKERT Scale

• Fundamentally, these scales do not represent a measurable quantity

• An individual may respond 8 and be in less pain than someone else who responded 5

• A person may not be in exactly half as much pain if they responded 4 than if they responded 8

• Visual Analog Scale is a Likert scale but often (wrongly) analyzed as if it were continuous data

Page 52: Statistics for Anaesthesiologists

COMPOSITE SCORE & LIKERT Scale

• Composite scores combine multiple Likert item scales into a single scale

• Composite scores must first be analyzed for internal consistency and inter-item correlation for each item and reported (ex: using Cronbach’s alpha – scale reliability analysis)

• These scores represent ordinal data so must use non-parametric tests and descriptives

Page 53: Statistics for Anaesthesiologists

Cronbach’s Alpha For scales

• Check for internal consistency and overall validity of a multiple Likert-type item scale

• Check correlation (α) with each item deleted at a time

• Based on number of items and comparison of its variances

Page 54: Statistics for Anaesthesiologists

Cronbach’s Alpha For scales

• Values of α range from 0 to 1

• Ideally overall α and α for each item (when deleted from scale) must be > 0.7 to 0.8

• Clinical scores need higher α > 0.8 to 0.9 (Bland-Altman)

Page 55: Statistics for Anaesthesiologists

Power analysis & effect size

• To calculate sample size (n) we must know the type of statistical test involved in our primary outcome measure

• Also we must also know:

• Desired α error (usually taken as 0.05)

• Power (1-β) usually taken 0.8 (80%) or greater

• Two or one-tailed comparison

• Effect size

Page 56: Statistics for Anaesthesiologists

Power analysis & effect size• Power is the fraction of experiments that you expect to yield a "statistically significant” p-value (80% of experiments of the sample may yield a significant p-value)

• Effect size (Cohen’s d for mean) depends on study design, it is calculated by data from pilot studies or reference studies

• Effect size depends on a clinically defined level of significance (ex: more than 20% difference between 2 groups, with difference for proportion or mean ± SD data etc)

Page 57: Statistics for Anaesthesiologists

Power analysis & effect size• Cohen’s d is usually calculated based on pilot

studies but if effect size is unknown Jacob Cohen provided 3 guess estimate effect sizes (value varies slightly for different statistical tests):

1.Small effect d around 0.2 (requires large sample sizes)

2.Medium effect d around 0.5 (seen with careful observation, use when in doubt)

3.Large effect greater than 0.8 (if large it is obvious)

• Criticized when d is used as above as “T-shirt” effect sizes

Page 58: Statistics for Anaesthesiologists

Power analysis & effect size

• Calculation of required sample size a with set target for power before starting the final study is called A priori analysis (before the fact) – accepted method, especially important to avoid incorrectly being “blind” to a real difference in a negative study (due to large βerror)

• Calculation of required sample size at the end of the final study is called Post hoc analysis (after the fact) – incorrect as the computed power is a simple reflection of the p-value!

• G*Power software is a free useful resource

Page 59: Statistics for Anaesthesiologists