study design and simple statistics 17 th feb 2005 kath bennett

Study design and simple statistics

17th Feb 2005

Kath Bennett

Overview

• Overview of research methods, study design.

• Some common statistical definitions.

Research

Basic research

Lab, biochemical, genetic

Epidemiology

Distribution & determinants of disease in a

population

Clinical

Deals with patients with a particular

disease

Research

• Clear aims and objectives from start– hypothesis

• Design study to be able to address the objectives set out

• Collect complete and accurate data• Enter and analyse data• Interpret the data in light of available

evidence• Publish

Types of Clinical Research

Quantitative Qualitative

Types of clinical studies

Quantitative

Observational Experimental(epidemiological) (interventional)

Cohort “Clinical trials”Case-Control Randomised controlled Cross-Sectional trialCase Reports Open studies

Pilot study Large simplified trial

Observational versus Experimental Research

• Observational research seen as complementary to experimental:

•Intervention producing large impact, can be shown using observational studies

•Infrequent adverse events, require large numbers, inpractical in RCTS.

•Longer term than RCTS.•Clinical uncertainty providing

evidence for RCTS.•Impractical or unethical to do an

RCT.

Comparison of random and Comparison of random and non-random studiesnon-random studies

HRT and coronary heart disease. Evidence HRT and coronary heart disease. Evidence from observational studies and recently from observational studies and recently published RCT (Lancet 2002)published RCT (Lancet 2002)

Relative risk

Observational studies 0.5-0.75

RCT 1.29

Quantitative Methods

Advantages• ‘Objective’ assessment• Can sample large numbers (cost!)• Can assess prevalence• Repeatable results (consistency)

Quantitative Methods

Disadvantages• Way in which questions are

generated – Researcher decides limits and imposes

structure– Little opportunity to detect

“unexpected” new outcomes

• Sources of bias– lack of explanatory power– limited ability to describe context

Types of clinical studies

Qualititative

Focus group discussions

Indepth interviewing

Observation

Documentary

Primary versus Secondary Research

Primary SecondaryClinical trials Systematic

ReviewsSurveys Meta – analyses

Cohort studies Economic analyses (original research (reanalysis of focused on patients previously

gathered or populations) data)

Clinical trials

• Importance for ventures into clinical researchPrinciples required• Appropriate Design• Randomisation• Blinding• Study power or sample size

Randomised Controlled Trial - RCT

Treatment (efficacy, R.C.T.(randomised

safety comparison etc.) controlled trial)

QUESTION PREFERRED DESIGN

Clinical trial design

• Parallel group trials– RANDOMISED:Patients randomly allocated to either one

treatment or another– NON-RANDOMISED : patients not randomly allocated to

treatment.

• Factorial design – Patients may receive none, one or more than one of several

interventions.

• Cross-over trials– Patients receive one treatment followed by another. Fewer

patients required but takes longer. Within-subject comparisons, and therefore less variability producing more precise results (fewer patients required)

Randomised parallel group design

Participants satisfying entry criteria

Randomly allocated to receive A or B

A B

Participants followed up exactly the same way

Example: Digoxin vs Placebo – DIG study

Factorial design

Participants satisfying entry criteria

Participants randomly allocated to one of four groups. 2x2 factorial design

Example: Heart Protection Study. =Simvastatin;

=Vitamins; =Placebo

MRC/BHF Heart Protection Study

Simvastatin(40 mg daily)

vs Placebotablets

Vitamins(600 mg E, 250 mg C& 20 mg beta-carotene)

vs Placebocapsules

Planned mean duration: At least 5 years

2x2 Factorial treatment comparisons

Randomised to either:

Two-period, two-treatment cross-over trial

Participants satisfying

entry criteria – sometimes followed by

run-in period

A

B

B

A

Randomised to A followed by B or vice-versa

Usually ‘washout’ in between

Example: Aspergesic (A) vs ibuprofen (B) in rheumatoid arthritis.

RELIABILITY

CHANCE SYSTEMATIC

EFFECTS BIASESRandom error Systematic error

• Minimise chance effects (random error) by– Increasing the number of patients studied (do large trials

and reviews of trials)

• Minimise systematic biases (systematic error) by– Using an appropriate method of allocation

(randomisation)– Ensuring investigator and/or subject unaware of

treatment allocation (blinding)– Basing the analyses on the allocated treatment

(intention-to-treat)– Including all relevant evidence (systematic review of

similar trials)

To obtain evidence as reliable as possible

Randomisation

• Clinical trials, and any studies need to avoid bias– By doctor eg. preferences to treatment– By individual patient– By choice of design

• Randomisation avoids bias by removing choice of treatment by doctor or patient

• Randomisation is not always possible for practical or ethical reasons, leading to a controlled clinical trial (treated group compared directly with non-treated group)

Blinding

• Avoidance of bias in subjective assessment eg. pain, frequency of side effects achieved through blinding

• Double blind (masked) trials – when both patients & investigators are not aware of

which treatment group has been assigned

• Single blind (masked) trials– when only the study participant is not aware of the

treatment group assigned to them

• ‘Placebo’ is also useful in avoiding bias

Intention to treat (ITT)

• Intention of randomisation is to establish similar groups of patients in each arm

• Problems arise when non-adherence may be related to outcome or prognosis, leading to biased representation

• ITT analyses all patients according to randomised treatment irrespective of protocol violations etc.

• However, it does not solve all problems

Number of patients required – sample size

• Requirement for well-designed studies• Most journals now require sample size

calculations• Reassurance money well spent – likelihood

study will give unequivocal results• Requirement for regularity authorities i.e FDA• Low sample size can be a reason for not

recognising that one treatment is superior• Unethical to perform a study if numbers too

small to detect a useful difference

What is “power” of a study?

• “the ability to detect a true difference of clinical importance” Doug Altman

• “the confidence with which the investigator can claim that a specified treatment benefit has not been overlooked”Sheila Gore

Estimating sample size and power

• Identify a single major outcome measure – primary endpoint– Survival, response rate, quality of life

• Specify size of difference required to detect– Improvement in response from 20% to 30%

• ‘We want to be reasonably certain of detecting such a difference if it really exists’– ‘detecting a difference’ refers to P<0.05– ‘reasonably certain’ refers to having a chance of at

least 80% or obtaining such a P value

Methods to calculate sample size

• Equations– Mathematical equations available for computing

sample size given , and (1- )

• Tables – Based on equations above

• Nomogram– Summarises figures in a graph, easy to use

• Computer packages

Example• Objective: to compare effect of drug A vs drug B

using blood pressure as outcome measure• Design: RCT – half to drug A, half to drug B• Require 80% power, and significance level set at

5%• Expected mean difference between the two

groups= 6 • Pooled standard deviation SD=10 =difference in means/SD (effect size)

= 6/10 = 0.6• From tables n=45 per group

Common statistical definitions

Classification of data

• Different types of data– Nominal / categorical - used in

classification (eg blood groups); Female / Male also

– Ordinal - ordered categorical data (e.g. non-smoker, <10 day, 10-20 day, >20 day)

– Interval / continuous data (e.g. age, birthweight, plasma K levels)

Graphical presentations

BAR CHARTS• Bar charts are used to show

(graphically) frequency distributions for categorical data.

• The height of each ‘bar’ in the bar chart is proportional to the number of observations or frequency of the observations in each category.

BAR CHART

Bar chart of Blood groups

BLOOD GROUP

OBABA

Num

ber

of

patie

nts

60

50

40

30

20

10

Histograms

• Similar to bar charts but for continuous (interval) data

• the width of the bars varies only with varying intervals of data.

• Boundaries of histogram ‘bars’ are taken as half way between the upper limit of the lower group and the lower limit of the upper group.

pre-operative % haemoglobin

100.090.080.070.060.050.040.030.0

Histogram of pre-operative haemoglobin ratesF

requ

ency

(N

umbe

r o

f pa

tient

s)

16

14

12

10

8

6

4

2

0

Std. Dev = 14.40 Mean = 61.3N = 45.00

The Normal distribution

• An important distribution in statistics• - used for continuous data • - bell-shaped curve• - symmetric about the mean (or median)

0 2-2-4 4incr

easi

ng p

robabili

ty

0

0.4

-1.96 1.96

2.5%2.5% 95%

Measures of location

• Gives an idea of the ‘average’ value on a particular scale

Common measures are:– Mean - sum of observations / number of

observations– Median - middle value of the sample when

arranged in order– Mode - most common value (used when

only a few different values)

Variation

• Humans differ in response to exposure to adverse effects

• Humans differ in response to treatment

• Humans differ in disease symptoms

• Diagnosis and treatment is often probabilistically based

Measures of variation

• Gives an idea of the spread or variability of the data

• Common measures are:– Range – Quartiles - The ‘inter-quartile range’ is the

difference between the 25th and 75th centiles

– Sample variance - 2= 1

12

nxix

( )

Measures of dispersion (contd.)

The standard deviation () is the square root of the variance.

– Standard error (if repeated samples were taken, the standard deviation of means from each sample)

• SE(Mean)= n

Confidence intervals• Over emphasis on hypothesis testing and

p-values.

• The size and range of the difference between two groups is more informative than whether it is statistically significant or not.

• Confidence intervals, if appropriate to the type of study, should be used for major findings in both main text and abstract.

Confidence intervals

• If a CI is constructed, the significance of a hypothesis test can be inferred from it.

• For example, a 95% CI for the difference of two means containing 0 would infer that the difference between the means was non-significant at 5%

Systolic blood pressure in 100 diabetic and 100 non-diabetic men

DIABETICS

190.0180.0170.0160.0150.0140.0130.0120.0110.0100.0

30

20

10

0

146.4

NON-DIABETICS

180.0170.0160.0150.0140.0130.0120.0110.0100.0

30

20

10

0

140.4

Difference between sample means = 6 mm Hg.

• Difference of 6.0mm Hg found between mean systolic blood pressures, standard error 2.5mm Hg.

• 95% confidence interval for population difference is from 1.1 to 10.9 mm Hg.

• This means there is a 95% chance that the indicated range includes the ‘true’ population difference in mean blood pressure.

Systolic blood pressure in 100 men with diabetes and 100 men

without

What affects the width of a CI?

• The sample size by a factor of n. Smaller sample size leads to lower precision.

• Variability of data - less variable the data, more precise the estimate.

• Degree of confidence. 95% most commonly used. If greater or less confidence required the CIs increase and decrease respectively.

P-values and CIs

• One can infer from CIs whether there is a statistical significant difference, but not vice versa.

• Example, difference in BP between diabetics and non-diabetics found to be 6mm Hg. 95% confidence interval for population difference is from 1.1 to 10.9 mm Hg.

• The interval does not contain ‘0’ so we can infer that there is a statistically significant difference between the groups. In fact, the p-value from an independent t-test was p=0.02.

Probability

• Probability and statistical tests– Statistical tests are used to assess the weight

of evidence and to estimate probability that data arose from chance

– Presented as ‘p value’, usually p<0.05, i.e. the observed difference would be expected to have arisen by chance less than 5% of time or p<0.001, less than 0.1% of the time

– 5% or 1% is known as the significance level of the test or alpha ()

Effect on significance

• ‘Non-significance’– Indicates insufficient weight of evidence – Does not mean ‘no clinically important difference

between groups’– If power of test is low (i.e. sample size too small), all

one can conclude is that the question of difference between groups is unresolved

• Confidence intervals show, more informatively, the impact of sample size upon precision of a difference

Reporting p-values

P value Wording Summary

>0.05 Not significant ns

0.01 to 0.05 Significant *

0.001 to 0.01 Very significant **

< 0.001 Extremely significant ***

Report the actual p-value

Measuring effectiveness

Risk

PROPORTION

A ratio where the numerator (top) is part of the denominator (bottom).

RISK

Number of subjects in a group who have an event divided by total number of subjects in the group. It is the probability of (proportion) having an event in that group (P). It is called incidence when expressed per unit time

RELATIVE RISK (RR)

Ratio of risk in exposed group to risk in not exposed group (P1/P2)

Example

Type of vaccine Got Avoided Total

Influenza Influenza

I 43 237 280

II (Control) 52 198 250

Risk of disease in Vaccine Group I = 43/280=0.154

Risk of disease in Vaccine Group II=52/250=0.208

Relative Risk (Risk Ratio) =0.154/0.208 =0.74

Odds

ODDS

Probability of developing disease divided by probability of not developing disease. P/ (1-P)

Often expressed as number of times something expected not to happen: number of times something expected to happen.

ODDS RATIO (OR)

Ratio of odds for exposed group divided by odds for not exposed group.

{P1/(1-P1)}/{P2/(1-P2)}

Odds ratios are treated as relative risks, especially when events are rare, and emerge naturally in some types of studies (case-control studies)

Example Odds of disease in Vaccine Group I = 0.154/(1-0.154)=0.182

Odds of disease in Vaccine Group II= 0.208/(1-0.208)=0.263

Odds ratio of getting disease in Group I relative to Group II=0.182/0.263=0.69 (close to relative risk of 0.74)

Absolute risk reduction

Absolute risk reduction (ARR)

Risk in treated group minus risk in control group

ARR=p1-p2

Number need to treat=1/ARR

This is the number you would need to treat under each of two treatments to get one extra person cured under the new treatment

Example

Absolute risk reduction for vaccine I=

0.208 - 0.154=0.054

NNT=1/0.054=18.5

Thus on average one would have to give vaccine I to 19 patients to expect one extra patient is being protected from influenza compared with vaccine II.

Summary

• Have clear objectives and aims to study

• Chose the study design that best addresses these aims

• Use randomisation, blinding etc. where appropriate

• Make sure sufficient numbers of individuals studied to be able to reliably answer the question.

Useful statistical references

• M Bland. An Introduction to Medical Statistics.• Campbell MJ and Machin D (1993) Medical Campbell MJ and Machin D (1993) Medical

Statistics: a commonsense approach. WileyStatistics: a commonsense approach. Wiley• DG Altman. Practical statistics for medical

research. London: Chapman & Hall, 1991.• DS Moore and GP McCabe. Introduction to

the practice of statistics. WH Freeman and Company, New York, 3rd Edition. 1999.

study design and simple statistics 17 th feb 2005 kath bennett

Documents

clinical uncertainty

factorial design patients

study design

large impact

starthypothesisdesign

randomised safety comparison

coronary heart disease

comparison of random