research skills basic understanding of p values and confidence limits che level 5 march 2014 sian...

17
Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Upload: allyson-robinson

Post on 30-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Research SkillsBasic understanding of P values

and Confidence limits

CHE Level 5March 2014Sian Moss

Page 2: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

• valuable for quantifying the effectiveness of a particular intervention, relative to some comparison

• It allows us to move beyond the simplistic, 'Does it work or not?' to the far more sophisticated, 'How well does it work in a range of contexts?'

• emphasises the size of the difference rather than confounding this with sample size

Effect Size

Page 3: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Why measure size of effect?• Generating p-values depends essentially on two things:

Size of the effect and the size of the sample

• A 'significant' result is seen either if the effect were very big (despite having only a small sample) or if the sample were very big (even if the actual effect size were tiny)

• It is important to know the statistical significance of a result, since

without it there is a danger of drawing firm conclusions from studies where the sample is too small to justify such confidence

• However, statistical significance does not tell you the most important thing: the size of the effect. One way to overcome this confusion is to report the effect size, together with an estimate of its likely 'margin for error' or 'confidence interval'.

Page 4: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Measuring Effect Size

• relative risk reduction RRR• absolute risk reduction ARR• number needed to treat NNT

Relative measures tend to emphasise potential benefitsAbsolute measures provide an across-the-board summary

Either may be appropriate, subject to correct interpretation.

Page 5: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Table 1 Summary of effect measuresMeasure of effect

Abbreviation Definition No effect Total success

Absolute risk reduction

ARR Absolute change in riskRisk of event in control group – risk of event in Tx group

ARR = 0% ARR = initial risk

Relative risk reduction

RRR Proportion of risk removed by Tx. ARR / initial risk in control group

RRR = 0% RRR = 100%

Relative risk RR Risk of event in Tx group / risk of event in control group(Expressed as a decimal proportion or %)

RR = 1OrRR = 100%

RR = 0

Odds ratio OR Odds of an event in Tx group / odds of event in control group(Expressed as a decimal proportion)

OR = 1 OR = 0

Number needed to treat

NNT Number of Px needed to be treated to prevent one event.(reciprocal of ARR, usually rounded to a whole number)

NNT = infinity

NNT = 1 / initial risk

Davies and Crombie 2009

Page 6: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Robustness

Q) How trustworthy are the findings?

Q) Are the findings likely to be true about similar groups of Px?

Q) Has any Tx benefit arisen due to way the study has been conducted?

Two issues to address

BIAS + CHANCE (Risk of Bias assessments) (Confidence intervals and p-values)

Page 7: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Hypothesis testing and P-values

Assesses whether findings are ‘significantly different’ or not from a reference value (in trials this is usually the value reflecting ‘no effect’)

Eg A new treatment appears to outperform a standard therapy in a trial Is this effect likely to be REAL or a chance finding?

Calculating p-value 1. Assume there is no true difference between the two Tx NULL HYPOTHESIS 2. Calculate how likely that the observed diff is by chance if the NH is correct

This is the p-value

A probability that we observe a difference, given that there was really no difference between the Treatments!

Page 8: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

If the p-value is SMALL, the findings are UNLIKELY to have occurred by chanceWe REJECT the Null Hypothesis

(The smaller the value the greater the significance)

If the p-value is LARGE, the probability that the findings are due to chance are high

We CANNOT reject the Null Hypothesis(The idea that there is no difference between treatments is not rejected, but also it is not accepted

either)

Convention SMALL p ≤ 0.05 ie. there is less than a 1 : 20 chance that the

difference seen has arisen by chance, if there was really no true difference!

The results are said to be ‘SIGNIFICANTLY DIFFERENT’

Hypothesis testing and P-values

Page 9: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Confidence LimitsA measure of treatment effect

• Shows the range within which the true treatment effect is likely to lie

• are preferable to p-values, as they tell us the range of possible effect sizes

compatible with the data

• A confidence interval that embraces the value of no difference between treatments indicates that the treatment under investigation is not significantly different from the control

Page 10: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Hypothesis testing and Confidence Intervals

Hypothesis testing produces a decision about observed differences

Confidence Intervals provide a range about the observed effect sizeDefinition

‘ a range of values for treatment effect, constructed so that the range has a specified probability of including the true value of the effect’The specified probability is called the CONFIDENCE LEVEL the end points of the interval are the CONFIDENCE LIMITS

Usually 95% which corresponds to p ≤ 0.05

Page 11: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Confidence IntervalsAt the 95% level, 95% of the time the CI should contain the true value of an effect

If the Confidence interval does capture the value reflecting ‘no effect’ this represents a difference that is statistically non-significant

If the Confidence interval does not enclose the value reflecting ‘no effect’ this represents a difference that is statistically significant

(for a 95% CI it is significance at the 5% level, corresponding to p<0.05)

In addition the intervals show the largest and smallest effects that are likely given the observed data

CIs from large studies tend to be narrow leading to more precision in estimating size of a real effect. Smaller studies have wider Cis compatible with wide range of effect sizes

Page 12: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

An example of the use of Confidence intervals

Ramipril is an angiotensin-converting enzyme (ACE) inhibitor which has been tested for use in patients at high risk of cardiovascular events. In one study published in the New England Journal of Medicine, a total of 9,297 patients were recruited into a randomised, double-blind, controlled trial. The key findings presented on the primary outcome and deaths are shown below.

Outcome Ramipril Placebo Relative Risk

(n=4,645) (n=4,652) (95% CI)

number (%) number (%)

Cardiovascular event (including death)

651 (14.0) 826 (17.8) 0.78 (0.70–0.86)

Death from non-cardiovascular cause

200 (4.3) 192 (4.1) 1.03 (0.85–1.26)

Death from any cause 482 (10.4) 569 (12.2) 0.84 (0.75–0.95)

Incidence of primary outcome and deaths from any cause

(New England Journal of Medicine, 2000; 342:145-153)

Page 13: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

These data indicate that fewer people treated with ramipril suffered a cardiovascular event (14.0%) compared with those in the placebo group (17.8%). This gives a relative risk of 0.78, or a reduction in (relative) risk of 22%. The 95% confidence interval for this estimate of the relative risk runs from 0.70 to 0.86. Two observations can then be made from this confidence interval

• First, the observed difference is statistically significant at the 5% level, because the

interval does not embrace a relative risk of one.• Second, the observed data are consistent with as much as a 30% reduction

in relative risk or as little as a 14% reduction in risk.

Similarly, the last row of the table shows that statistically significant reductions in the overall death rate were recorded: a relative risk of 0.84 with a confidence interval running from 0.75 to 0.95. Thus, the true reduction in deaths may be as much as a quarter or it could be only as little as 5%; however, we are 95% certain that the overall death rate is reduced in the ramipril group.

Page 14: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Exploring the data presented in the middle row shows an example of how a confidence interval can demonstrate non-significance. There were a few more deaths from noncardiovascular causes in the ramipril group (200) compared with the placebo group (192).

Because of this, the relative risk is calculated to be 1.03 – showing a slight increase in risk in the ramipril group. However, the confidence interval is seen to capture the value of no effect (relative risk = 1), running as it does from 0.85 to 1.26. The observed difference is thus non-significant;

The true value could be anything from a 15% reduction in non-cardiovascular deaths for ramipril to a 26% increase in these deaths.

Not only do we know that the result is not significant, but we can also see how large or small a true difference might plausibly be, given these data.

Page 15: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

Errors and ValidityP-values and confidence intervals HELP with interpretation of research findings with regard to CHANCE

Important pitfalls exist1. 1 : 20 significant findings will be spurious leading to us believing

something that is not real TYPE I error 2. Clinical difference and statistical difference are not the same, it is

the size of effect, not just the size of the significance that matters

3. We may conclude with non-significance that there is no effect, when in fact there is a real effect TYPE II error (how carefully have the findings been interpreted?)

4. External validity, do findings relate to the participants of the study, how well are they applicable to other groups? Do they

particularise to the individual? (Assessment of External validity based on Px characteristics and on setting and conduct of trial)

Page 16: Research Skills Basic understanding of P values and Confidence limits CHE Level 5 March 2014 Sian Moss

FYINote:Internal validityThe extent to which the design and conduct of a study are likely to have prevented bias. Variation in quality can explain variation in the results of studies included in a systematic review. More rigorously designed (better quality) trials are more likely to yield results that are closer to the truth. (Also called methodological quality but better thought of as relating to bias prevention.)

Intention to treatAn assessment of the people taking part in a clinical trial, based on the group they were initially (and randomly) allocated to. This is regardless of whether or not they dropped out, fully complied with the treatment or switched to an alternative treatment. Intention-to-treat analyses are often used to assess clinical effectiveness because they mirror actual practice: that is, not everyone complies with treatment and the treatment people receive may be changed according to how they respond to it.

Useful websites