the bland-altman limits of agreement: how often have they been misapplied?

44
THE BLAND-ALTMAN LIMITS OF AGREEMENT: HOW OFTEN HAVE THEY BEEN MISAPPLIED? Introdução à Medicina 23/Maio/2011 Turma 13

Upload: lobo

Post on 24-Feb-2016

24 views

Category:

Documents


0 download

DESCRIPTION

The Bland-Altman LIMITS OF AGREEMENT: How Often HAVE THEY Been Misapplied?. Introdução à Medicina – 23/Maio/2011. Turma 13. INTRODUCTION. Background. Due to the advances of technology, new methods of clinical measurement appear constantly, and they keep becoming more innovating. 1 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

THE BLAND-ALTMAN LIMITS OF AGREEMENT:

HOW OFTEN HAVE THEY BEEN MISAPPLIED?

Introdução à Medicina –

23/Maio/2011Turma 13

Page 2: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

INTRODUCTION

Page 3: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Background

Due to the advances of technology, new methods of clinical measurement appear constantly, and they keep becoming more innovating.1

In the 80s, Bland and Altman took knowledge of the wide use of the correlation coefficient as a way to evaluate the agreement between two methods of clinical measurement.

They realized it wasn’t adequate. So, they created their own method - the

limits of agreement of Bland-Altman.2

1 - Zietman A, Goitein M, Tepper JE. Technology evolution: is it survival of the fittest? Journal of Clinical Oncology: official journal of the American Society of Clinical Oncology, 2010 Sep 20; 28(27): 4275-4279.2 - Altman DG, Bland JM. Statistical Methods For Assessing Agreement Between Two Methods of Clinical Measurement. Lancet, 1986; i: 307-310.

Page 4: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Statistical methods for assessing agreement between two methods of clinical measurement

The Lancet, 1986 Objective of the method

Assess the agreement between two methods of clinical measurement

Importance: If the agreement isn’t accomplished, there is a high risk of

diagnosis mistakes, which may lead to severe consequences3

3 - Stoker, Mark. Common Errors in Clinical Measurement. Anesthesia & Intensive Care Medicine, December 2008; volume 9, issue 12: 553-558.

Page 5: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

AverageDifference

Correlation coefficient

Instrument 1

Instrument 2

Measurement Measurement

How do we apply the method?2

2 - Altman DG, Bland JM. Statistical Methods For Assessing Agreement Between Two Methods of Clinical Measurement. Lancet, 1986; i: 307-310.

Page 6: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

= 0

No systematic error

≠ 0

Systematic error

Page 7: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?
Page 8: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

If the limits of agreement are…

There are random

mistakes associated with the measuring instrument;

It is unacceptable for clinical use.

There is a systematic error;

The measuring device must be calibrated.

Too wide… Small but the average of the differences is ≠ 0…

Page 9: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

The evaluation of whether the limits of agreement are too wide or, on the other hand, adequate, may be a little subjective. Thereby, it is important that the maximum limits of agreement are defined according to the clinical needs.

Page 10: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Assumptions

the differences between the measured

values must follow a normal distribution;

the standard deviation must be constant / there must be no

relation between the averages and the

differences;Images: Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003; 22, 85-93.

Page 11: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Example of the existence of a relation between the averages and differences

Images: Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003; 22, 85-93.

Page 12: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Bland-Altman Method

had a great impact on the scientific community and,after being published in The Lancet, was quoted

BUT,

some of the quotes/applications of this method may not have been correctly made!

Bland and Altman noticed themselves that their limits of agreement were being misapplied and, thereby, led to false

conclusions about the agreement between two instruments of clinical measurement.5

more than 17000times4

4 - Ryan TP and Woodall WH. The Most Cited Statistical Papers. Journal of Applied Statistics, 2005; 32: 461-474.5 - Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003; 22, 85-93.

Page 13: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

RESEARCH QUESTION AND AIMS

Page 14: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

“What is the percentage of articles in which the Bland-Altman method is applied correctly?”

Research Question

Page 15: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Our secondary aims are to find out:

at what level the method is misapplied

which assumption is

the least fulfilled one

if, through the years, the percentage of

articles applying the method incorrectly

has varied

if the percentage of articles applying the

method correctly varies according to

whether it is used to obtain primary or secondary data.

what percentage of articles fit into each

of the document types defined by ISI.

if the impact factor of a journal

influences the percentage of

articles published in it that apply the method correctly

Page 16: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

METHODS

Page 17: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Methods

Sample 70 articles indexed by ISI that cite the article

where Bland and Altman expose their method,

published by The Lancet

RANDOMLY CHOSEN

Page 18: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Check-list Evaluates the article when it comes to the:

Verification of the assumptions; Application of the method itself; Interpretation of the obtained limit of

agreement.

Page 19: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?
Page 20: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Check-list Evaluates the article when it comes to the:

Verification of the assumptions; Application of the method itself; Interpretation of the obtained limit of agreement.

The check list will also gather some relevant data related to the articles: type of article and year and journal in which it was published.

Page 21: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?
Page 22: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Reprodutibility of the check-list

Student A Student B

Article X

Comparison between the answers given between the two students.

Page 23: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

To analyze our results… We calculated the median of the impact

factor and year of publication Created two groups

≤ median > median

Page 24: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

How do we know the differences are significant?

Data of the tables related to the year, journal of publication and type of data of each

article

Chi Square Test6

P ≤ 0,05 Statistically significant

6 - PERLA, Rocco J, CARIFIO James. Use of the Chi-square Test to Determine Significance of Cumulative Antibiogram Data. American Journal of Infectious Diseases, 2005; 1 (4): 162-167

Page 25: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

EXPECTED RESULTS

Page 26: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Many articles will have misapplied the method

Main reason lack of verification of the assumptions; wrong verification of the assumptions.5

Least fulfilled assumption verifying if the differences follow a normal distribution

Why? It requires the construction of a different graph (histogram

of the differences), while the other assumption can be verified by analysis of the averages vs. differences one, which is often used to observe the limits of agreement.

5 - Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003; 22, 85-93.

Page 27: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

There will be variations of the percentage of articles misapplying the method throughout the years

WHY? researchers started to notice that the method was being misapplied

HOW? they realized that two methods of clinical measurement that had passed the test of Bland-Altman in terms of agreement weren’t actually agreeing very much. Example: didn’t agree when it came to higher values than the ones

used for the test.5

5 - Bland JM and Altman DG. Applying the Right Statistics: Analyses of Measurement Studies. Ultrasound in Obstetrics and Gynecology, 2003; 22, 85-93.

Page 28: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

The impact factor of a journal must have influence in the percentage of misapplications of the method present in the articles published there

Why?

> Impact Factor > Quality> Attention to scientific correction

Page 29: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

RESULTS

Page 30: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

The two students which disagreed re-evaluated the question and came to an agreement.

- To ensure the correct analisis of the articles two students analized the same article

There was an agreement of 100% in all questions,

except

for the one that asked if the article had interpreted the outcome correctly according to the clinical needs, in which there was a disagreement relative to 1 article

Of the 5 articles analyzed by two different students

Reproducibility of the check list

Page 31: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

What percentage of articles fit into each of the document types defined by ISI.

Articles - 16230

Reviews - 291

Meeting abstracts - 70

Reprints - 2

Proceeding papers - 1059

Notes - 121

Corrections/Addictions - 2

Correction - 1

Letters - 471

n= 18360

Page 32: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Out of those 56, 5 weren’t applications of the Bland and Altman limits of agreement, while 51 were.

UNAVAILABLE(in a full text version or in

foreign languages)AVAILABLE

51

5

1470 (articles and proceedings papers)

The Sample

Page 33: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

THE MAIN FINDINGS of our study in regards to our

original research question and aims:

Page 34: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Table 1. n(%) of articles which fullfill each point of the check-list.

Page 35: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Table 2 – Percentage of articles fulfilling each main point of the check list, divided according to the impact factor of the journal where they were published. We used a Chi-Square test to compare the percentages amongst the two levels of impact factor.LA – Limits of agreement.

… if the impact factor of a journal influences the percentage of articles published in it that apply the method correctly

p>0,05!!

Page 36: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Table 3 – Percentage of articles fulfilling each main point of the check list, divided according to the year when they were published. We used a Chi-Square test to compare the percentages amongst the two levels of impact factor. LA – Limits of agreement.* - statistically significant.

…if, through the years, the percentage of articles applying the method incorrectly has varied

p<0,05!!

Page 37: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

…if the percentage of articles applying the method correctly varies according to whether it is used to obtain primary or secondary data.

Table 4 – Percentage of articles fulfilling each main point of the check list, divided according to the type of data obtained by using the limits of agreement. We used a Chi-Square test to compare the percentages amongst the two levels of impact factor. LA – Limits of agreement.* - statistically significant.

Page 38: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

DISCUSSION

Page 39: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

“What is the percentage of articles in which the Bland-Altman method is applied incorrectly?”

Interestingly,out of all the articles we analyzed,THERE WAS NOT ONE article which correctly applied the method in its entirety. The method seems to be mostly misapplied at the level of: verifying the assumptions

The least fulfilled assumption

The 7 articles where this assumption was applied correctly - a mere 14%– also

correctly fulfilled the first one

So,the errors of articles that correctly applied the second assumption were only minor ones.Table 1. n(%) of articles which fullfill

each point of the check-list.

Page 40: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

… if the impact factor of a journal influences the percentage of articles published in it that apply the method correctly

Table 2 – Percentage of articles fulfilling each main point of the check list, divided according to the impact factor of the journal where they were published. We used a Chi-Square test to compare the percentages amongst the two levels of impact factor.LA – Limits of agreement.

p>0,05!!

•The articles published in journals with a lower IF appear to have a higher percentage of correct applications!

•The differences are however not statistically significant.

Page 41: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

…if, through the years, the percentage of articles applying the method incorrectly has varied

Table 3 – Percentage of articles fulfilling each main point of the check list, divided according to the year when they were published. We used a Chi-Square test to compare the percentages amongst the two levels of impact factor. LA – Limits of agreement.* - statistically significant.

p<0,05!!

•In every single category, the articles published at a more recent date always have a higher percentage of correct application of the method

•Only one of the results is not statistically significant.

•With the passing of time authors have come to realize that sometimes the employment of the Bland-Altman method leads to incorrect findings

•This would obviously lead the authors of more recent study to be more careful when employing the method.

Page 42: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

…if the percentage of articles applying the method correctly varies according to whether it is used to obtain primary or secondary data.

Table 4 – Percentage of articles fulfilling each main point of the check list, divided according to the type of data obtained by using the limits of agreement. We used a Chi-Square test to compare the percentages amongst the two levels of impact factor. LA – Limits of agreement.* - statistically significant.

•Articles which are using the method to obtain primary data have a higher percentage of correct application of the method than those that use it to obtain secondary data.

• It is more likely for authors of an article to pay more attention to the correct employment of a scientific method if it is their main method or one of their main methods for acquiring data.

Page 43: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Limitations of our work Relatively small sample Human error No other works to cross-reference with

Page 44: The  Bland-Altman  LIMITS OF AGREEMENT: How  Often  HAVE THEY  Been  Misapplied?

Acknowledgements Professora Doutora Cristina Santos Professor Doutor Altamiro Rodrigues da

Costa Pereira  Mestre João Cláudio Antunes Turma 4