12sl statement of validation and accuracy

Upload: cygnus8929

Post on 07-Apr-2018

236 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 12SL Statement of Validation and Accuracy

    1/26

    12SL

    Statement of Validation and Accuracyfor the

    Physicians Guide

    416791-003 Revision A

    12SLP R O G R A M

  • 8/3/2019 12SL Statement of Validation and Accuracy

    2/26

    T-2 12SL Statement of Validation and Accuracy Revision A416791-003 23 October, 2000

    127(The informa tion in t his ma nu al only applies to th e 12SL Stat ement of Validat ion a nd Accuracy.

    Due t o continuing pr oduct inn ovat ion, specifications in th is man ua l are subject t o chan ge with out notice.

    Listed below a re GE Medical Systems Informat ion T echn ologies tradema rks. All other tr ademarks contained

    herein a re th e property of th eir respective owners.

    900 SC, ACCUSKETCH, AccuVision, APEX, AQUA-KNOT, ARCHIVIST, Autoseq, BABY MAC, C Qwik

    Connect, CardioServ, CardioSmart, CardioSys, CardioWindow, CASE, CD TELEMETRY, CENTRA, CHART

    GUARD, CINE 35, CORO, COROLAN, COROMETRICS, Corometrics Sensor Tip, CRG PLUS, DASH,

    Digistore, Digital DATAQ, E for M, EAGLE, Event-Link, FMS 101B, FMS 111, HELLIGE, IMAGE STORE,

    INTE LLIMOTION, IQA, LASER SXP, MAC, MAC-LAB, MACTRODE, MANAGED USE , MARQUETTE ,

    MARQUETTE MAC, MARQUETTE MEDICAL SYSTEMS, MARQUETTE UNITY NETWORK, MARS,

    MAX, MEDITEL, MEI, MEI in the circle logo, MEMOPORT, MEMOPORT C, MINISTORE, MINNOWS,

    Mona rch 8000, MULTI-LINK, MULTISCRIPTOR, MUSE, MUSE CV, Neo-Trak , NEU ROSCRIPT,

    OnlineABG, OXYMONITOR, Pres-R-Cuff, PRESSURE-SCRIBE, QMI, QS, Quantitative Medicine,

    Quantitative Sentinel, RAC RAMS, RSVP, SAM, SEER, SILVERTRACE, SOLAR, SOLARVIEW, Spectra400, Spectr a-Overview, Spectra -Tel, ST GU ARD, TRAM, TRAM-NE T, TRAM-RAC, TRAMSCOPE , TRIM

    KNOB, Tr imline, UN ION S TATION, UNI TY logo, UNITY NETWORK, Var i-X, Vari-X Car diomat ic,

    VariCath , VARIDEX, VAS, and Vision Ca re F ilter ar e tr adema rks of GE Medical Systems Informa tion

    Technologies registered in th e United Sta tes Pa tent and Trademark Office.

    12SL, 15SL, Access, AccuSpea k, ADVANTAGE, BAM, BODYTRODE, Car diomat ic, Car dioSpeak, CD

    TELEMETRY

    -LAN, CENTRALSCOPE, Corolat ion, E DIC, EK-Pro, Event-Link Cirr us, E vent-Link

    Cumulus, Event-Link Nimbus, HI-RES, ICMMS, IMAGE VAULT, IMPACT.wf, INTER-LEAD, IQA,

    LIFE WATCH, Man aged Use, MARQUETTE PRISM, MARQUETTE

    RESPONDER, MENTOR,

    MicroSmar t, MMS, MRT, MUSE Car dioWindow, NST P RO, NAUTILUS, O2SEN SOR, Octan et, OMRS, PH i-

    Res, Premium , Pr ism, QUIK CONNE CT V, QUICK CONNECT, QT Gua rd, SMART-PAC, SMARTLOOK,

    Spiral Lok, Sweeth eart , UNITY, Universal, Waterfall, an d Walkmom ar e tra demar ks of GE Medical Systems

    Informa tion T echn ologies.

    GE Medical System s Informa tion T echn ologies , 2000. All right s reser ved.

  • 8/3/2019 12SL Statement of Validation and Accuracy

    3/261

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Statement of Validation and Accuracy

    This paper reviews several aspects regarding the development andvalidation of the GE Marquette 12SL Program. Program accuracy levels

    for major statement categories are supplied. When the program is usedin conjunction with a cardiologist, it has been shown that it can improveboth the speed1 and accuracy of reviewing ECGs2.

    1. Sheffield et. al., 1987.Electrocardiography and computerization: a

    winning combination. Card. Prod. News. 47-51

    2. Willems et. al., 1990.Assessment of the diagnostic performance of ECG

    computer programs and cardiologists. In: Common Standards for

    Electrocardiograph: 10th and Final Progress ReportLeuven:Ref.Nr. CSE

    90-12-31 148-266

  • 8/3/2019 12SL Statement of Validation and Accuracy

    4/262

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    Introduction

    The first human electrocardiogram was taken over a hundred years ago,and computerized electrocardiography has been in existence since the

    late 1950s.

    1,

    2

    In spite of its widespread use,

    3

    long history, and thevoluminous amount of literature regarding the scientific aspects of thistechnology, there is little written that directly addresses the intent ofcomputerized electrocardiography.

    The pioneers of this technology had motivations which ranged from theesoteric goal of proving that a computer could mimic human activity tothe basic requirement of efficiently recording artifact free tracings.4Some of the favorable developments which resulted from the evolutionof this technology were hardly imagined at its inception. Consider, forexample, work patterns at facilities which provide ECG services; theyhave been greatly streamlined.5 Additionally, computerization hasresulted in two practical advantages for the overreading physician. First,

    the computer can serve as an additional expert opinion. Second,physicians have found that it is possible for them to overread computeranalyzed tracings in half the time required for conventional, non-analyzed ECGs.6

    The computer, therefore, is not only used to efficiently record, store,transmit, and present the ECG; but it is also used to assist the physicianin overreading an ECG. Consequently, developers inherit a certainresponsibility. GE Marquette has accepted this serious challenge. GEMarquette must develop the most accurate program possible,substantiate the program with data, and document the extent of thecomputers capabilities. This is the purpose of this paper.

    It should be made clear that a computerized analysis is not a substitutefor human interpretation. There are two reasons for this. First,statements of accuracy need to be viewed from a statistical perspective.

    Although accuracy levels may be high, outliers can and will exist.Second, a computer does not have the ability to include the entireclinical picture of the patient. Despite the fact that the 12SL analysisprogram has a high level of accuracy, it will occasionally not correctlyinterpret an ECG. The ECG tracing is significant only when interpreted

    1. Burch et. al., 1964.A history of electrocardiography. Year Book Medical

    Publishers. Chicago, Ill.

    2. Pipberger et. al., 1975. Computer methods in electrocardiography. Annu.Rev. Biophys. Bioeng. 4:15-42

    3. Drazen et. al., 1988. Survey of computer-assisted electrocardiography in

    the United States. J Electro. 21 (suppl):98-104

    4. Rowlandson, I., 1990. Computerized electrocardiography: a historical

    perspective. Ann. New York Academy of Science. 601:343-352

    5. Sheffield et. al., 1987.Electrocardiography and computerization: a

    winning combination. Card. Prod. News. 47-51

    6. Sheffield et. al., 1987. Seminar on computer applications for the

    cardiologist, computer-aided electrocardiography. JACC, 10(2):448-55

  • 8/3/2019 12SL Statement of Validation and Accuracy

    5/263

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    in conjunction with clinical findings. Thus, it is critical that a physicianutilizes his/her best clinical judgement when reviewing the ECGinterpretation.

  • 8/3/2019 12SL Statement of Validation and Accuracy

    6/264

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    Signal Acquisition/Signal Conditioning

    A programs accuracy is directly dependent upon the quality of thesignal it acquires. In 1979, GE Marquette introduced an

    electrocardiograph that simultaneously acquired all of the leads fromthe 12-lead electrocardiogram. Prior to this time, all commerciallyavailable electrocardiographs could only acquire 3 leads at a time.

    Simultaneous recording was adapted so that the computer could use allsignals from all 12 leads to properly detect and classify each QRScomplex. The advantage of this technique was independently verified by

    Willems et. al.: Conclusion: The simultaneous recording and analysis ofall 12 standard leads...is certainly an improvement over theconventional recording of three leads at a time. Similarly...multileadprograms proved to be more stable than those obtained by conventionalprograms analyzing three leads at a time...1

    1. Willems et. al., 1987.A reference data base for multilead

    electrocardiographic computer measurement programs. JACC,

    10(6):1313-21

  • 8/3/2019 12SL Statement of Validation and Accuracy

    7/265

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Median Beat/Signal Averaging

    Computer measurement for features within the QRS complex are verysusceptible to artifact. To remove artifact, filtering may be used. Beyond

    filtering, there is another method able to eliminate noise from the QRScomplete: signal averaging. Instead of analyzing a single QRS complex,the GE Marquette 12SL Program generates a median complex. In other

    words, all QRSs of the same shape are aligned in time. Next, thealgorithm generates a representative QRS complex from the medianvoltages that are found at each successive sample time. This is morecomplicated than creating an average, but the method results in acleaner signal since it is able to disregard outliers.

    Willems et. al., independently verified the value of this technique: Onthe basis of the finding of the present study, a measurement strategybased on selective averaging is recommended for diagnostic ECGcomputer programs.1 A specific study that included results from the GE

    Marquette 12SL Program also found that Comparisons of all averagingagainst all non-averaging programs yields a 70% higher detectioninstability for the non-averaging programs and worse performance inmost tested measurements.2

    1. Willems et. al., 1987.Influence of noise on wave boundary recognition by

    ECG measurement programs. Comp. Biomed. Res., 83

    2. Zyweitz et. al., 1987.Noise tolerance of ECG amplitude measurements.

    results and recommendations from the cse project. Comp. in Cardiol.

  • 8/3/2019 12SL Statement of Validation and Accuracy

    8/266

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    Measurements/Onsets and Offsets

    All ECG computer programs are composed of two parts: one whichmeasures the waveforms and one which does the analysis based upon

    those measurements. The main task of computerized measurementdetermines the location of major reference points (onsets and offsets forP, QRS, and T waves).

    Consistent with the signal processing portion of the program, the majorwave onsets and offsets are delineated by an analysis of the slopes in all12 simultaneous leads. That is, QRS duration is measured from theearliest onset in any lead to the latest deflection in any lead. Similarly,the QT interval is measured from the earliest detection of depolarizationin any lead to the latest detection of repolarization in any lead.

    The measurement accuracy of the GE Marquette 12SL Program wasevaluated in an independent study. It used a sample of 250 ECGs. These

    ECGs represented a wide spectrum of waveform patterns. They werethen analyzed by five referee cardiologists and sixteen differentprograms. Results from this study can be found in Appendix I. The GEMarquette 12SL Program was found to be within 1 millisecond of thereference standard for the measurement of P duration, PR interval, QRSduration, and QT interval.

    In another independent study by Mulcahy et. al., the computerdetermined heart rate, PR interval, QRS duration, QT interval and meanfrontal QRS axis. These results were then compared to measurements bytwo cardiologists. They stated: We conclude that the GE MarquetteMAC II Computer Augmented Cardiograph can be relied upon forroutine electrocardiographic measurements...1

    After the onsets and offsets for P, QRS, and T complexes have beendemarcated, the waves within each complex are then measuredaccording to published standards.2 These amplitudes and durationsresult in a measurement matrix containing more than 1600 values.Measurements are then passed onto the criteria portion of the programso that it can generate an interpretation.

    1. Mulcahy et. al., 1986. Can a computer assisted electrocardiograph replace

    a cardiologist for ECG measurements?. Irish J. of Med. Sci., 155 (12):410-

    414

    2. Willems et. al., 1982. Common standards for quantitative

    electrocardiography. second progress report. CSE reference 82-11-20.

    Leuven:Acco, 1982:1-246.

  • 8/3/2019 12SL Statement of Validation and Accuracy

    9/267

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Computer Interpretation/Development Process

    The GE Marquette 12SL Program was introduced in 1980. Allimprovements to the program have been accomplished via a

    systematic, logical, and controlled methodology. A major aspect of thismethodology benefits from the use of stored ECGs.

    GE Marquette stores ECGs in such a fashion that they can be re-analyzed by the 12SL Program.1,2 In other words, the fidelity of thestored ECG is such that it can be used as if it was newly acquired. Thisallows us to access large volumes of stored ECGs for the purposes ofeither training or testing the program.

    Any change to the program requires a great deal of research. This effortcan be instigated by a variety of sources. The constant pursuit ofclinically correlated databases can yield statistics that indicate whethera change should be considered. New criteria published in the literature

    can be evaluated and sometimes incorporated into the program.Consultations with cardiologists also stimulate investigations. This isespecially true when they have stored ECGs whose interpretation hasbeen verified by other, non-ECG data.

    Before a change can be instituted, it must always be evaluated inrelation to the current performance of the program. This validationprocess is facilitated by a set of research tools developed specifically forthis purpose. These tools are then used in conjunction with ourreanalysis capability. As an ECG is re-analyzed, a score pertaining to theitem under investigation is stored and collated by the computer. Later,after many ECG records have been scored, the computer can generatestatistics on the entire set of ECGs that it reanalyzed and stored. To

    determine ECGs that might be affected by a program change, the entireECG set is reanalyzed twice: once with the change and once without.

    After this is done, the computer automatically culls out and plots anyECGs that scored differently between the two versions of the program.This work has resulted in an efficient set of research tools that allows GEMarquette to automatically determine how a change might affectprogram performance on a large database.3

    Given these sophisticated tools, the next issue relevant to thedevelopment process is the selection of an appropriate database.

    Appendix II contains a list of gold standard databases that GEMarquette has used for program validation. These databases areextremely valuable, because they are time consuming and expensive to

    1. Huffman, D.A., 1952.A method for the construction of minimum

    redundancy codes. Proc. Inst. Radio Eng.: 1098-1101

    2. Reddy et. al., 1991.Data compression for storage of resting ECGs

    digitized at 500 samples per second. Association for the Advancement of

    Medical Instrumentation Meeting

    3. Rowlandson I., 1986.New techniques in criteria development.

    Computerized Interpretation of the Elec. X., 177-184

  • 8/3/2019 12SL Statement of Validation and Accuracy

    10/268

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    obtain. Nevertheless, they are an essential ingredient. Without anobjective yardstick, the program will not excel since the target forperformance will be vaguely and inconsistently defined by theconsensus cardiologist.1

    It should also be noted that GE Marquette uses different databasesduring the development and validation process. This precludes us fromdeveloping an algorithm that works beautifully on the training set butcannot be applied, with the same success, to other populations.2, 3

    During the training phase, we use a database that has been correlatedwith a gold standard. The choice of the gold standard depends uponthe problem we are investigating. For criteria that references aparticular patho-physiologic state (like myocardial infarction), we use adatabase that is correlated with other non-ECG evidence (like cardiaccatheterization, echocardiography, autopsy, cardiac enzymes, patienthistory, etc.). For measurements, or arrhythmia statements that can beconfirmed by the ECG itself, we use the ECG in conjunction with expertopinion.

    During the test phase, we not only use an independent gold standarddatabase, but we also use other databases. We do this because goldstandard databases have some limits. Examples include the following:4

    s The gold standard may not be representative of the disease inthe clinical setting. For example, an ECG database whichcontains autopsy proven myocardial infarctions (MI) may notbe indicative of what a typical MI looks like since many patientssurvive a MI.

    s Gold standard databases often contain only one, isolateddisease. For example, a database may only contain MIs andnormals. The algorithm, however, must also operate in the

    presence of ischemia, LVH, drug effects, etc.

    s There may be a systematic bias when selecting patients for agold standard test. CATH proven normals often receive thetest because they were symptomatic.

    s A gold standard database may only contain extremes ofnormal versus abnormal. Algorithms dont operate in a blackand white world.

    s And finally, a gold standard cannot be considered perfect:every test comes with its own inherent level of inaccuracy.

    1. Gorman et. al., 1964. Observer variation in interpretation of the

    electrocardiogram. Med. Ann. DC 33:97-99

    2. Devijver P.A. and Kittler J., 1982. Pattern recognition: a statistical

    approach. Prentice Hall International, Englewood Cliffs, New Jersey

    3. Talmon J.L., 1983. Pattern recognition of the ECG: a structured analysis.

    Univ. of Amsterdam, Amsterdam, Holland

    4. Laks MM., 1981. Gold standard for ECG diagnosis - revised 1981., In:

    Bonner et. al., eds., Comp. Interp. of the ECG VI. New York; Engineering

    Foundation, 1982; 267-75

  • 8/3/2019 12SL Statement of Validation and Accuracy

    11/269

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Given these aforementioned limitations, testing must go beyond the useof gold standard databases. We must test the algorithm with a widerspectrum of data. GE Marquette accomplishes this by measuring thealgorithms performance on a large database (>150,000 ECGs acquiredfrom one of the over 400 installed GE Marquette ECG storage systems).This process, which the computer can do in less than a day, confrontsthe algorithm with multiple diseases and varying degrees ofabnormality. ECGs that changed their analysis results due to programmodification can be further investigated with either confirmation frommedical records and/or expert opinion.

    Only after this retrospective testing is complete, can we finallyincorporate the change into an ECG cart and evaluate its performance ata clinical site. If this last test is successful, the change is incorporatedinto the program for general release.

  • 8/3/2019 12SL Statement of Validation and Accuracy

    12/2610

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    Accuracy of Computer Interpretation

    The following section summarizes the diagnostic performance for themajor statement categories as made by the GE Marquette 12SL

    Program. These categories include: rhythm, conduction, hypertrophy,infarction, repolarization, and overall classification (that is, normalversus abnormal).

    Rhythm Tables 1 and 2 presented below are from a study that reportedretrospective and prospective evaluation results of the GE Marquette12SL Program.1 The tables show the performance indices of 12SL formajor rhythm categories.

    1. Reddy et. al., 1998. Prospective Evaluation of a microprocessor-assisted

    cardiac rhythm algorithm: Results from one clinical center. J.

    Electrocardiol. 30:28-33

    Table 1. Sensitivity, Specificity Negative and Positive Predictive Accuracy ofRhythm Interpretation of MAC-RHYTHM Analysis During Prospective Testing

    No. Sensitivity (%) Specificity (%) NPA (%) PPA (%)

    Sinus rhythms 9,324 98.7 91.0 91.5 98.6

    Atrial fibrillation 832 87.5 99.4 99.0 92.4

    Atrial flutter 106 76.4 99.7 99.8 71.7

    Junctional 64 92.2 99.5 100.0 52.7 (72.8*)

    2nd-degree AV blocks 26 80.8 99.6 100.0 32.8

    * After excluding paced ECGs with failed pace detection. NPA, negative predictive accuracy; PPA, positive predictive accuracy.

  • 8/3/2019 12SL Statement of Validation and Accuracy

    13/2611

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Table 2. Sensitivity and Specificity of Current Release (MACR) and an earlier 12SL release (Dec. 1993)in Retrospective Testing

    Sensitivity (%) Specificity (%)

    n MACR Old P MACR Old P

    Sinus rhythms 3,438 98.6 98.5 NS 91.7 86.5 .0001

    Abnormal rhythms

    Atrial fibrillation 283 82.0 65.0 .0001 99.4 99.0 NS

    Atrial flutter 100 80.0 73.0 NS 99.7 99.9 NS

    Junctional rhythm 101 81.2 39.6 .0001 99.1 99.7 NS

    2nd-degree AV block 167 79.6 12.0 .0001 99.7 99.9 ns

    Others* 59 67.8 15.3 .0001 99.7 99.9 NS

    Subtotal: abnormal 710 79.4 44.2 .0001 98.6 98.5 NS

    Total 4,176 95.2 88.9 .0001

    * Others: atrioventricular dissociation and complete heart blocks. NS, not significant

  • 8/3/2019 12SL Statement of Validation and Accuracy

    14/26

  • 8/3/2019 12SL Statement of Validation and Accuracy

    15/2613

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Hypertrophy Two independent studies have evaluated the performance of the GEMarquette 12SL Program for left ventricular hypertrophy (LVH) usingechocardiography (ECHO).

    At the Mayo Clinic, an ECHO was performed within 30 days of the ECG.1ECGs demonstrating WPW syndrome, paced rhythm, or LBBB wereexcluded from the study. ECHO studies were excluded for patients who

    were less than 21 years of age. All two dimensional and M-mode ECHOstudies were technically adequate and required clear delineation ofinterventricular septal thickness (IVST), posterior wall thickness (PWT),and left ventricular internal dimension (LVID). Patients with IVST/PWT>1.5, segmental wall motion abnormalities, pericardial effusion, orinfiltrative cardiomyopathy were excluded from the study. This resultedin a test population of 4,300 patients.

    ECHO measurements were made according to the American Society ofEchocardiography. ECHO studies revealed LVH in 1,029 patients. LVH

    was defined as:

    ECHO LV mass >265g

    LV mass = 1.04 ((LVID + PWT + IVST)3 - (LVID)3) - 13.6g

    The GE Marquette 12SL Program correctly identified 328 patients withLVH and 3,010 patients without LVH. The program was scored as statingLVH for the full breadth of statements that refer to the abnormality;including minimal (and moderate) voltage criteria for LVH, may benormal. Table 5 summarizes the programs performance.

    In addition to the Mayo Clinic study, another international studyevaluated program performance for LVH.2 There are fewer patients inthis study (normals = 382, LVH = 183). A normal individual was definedas being free of significant cardiopulmonary disease on the basis of ahealth screening examination (negative history, normal physical exam,normal chest X-ray; n=285) or invasive cardiac study (n=97). Invasivestudies usually entailed cardiac catheterization (CATH) for atypicalchest pain or ST/T abnormalities evident at rest or during exercise. LVH

    was based on CATH or ECHO. ECHO diagnosis of LVH was based on an

    increased indexed LV mass >100 g/m2 for females and >134 g/m2 for

    1. Hammil et. al., 1988. Comparison of a computerized interpretive ECG

    program using xyz leads with a program using scalar leads. J. of Elect.

    (Supp) 88

    Table 5. Sensitivity and Specificity for LVH

    SENSITIVITY SPECIFICITY

    328 / 1,029 (31.9%) 3, 010 / 3,271 (92%)

    2. Willems et. al., 1990.Assessment of the diagnostic performance of ECG

    computer programs and cardiologists. In: Common Standards for

    Electrocardiography: 10th and Final Progress Report Leuven:Ref.Nr. CSE

    90-12-31 148-266

  • 8/3/2019 12SL Statement of Validation and Accuracy

    16/2614

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    males. Patients had left valvular acquired or congenital heart disease,including aortic stenosis (NYHA grade 2,3), aortic regurgitation (grade2,3,4), mitral incompetence (grade 2,3,4) or aortic coarctation.Cardiomyopathy patients were excluded from this study. The statementoutput of the program was scored similarly to the Mayo study. Table 6summarizes the performance of the program from this study.

    The difference in sensitivity between these two studies is probably dueto the degree of disease in the two populations. The study with thehigher sensitivity has LVH of a greater extreme than the wider spectrumof disease that was recorded at the Mayo Clinic.

    It should also be noted that the criteria for LVH is age sensitive. If age is

    not entered into the electrocardiograph, the program defaults to themost sensitive criteria (old age, i.e. >80 years).

    The international study by Willems et. al. also investigated rightventricular hypertrophy (RVH). This abnormality is notoriously difficultto verify. As a result, the presence of RVH was determined via aminimum requirement of clear cut-clinical evidence coupled withinvestigative evidence of conditions which would be expected to causeRVH. The RVH cases (n=55) had acquired congenital RV pressure orvolume overload, primary or secondary pulmonary hypertension (meanpulmonary pressure > 25mmHg) or left-to-right shunt at the atrial levelof more than 1.5. The results of this study were as follows:

    Table 6. Sensitivity and Specificity for LVH

    SENSITIVITY SPECIFICITY

    76.2% 91.2%

    Table 7. Sensitivity and Specificity for RVH

    SENSITIVITY SPECIFICITY

    29.1% 100%

  • 8/3/2019 12SL Statement of Validation and Accuracy

    17/2615

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Infarction The acute infarction interpretation package in the current 12SLprogram has, based on an evaluation at GE Marquette, an overallsensitivity of 65% with a specificity of 98%.1 Kudenchuk et al.,2 in aseparate independent study found that the GE Marquette 12SL programhad an overall sensitivity of 74% in chest pain patients whose ECGs hadST elevation and other abnormalities. He also determined the sensitivityfor anterior acute infarction alone to be 56% and for inferior infarctionalone to be 87%. The overall specificity was found to be 100% for the12SL program versus 82% for the electrocardiographer in that samechest pain data set. However, in the ECGs where only ST elevation waspresent without other concomitant repolarization abnormalities, the12SL program was found to have a sensitivity of 45% for correctlyinterpreting acute infarction, but retained the same high specificity.

    The improvements to the current acute infarction package which led upto these figures are described in detail in the reference article: TheDilemma of Sensitivity versus Specificity in computer-interpreted AcuteMyocardial Infarction. To summarize, the original package had veryhigh specificity (99-100%), but a very low sensitivity (21%). GE

    Marquette worked with Dr. Douglas Weaver and used his enzymecorrelated chest pain acute MI database, which is described inKudenchuks article, to develop new interpretation criteria to increasethe sensitivity for interpretation of acute infarction, and not decreasethe high specificity. The package for interpreting acute infarction wasrequired to have extremely high specificity to minimize the possibilityof clinicians treating inappropriate patients; thereby needlesslysubjecting the patients to the risk of potentially life-threateningcomplications of the medication. GE Marquette succeeded indeveloping more sensitive interpretive criteria without jeopardizing thealready high specificity by requiring evidence of concomitant orreciprocal repolarization changes to be present in the ECGs beforeinterpreting them as having evidence of acute infarction. Concomitant

    repolarization changes refer to evidence of other ST segment depressionand/or T wave inversion in leads other than those exhibiting STelevation.

    As stated above, the 12SL program has a much higher sensitivity foracute infarction when other repolarization changes are present in theECG (74%) than when they are not present (45%). The 12SL emphasis isto always maintain the current specificity.

    1. Elko PP, Waver WD, Kudenchuk PJ, Rowlandson GI: The dilemma of

    sensitivity versus specificity in computer interpreted acute myocardial

    infarction. J of Electrocardiol 24:S2-7,1991

    2. Kudenchuk PJ, Ho MT, Waver WD, et. al.,:Accuracy of Computer-

    Interpreted Electrocardiography in Selecting Patients for Thrombolytic

    Therapy. J Am Coll Cardiol 1991;17:1486-91

    Table 8. Sensitivity and Specificity for Overall Acute MI

    SENSITIVITY SPECIFICITY

    65% 98%

  • 8/3/2019 12SL Statement of Validation and Accuracy

    18/2616

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    Kudenchuk et. al., has presented a very good synopsis of his findings onthe performance of the GE Marquette 12SL program in the chest painscreening area. Part is included here:

    Table 9. Sensitivity and Specificity for In Chest Pain Patients

    SENSITIVITY SPECIFICITY

    74% 98%

    in patients with suspected acute myocardial infarction, the overallsensitivity for the current computer algorithm (GE Marquette 12SL) todetect acute injury was 52% compared with the 66% sensitivity of theelectrocardiographer in identifying patients with acute myocardial infarctionwho had ST elevation on their first ECG. This difference occurred in partbecause the electrocardiographer used less stringent criteria (100 V STelevation without consideration of associated QRS or ST changes, unlessleft ventricular hypertrophy, left bundle branch block or a ventricular

    arrhythmia was present) than those used by the computer algorithm. Thecomputer algorithm was developed to help differentiate early repolarization

    and nonspecific ECG changes from those of acute injury and, unlike theelectrocardiographer, did not presume that ST elevation in a patient with

    chest pain was more likely than not to indicate acute infarction. Althoughmore sensitive, the electrocardiographer has an overall incidence of 5%false positive diagnoses, including a 22% incidence of false positivediagnoses in patients with isolated ST segment elevation. In contrast, thecomputer was nearly perfect at excluding patients without acute myocardial

    infarction, but did so at the expense of diminished sensitivity. Computerspecificity was as high or higher than that of the electrocardiographer,regardless of the patients clinical characteristics (gender, age, cardiachistory or duration of symptoms), and inappropriate patients were

    extremely unlikely to be designated for thrombolytic intervention. The present algorithm is clearly adequate for first line screening of patientswith chest pain by paramedics or in the emergency department. Itssensitivity is no worse than that of the emergency physician and itsspecificity is superior to that of a trained electrocardiographer. Use of thecomputer-interpreted ECG will provide for almost immediate triage ofpatients and obviate the time delays required when consulting with anelectrocardiographer before proceeding with treatment. If findings areinterpreted as normal or nondiagnostic, the ECG can then be furtherassessed by a skilled electrocardiographer. We are unaware of anysimilar extensive testing of other commercially available algorithms.

  • 8/3/2019 12SL Statement of Validation and Accuracy

    19/2617

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Otto and Aufderheide1 reinforce the usefulness of concomitantrepolarization evidence for interpretation of acute infarction from theECG. They state:

    To summarize, the current 12SL program relies heavily on concomitantorreciprocalrepolarization and ST elevation to have what GE Marquettebelieves is the best acute infarction package in the industry in terms ofcombined sensitivity and specificity.

    GE Marquette Medical Systems position with respect to the use of the12SL computerized ECG analysis program; is: Despite the fact that the

    12SL analysis program has a high level of accuracy, it will occasionallynot correctly diagnose an ECG. The ECG tracing is significant only wheninterpreted in conjunction with clinical findings. Thus, it is critical that aphysician utilizes his/her best clinical judgment when reviewing theECG interpretation and that all ECGs need to be reviewed by aphysician.

    This explanation and information is intended to provide the user with abetter, more accurate, and realistic understanding of the capabilities ofthe 12SL program for the detection of acute infarction. GE Marquettecontinually strives to make the 12SL ECG analysis algorithm better, andbelieves that the program currently is the best and most completelyvalidated in the industry; although not perfect.

    Anterior Myocardial Infarction Selection of Patients

    Using a computerized database of the patients who had undergonecardiac catheterization at the Syracuse Veterans Administration

    1. Otto LA, Aufderheide TP:Evaluation of ST Segment Elevation Criteria for

    the Prehospital Electrocardiographic diagnosis of Acute Myocardial

    Infarction. Annals of Emergency Medicine 1994; vol 23, no. 1:17-24

    ST segment elevation criteria are not the sole factors used to determine thefinal ECG diagnosis of acute myocardial infarction because they lack the

    added impact of observer interpretation. ST segment elevation criteriafunction to triage ECGs and represent a minimally acceptable guidelinerequiring additional observer interpretation for consideration of the acutemyocardial infarction diagnosis. Significant observer variation ininterpreting ECGs is well established. Improving the positive predictivevalue of ST segment elevation criteria is likely to decrease the impact ofobserver variation. This study demonstrated that inclusion of reciprocalchanges in prehospital ST segment elevation criteria improves the positivepredictive value from 49% to more than 90%. Clinically, reciprocal changesare present in patients with large myocardial infarctions, lower ventricularejection fractions, and higher mortality rates. Reciprocal changes also canbe seen in some patients with significant disease of a noninfarct relatedartery. Therefore, inclusion of reciprocal changes into ST segment elevationcriteria not only significantly improves positive predictive value but alsoselects the subgroup of acute myocardial infarction patients that stands tobenefit most from rapid intervention.

  • 8/3/2019 12SL Statement of Validation and Accuracy

    20/2618

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    Medical Center (VAMC) from 1975 through 1993, we identified allpatients who were angiographically normal (normal subjects) and allpatients who had angiographic evidence of previous AnteriorMyocardial Infarction (AMI). We defined the former as those who hadno evidence of coronary arterial abnormalities on coronary angiographyin multiple projections, no abnormalities on coronary angiography inmultiple projects, no abnormalities of the left ventricle in the rightanterior oblique projection (mean ejection fraction >70), and normal(

  • 8/3/2019 12SL Statement of Validation and Accuracy

    21/2619

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Repolarization GE Marquette 12SL Program accuracy for recognizing ST segmentelevation associated with acute myocardial infarction has beenevaluated in a large study (n=1,189) that acquired ECGs from patients

    within 6 hours of the onset of chest pain. This study used cardiacenzymes as the gold standard.1 Their conclusion: the positivepredictive value of the computer- and physician-interpreted ECG was,respectively, 94% and 86% and the negative predictive value was 81%and 85%.

    The present algorithm is clearly adequate for first line screening ofpatients with chest pain by paramedics or in the emergencydepartment. Its sensitivity is no worse than that of the emergencyphysician and its specificity is superior to the trainedelectrocardiographer.

    Raw numbers for algorithm performance is given in Table 8.

    These results also led to the following conclusion: Although moresensitive, the electrocardiographer had an overall incidence of a 5%false positive diagnosis, including a 22% incidence of false positivediagnoses in patients with isolated ST segment elevation. In contrast,the computer was nearly perfect at excluding patients without acutemyocardial infarction, but did so at the expense of diminishedsensitivity.

    With regard to other repolarization abnormalities, program evaluationpresents a difficult dilemma. As opposed to such abnormalities ashypertrophy and infarction, there are few gold standards to quantifythe performance of such things as metabolic disorders, drug effects, etc.Furthermore, and as opposed to rhythm and conduction, the ECGcannot be reliably used as the source for correct interpretation of theseabnormalities. This dilemma is especially evident for such statements asnon-specific ST/T abnormality. These statements refer to therecognition of electrocardiographic features that have been identified

    without quantification; in spite of the fact that there is still no standardfor measuring them. Without a standard, it is difficult to quantifyperformance when the statement has no objective measure, either fromthe ECG itself or from other non-ECG data.

    1. Kudenchuk et. al., 1991.Accuracy of computer-interpreted

    electrocardiography in selecting patients for thrombolytic therapy. JACC

    17(7):1486-91

    Table 11. Sensitivity and Specificity for acute MIas determined by cardiac enzymes

    SENSITIVITY SPECIFICITY

    Computer 202/391 (52%) 785/798 (98%)

    Physician 259/391 (66%) 757/798 (95%)

  • 8/3/2019 12SL Statement of Validation and Accuracy

    22/2620

    Revision A12SL Statement of Validation and Accuracy

    416791-003

    Overall Classification Several studies have addressed the issue of whether or not the computercan reliably classify the ECG as either normal or abnormal. Thesestudies found:

    The GE Marquette program is reliable in diagnosing normality: eventhe disagreements are arguable.1

    From a practical point of view, the eventual consensus opinion of thecardiologists was that only one tracing reported as normal by the GEMarquette system definitely should have been reported as abnormal toa family doctor, resulting in a negative predictive value of 98.4%. In viewof the cardiologists inter-observer variation with regard to what isnormal, this may well be higher than an individual cardiologistsnegative predictive value and suggests that the system examined maysafely be used to exclude major abnormalities which would affectclinical management.

    A total of 39, 238 electrocardiograms were reviewedThe programplaced the ECG into the following diagnostic classifications: normal

    22%, otherwise normal 6%, borderline 5%, abnormal 66%. Thereviewing physician agreed with this classification in 96.3% of allcases The most striking information shows the agreement of thephysicians with the computer diagnosis of an abnormalelectrocardiogram in 97.7% of the 25,295 tracings. In only 204 recordsout of 25,987 tracings (.8%), the physicians edited a computer-calledabnormal electrocardiogram and changed it to normal. Likewise, in only63 of 8,632 (.7%) tracings of which the computer called normal did thephysicians edit this tracing to read abnormal.2

    1. Graham, et. al., 1986. User evaluation of a commercially available

    computerized electrocardiographic interpretation system. In: Willems et.

    al eds. Computer ECG Analysis: Towards Standardisation. Amsterdam:

    North Holland, 1986;191-3.

    2. Mulcahy et. al., 1986. Can a computer diagnosis of normal ECG be

    trusted?. Irish J of Med. Sci., 155(12):410-414

  • 8/3/2019 12SL Statement of Validation and Accuracy

    23/2621

    Revision A 12SL Statement of Validation and Accuracy

    416791-003

    Appendix IMean Differences and Corresponding Standard Deviations (in ms)of Basic Intervals Derived from the Global References Standards

    CSI Program

    P Duration(n=218)

    Mean SD

    PR Interval(n=218)

    Mean SD

    QRS Duration(n=218)

    Mean SD

    QT Interval(n=218)

    Mean SD

    GE Marquette -0.4 9.0 -.0.6 5.8 -0.6 5.4 0.9 12.2

    Louvain 9.3 8.8 3.3 6.9 0.6 4.2 -7.0 11.6

    Hannover 2.4 7.7 2.6 7.3 -1.2 6.0 1.8 9.5

    HP 9.3 16.2 2.7 11.5 1.6 7.2 6.0 14.7

    IBM 7.8 7.5 1.2 5.8 8.2 7.7 0.2 10.7

    Nagoya 8.9 14.1 3.3 13.1 1.8 15.2 5.8 15.8

    Lyon 11.4 8.5 -5.6 6.0 -0.1 6.1 0.3 10.4

    AVA -9.6 7.4 -3.8 4.8 -5.1 4.8 -7.3 8.9

    Glasgow 0.5 9.3 -1.9 7.0 0.1 4.9 4.5 12.2

    Halifax -12.9 12.4 -3.9 6.3 -2.2 7.0 -3.2 9.2

    Padova -2.8 8.7 -2.4 5.4 -3.4 5.8 -4.1 6.4

    Telemed 10.4 12.8 -0.2 10.3 8.6 8.1 4.6 12.3

    Modular 6.9 7.9 4.3 4.2 0.3 7.8 3.9 9.8

    Sicard-Riedl 1.7 8.1 -3.3 6.5 2.2 5.9 -0.9 7.9

    Appendix IIGold Standard Databases

    s Normals, proven via health screening examination (5000)s Prolonged QT syndrome, proven via personal and family history (300)s Inferior Myocardial Infarction, proven via CATH (700)s Inferior Myocardial Infarction, proven via ECHO (200)s

    Anterior Myocardial Infarction, proven via CATH (500)s Normals, proven via CATH (500)s Acute Myocardial Infarction, proven via cardiac enzymes (1200)

    s Myocardial Infarction, proven via autopsy (500)s Left Ventricular Hypertrophy, proven via ECHO (500)s Normals from Health and Nutrition Evaluation Survey (2000)s ECGs from Common Standards from Electrocardiography (1000)s Serial ECGs correlated with evolving infarction (several thousand)

  • 8/3/2019 12SL Statement of Validation and Accuracy

    24/2622

    Revision A12SL Statement of Validation and Accuracy

    416791-003

  • 8/3/2019 12SL Statement of Validation and Accuracy

    25/26

  • 8/3/2019 12SL Statement of Validation and Accuracy

    26/26