information extraction from multimodal ecg · pdf filecardia and tachycardia patient records...

Information Extraction from Multimodal ECG Documents

Fei Wang Tanveer Syeda-Mahmood David BeymerIBM Almaden Research Center

650 Harry Road, San Jose, CA 95120{wangfe,stf,beymer}@us.ibm.com

Abstract

With the rise of tools for clinical decision support, thereis an increased need for automatic processing of electro-cardiograms (ECG) documents. In fact, many systems havealready been developed to perform signal processing taskssuch as 12-lead off-line ECG analysis and real-time pa-tient monitoring. All these applications require an accu-rate detection of the heart rate of the ECG. In this paper,we present the idea that the image form of ECG is actu-ally a better medium to detect periodicity in ECG. Whenthe ECG trace is scanned or rendered in videos, the peaksof the waveform (R-wave) is often traced thicker due to pixeldithering. We exploit the pixel thickness information, for thefirst time, as a reliable feature for determining periodicity.Results are presented on a database of 16,613 12-channelECG waveforms, which demonstrate robustness and accu-racy of our image-based period detection method on theseECGs of various cardiovascular diseases. 94.5% of brady-cardia and tachycardia patient records are correctly identi-fied using our estimated heart period as the disease criteria.

1 Introduction

An electrocardiogram (ECG) is an important and com-monly used diagnostic aid in cardiovascular disease diag-nosis. An ECG is an electrical recording of the heart thatdepicts the cardiac cycle. It is routinely used as a first courseof choice in diagnosing many cardiovascular diseases. Of-ten, 12 electrodes are used to record the electrical activityof the heart from different leads. A normal ECG waveform(in lead II) has a characteristic shape indicated in Fig. 1b.Many disturbances in the heart function show as character-istic variations in the sinus rhythm waveform of Fig.1b, andcan be served as important cues to diagnose the disease.Physicians routinely make diagnosis by a simple visual ex-amination of these ECG waveforms.

With the rise of tools for clinical decision support, there

(a) (b)

Figure 1. Illustration of the Electrocardiogram(a) The heart cycle. (b) A normal ECG.

has been an increased need for automatic processing of theelectrocardiogram (ECG). In fact, many systems have al-ready been developed to perform signal processing taskssuch as 12-lead off-line ECG analysis, and real-time patientmonitoring. All these applications require an accurate de-tection of the heart rate so that the disease related featureswithin a heart cycle can be extracted. Furthermore, heartrate estimation is a prerequisite step to identify arrhythmiadiseases such as bradycardia and tachycardia. Much of theprior work has focused on determining the heart rate fromdigital ECG time series signals. There are a large numberof ECGs still in paper form. Their digital records are cre-ated as scanned images as shown in Fig. 2b. Not much priorart exists in the determination of period from ECG imagesobtained from scanner paper ECGs. Although most hospi-tals now have digital ECG recorders, much of the legacyECG data is still in paper form. A sample paper form ECGis shown in Fig. 2b. Unlocking these ECG records printedon paper and exposing them to digital analysis would beuseful. It provides additional data for current ECG analysistechniques, as well as historical data to be used for compara-tive studies. Digital ECG recordings are sampled very finely(1000 samples/sec) and contain a large amount of noisy datawith baseline wandering problems, whereas the paper print-outs are actually cleaner in appearance. Further, all 12 leadinformation is displayed compactly in a standardized (lead

(a) (b)

(c) (d)

Figure 2. Illustration of the Electrocardiogramfrom different sources (a) Digital ECG (b)Paper ECG. (c) ECG embedded in echocar-diogram video (d). ECG extracted from theechocardiogram frame from (c)

order is fixed) 3 row by 4 column format.The goal of this paper is to present a computational al-

gorithm as well as medical decision support tool for findingthe heart period from ECG images. Specifically, we use theknowledge that when ECG trace is scanned or rendered invideos, the peaks of the waveform (R-wave) is often tracedthicker due to pixel dithering. We exploit the pixel thicknessinformation, for the first time, as a reliable feature for de-termining periodicity. Existing period estimation algorithm,such as estimation of R peaks as largest positive peaks [13]often leads to detection error when R peaks are lower thanthe P or T peaks. While our method is relatively robustto these cases since our features latched on pixel ditheringis more consistent among corresponding R peaks. Further-more, using our estimated heart period as the disease cri-teria, 94.5% of bradycardia and tachycardia patient recordsare correctly identified on a database of 16,613 ECG wave-forms.

This paper makes several novel contributions. To ourknowledge, this is the first period estimation work on im-age ECG waveforms from scanned ECG printouts or ECGsembedded in echo videos. It is also the first practical ap-plication of image-based techniques which utilize the pixeldithering to identify ECG period. Our method can also beused for disease diagnosis validation in healthcare decisionsupport applications, where we can use our technique toidentify arrythmia ECG records from patient with brady-cardia or tachycardia diseases. Our algorithm can be espe-cially useful for scanned paper ECGs, ECGs embedded in

echocardiogram videos, as well as digital ECGs, once it isconverted into image form.

While there is considerable prior art in ECG period de-tection from digital time series, there has not been muchwork on detecting periodicity from ECG images directly(rather than detecting periodicity after conversion to a timeseries). We will briefly review some related methods in lit-erature next.

1.1 Previous methods

Most of the existing approaches that deal with paperECGs actually trace the signals to form a digital time se-ries so that time series correlation methods can be used todetect periodicity. There are considerable algorithms avail-able for single ECG period detection, which have been ex-tensively analyzed in the past using a variety of classicalmethods including spectral analysis, time-frequency analy-sis, wavelets, and machine learning methods (see survey at(Koler et al. [4] and Tompkins [9]). One of the most popularapproaches is based on detection of the R peak or QRS peakof the ECG waveform (Tompkins [9]). In Zimmerman et al.[13], the R wave is detected as the largest positive peaks ina 3 second sample window, after the ECG signal is first fil-tered to remove the wandering zero-volt baseline. Similarapproaches were also used in Reisman et al. [6] except dif-ferent kinds of bandpass filters were employed. For exam-ple, those filters include LPC filter [4], two-pole recursivefilter [9], and bandpass filter in Reisman et al. [6]. Theseapproaches are all based on an assumption that R peaks arethe largest positive peaks in a ECG signal, which is not al-ways true for ECG recordings. A modified feature-basedauto-correlation function is utilized in Syeda-Mahmood etal. [8] and Wrublewski et al. [11], whereas the peaks inthe autocorrelation function correspond to the various pe-riodicity patterns found in the signal. The most commoninter-peak duration is representative of a heart beat dura-tion. This approach is however sensitive to the noise, andit tends to produce a segment that is less than a heart cycledue to multiple peaks within a heart beat.

The power spectrum of the ECG waveform can provideuseful information about the QRS complex. These spec-trums can be generated using either Wavelet transforma-tion (Saxena et al. [7] and Ghaffari et al. [3]) or Fouriertransformation (Tompkins [9]). The peaks of the frequencyspectrum obtained corresponds to the peak energy of theQRS complex. For an overview of these methods, pleaserefer to Chapter 12 of Tompkins’ book [9]. These methodstypically require that the frequency (or the scale in wavelettransform) of the QRS complex is known beforehand, sothat QRS candidate can be searched in a defined vicinity forhigher peaks.

More recently, machine learning methods have been an

increased trend towards heart period estimation [1]. For ex-ample, in Mehta et al. [5], the Support Vector Machine(SVM) was used as a classifier for detection of QRS com-plexes in ECG; Vijaya et al. [10] propose to use NeuralNetworks to estimate ECG periods. A critical difference be-tween these learning based approaches is that these methodsrequire training of the ECG data, while we do not have anysuch requirement.

Another source of ECGs is the synchronization ECGused in echocardiograms which shows as a waveform em-bedded in images as shown in Fig. 2c. The problem ofperiodicity estimation is frequently encountered in lots ofspatio-temporal analysis of echo videos (Ebadollahi et al.[2]). While it is difficult to estimate the heart cycle directlyfrom the depicted heart region in the video, it is relativelyeasy to estimate heart rate from the synchronizing ECG.Thus, methods are needed that can estimate the periodicityfrom embedded ECGs in images.

2 Model

In this section, we describe our algorithm in detail. Ourmethod can be divided into two major steps: first we ex-tract ECG envelopes from images, and then estimate theperiod based on these image-based features. Once the ECGis in the image form, either scanned or extracted from theechocardiographic videos, the next important step is to pre-process the data which will be useful in estimating the pe-riod. Our approach of period detection from ECG imageis based on a key observation that the difference of the up-per envelope and lower envelope of the ECG can signify thelocal maximum & minimum of the original signal. The up-per envelope and (shown in red) and lower envelope of theECG (shown in blue) are plotted in Fig. 4b. Note that thegap between the two envelopes increase significantly whenthe signal reaches its peak or valley. This phenomenon isexploited in our algorithm to detect the heart rate, whichhowever cannot be used if we treat ECG as signals sinceall points would have equal tracing width. We now firstdiscuss the extraction of ECG waveforms from the scannedECG images.

2.1 ECG envelope extraction

We extract the ECG waveforms as curves in the respec-tive image segments where each lead position in the imageis segmented. Due to noise in recordings as well as the sty-lus speed in the ECG recorder, there are often gaps whichcause problems in curve extraction. Note that these record-ings are actually time series or functions of lead-voltage vstime. Thus, a general purpose algorithm such as curve fol-lowing or skeletonization, may not enforce the constraintthat a single y−value occur for each x position in image

coordinates. In places where the axis bifurcates or turnsback on itself, a post-processing step would be required forchoosing between multiple values of y for a particular valueof x.

Since the ECG curve may be multiple pixels thick at thescanned resolution, our curve tracing algorithm follows theupper and lower edges of the curve. Fig. 3 depicts thesecurves, yu(x) and yl(x). In our tracing algorithm, we con-tinue the curve trace from x − 1, thus extending yu(x − 1)and yl(x−1). We define two search ranges centered aroundthese values:

Ru = [yu(x− 1)−W, yu(x− 1) + W ] (1)Rl = [yl(x− 1)−W, yl(x− 1) + W ] (2)

where W is a search window. In practice, W needs to beset large enough to handle the voltage spike at an R wave.Next, we define

yu(x) = miny∈Ru

y such that T (x, y) = 0 (3)

yl(x) = maxy∈Rl

y such that T (x, y) = 0 (4)

which keeps yu and yl tracing the upper and lower en-velopes of the ECG curve. After tracing the envelopes, theaverage

y(x) = (yu + yl)/2 (5)

is used as the traced value y(x).To provide robustness to noise and missing curve frag-

ments, we use morphological operators and a grouping al-gorithm across the gaps. A morphological open (erode + di-late) helps fill in holes inside the curve, and a morphologicalclose (dilate + erode) eliminates noise pixels near the curve.To close small gaps, curve tracing is started using a num-ber of seed points along the expected ECG curve location.Gaps are closed between pairs of consecutive fragments ifthe gap is small enough (4 pixels = 10 msec). For gaps thatare larger than the gap threshold, we have found that mostlarger breaks in the curve occur at the R wave, where thesignal spikes up and down. At the R wave, however, thecurve is nearly vertical, so the sampling in equations (1)-(2)is nearly tangent to the curve. Thus, we can handle a largefraction of vertical dropout, as shown in Fig. 3. Fig. 4(b)illustrates the upper and lower envelope of the ECG curve.ydiff = yu−yl is the difference between the two envelopes,which is shown in Fig. 4e. It is evident from this plot thatthe peak of the difference signal is quite consistent amongall R peaks, and therefore can serve as a good feature toidentify the R peak locations.

2.2 ECG period detection

Differentiation forms the basis of many QRS detectionalgorithms [12, 11]. The differentiator, in effect, acts as a

Figure 3. Illustration of curve tracing acrossgaps in ECG images. Upper and lowerbounding curves of the ECG, yu and yl, areshown in green and red. The R wave spikesshow a number of gaps in the original curvethat are correctly bridged by our tracing tech-nique.

Figure 4. Illustration of various waveformsfrom different stages of period detection al-gorithm.

high-pass filter, which characterizes the steep slope of theQRS complex of the ECG signal. Specifically, the first or-der derivative of the ECG signal amplifies the higher fre-quencies characteristic of the QRS complex while attenuat-ing the lower frequencies of the P and T waves. For thesereasons, we take the product of the ECG derivative with theenvelope difference signal to filter out the high frequencynoise of the envelope difference signal. Differentiator fil-ters have several different forms [4], we choose to approx-imate derivative of the input ECG signal by the backwarddifference.

Once the ECG envelopes are extracted from the image,the final product consists of three waveforms, which is ex-pressed as

S(x) =−−→y(x) ∗

−−−−−→ydiff (x) ∗

−−−→y′(x) (6)

where −→y is the normalized ECG waveform, in which thewandering baseline is removed; −−−→ydiff is the normalized en-velope difference (shown in Fig. 4e), and

−→y′ is the normal-

ized first order ECG derivative. The normalized product

waveform is shown in Fig. 4f, in which we noticed thatpeaks of the product is corresponding to the R wave of theECG signal. The period of the ECG can therefore be esti-mated as the interval between these peaks.

3 EXPERIMENTAL RESULTS

We now present experimental results on the applicationof our algorithm to both ECGs synthesized from the digitalwaveforms and those ECGs extracted from echocardiogramvideos. First, to demonstrate the robustness of our algo-rithm, we apply our algorithm to an ECG from an atrialfibrillation patient, in which the ECG waveform is quitenoisy and QRS complex are unpronounced. Regular meth-ods such as a derivative-based method all failed to recoverthe period, which is evident from the derivative waveformin Fig. 5b. The product waveform S(x) produced from ouralgorithm is shown in Fig. 5c. It is evident that our methodis able to pick up the R peak from the noisy waveform, evenwhen the noise level is high. Our method exhibits strongerresistance to noise than the derivative-based and other filter-based approaches.

Next, we conduct our experiments on a large collectionof digital ECG datasets, which we obtained from a largehospital network. The database contains 16,613 12-channelECG waveforms. 13043 ECGs come with the ground truthheart rate among the whole data collection. Using ourmethod, we are able to achieve 88% accuracy rate, wherethe error between ground truth and the estimation is un-der 5%. In comparison, cross-correlation is only able toachieve a 72% accuracy rate. A closer examination of thethose cases in which the relative error from the ground truthis higher than 5% reveals that the majority of those ECGsare either i) arrhythmia cases where each channel gives adistinct period estimate that confuses the algorithms, ii)the ground truth for the ECGs are inaccurate. We believethe accuracy of our algorithm can be further improved ifthose cases are excluded. Additionally, of the 16613 ECGs,we were able to additionally find a period for 2641 morecases for which these patients had no documented periods.We found 2538 cases of bradycardia, and 1541 cases oftachycardia among those matching with the ground truth,which represents 94.5% of the bradycardia and tachycardiapatients when compared with the ground truth disease la-bels. Sample bradycardia and tachycardia ECGs that areretrieved by our algorithm is shown in Fig. 6.

Finally, we apply our algorithm to estimate the heart cy-cle from ECGs extracted from a large collection echocardio-gram videos, which contains 1178 echocardiogram videosequences. For each of these video clips, we first extractthe region of interest which contain ECG signal, and feedit to our period detection algorithm. After we estimate theperiod for all these videos, we randomly pick 100 videos

Figure 5. Robustness in the presence of large noise. a) ECG image; b) first order derivative of theECG waveform; c) product waveform from our method.

(a) Bradycardia (b) Tachycardia

Figure 6. Sample bradycardia and tachycar-dia ECGs retrieved by our algorithm.

with ground truth periods, we are able to achieve 84% ac-curacy rate, where the error between ground truth and theestimation is under 5%. We believe the main cause of er-ror is from the variation of heart beat within single channelECG waveform embedded in the videos.

4 Conclusions

In this paper, we present a novel computational algo-rithm as well as medical decision support tool for findingthe heart period from multimodal electrocardiogram (ECG)documents. We exploit the pixel thickness information, forthe first time, as a reliable feature for determining period-icity. Existing period estimation algorithm, such as estima-tion of R peaks as largest positive peaks often leads to de-tection error when R peaks are lower than the P or T peaks.While our method is relatively robust to these cases sinceour features latched on pixel dithering is more consistentamong corresponding R peaks. Furthermore, we are ableto correctly identify 94.5% of bradycardia and tachycardiapatient records (on a database of 16,613 ECG waveforms)using our estimated heart period as the disease criteria. Fu-ture work will focus on building a mathematical model forthe dithering effect so that our algorithm can be generalizedto documents of other digital signals.

References

[1] H. Bunke and A. Sanfeliu, editors. Syntactic and StructuralPattern Recognition Theory and Applications. World Scien-tific, 1990.

[2] S. Ebadollahi, S.-F. Chang, and H. Wu. Automatic viewrecognition in echocardiogram videos using parts-based rep-resentation. In CVPR, pages 2–9, 2004.

[3] A. Ghaffari, H. Golbayani, and M. Ghasemi. A new math-ematical based qrs detector using continuous wavelet trans-form. Comput. Electr. Eng., 34(2):81–91, 2008.

[4] B.-U. Koler, C. Hennig, and R. Orglmeister. The principlesof software QRS detection. IEEE Engineering in Medicineand Biology Magazine, 21:42–57, 2002.

[5] J. Mehta and N.S.Lingayat. Comparative study of QRSdetection in single lead and 12-lead ecg based on entropyand combined entropy criteria using support vector machine.JATIT, 3:8–18, 2007.

[6] S. Reisman and W. Tapp. Variable threshold r wave detector:Use in automated ecg processing. Physiology & Behavior,35:815–818, 1985.

[7] S. C. Saxena, V. Kumar, and S. Hamde. Feature extractionfrom ECG signals using wavelet transform for disease diag-nostics. IJSS, 33:1073– 1085, 2002.

[8] T. Syeda-Mahmood, D. Beymer, and F. Wang. Shape-basedmatching of ECG recordings. IEEE International Confer-ence on Engineering in Medicine and Biology, pages 2012–2018, 2007.

[9] W. J. Tompkins, editor. Biomedical digital signal process-ing: C-language examples and laboratory experiments forthe IBM PC. Prentice-Hall, Inc., Upper Saddle River, NJ,USA, 1993.

[10] G. Vijaya, V. Kumar, and H. Verma. ANN-based QRS-complex analysis of ECG. J Med Eng Technol., pages 160–167, 1998.

[11] T. A. Wrublewski, Y. Sun, and J. Beyer. Real-time early de-tection of r waves of ECG signals. Engineering in Medicineand Biology Society, pages 38 – 39, 1989.

[12] Y.-C. Yeh and W.-J. Wang. QRS complexes detection forECG signal: The difference operation method. Comput.Methods Prog. Biomed., 91(3):245–254, 2008.

[13] T. Zimmerman and T. Syeda-Mahmood. Automatic detec-tion of heart disease from twelve channel electrocardiogramwaveforms. In Computers in Cardiology, pages 809–812,2007.