classification of magnetic resonance brain images using wavelets

7
Classification of magnetic resonance brain images using wavelets as input to support vector machine and neural network Sandeep Chaplot a , L.M. Patnaik a, * , N.R. Jagannathan b a Computational Neurobiology Group, Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560012, India b Department of Nuclear Magnetic Resonance, All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India Received 17 February 2006; received in revised form 8 May 2006; accepted 11 May 2006 Abstract In this paper, we propose a novel method using wavelets as input to neural network self-organizing maps and support vector machine for classification of magnetic resonance (MR) images of the human brain. The proposed method classifies MR brain images as either normal or abnormal. We have tested the proposed approach using a dataset of 52 MR brain images. Good classification percentage of more than 94% was achieved using the neural network self-organizing maps (SOM) and 98% from support vector machine. We observed that the classification rate is high for a support vector machine classifier compared to self-organizing map-based approach. # 2006 Elsevier Ltd. All rights reserved. Keywords: Magnetic resonance imaging (MRI); Discrete wavelet transform (DWT); Artificial neural network (ANN); Self-organizing maps (SOM); Support vector machine (SVM) 1. Introduction Magnetic resonance (MR) imaging was introduced into clinical medicine and has ever since assumed an unparalleled role of importance in brain imaging. Magnetic resonance imaging is an advanced medical imaging technique that has proven to be an effective tool in the study of the human brain. The rich information that MR images provide about the soft tissue anatomy has dramatically improved the quality of brain pathology diagnosis and treatment. However, the amount of data is far too much for manual interpretation and hence there is a great need for automated image analysis tools. The most important advantage of MR imaging is that it is a non-invasive technique. The level of detail that can be seen is extraordinary as compared to other imaging modality. MR imaging provides two-dimensional and three-dimensional images of organs and structures inside the body. Extracting essential features from the MR brain images is imperative for proper analysis of these images. Independent component analysis (ICA) [1], Fourier transform [2] and wavelet transform [3] are few of the methods used to extract features from images. Although a relatively recent construct, wavelets have become a tool of choice for engineers, physicists, and mathematicians for time–space–frequency analysis. Unlike Fourier transform, which provides only frequency analysis of signals, wavelet transforms provide time–frequency analysis, which is particularly useful for pattern recognition. We used machine learning algorithms to obtain the classification of images under two categories, either normal or abnormal. Artificial neural networks (ANNs) [4,5] are biologically inspired. They are composed of many non-linear computational elements operating in parallel and are arranged in patterns similar to biological neural nets. ANNs modify their behavior in response to their environment, learn from experience, and generalize from previous examples to new ones. ANNs have become the preferred technique for a large class of pattern recognition tasks. Lippmann [6] has given an excellent review on ANNs. A self-organizing map (SOM) is an unsupervised algorithm which has advantages over other networks, it can form similarity diagrams automatically, and can produce abstractions. www.elsevier.com/locate/bspc Biomedical Signal Processing and Control 1 (2006) 86–92 * Correspondence to: Microprocessor Applications Laboratory, CEDT Build- ing, First Floor, Room No. 239, Indian Institute of Science, Bangalore, Karnataka 560012, India. Tel.: +91 80 23600451; fax: +91 80 2360683. E-mail address: [email protected] (L.M. Patnaik). 1746-8094/$ – see front matter # 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.bspc.2006.05.002

Upload: adalberto-macdonald

Post on 19-Jan-2016

12 views

Category:

Documents


1 download

DESCRIPTION

ME Mechanical

TRANSCRIPT

Page 1: Classification of Magnetic Resonance Brain Images Using Wavelets

Classification of magnetic resonance brain images using wavelets

as input to support vector machine and neural network

Sandeep Chaplot a, L.M. Patnaik a,*, N.R. Jagannathan b

a Computational Neurobiology Group, Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560012, Indiab Department of Nuclear Magnetic Resonance, All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India

Received 17 February 2006; received in revised form 8 May 2006; accepted 11 May 2006

Abstract

In this paper, we propose a novel method using wavelets as input to neural network self-organizing maps and support vector machine for

classification of magnetic resonance (MR) images of the human brain. The proposed method classifies MR brain images as either normal or

abnormal. We have tested the proposed approach using a dataset of 52 MR brain images. Good classification percentage of more than 94% was

achieved using the neural network self-organizing maps (SOM) and 98% from support vector machine. We observed that the classification rate is

high for a support vector machine classifier compared to self-organizing map-based approach.

# 2006 Elsevier Ltd. All rights reserved.

Keywords: Magnetic resonance imaging (MRI); Discrete wavelet transform (DWT); Artificial neural network (ANN); Self-organizing maps (SOM); Support

vector machine (SVM)

www.elsevier.com/locate/bspc

Biomedical Signal Processing and Control 1 (2006) 86–92

1. Introduction

Magnetic resonance (MR) imaging was introduced into

clinical medicine and has ever since assumed an unparalleled

role of importance in brain imaging. Magnetic resonance

imaging is an advanced medical imaging technique that has

proven to be an effective tool in the study of the human brain.

The rich information that MR images provide about the soft

tissue anatomy has dramatically improved the quality of brain

pathology diagnosis and treatment. However, the amount of

data is far too much for manual interpretation and hence there is

a great need for automated image analysis tools. The most

important advantage of MR imaging is that it is a non-invasive

technique. The level of detail that can be seen is extraordinary

as compared to other imaging modality. MR imaging provides

two-dimensional and three-dimensional images of organs and

structures inside the body.

* Correspondence to: Microprocessor Applications Laboratory, CEDT Build-

ing, First Floor, Room No. 239, Indian Institute of Science, Bangalore,

Karnataka 560012, India. Tel.: +91 80 23600451; fax: +91 80 2360683.

E-mail address: [email protected] (L.M. Patnaik).

1746-8094/$ – see front matter # 2006 Elsevier Ltd. All rights reserved.

doi:10.1016/j.bspc.2006.05.002

Extracting essential features from the MR brain images is

imperative for proper analysis of these images. Independent

component analysis (ICA) [1], Fourier transform [2] and

wavelet transform [3] are few of the methods used to extract

features from images. Although a relatively recent construct,

wavelets have become a tool of choice for engineers, physicists,

and mathematicians for time–space–frequency analysis. Unlike

Fourier transform, which provides only frequency analysis of

signals, wavelet transforms provide time–frequency analysis,

which is particularly useful for pattern recognition.

We used machine learning algorithms to obtain the

classification of images under two categories, either normal

or abnormal.

Artificial neural networks (ANNs) [4,5] are biologically

inspired. They are composed of many non-linear computational

elements operating in parallel and are arranged in patterns similar

to biological neural nets. ANNs modify their behavior in

response to their environment, learn from experience, and

generalize from previous examples to new ones. ANNs have

become the preferred technique for a large class of pattern

recognition tasks. Lippmann [6] has given an excellent review on

ANNs. A self-organizing map (SOM) is an unsupervised

algorithm which has advantages over other networks, it can form

similarity diagrams automatically, and can produce abstractions.

Page 2: Classification of Magnetic Resonance Brain Images Using Wavelets

S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–92 87

The self-organizing map is generalizable to non-vectoral, say

symbolic data, whereas the other networks are not [7].

The support vector machine (SVM) is a machine learning

technique, which originated from the statistical theory [8] and is

used for the classification of images. The primary advantage is

that it is able to model highly non-linear systems and the special

properties of the decision surface ensure very good general-

ization. The SVM has been widely used in pattern recognition

applications due to its computational efficiency and good

generalization performance.

Thus, ANN and SVM are very attractive in recognition and

classification tasks. The results of both the approaches are

compared.

The rest of the paper is organised as follows. A short

description on the input dataset of images and methods for

feature extraction as well for classification are presented in

Section 2. Section 3 contains results and discussion while

conclusions are presented in Section 4.

2. Materials and methods

2.1. Input dataset

The input dataset consists of axial, T2-weighted,

256 � 256 pixel MR brain images (Fig. 1). These images were

downloaded from the (Harvard Medical School website (http://

med.harvard.edu/AANLIB/) [9]. Only those sections of the brain

in which lateral ventricles are clearly seen are considered in our

study. The number of MR brain images in the input dataset is 52

of which 6 are of normal brain and 46 are of abnormal brain. The

abnormal brain image set consists of images of brain affected by

Alzheimer’s disease. The remarkable feature of a normal human

brain is the symmetry that it exhibits in the axial and coronal

Fig. 1. T2-weighted, an axial MR brain image (http://med.harvard.edu/AAN-

LIB/).

images. Asymmetry in an axial MR brain image strongly

indicates abnormality. Hence symmetry in axial MR images is an

important feature that needs to be considered in deciding whether

the MR image at hand is of a normal or an abnormal brain. A

normal and an abnormal T2-weighted MR brain image are shown

in Fig. 2a and b, respectively.

The lack of symmetry in an abnormal brain MR image is

clearly seen in Fig. 2b.

Asymmetry beyond a certain degree is a sure indication of

the diseased brain and this has been exploited in our work for an

initial classification at a gross level. For further detailed

classification, we extract finer features using wavelets and use

Fig. 2. Axial T2-weighted MR brain images: (a) normal brain; (b) abnormal

brain (http://med.harvard.edu/AANLIB/).

Page 3: Classification of Magnetic Resonance Brain Images Using Wavelets

S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–9288

these features for self-organizing neural network and for

support vector machine classifications.

2.2. Wavelet-based feature extraction

Wavelets are mathematical functions that decompose data

into different frequency components and then study each

component with a resolution matched to its scale. Wavelets

have emerged as powerful new mathematical tools for analysis

of complex datasets. The Fourier transform provides repre-

sentation of an image based only on its frequency content.

Hence this representation is not spatially localized while

wavelet functions are localized in space. The Fourier transform

decomposes a signal into a spectrum of frequencies whereas the

wavelet analysis decomposes a signal into a hierarchy of scales

ranging from the coarset scale. Hence Wavelet transform [3]

which provides representation of an image at various

resolutions is a better tool for feature extraction from images.

2.2.1. Discrete wavelet transform (DWT)

The DWT is an implementation of the wavelet transform

using a discrete set of the wavelet scales and translation

obeying some defined rules. For practical computations, it is

necessary to discretize the wavelet transform. The scale

parameter(s) is (are) discretized on a logarithmic grid. The

translation parameter (t) is then discretized with respect to

the scale parameter, i.e. sampling is done on the dyadic (as the

base of the logarithm is usually chosen as two) sampling grid.

The discretized scale and translation parameters are given by,

s = 2�m and t = n2�m, where m, n 2 Z, the set of all integers.

Thus, the family of wavelet functions is represented in

Eq. (1),

cm;nðtÞ ¼ 2m=2cð2mt � nÞ: (1)

Fig. 3. Wavelet transform of an image Y up to level two. Y1a and Y2

a are first and sec

detail components. Y2h , Y2

v and Y2d are second level detail components.

The wavelet transform decomposes a signal x(t) into a

family of synthesis wavelets as given below in Eqs. (2) and (3),

xðtÞ ¼X

m

Xncm;ncm;nðtÞ; (2)

where

cm;n ¼ hxðtÞ;cm;nðtÞi: (3)

For a discrete-time signal x[n], the wavelet decomposition

on I octaves is given by

x½n� ¼X

i¼1 to I

Xk2Z

ci;kg½n� 2ik� þX

k2 ZdI;khI ½n� 2Ik�;

(4)

where ci,k i = 1 � � � I: wavelet coefficients and di,k i = 1 � � � I:

scaling coefficients.

The wavelet and the scaling coefficients are given by,

ci;k ¼X

nx½n�g�i ½n� 2ik�; (5)

di;k ¼X

nx½n�h�I ½n� 2Ik�; (6)

where gi[n � 2ik] and hI[n � 2Ik] represent the discrete wave-

lets and scaling sequences, respectively, and (*) indicates

complex conjugate.

2.2.2. DWT in two dimensions

In case of images, the DWT is applied to each dimension

separately. This results in an image Y being decomposed into a

first level approximation component Y1a , and detailed

components Y1h , Y1

v and Y1d ; corresponding to horizontal,

vertical and diagonal details [10]. Fig. 3 depicts the process of

an image being decomposed into approximate and detailed

components.

The approximation component (Ya) contains low frequency

components of the image while the detailed components (Yh, Yv

ond level approximation components, respectively. Y1h , Y1

v and Y1d are first level

Page 4: Classification of Magnetic Resonance Brain Images Using Wavelets

S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–92 89

and Yd) contain high frequency components. Thus,

Y ¼ Y1a þ fY1

h þ Y1v þ Y1

dg: (7)

If DWT is applied to Y1a , the second level approximation and

detailed components are obtained. Higher-level decomposition

is performed similarly. If the process is repeated up to N levels,

the image Y can be written in terms of the Nth approximation

component (YNa ) and all detailed components as given below in

Eq. (8),

Y ¼ YNa þ

Xi¼1 to N

fYih þ Yi

v þ Yidg: (8)

At each decomposition level, the length of the decomposed

signals is half the length of the signal in the previous stage.

Hence the size of the approximation component obtained from

the first level decomposition of an N � N image is N/2 � N/2,

second level is N/4 � N/4 and so on. As the level of

decomposition is increased, compact but coarser approximation

of the image is obtained. Thus, wavelets provide a simple

hierarchical framework for interpreting the image information

[11].

2.2.3. Choice of mother wavelet

The basis of a wavelet transform that is compressed or

localized is called the mother wavelet of a wavelet transform. In

case of MR images of the brain, the pixel intensity values vary

smoothly which cannot be very efficiently represented by a

Haar wavelet [12]. Daubechies-4 (DAUB4) [13] wavelet

though expensive to compute, has the advantage of better

resolution for smoothly changing signals compared to the Haar

wavelet. Hence we have chosen Daubechies-4 wavelet, which

renders excellent classification accuracy. In this paper, we have

extracted the DAUB4 wavelet approximation coefficients of the

MR brain images and used them as feature vector for

classification.

2.3. Self-organizing maps (SOM)

Kohonen et al. [7] have proposed an unsupervised method

called self-organizing map. SOM [14] is a powerful neural

network method for the analysis and visualization of high-

dimensional data. It maps non-linear statistical relationships

between high-dimensional measurement data into simple

geometric relationship, usually on a two-dimensional grid.

SOM is an unsupervised method, we just give a series of input

patterns, and it learns for itself how to group these together so

that similar patterns produce similar outputs.

The first part of a SOM is the data. The idea of self-

organizing maps is to project the n-dimensional data into

something that can be better understood visually. The second

components of SOM are the weight vectors.

The input data points are mapped onto SOM units on two-

dimensional grid. The mapping is learnt from the training data

samples by a simple stochastic learning process, where the

SOM units are adjusted by small steps with respect to feature

vectors that are extracted from the data and are presented one

after the another (in random order).

The way SOM go about organising themselves is by

competing for representation of the samples. Neurons are also

allowed to change themselves by learning to become more like

samples in hopes of winning the next competition. It is this

selection and learning process that makes the weights organize

themselves into a map representing similarities.

The first step in constructing a SOM is to initialize the

weight vectors, from that point we select a sample vector

randomly and search the map of weight vectors to find which

weight vector best represents that sample. Since each weight

vector has a location, it also has neighbouring weights those are

close to it. The steps of the algorithm are as follows:

Step 1. Initializing the weights—initialize each weight

wi ¼ ½wi1;wi2; . . . ;wi j�T 2R j.

They are initialized to random numbers, between 0 and 0.5.

Step 2. Get the best matching unit—present an input pattern

x = [x1, x2, . . ., xj]T 2 Rj to SOM. Calculate the distance x from

each weight ðwiÞ to the chosen sample vector and identify the

winning weight or the best matching unit c such that jjx�wcjj ¼ minfjjx� wijjg: The weight with the shortest distance is

the winner, if there are more than one weight with the same

distance, then the winning weight is chosen randomly among

the weights with the shortest distance.

Step 3. Scale neighbors—adjust the weights of best matching

unit c and all neighbor units wiðt þ 1Þ ¼ wiðtÞ þ hciðtÞ½xðtÞ �wiðtÞ�; where i is the index of neighbor weight and t an integer,

the discrete-time coordinate. The neighborhood kernel hci(t) is

a function of time and the distance between neighbor weight i

and the winning weight c; hci(t) defines the region of influence

that the input pattern has on the SOM consisting of two parts

[15], the neighborhood function h(jj�jj, t) and the learning rate

function a(t). The kernel is given by hci(t) = (hjjrc � rijj, t) a(t),

where r is the location of the weight on the 2D map grid. The

learning rate function a(t) is a decreasing function of time.

Step 4. The process is repeated for a given number of iterations

by changing the learning parameters (initial weight range and

initial learning rate) and the radius of the neighborhood neu-

rons.

2.4. Support vector machine (SVM)

SVM is a binary classification method that takes as input

labeled data from two classes and outputs a model file for

classifying new unlabeled/labeled data into one of two

classes. The SVM originated from the idea of the structural

risk minimization that was developed by Vapnik. Support

vector machines are primarily two class classifiers that have

been shown to be attractive and more systematic to learning

linear or non-linear class boundaries. The use of SVM, like

any other machine learning technique, involves two basic

steps namely training and testing. Training an SVM involves

feeding known data to the SVM along with previously known

decision values, thus forming a finite training set. It is from the

Page 5: Classification of Magnetic Resonance Brain Images Using Wavelets

S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–9290

training set that an SVM gets its intelligence to classify

unknown data.

2.4.1. Review of SVM learning for classification

Let vector x 2 Rn denote a pattern to be classified, and let

scalar y denote its class label that is, y = � 1. Let {(xi, yi), i = 1,

2, . . ., l} denote a given set of l training samples. The problem is

how to construct a classifier (i.e. a decision function f(x)) that

can correctly classify an input pattern ‘x’ that is not necessarily

from the training set.

2.4.2. Linear SVM classifier

This is the simplest case in which the input patterns are

linearly separable. There exists a linear function of the form,

f ðxÞ ¼ WTxþ b; (9)

such that for each training example xi, the function yields

f(xi) � 0 for yi = +1 and f(xi) < 0 for yi = �1. Hence, training

samples from the two different classes are separated by the

hyperplane,

f ðxÞ ¼ WTxþ b ¼ 0: (10)

For a given set, there exist many hyperplanes that separate

the two classes but the SVM classifier is based on the

hyperplane that maximizes the separating margin between the

two classes [16].

2.4.3. Non-linear SVM classifier

The linear SVM classifier can be readily extended to a non-

linear classifier by first using a non-linear operator F(�) to map

the input pattern x into higher-dimensional space. The non-

linear classifier so obtained is defined as in Eq. (11),

f ðxÞ ¼ WTFðxÞ þ b; (11)

which is linear in terms of the transformed data F(x) but non-

linear in terms of the original data x 2 Rn. Following non-linear

transformation, the parameters of the decision function f(x) are

determined by the following minimization criteria,

Min JðW ; jÞ ¼ 12jjW jj2 þ C

XjI ; i ¼ 0; 1; . . . ; l; (12)

subject to

yiðWTfðxiÞ þ bÞ� 1� ji; ji� 0; i ¼ 1; 2; . . . ; l: (13)

2.4.4. SVM kernel functions

The kernel function in an SVM plays the central role of

implicitly mapping the input vector (through an inner product)

onto a high-dimensional feature space. It is usual that the data

points are not linearly separable, in which case a linear function

does not classify well. This is solved by the introduction of

kernel as a relaxation of the margin so that some points are

accepted to invade the opposite margin. However, when

choosing a kernel function, it is necessary to check whether it is

associated with the inner product of some non-linear mapping.

Mercer’s theorem states that such a mapping indeed underlies a

kernel K(�,�) provided that K(�,�) is a positive integral operator

[17]; that is, for every square integrable function g(�) defined on

the kernel K(�,�), the kernel satisfies the following condition,

Z ZKðx; yÞgðxÞgðyÞ dx dy� 0: (14)

Examples of kernels satisfying Mercer’s condition include

polynomials and radial basis functions (RBFs). These are

among the most commonly used kernels in SVM research. The

polynomial kernel is defined as follows,

Kðx; yÞ ¼ ðxTyþ 1Þ p; (15)

where p > 0 is a constant that is the order of a kernel.

There are several types of kernel learning methods such as

polynomial and RBF [18]. The convolution of the inner product

of feature vectors allows the construction of decision functions

that are non-linear in the input space. The decision function is

defined as,

f ðxÞ ¼ sign

� XSupport

yiaiKðxi; xÞ � b

�; (16)

where ai is the Lagrange multipliers, xi the support vectors,

K(xi, x) the convolution of the inner product for the feature

space.and the support vectors are equivalent to linear decision

functions in the high-dimensional feature space c1(x), c2(x),

. . ., cN(x). Using different functions for the convolution of the

inner product K(x, xi), one can construct learning machines with

different types of non-linear decision surfaces in the input

space.

2.4.5. Polynomial learning machine

To construct polynomial decision rules of degree ‘d’, one

can use the following function for convolution of the inner

product,

Kðx; xiÞ ¼ ½ðx� xiÞ þ 1�d: (17)

Then the decision function becomes,

f ðx;aÞ ¼ sign

� XSupport

yiai½ðxi � xÞ þ 1�d � b

�; (18)

which is a factorization of the d-dimensional polynomials in n-

dimensional input space.

2.4.6. Radial basis function machines

Classical radial basis function machine uses the following

set of decision rules,

f ðxÞ ¼ sign

�XN

i¼1

aiyiKgðjx� xijÞ � b

�; (19)

where N is the number of support vectors, g the width parameter

of the kernel function, Kg(jx � xij) depends on the distance

jx � xij between two vectors.

Page 6: Classification of Magnetic Resonance Brain Images Using Wavelets

S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–92 91

Table 1

Classification results from self-organizing maps

Total number of images 52

Number of normal images 6

Number of abnormal images 46

Number of images misclassified 3

Classification accuracy (%) 94

3. Results and discussion

3.1. Level of wavelet decomposition

We obtained wavelet coefficients of 52 brain MR images,

each of whose size is 256 � 256. Level-1 DAUB4 wavelet

decomposition of a brain MR image produces 17161 wavelet

approximation coefficients; while level-2 and level-3 produce

4761 and 1444 coefficients, respectively. The third level of

wavelet decomposition greatly reduces the input vector size but

results in lower classification percentage. With the first level

decomposition, the vector size (17,161) is too large to be given

as an input to a classifier. By proper analysis of the wavelet

coefficients through simulation in Matlab 7.1, we came to the

conclusion that level-2 features are the best suitable for neural

network self-organizing map and support vector machine,

whereas level-1 and level-3 features results in lower

classification accuracy. The second level of wavelet decom-

position not only gives virtually perfect results in the testing

phase, but also has reasonably manageable number of features

(4761) that can be handled without much hassle by the

classifier.

3.2. Classification of brain images

The classification of brain MR images has been done by two

approaches. The wavelet features are given as input to a self-

organizing map as well as to a support vector machine. We

observed that the classification rate is higher for a support

vector machine classifier compared to the self-organizing map-

based approach.

3.2.1. Classification from self-organizing maps

Feature extraction using wavelet decomposition and cla-

ssification through self-organizing map-based neural network

are employed. Wavelet coefficients of MR images were

obtained using wavelet toolbox of Matlab 7.1. A program

for self-organizing neural network was written using Matlab

7.1.

Those images, which are marked as abnormal in the first

stage, are not considered in the second stage. So expensive

calculations for wavelet decomposition of these images are

avoided. The second level DAUB4 wavelet approximation

coefficients are obtained and given as input to the self-

organizing neural network classifier. The results of classi-

fication are given in Table 1. The number of MR brain

images in the input dataset is 52 of which 6 are from normal

Table 2

Classification results from support vector machine

Kernel used Total number

of images

Number of

images in training

Normal Abnormal

Linear 52 4 6

Polynomial 52 4 6

Radial basis function 52 4 6

brain and 46 are of abnormal brain. Experimentation was

done with varying levels of wavelet decomposition. The final

categories obtained after classification by a self-organizing

map depend on the order in which the input vectors are

presented to the network. Hence we have randomized the

order of presentation of the input images. Our experiment

was repeated, each time with a different order of input

presentation, and we have obtained the same classification

percentage and normal–abnormal categories in all the

experiments.

3.2.2. Classification from support vector machine

We have implemented SVM in Matlab 7.1 with the inputs

being the wavelet-coded images, using the OSU SVM Matlab

toolbox [19], for our classification. This is a 2D classification

technique. In this paper, we treat the classification of MR

brain images as a two class pattern classification problem. In

every wavelet-coded MR image, we apply a classifier to

determine whether it is normal or abnormal. As mentioned

previously, the use of SVM involves training and testing the

SVM, with a particular kernel function, which in turn has

specific kernel parameters. Training an SVM is the most

crucial part of the machine learning process, as the ‘thinking’

procedure of the SVM depends on its past ‘experience’ in the

form of the training set. The standard training and testing sets

are created. We have used the second order approximate

wavelet coefficients of four normal images and six abnormal

images (randomly chosen) for training the SVM and rest of

the images are used for testing. The RBF and polynomial

functions have been used for non-linear training and testing

with degrees 2, 3 and 4.The linear kernel was also used for

SVM training and testing, but it shows lower classification

rate than the polynomial and RBF kernels. The SVM was later

tested using 52 images with different combinations in testing,

which were previously ‘unknown’ to the SVM. The

classification results with linear, polynomial and RBF kernel

are shown in Table 2. The accuracy of classification is high in

RBF kernel in comparison with the polynomial and linear

kernels.

Number of

images in testing

Images

misclassified

Classification

accuracy (%)

Normal Abnormal

6 46 2 96.15

6 46 1 98

6 46 1 98

Page 7: Classification of Magnetic Resonance Brain Images Using Wavelets

S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–9292

4. Conclusions

With the help of wavelet and machine learning approach, we

can classify whether a brain image is normal or abnormal. A

novel approach for classification of MR brain images using

wavelet as an input to self-organizing maps and support vector

machine has been proposed and implemented in this paper.

Classification percentage of more than 94% in case of self-

organizing maps and 98% in case of support vector machine

demonstrates the utility of the proposed method. In this paper,

we have applied this method only to axial T2-weighted images

at a particular depth inside the brain. The same method can be

employed for T1-weighted, T2-weighted, proton density and

other types of MR images. With the help of above approaches,

one can develop software for a diagnostic system for the

detection of brain disorders like Alzheimer’s, Huntington’s,

Parkinson’s diseases etc.

Acknowledgements

We would like to thank the Department of Biotechnology,

Government of India, for providing financial support to

complete the work reported in this paper.

Dr. R. Anjan Bharati, ex-Consultant Radiologist from the

M.S. Ramaiah Memorial Hospital, Bangalore, is acknowledged

for providing several clarifications on the medical aspects of the

work.

References

[1] C.H. Moritz, V.M. Haughton, D. Cordes, M. Quigley, M.E. Meyerand,

Whole-brain functional MR imaging activation from finger tapping task

examined with independent component analysis, Am. J. Neuroradiol. 21

(2000) 1629–1635.

[2] R.N. Bracewell, The Fourier Transform and its Applications, third ed.,

McGraw-Hill, New York, 1999.

[3] S.G. Mallat, A theory of multiresolution signal decomposition: the

wavelet representation, IEEE Trans. Pattern Anal. Mach. Intell. 11 (7)

(1980) 674–693.

[4] P.D. Wasserman, Neural Computing, Van Nostrand Reinhold, New York,

1989.

[5] S. Haykin, Neural Networks: A Comprehensive Foundation, second ed.,

PHI, 1994.

[6] R.P. Lippmann, An introduction to computing with neural nets, IEEE

Acoustics Speech Signal Processing Mag. 4 (2) (1987) 4–22.

[7] T. Kohonen, The self-organizing map, IEEE Proc. 78 (1990) 1464–

1477.

[8] V.N. Vapnik, Statistical Learning Theory, Wiley, New York, 1998.

[9] Harvard Medical School, Web: data available at http://med.harvard.edu/

AANLIB/.

[10] R.C. Gonzalez, R.E. Woods, Digital Image Processing, second ed.,

Pearson Education, Ch. Wavelet and Multiresolution Processing, 2004 ,

pp. 349–408.

[11] J. Koenderink, The structure of images, Biol. Cybern. 50 (5) (1984) 363–

370.

[12] A. Cohen, Biorthogonal basis of compactly supported vectors, Commun.

Pure Appl. Math. 21 (1992) 485–560.

[13] I. Daubechies, Ten lectures on wavelets, in: CBMS Conference Lecture

Notes 61, SIAM, Philadelphia, 1992.

[14] T. Kohonen, Self-organizing Maps, second ed., Springer Series in

Information Sciences, 1997.

[15] J. Vesanto, Data mining techniques based on the self-organizing map,

Master’s thesis, Helsinki University of Technology, 1997.

[16] C. Burges, Tutorial on support vector machine for pattern recognition,

Data Mining Knowl. Discov. 2 (1998) 955–974.

[17] B. Scholkopf, Advances in Kernel Methods: Support Vector Learning,

MIT Press, Cambridge, MA, 1999.

[18] C.A. Micchelli, Interpolation of scattered data: distance matrices and

conditionally positive definite functions, Constr. Approximation 2 (1986)

11–22.

[19] C.C. Chang, C.C. Lin, LIBSVM: a library of support vector machines,

Software available at: http://www.csie.ntu.edu.tw/�cjlin/libsvm, 2001.