classification of magnetic resonance brain images using wavelets
DESCRIPTION
ME MechanicalTRANSCRIPT
Classification of magnetic resonance brain images using wavelets
as input to support vector machine and neural network
Sandeep Chaplot a, L.M. Patnaik a,*, N.R. Jagannathan b
a Computational Neurobiology Group, Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560012, Indiab Department of Nuclear Magnetic Resonance, All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
Received 17 February 2006; received in revised form 8 May 2006; accepted 11 May 2006
Abstract
In this paper, we propose a novel method using wavelets as input to neural network self-organizing maps and support vector machine for
classification of magnetic resonance (MR) images of the human brain. The proposed method classifies MR brain images as either normal or
abnormal. We have tested the proposed approach using a dataset of 52 MR brain images. Good classification percentage of more than 94% was
achieved using the neural network self-organizing maps (SOM) and 98% from support vector machine. We observed that the classification rate is
high for a support vector machine classifier compared to self-organizing map-based approach.
# 2006 Elsevier Ltd. All rights reserved.
Keywords: Magnetic resonance imaging (MRI); Discrete wavelet transform (DWT); Artificial neural network (ANN); Self-organizing maps (SOM); Support
vector machine (SVM)
www.elsevier.com/locate/bspc
Biomedical Signal Processing and Control 1 (2006) 86–92
1. Introduction
Magnetic resonance (MR) imaging was introduced into
clinical medicine and has ever since assumed an unparalleled
role of importance in brain imaging. Magnetic resonance
imaging is an advanced medical imaging technique that has
proven to be an effective tool in the study of the human brain.
The rich information that MR images provide about the soft
tissue anatomy has dramatically improved the quality of brain
pathology diagnosis and treatment. However, the amount of
data is far too much for manual interpretation and hence there is
a great need for automated image analysis tools. The most
important advantage of MR imaging is that it is a non-invasive
technique. The level of detail that can be seen is extraordinary
as compared to other imaging modality. MR imaging provides
two-dimensional and three-dimensional images of organs and
structures inside the body.
* Correspondence to: Microprocessor Applications Laboratory, CEDT Build-
ing, First Floor, Room No. 239, Indian Institute of Science, Bangalore,
Karnataka 560012, India. Tel.: +91 80 23600451; fax: +91 80 2360683.
E-mail address: [email protected] (L.M. Patnaik).
1746-8094/$ – see front matter # 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.bspc.2006.05.002
Extracting essential features from the MR brain images is
imperative for proper analysis of these images. Independent
component analysis (ICA) [1], Fourier transform [2] and
wavelet transform [3] are few of the methods used to extract
features from images. Although a relatively recent construct,
wavelets have become a tool of choice for engineers, physicists,
and mathematicians for time–space–frequency analysis. Unlike
Fourier transform, which provides only frequency analysis of
signals, wavelet transforms provide time–frequency analysis,
which is particularly useful for pattern recognition.
We used machine learning algorithms to obtain the
classification of images under two categories, either normal
or abnormal.
Artificial neural networks (ANNs) [4,5] are biologically
inspired. They are composed of many non-linear computational
elements operating in parallel and are arranged in patterns similar
to biological neural nets. ANNs modify their behavior in
response to their environment, learn from experience, and
generalize from previous examples to new ones. ANNs have
become the preferred technique for a large class of pattern
recognition tasks. Lippmann [6] has given an excellent review on
ANNs. A self-organizing map (SOM) is an unsupervised
algorithm which has advantages over other networks, it can form
similarity diagrams automatically, and can produce abstractions.
S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–92 87
The self-organizing map is generalizable to non-vectoral, say
symbolic data, whereas the other networks are not [7].
The support vector machine (SVM) is a machine learning
technique, which originated from the statistical theory [8] and is
used for the classification of images. The primary advantage is
that it is able to model highly non-linear systems and the special
properties of the decision surface ensure very good general-
ization. The SVM has been widely used in pattern recognition
applications due to its computational efficiency and good
generalization performance.
Thus, ANN and SVM are very attractive in recognition and
classification tasks. The results of both the approaches are
compared.
The rest of the paper is organised as follows. A short
description on the input dataset of images and methods for
feature extraction as well for classification are presented in
Section 2. Section 3 contains results and discussion while
conclusions are presented in Section 4.
2. Materials and methods
2.1. Input dataset
The input dataset consists of axial, T2-weighted,
256 � 256 pixel MR brain images (Fig. 1). These images were
downloaded from the (Harvard Medical School website (http://
med.harvard.edu/AANLIB/) [9]. Only those sections of the brain
in which lateral ventricles are clearly seen are considered in our
study. The number of MR brain images in the input dataset is 52
of which 6 are of normal brain and 46 are of abnormal brain. The
abnormal brain image set consists of images of brain affected by
Alzheimer’s disease. The remarkable feature of a normal human
brain is the symmetry that it exhibits in the axial and coronal
Fig. 1. T2-weighted, an axial MR brain image (http://med.harvard.edu/AAN-
LIB/).
images. Asymmetry in an axial MR brain image strongly
indicates abnormality. Hence symmetry in axial MR images is an
important feature that needs to be considered in deciding whether
the MR image at hand is of a normal or an abnormal brain. A
normal and an abnormal T2-weighted MR brain image are shown
in Fig. 2a and b, respectively.
The lack of symmetry in an abnormal brain MR image is
clearly seen in Fig. 2b.
Asymmetry beyond a certain degree is a sure indication of
the diseased brain and this has been exploited in our work for an
initial classification at a gross level. For further detailed
classification, we extract finer features using wavelets and use
Fig. 2. Axial T2-weighted MR brain images: (a) normal brain; (b) abnormal
brain (http://med.harvard.edu/AANLIB/).
S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–9288
these features for self-organizing neural network and for
support vector machine classifications.
2.2. Wavelet-based feature extraction
Wavelets are mathematical functions that decompose data
into different frequency components and then study each
component with a resolution matched to its scale. Wavelets
have emerged as powerful new mathematical tools for analysis
of complex datasets. The Fourier transform provides repre-
sentation of an image based only on its frequency content.
Hence this representation is not spatially localized while
wavelet functions are localized in space. The Fourier transform
decomposes a signal into a spectrum of frequencies whereas the
wavelet analysis decomposes a signal into a hierarchy of scales
ranging from the coarset scale. Hence Wavelet transform [3]
which provides representation of an image at various
resolutions is a better tool for feature extraction from images.
2.2.1. Discrete wavelet transform (DWT)
The DWT is an implementation of the wavelet transform
using a discrete set of the wavelet scales and translation
obeying some defined rules. For practical computations, it is
necessary to discretize the wavelet transform. The scale
parameter(s) is (are) discretized on a logarithmic grid. The
translation parameter (t) is then discretized with respect to
the scale parameter, i.e. sampling is done on the dyadic (as the
base of the logarithm is usually chosen as two) sampling grid.
The discretized scale and translation parameters are given by,
s = 2�m and t = n2�m, where m, n 2 Z, the set of all integers.
Thus, the family of wavelet functions is represented in
Eq. (1),
cm;nðtÞ ¼ 2m=2cð2mt � nÞ: (1)
Fig. 3. Wavelet transform of an image Y up to level two. Y1a and Y2
a are first and sec
detail components. Y2h , Y2
v and Y2d are second level detail components.
The wavelet transform decomposes a signal x(t) into a
family of synthesis wavelets as given below in Eqs. (2) and (3),
xðtÞ ¼X
m
Xncm;ncm;nðtÞ; (2)
where
cm;n ¼ hxðtÞ;cm;nðtÞi: (3)
For a discrete-time signal x[n], the wavelet decomposition
on I octaves is given by
x½n� ¼X
i¼1 to I
Xk2Z
ci;kg½n� 2ik� þX
k2 ZdI;khI ½n� 2Ik�;
(4)
where ci,k i = 1 � � � I: wavelet coefficients and di,k i = 1 � � � I:
scaling coefficients.
The wavelet and the scaling coefficients are given by,
ci;k ¼X
nx½n�g�i ½n� 2ik�; (5)
di;k ¼X
nx½n�h�I ½n� 2Ik�; (6)
where gi[n � 2ik] and hI[n � 2Ik] represent the discrete wave-
lets and scaling sequences, respectively, and (*) indicates
complex conjugate.
2.2.2. DWT in two dimensions
In case of images, the DWT is applied to each dimension
separately. This results in an image Y being decomposed into a
first level approximation component Y1a , and detailed
components Y1h , Y1
v and Y1d ; corresponding to horizontal,
vertical and diagonal details [10]. Fig. 3 depicts the process of
an image being decomposed into approximate and detailed
components.
The approximation component (Ya) contains low frequency
components of the image while the detailed components (Yh, Yv
ond level approximation components, respectively. Y1h , Y1
v and Y1d are first level
S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–92 89
and Yd) contain high frequency components. Thus,
Y ¼ Y1a þ fY1
h þ Y1v þ Y1
dg: (7)
If DWT is applied to Y1a , the second level approximation and
detailed components are obtained. Higher-level decomposition
is performed similarly. If the process is repeated up to N levels,
the image Y can be written in terms of the Nth approximation
component (YNa ) and all detailed components as given below in
Eq. (8),
Y ¼ YNa þ
Xi¼1 to N
fYih þ Yi
v þ Yidg: (8)
At each decomposition level, the length of the decomposed
signals is half the length of the signal in the previous stage.
Hence the size of the approximation component obtained from
the first level decomposition of an N � N image is N/2 � N/2,
second level is N/4 � N/4 and so on. As the level of
decomposition is increased, compact but coarser approximation
of the image is obtained. Thus, wavelets provide a simple
hierarchical framework for interpreting the image information
[11].
2.2.3. Choice of mother wavelet
The basis of a wavelet transform that is compressed or
localized is called the mother wavelet of a wavelet transform. In
case of MR images of the brain, the pixel intensity values vary
smoothly which cannot be very efficiently represented by a
Haar wavelet [12]. Daubechies-4 (DAUB4) [13] wavelet
though expensive to compute, has the advantage of better
resolution for smoothly changing signals compared to the Haar
wavelet. Hence we have chosen Daubechies-4 wavelet, which
renders excellent classification accuracy. In this paper, we have
extracted the DAUB4 wavelet approximation coefficients of the
MR brain images and used them as feature vector for
classification.
2.3. Self-organizing maps (SOM)
Kohonen et al. [7] have proposed an unsupervised method
called self-organizing map. SOM [14] is a powerful neural
network method for the analysis and visualization of high-
dimensional data. It maps non-linear statistical relationships
between high-dimensional measurement data into simple
geometric relationship, usually on a two-dimensional grid.
SOM is an unsupervised method, we just give a series of input
patterns, and it learns for itself how to group these together so
that similar patterns produce similar outputs.
The first part of a SOM is the data. The idea of self-
organizing maps is to project the n-dimensional data into
something that can be better understood visually. The second
components of SOM are the weight vectors.
The input data points are mapped onto SOM units on two-
dimensional grid. The mapping is learnt from the training data
samples by a simple stochastic learning process, where the
SOM units are adjusted by small steps with respect to feature
vectors that are extracted from the data and are presented one
after the another (in random order).
The way SOM go about organising themselves is by
competing for representation of the samples. Neurons are also
allowed to change themselves by learning to become more like
samples in hopes of winning the next competition. It is this
selection and learning process that makes the weights organize
themselves into a map representing similarities.
The first step in constructing a SOM is to initialize the
weight vectors, from that point we select a sample vector
randomly and search the map of weight vectors to find which
weight vector best represents that sample. Since each weight
vector has a location, it also has neighbouring weights those are
close to it. The steps of the algorithm are as follows:
Step 1. Initializing the weights—initialize each weight
wi ¼ ½wi1;wi2; . . . ;wi j�T 2R j.
They are initialized to random numbers, between 0 and 0.5.
Step 2. Get the best matching unit—present an input pattern
x = [x1, x2, . . ., xj]T 2 Rj to SOM. Calculate the distance x from
each weight ðwiÞ to the chosen sample vector and identify the
winning weight or the best matching unit c such that jjx�wcjj ¼ minfjjx� wijjg: The weight with the shortest distance is
the winner, if there are more than one weight with the same
distance, then the winning weight is chosen randomly among
the weights with the shortest distance.
Step 3. Scale neighbors—adjust the weights of best matching
unit c and all neighbor units wiðt þ 1Þ ¼ wiðtÞ þ hciðtÞ½xðtÞ �wiðtÞ�; where i is the index of neighbor weight and t an integer,
the discrete-time coordinate. The neighborhood kernel hci(t) is
a function of time and the distance between neighbor weight i
and the winning weight c; hci(t) defines the region of influence
that the input pattern has on the SOM consisting of two parts
[15], the neighborhood function h(jj�jj, t) and the learning rate
function a(t). The kernel is given by hci(t) = (hjjrc � rijj, t) a(t),
where r is the location of the weight on the 2D map grid. The
learning rate function a(t) is a decreasing function of time.
Step 4. The process is repeated for a given number of iterations
by changing the learning parameters (initial weight range and
initial learning rate) and the radius of the neighborhood neu-
rons.
2.4. Support vector machine (SVM)
SVM is a binary classification method that takes as input
labeled data from two classes and outputs a model file for
classifying new unlabeled/labeled data into one of two
classes. The SVM originated from the idea of the structural
risk minimization that was developed by Vapnik. Support
vector machines are primarily two class classifiers that have
been shown to be attractive and more systematic to learning
linear or non-linear class boundaries. The use of SVM, like
any other machine learning technique, involves two basic
steps namely training and testing. Training an SVM involves
feeding known data to the SVM along with previously known
decision values, thus forming a finite training set. It is from the
S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–9290
training set that an SVM gets its intelligence to classify
unknown data.
2.4.1. Review of SVM learning for classification
Let vector x 2 Rn denote a pattern to be classified, and let
scalar y denote its class label that is, y = � 1. Let {(xi, yi), i = 1,
2, . . ., l} denote a given set of l training samples. The problem is
how to construct a classifier (i.e. a decision function f(x)) that
can correctly classify an input pattern ‘x’ that is not necessarily
from the training set.
2.4.2. Linear SVM classifier
This is the simplest case in which the input patterns are
linearly separable. There exists a linear function of the form,
f ðxÞ ¼ WTxþ b; (9)
such that for each training example xi, the function yields
f(xi) � 0 for yi = +1 and f(xi) < 0 for yi = �1. Hence, training
samples from the two different classes are separated by the
hyperplane,
f ðxÞ ¼ WTxþ b ¼ 0: (10)
For a given set, there exist many hyperplanes that separate
the two classes but the SVM classifier is based on the
hyperplane that maximizes the separating margin between the
two classes [16].
2.4.3. Non-linear SVM classifier
The linear SVM classifier can be readily extended to a non-
linear classifier by first using a non-linear operator F(�) to map
the input pattern x into higher-dimensional space. The non-
linear classifier so obtained is defined as in Eq. (11),
f ðxÞ ¼ WTFðxÞ þ b; (11)
which is linear in terms of the transformed data F(x) but non-
linear in terms of the original data x 2 Rn. Following non-linear
transformation, the parameters of the decision function f(x) are
determined by the following minimization criteria,
Min JðW ; jÞ ¼ 12jjW jj2 þ C
XjI ; i ¼ 0; 1; . . . ; l; (12)
subject to
yiðWTfðxiÞ þ bÞ� 1� ji; ji� 0; i ¼ 1; 2; . . . ; l: (13)
2.4.4. SVM kernel functions
The kernel function in an SVM plays the central role of
implicitly mapping the input vector (through an inner product)
onto a high-dimensional feature space. It is usual that the data
points are not linearly separable, in which case a linear function
does not classify well. This is solved by the introduction of
kernel as a relaxation of the margin so that some points are
accepted to invade the opposite margin. However, when
choosing a kernel function, it is necessary to check whether it is
associated with the inner product of some non-linear mapping.
Mercer’s theorem states that such a mapping indeed underlies a
kernel K(�,�) provided that K(�,�) is a positive integral operator
[17]; that is, for every square integrable function g(�) defined on
the kernel K(�,�), the kernel satisfies the following condition,
Z ZKðx; yÞgðxÞgðyÞ dx dy� 0: (14)
Examples of kernels satisfying Mercer’s condition include
polynomials and radial basis functions (RBFs). These are
among the most commonly used kernels in SVM research. The
polynomial kernel is defined as follows,
Kðx; yÞ ¼ ðxTyþ 1Þ p; (15)
where p > 0 is a constant that is the order of a kernel.
There are several types of kernel learning methods such as
polynomial and RBF [18]. The convolution of the inner product
of feature vectors allows the construction of decision functions
that are non-linear in the input space. The decision function is
defined as,
f ðxÞ ¼ sign
� XSupport
yiaiKðxi; xÞ � b
�; (16)
where ai is the Lagrange multipliers, xi the support vectors,
K(xi, x) the convolution of the inner product for the feature
space.and the support vectors are equivalent to linear decision
functions in the high-dimensional feature space c1(x), c2(x),
. . ., cN(x). Using different functions for the convolution of the
inner product K(x, xi), one can construct learning machines with
different types of non-linear decision surfaces in the input
space.
2.4.5. Polynomial learning machine
To construct polynomial decision rules of degree ‘d’, one
can use the following function for convolution of the inner
product,
Kðx; xiÞ ¼ ½ðx� xiÞ þ 1�d: (17)
Then the decision function becomes,
f ðx;aÞ ¼ sign
� XSupport
yiai½ðxi � xÞ þ 1�d � b
�; (18)
which is a factorization of the d-dimensional polynomials in n-
dimensional input space.
2.4.6. Radial basis function machines
Classical radial basis function machine uses the following
set of decision rules,
f ðxÞ ¼ sign
�XN
i¼1
aiyiKgðjx� xijÞ � b
�; (19)
where N is the number of support vectors, g the width parameter
of the kernel function, Kg(jx � xij) depends on the distance
jx � xij between two vectors.
S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–92 91
Table 1
Classification results from self-organizing maps
Total number of images 52
Number of normal images 6
Number of abnormal images 46
Number of images misclassified 3
Classification accuracy (%) 94
3. Results and discussion
3.1. Level of wavelet decomposition
We obtained wavelet coefficients of 52 brain MR images,
each of whose size is 256 � 256. Level-1 DAUB4 wavelet
decomposition of a brain MR image produces 17161 wavelet
approximation coefficients; while level-2 and level-3 produce
4761 and 1444 coefficients, respectively. The third level of
wavelet decomposition greatly reduces the input vector size but
results in lower classification percentage. With the first level
decomposition, the vector size (17,161) is too large to be given
as an input to a classifier. By proper analysis of the wavelet
coefficients through simulation in Matlab 7.1, we came to the
conclusion that level-2 features are the best suitable for neural
network self-organizing map and support vector machine,
whereas level-1 and level-3 features results in lower
classification accuracy. The second level of wavelet decom-
position not only gives virtually perfect results in the testing
phase, but also has reasonably manageable number of features
(4761) that can be handled without much hassle by the
classifier.
3.2. Classification of brain images
The classification of brain MR images has been done by two
approaches. The wavelet features are given as input to a self-
organizing map as well as to a support vector machine. We
observed that the classification rate is higher for a support
vector machine classifier compared to the self-organizing map-
based approach.
3.2.1. Classification from self-organizing maps
Feature extraction using wavelet decomposition and cla-
ssification through self-organizing map-based neural network
are employed. Wavelet coefficients of MR images were
obtained using wavelet toolbox of Matlab 7.1. A program
for self-organizing neural network was written using Matlab
7.1.
Those images, which are marked as abnormal in the first
stage, are not considered in the second stage. So expensive
calculations for wavelet decomposition of these images are
avoided. The second level DAUB4 wavelet approximation
coefficients are obtained and given as input to the self-
organizing neural network classifier. The results of classi-
fication are given in Table 1. The number of MR brain
images in the input dataset is 52 of which 6 are from normal
Table 2
Classification results from support vector machine
Kernel used Total number
of images
Number of
images in training
Normal Abnormal
Linear 52 4 6
Polynomial 52 4 6
Radial basis function 52 4 6
brain and 46 are of abnormal brain. Experimentation was
done with varying levels of wavelet decomposition. The final
categories obtained after classification by a self-organizing
map depend on the order in which the input vectors are
presented to the network. Hence we have randomized the
order of presentation of the input images. Our experiment
was repeated, each time with a different order of input
presentation, and we have obtained the same classification
percentage and normal–abnormal categories in all the
experiments.
3.2.2. Classification from support vector machine
We have implemented SVM in Matlab 7.1 with the inputs
being the wavelet-coded images, using the OSU SVM Matlab
toolbox [19], for our classification. This is a 2D classification
technique. In this paper, we treat the classification of MR
brain images as a two class pattern classification problem. In
every wavelet-coded MR image, we apply a classifier to
determine whether it is normal or abnormal. As mentioned
previously, the use of SVM involves training and testing the
SVM, with a particular kernel function, which in turn has
specific kernel parameters. Training an SVM is the most
crucial part of the machine learning process, as the ‘thinking’
procedure of the SVM depends on its past ‘experience’ in the
form of the training set. The standard training and testing sets
are created. We have used the second order approximate
wavelet coefficients of four normal images and six abnormal
images (randomly chosen) for training the SVM and rest of
the images are used for testing. The RBF and polynomial
functions have been used for non-linear training and testing
with degrees 2, 3 and 4.The linear kernel was also used for
SVM training and testing, but it shows lower classification
rate than the polynomial and RBF kernels. The SVM was later
tested using 52 images with different combinations in testing,
which were previously ‘unknown’ to the SVM. The
classification results with linear, polynomial and RBF kernel
are shown in Table 2. The accuracy of classification is high in
RBF kernel in comparison with the polynomial and linear
kernels.
Number of
images in testing
Images
misclassified
Classification
accuracy (%)
Normal Abnormal
6 46 2 96.15
6 46 1 98
6 46 1 98
S. Chaplot et al. / Biomedical Signal Processing and Control 1 (2006) 86–9292
4. Conclusions
With the help of wavelet and machine learning approach, we
can classify whether a brain image is normal or abnormal. A
novel approach for classification of MR brain images using
wavelet as an input to self-organizing maps and support vector
machine has been proposed and implemented in this paper.
Classification percentage of more than 94% in case of self-
organizing maps and 98% in case of support vector machine
demonstrates the utility of the proposed method. In this paper,
we have applied this method only to axial T2-weighted images
at a particular depth inside the brain. The same method can be
employed for T1-weighted, T2-weighted, proton density and
other types of MR images. With the help of above approaches,
one can develop software for a diagnostic system for the
detection of brain disorders like Alzheimer’s, Huntington’s,
Parkinson’s diseases etc.
Acknowledgements
We would like to thank the Department of Biotechnology,
Government of India, for providing financial support to
complete the work reported in this paper.
Dr. R. Anjan Bharati, ex-Consultant Radiologist from the
M.S. Ramaiah Memorial Hospital, Bangalore, is acknowledged
for providing several clarifications on the medical aspects of the
work.
References
[1] C.H. Moritz, V.M. Haughton, D. Cordes, M. Quigley, M.E. Meyerand,
Whole-brain functional MR imaging activation from finger tapping task
examined with independent component analysis, Am. J. Neuroradiol. 21
(2000) 1629–1635.
[2] R.N. Bracewell, The Fourier Transform and its Applications, third ed.,
McGraw-Hill, New York, 1999.
[3] S.G. Mallat, A theory of multiresolution signal decomposition: the
wavelet representation, IEEE Trans. Pattern Anal. Mach. Intell. 11 (7)
(1980) 674–693.
[4] P.D. Wasserman, Neural Computing, Van Nostrand Reinhold, New York,
1989.
[5] S. Haykin, Neural Networks: A Comprehensive Foundation, second ed.,
PHI, 1994.
[6] R.P. Lippmann, An introduction to computing with neural nets, IEEE
Acoustics Speech Signal Processing Mag. 4 (2) (1987) 4–22.
[7] T. Kohonen, The self-organizing map, IEEE Proc. 78 (1990) 1464–
1477.
[8] V.N. Vapnik, Statistical Learning Theory, Wiley, New York, 1998.
[9] Harvard Medical School, Web: data available at http://med.harvard.edu/
AANLIB/.
[10] R.C. Gonzalez, R.E. Woods, Digital Image Processing, second ed.,
Pearson Education, Ch. Wavelet and Multiresolution Processing, 2004 ,
pp. 349–408.
[11] J. Koenderink, The structure of images, Biol. Cybern. 50 (5) (1984) 363–
370.
[12] A. Cohen, Biorthogonal basis of compactly supported vectors, Commun.
Pure Appl. Math. 21 (1992) 485–560.
[13] I. Daubechies, Ten lectures on wavelets, in: CBMS Conference Lecture
Notes 61, SIAM, Philadelphia, 1992.
[14] T. Kohonen, Self-organizing Maps, second ed., Springer Series in
Information Sciences, 1997.
[15] J. Vesanto, Data mining techniques based on the self-organizing map,
Master’s thesis, Helsinki University of Technology, 1997.
[16] C. Burges, Tutorial on support vector machine for pattern recognition,
Data Mining Knowl. Discov. 2 (1998) 955–974.
[17] B. Scholkopf, Advances in Kernel Methods: Support Vector Learning,
MIT Press, Cambridge, MA, 1999.
[18] C.A. Micchelli, Interpolation of scattered data: distance matrices and
conditionally positive definite functions, Constr. Approximation 2 (1986)
11–22.
[19] C.C. Chang, C.C. Lin, LIBSVM: a library of support vector machines,
Software available at: http://www.csie.ntu.edu.tw/�cjlin/libsvm, 2001.