01564328

10
74 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 25, NO. 1, JANUARY 2006 Data-Driven Brain MRI Segmentation Supported on Edge Condence and  A Priori  Tissue Information Juan Ramón Jiménez-Alaniz*  , Student Member , IEEE , V erónica Medina-Bañuelos  , Member , IEEE , and Oscar Yáñez-Suárez  , Member , IEEE  Abstract—Brain magnetic resonance imaging segmentation is accomplished in this work by applying nonparametric density es- timation, using the mean shift algorithm in the joint spatial-range domain. The quality of the class boundaries is improved by in- cluding an edge condence map, that represents the condence of truly being in the presence of a border between adjacent regions; an adjacency graph is then constructed with the labeled regions, and analyzed and pruned to merge adjacent regions. In order to assign image regions to a cerebral tissue type, a spatial normaliza- tion between image data and standard probability maps is carried out, so that for each structure a maximum  a posteriori probability criterion is applied. The method was applied to synthetic and real images, keeping all parameters constant throughout the process for each type of data. The combination of region segmentation and edge detection proved to be a robust technique, as adequate clusters were automatically identied, regardless of the noise level and bias. In a comparison with reference segmentations, average Tanimoto indexes of 0.90–0.99 were obtained for synthetic data and of 0.59 –0.9 9 for real data , cons ider ing gra y matt er , whit e matter , and backgro und.  Index T erms—Brain MRI, edge detection, image segmentation, mean shift, nonparametric estimation. I. INTRODUCTION S EGMENTATION of the brain structures from magnetic resona nce imaging (MRI) is appli ed in the study of many disorders, such as multiple sclerosis, schizophrenia, epilepsy, Parkinson’s disease, Alzheimer’s disease, cerebral atrophy, etc. MRI is particularly suitable for brain studies [ 1] because it is virtually noninvasive, and it presents a high spatial resolution and an excellent contrast of soft tissues [ 2]. Additionally, MRI segmentation is an important image processing step to identify anatomical areas of interest for diagnosis, treatment, surgical planning, image registration and functional mapping. In MRI , the tiss uesarecha rac ter ize d by vo xel intensities orig - inated from spatially differentiated MR signals. The goal of the segmentation procedure in medical applications is to partition Manuscript received September 6, 2005; revised September 28, 2005. This work was supported in part by the Consejo Nacional de Ciencia y Tecnologiá (CON ACy T) unde r Grant41203 and the Labo ratori o Franc o Mex icanode Infor - matica (LAFMI). The Associate Editor responsible for coordinating the review of this paper and recommending its publication was C. Davatzikos.  Asterisk in- dicates corresponding author. *J. R. Jiménez-Alaniz is with the Neuroimaging Laboratory, Department of Electrical Engineering, Universidad Autónoma Metropolitana—Izta palapa, Av. San Rafael Atlixco 186, Col. Vicentina, México, D. F., 09340, México (e-mail:  [email protected]). V. Medina-B añuel os and O. Yáñez -Suár ez are with the Neuro imag ing Laboratory, Depa rtmen t of Electr ical Engin eerin g, Univ ersid ad Autó noma Metro poli tana—Iztapalapa, Col. Vi cent ina D. F ., 0934 0, Méxi co (e-mail: vera@xanum .uam.mx; yaso@xanu m.uam.mx). Digital Object Identier 10.1109/TMI.2005.86 0999 an image into signicant anatomical regions, which are homo- geneous according to a specic property; this is carried out by nding the discriminating tissue properties and by labeling the voxels considering these features. MRI segmentation using pattern recognition techniques has been particularly successful for brain images [ 3]. Statistical pat- tern recognition approaches usually dene a parametric model, which assumes a particular probability density function (pdf) of the selected features. Parametric methods are useful when the dist rib uti on of fea tur es for the dif ferentcla sse s is well kno wn, or can be adequately modeled. A parametric approach frequently used is the mixture model, where the mixture compone nts are normally distributed [4]–[7]. Model parameters are commonly determined by maximum-likelihood (ML) [ 2], [8] or maximum a posteriori (MAP) estimates [9], using algorithms such as ex- pectation-maximization. Nevertheless, in MRI the distribution of tissue classes is not necessarily a known statistical function, which results in a biased estimation, thus producing poor seg- mentations[4]. Also, onemajor tis sue cla ss is usu all y ass oci ate d with a number of normal components within the mixture, thus requiring the estimation of a signicant amount of parameters; thi s can cau se sta bil ityandcon ver gen ce pro ble ms. Fur the rmo re, borders between classes in the pdf can be complex and not ex- pressible with a parametric form [ 10]. Segmentation stability and convergence can be improved by using nonparametric clustering algorithms which do not rely on predened distributions, being rather built directly from the ac- tual data samples, without requiring knowledge or assumptions about their statistical properties [3], [4]. In such methods, the proble m is to ndclusters of vo xels wit h homogeneous features, by nding the cluster centers and the class membership assign- ments of each cluster [6]. Most of the published clustering tech- niques are usually inadequate to analyze feature s paces directly from real data, because they rely upon  a priori  knowledge of the number of clusters present (e.g., k-means), or they implic- itly assume a given cluster geometry, which makes them unable to handle the complexity of the real feature space. Since these techniques impose a rigid delinea tion over the feature space and require a reasonable guess for the number of clusters present, they can return erroneous results when the embedded assump- tions are not satised. Therefore, arbitrarily structured feature spaces can only be analyzed by nonparametric methods [ 11]. As a robust alternative, a clustering technique which does not require prior knowledge of the number of clusters, and does not constrain their shape, is built upon the mean shift (MS). This is an iterative technique which estimates the modes of the multi- variate distribution underlying the feature space. The number of 0278-0062 /$20.00 © 2006 IEEE

Upload: desiree-lopez-palafox

Post on 22-Feb-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 1/10

74 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 25, NO. 1, JANUARY 2

Data-Driven Brain MRI Segmentation Supported Edge Condence and A Priori Tissue Information

Juan Ramón Jiménez-Alaniz* , Student Member, IEEE , Verónica Medina-Bañuelos , Member, IEEE , andOscar Yáñez-Suárez , Member, IEEE

Abstract— Brain magnetic resonance imaging segmentation isaccomplished in this work by applying nonparametric density es-timation, using the mean shift algorithm in the joint spatial-rangedomain. The quality of the class boundaries is improved by in-cluding an edge condence map, that represents the condence of truly being in the presence of a border between adjacent regions;an adjacency graph is then constructed with the labeled regions,and analyzed and pruned to merge adjacent regions. In order toassign image regions to a cerebral tissue type, a spatial normaliza-tion between image data and standard probability maps is carriedout, so that for each structure a maximum a posteriori probabilitycriterion is applied. The method was applied to synthetic and realimages, keeping all parameters constant throughout the processfor each type of data. The combination of region segmentationand edge detection proved to be a robust technique, as adequateclusters were automatically identied, regardless of the noise leveland bias. In a comparison with reference segmentations, averageTanimoto indexes of 0.90–0.99 were obtained for synthetic dataand of 0.59–0.99 for real data, considering gray matter, whitematter, and background.

Index Terms— Brain MRI, edge detection, image segmentation,mean shift, nonparametric estimation.

I. INTRODUCTION

S EGMENTATION of the brain structures from magneticresonance imaging (MRI) is applied in the study of manydisorders, such as multiple sclerosis, schizophrenia, epilepsy,Parkinson’s disease, Alzheimer’s disease, cerebral atrophy, etc.MRI is particularly suitable for brain studies [1] because it isvirtually noninvasive, and it presents a high spatial resolutionand an excellent contrast of soft tissues [2]. Additionally, MRIsegmentation is an important image processing step to identifyanatomical areas of interest for diagnosis, treatment, surgicalplanning, image registration and functional mapping.

In MRI, thetissuesarecharacterized by voxelintensities orig-inated from spatially differentiated MR signals. The goal of thesegmentation procedure in medical applications is to partition

Manuscript received September 6, 2005; revised September 28, 2005. Thiswork was supported in part by the Consejo Nacional de Ciencia y Tecnologiá(CONACyT) under Grant41203and theLaboratorio Franco Mexicanode Infor-matica (LAFMI). The Associate Editor responsible for coordinating the reviewof this paper and recommending its publication was C. Davatzikos. Asterisk in-dicates corresponding author.

*J. R. Jiménez-Alaniz is with the Neuroimaging Laboratory, Department of Electrical Engineering, Universidad Autónoma Metropolitana—Iztapalapa, Av.San Rafael Atlixco 186, Col. Vicentina, México, D. F., 09340, México (e-mail: [email protected]).

V. Medina-Bañuelos and O. Yáñez-Suárez are with the NeuroimagingLaboratory, Department of Electrical Engineering, Universidad AutónomaMetropolitana—Iztapalapa, Col. Vicentina D. F., 09340, México (e-mail:[email protected]; [email protected]).

Digital Object Identier 10.1109/TMI.2005.860999

an image into signicant anatomical regions, which are homgeneous according to a specic property; this is carried out bnding the discriminating tissue properties and by labeling thvoxels considering these features.

MRI segmentation using pattern recognition techniques habeen particularly successful for brain images [3]. Statistical pat-tern recognition approaches usually dene a parametric modewhich assumes a particular probability density function (pdf) the selected features. Parametric methods are useful when thdistribution of features for thedifferentclasses is well known,ocan be adequately modeled. A parametric approach frequentlused is the mixture model, where the mixture components anormally distributed [4]–[7]. Model parameters are commonlydetermined by maximum-likelihood (ML) [2], [8] or maximuma posteriori (MAP) estimates [9], using algorithms such as ex-pectation-maximization. Nevertheless, in MRI the distributioof tissue classes is not necessarily a known statistical functiowhich results in a biased estimation, thus producing poor sementations [4]. Also, onemajor tissueclass is usually associatedwith a number of normal components within the mixture, thurequiring the estimation of a signicant amount of parameterthis cancause stabilityandconvergence problems.Furthermoreborders between classes in the pdf can be complex and not expressible with a parametric form [10].

Segmentation stability and convergence can be improved busing nonparametric clustering algorithms which do not rely opredened distributions, being rather built directly from the atual data samples, without requiring knowledge or assumptionabout their statistical properties [3], [4]. In such methods, theproblem is to ndclusters of voxels with homogeneous featuresby nding the cluster centers and the class membership assigments of each cluster [6]. Most of the published clustering tech-niques are usually inadequate to analyze feature spaces directfrom real data, because they rely upon a priori knowledge of

the number of clusters present (e.g., k-means), or they impliitly assume a given cluster geometry, which makes them unabto handle the complexity of the real feature space. Since thestechniques impose a rigid delineation over the feature space anrequire a reasonable guess for the number of clusters presenthey can return erroneous results when the embedded assumptions are not satised. Therefore, arbitrarily structured featuspaces can only be analyzed by nonparametric methods [11].As a robust alternative, a clustering technique which does nrequire prior knowledge of the number of clusters, and does nconstrain their shape, is built upon the mean shift (MS). This an iterative technique which estimates the modes of the multvariate distribution underlying the feature space. The number

0278-0062/$20.00 © 2006 IEEE

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 2/10

JIM ÉNEZ-ALANIZ et al. : DATA-DRIVEN BRAIN MRI SEGMENTATION SUPPORTED ON EDGE CONFIDENCE AND a priori TISSUE INFORMATION 75

clusters is obtained automatically by nding the centers of thedensest regions in the space [ 12].

Another frequently used approach in segmentation consistson the integration of high-level knowledge. A digital atlas pro-vides a priori information about the spatial distribution of prob-abilities that a voxel belongs to a given brain structure; this

strengthens the process of de ning complex organs [ 13], andcan be used to nd the most probable class membership. Theuse of a priori information atlases must be preceded by a reg-istration step and this is carried out through a process of spatialnormalization that minimizes a cost function between the atlasand the image. Several atlases have been proposed for brain seg-mentation [ 8], [14], [15].

This work proposes a nonparametric estimation strategy,based on the MS algorithm, that uses the local modes of theunderlying joint space-range density function to de ne thecluster centers [ 16]–[18]. This algorithm has proven to berobust under the presence of different levels and types of noiseand doesn ’t assume a probability density function form. Thequality of the class boundaries is improved by including anedge con dence map that represents the con dence of trulybeing in the presence of a border between adjacent regions. Thecondence measure is also used to merge regions with weak edges, through the iterative application of transitive closureoperations in a region adjacency graph (RAG). The fusion stepis completed with a pruning process to eliminate very smallregions. To automate class labeling, a MAP decision is taken,based on a digital brain atlas against which a spatial normaliza-tion is carried out. The robustness of the MS estimation and itsedge-preserving ltering property, together with the edge andregion analysis incorporated in the proposed method, make it

amenable to process MR images with varying degrees of eldinhomogeneity and additive noise.

The paper is organized as follows: MS algorithm and edgecondence map concepts are detailed in Section II. Regionfusion is undertaken in Section III, and the incorporation of a priori information in Section IV. Section V presents a brief summary of the entire procedure. Results for synthetic andreal data are presented in Section VI, and nally, the proposedalgorithm is discussed to draw some conclusions in Section VII.It is important to point out that some preliminary results of thisresearch have been published elsewhere [ 19], [20].

II. M EAN SHIFT AND EDGE CONFIDENCE MAP FILTERING

A. Nonparametric Estimation

Fukunaga [ 16] proposed a nonparametric method of kernelestimation to cluster an observations set in different classes,by assigning each observation to the pdf ’s nearest mode alongits gradient direction; this is called the MS algorithm. Subse-quently, in 1995, Cheng [ 17] generalized the MS as a modeseeking process of any real function. Comaniciu and Meer [ 18]rst applied MS for color image segmentation; they proposedan algorithm, that is initialized with a data sample to nd thecluster centers, to validate them, and nally, to delineate theclusters using the nearest neighbor method. Several works werepublished afterwards related to MS [ 11 ], [12], [19]–[22].

B. Mean Shift Algorithm

Let be an arbitrary set of observations in the-dimensional Euclidean space , drawn from an unknown

probability density function . According to the a priori knowl-edge of an estimate of the probability density function canbe constructed based on a nonparametric strategy.

The feature spaces in an image are often characterized byvery irregular data clusters whose number and shape are notavailable. This strongly suggests that a nonparametric approach,which provides reliable detection of the local maxima of theunderlying density, should be employed for the analysis. Thekernel density estimation is a simple and reliable estimationmethod, and for those kernels satisfying mild conditions the es-timate is asymptotically unbiased, consistent in a mean squaresense and uniformly consistent in probability.

The multivariate kernel density estimator with kernel andwindow radius (also called smoothing parameter or band-width), computed at the -dimensional point is de ned as [ 23]

(1)

Thequality of theestimate is expressedas themean integratedsquared error (MISE), and the optimal kernel yielding minimumMISE is the multivariate Epanechnikov kernel [ 23], given by

if otherwise

(2)

where is the volume of a unit -dimensional sphere. Byemploying a differentiable kernel an estimate of the densitygradient can be de ned as the gradient of the kernel densityestimate (1)

(3)

Conditions on the kernel and the window radius toguarantee asymptotic unbiasedness, mean-square consistency,and uniform consistency of the estimate (3) are derived in [ 16].For the Epanechnikov kernel (2) the density gradient estimate(3) becomes

(4)

where is a hypersphere of radius having the volume, centered on , and containing data points. The last

term in (4)

(5)

is known as the sample MS [ 18], [19], [21].The quantity is the kernel density estimate

computed within and, thus, (4) can be written as

(6)

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 3/10

76 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 25, NO. 1, JANUARY 2006

Fig. 1. MS paths, showing two local maxima (circles) and all points that werelead to these maxima. The ltered result is obtained by replacing the originalintensity value with the value at the attained maxima.

which yields

(7)

This property can be used in an optimization procedure toclimb over the estimated pdf surface, , to search for thelocal maxima, i.e., the modes of the distribution [ 19]. The MSvector is proportional to the normalized pdf gradient estimate[11 ], [17], points in the pdf gradient direction, and since it isaligned with the local gradient estimate in , it can de ne a pathleading to a stationary point (mode) of the estimated pdf, as itcan be appreciated in Fig. 1.

The procedure computes for eachdata point, shiftsthekernel centers by these quantities, and iterates until the magni-tudes of the shifts are less than a given threshold or a certainnumber of iterations is attained.

C. Edge Condence Map

The most frequently used edge detection methods are basedon gradient orientation. These methods rely in the coherence of edges close to the gradient direction, and their main drawback is the lack of a way to extract a reliable edge direction; thisproblem can be solved by using an edge con dence measure,as proposed in [ 24]. Gradient estimation is carried out using adifferentiation mask, , de ned in an window, anddening a gradient subspace in . As it can be observedin Fig. 2, each mask vector ( and ) de nes a subspacedimension, and the data is an arbitrary vector in inthe orthogonal complement of subspace.

Gradient estimation is equivalent to projecting the data intothe gradient subspace, the projection is the vector , and itsorientation , is the angle between the data projection and oneof the mask vectors. The parameter can be used to de ne anideal edge template, , with the same estimated gradient orienta-tion. A measure of con dence for the presence of an edge in theanalyzed data can be de ned as: , which correspondsin the image domain, to the correlation coef cient between thenormalized template and the data (Fig. 2). This edge con dencemeasure has the advantage of being independent of the gradientmagnitude.

Fig. 2. Determination and de nition of the measure of con dence from anedge. and are the vectors corresponding to the differentiation masks,dening the gradient subspace, which are used to obtain the projection

of data onto the plane of the gradient operator. The estimated orientationof the gradient allows to de ne and ideal edge template somewhere outsidethe subspace. The measure of con dence for the presence of an edge isthe absolute value of the correlation coef cient between the data and thetemplate .

Since the gradient vector always points toward the directionof maximum increment, the estimated edge orientation is givenby

(8)

where is the estimated gradient orientation, and is the datamatrix, centered in each image voxel and of the same size that

. Edge orientation is used to de ne the template of an idealedge model with orientation and, thus, the con dence mea-

sure of each voxel, can be calculated by correlating the data ma-trix with its corresponding template.

Instead of the gradient magnitude, its normalized values areused, that is, the percentiles of their cumulative distribution of the gradient magnitude, as criterion for the presence of an edge.Finally, the con dence map is computed for each voxel as alinear combination of the cumulative distribution function andthe measure of edge con dence

(9)

where is a constant between 0 and 1 which controls theblending of gradient magnitude and the local patterninformation.

D. Coupling Mean Shift Filtering and Edge Condence

An improved MS estimate can be obtained by weighting eachvoxel within the region by a function of its edge con dence ,so that voxels that lie close to an edge (edge con dence 1) areless in uential in the determination of the new cluster center.The modi ed MS that includes weighted edge con dence is,from (5)

(10)

This weighted MS procedure is applied for the data pointsspeci ed both in the spatial and intensity domains, which con-stitute a joint spatial-range domain [ 21]. An Euclidean metric

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 4/10

JIM ÉNEZ-ALANIZ et al. : DATA-DRIVEN BRAIN MRI SEGMENTATION SUPPORTED ON EDGE CONFIDENCE AND a priori TISSUE INFORMATION 77

is used to control the quality of the segmentation, which is de-pendent on the radii and , corresponding to the resolu-tion parameters of the kernel estimate in the spatial and inten-sity (range) domains. After the MS procedure is applied to eachdata, those points that are suf ciently close in the joint domainare fused to obtain the homogeneous regions in the image. The

number of clusters present in the image is automatically deter-mined by the number of signi cant modes detected. A lteredimage can be drawn from the output of the MS procedure, byreplacing each voxel with the gray level of the mode it is asso-ciated to.

III. FUSION OF REGIONS

Once the ltered image is obtained, the tissue regions must bedelineated in the feature space, by clustering all those regionsthat are near within the spatial and range domains, that is, allthose voxels that have converged to the same point are fused andlabeled. The region delineation can be re ned by incorporating

a RAG that is then simpli ed with a union- nd algorithm [ 25],as described next.

A. Region Adjacency Analysis

The most straightforward representation for graphs is theso-called adjacency matrix representation, in which booleanvalues are established to indicate if there exists an edge betweenregions. To construct the adjacency matrix, each labeled regionof the ltered image is considered a vertex and a value of 1 isassigned to the adjacent labels, to identify neighboring regionsand 0 otherwise.

The number of regions determined by the weighted MS pro-

cedure may turn to be arbitrarily large, which produces an over-segmentation of the image. In order to fuse together homoge-neous adjacent regions that have been split apart by the MSprocedure, a transitive closure operation [ 26] on the RAG canbe performed. This operation links regions that: have associ-ated modes that are located within a distance of apart, andsatisfy a weak boundary strength condition. Such a boundarystrength measure can be derived directly from the edge con -dence (9) by adding the con dence values for the voxels alongthe boundary separating the regions. Whenever the measure isbelow a given threshold , the regions are nally fused.

A reduction procedure is then carried out on the RAG by ap-plying a union- nd algorithm, in order to nd the labels con-nected in the graph. The operations of transitive closure andunion- nd are applied iteratively until the number of regions be-tween iterations, remains unchanged.

B. Pruning

When the fused image is obtained, a pruning process is ap-plied to remove all the regions whose area is less than a min-imum size . In order to make this removal, the RAG is tra-versed, uniting regions whose area is less than with a regionthat is either adjacent and the distance of its mode is a minimumto that of the region being pruned; or it is the only adjacentregion having an area greater than the threshold. The pruningprocess is repeated iteratively, until the number of regions re-mains unchanged, or a given number of iterations is reached.

Fig. 3. A priori information maps. Example of the same slice in each one of the probabilistic maps: (a) background, (b) CSF, (c) GM, and (d) WM.

IV. INCORPORATION OF A P RIORI INFORMATION

The procedures mentioned up to now, produce homogeneousintensity regions, reducing as much as possible the number of regions that constitute the brain image volume. However, thenumber of regions can still be high, considering that in this work we are interested only on the identi cationof: gray matter (GM),white matter (WM), and cerebrospinal uid (CSF). In order toobtain the desired number of regions, the use of a priori knowl-edge, taken from probability maps, is proposed. Those mapsrepresent the a priori probability of a voxel being either GM,WM, or CSF, after an image has been normalized to the mapspace [ 27], [28]. Since only GM, WM, and CSF maps are avail-able, a fourth probability map for background and noncerebraltissue was calculated. Fig. 3 shows a coronal slice of the nor-

malized probabilistic maps.

A. Image Registration

Prior to segmentation, a registration step between data andprobabilistic maps must be applied. Image registration is car-ried out by rst using an af ne spatial transformation that takesimages of different subjects into roughly the same coordinatesystem, followed by a nonlinear spatial normalization to cor-rect gross differences in head shapes. The spatial transforma-tion that maps each voxel to its equivalent location in the prob-ability map, is determined by using template images placed inthe same stereotactic space (Talairach and Tournoux), than theprobability images. Template and probabilistic maps were takenfrom the digital atlas distributed with the statistical parametric

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 5/10

78 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 25, NO. 1, JANUARY 2006

mapping (SPM) [ 28] and the registration procedure was carriedout using the same software.

B. Region Classi cation

The choice of properties or, equivalently, the de nition of aclass, is the fundamental matter in the classi cation problem.

Once the probability images have been transformed to the dataspace, the classi cation of cerebral MRI requires essentially tocalculate the probabilities for each found region, and to assignthem to the class with greater probability, according to a classi-cation rule.

1) Bayesian Classi er: Bayes rule allows to compute the a posteriori probability, , from the a priori probability

and the class-conditional density . Letbe the image regions obtained from the RAG anal-

ysis, to be classi ed into classes, . Each regionis represented by a feature vector , so that the conditionalprobabilities are , . Assuming that the a priori probabilities and the class-conditional probabilitydensity functions are known, then the probability den-sity of nding a region that is in class and has feature vector

can be written as: ;rearranging terms, the Bayes formula [ 29] is

(11)

The a priori probability represents the information of theprior knowledge contained in the probabilistic atlas. The likeli-hood of with respect to , , quanti es the probabilitycontributed by the observed data , and indicates the class

for which is large enough to be the true class. In aclassi cation task, the Bayes decision rule for a region , rep-resented by the feature vector , amounts to assigning to class

if

(12)

For each homogeneous region found in the step of RAG anal-ysis, the a posteriori probability is computed, and the regionsare classi ed according to the Bayes decision rule, to obtain aprobability map associated to each of the four desired classesin the brain image (background, CSF, GM, and WM). Next, a

seed region is extracted, by searching the largest region withmaximum probability in each map, to obtain four initial seeds,characterized by the intensity mode found by the MS algorithm.The classi cation of the other regions needs a distance measurethat establishes how far the region ’s intensity is from theseed ’s intensity . For each region , its distance to classis

(13)

and can be turned into a probability as follows:

(14)

where is the probability of the region belongingtoclass , and isassigned tothe class withmaximum prob-ability .

V. M ETHODOLOGY OVERVIEW

The segmentation methodology proposed can be summarizedas follows:

1) computation of the edge con dence map from the data tobe segmented (Section II-C);

2) ltering by weighted MS procedure (Section II-B);3) region adjacency analysis, applying transitive closure op-

erations and pruning (Sections III-A and B);4) spatial normalization between image data and a priori

probability maps (Section IV-A);5) region classi cation computing the a posteriori probabili-

ties for each region and selecting the corresponding seeds(Section IV-B).

VI. R ESULTS

In order to test the proposed segmentation technique, the al-gorithm was implemented in 4D, in the joint spatial (x, y, z) andrange (intensity) domain. Synthetic and real images were used,taken from the Internet Brain Segmentation Repository (IBSR)[30] and the BrainWeb: Simulated Brain Database (SBD) [ 31].One type of synthetic data corresponds to ve volumes sim-ulating the brain, with a very simple three-dimensional (3-D)model, and the other to a simulated MRI volume for normalbrain. The real data consist of 20 normal brain MR data setsand their corresponding manual segmentations, provided by theCenter for Morphometric Analysis at Massachusetts GeneralHospital and available at IBSR.

A. Synthetic Data

As a rst example, we utilize the synthetic sphere phantommodel from the IBSR. The basic model is built from 52 slicesof 256 256 pixels containing two concentric spheres, wherethe inner sphere represents WM and the outer sphere GM. GMpixels are drawn from a normal distribution N(107, 8), whileWM pixels are drawn from distribution N(147, 5). Available testvolumes are derived from this basic model by further contami-nation with varying intensity bias elds, producing from zero to

full class overlap. In Fig. 4, the results of the different segmenta-tion process steps for a single slice can be appreciated. In orderto test the robustness of the proposed algorithm and its general-ization capability, ve synthetic volumes were processed underdifferent noise conditions and without any variation of the pa-rameters, which are shown in Table I.

Fig. 4 shows an example slice, where the intensities of the twocontained classes are strongly overlapped due to bias artifacts;the intermediate results through all the segmentation processcan also be observed. The rst step is to compute the edgecon dence map [Fig. 4(b)] starting from the original image[Fig. 4(a)]; in this step, the involved parameters are: the gra-dient window size, , the edge threshold, , and the blendingfactor, . The second step consists on ltering the originalimage, where the spatial radius, , and the intensity radius,

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 6/10

JIM ÉNEZ-ALANIZ et al. : DATA-DRIVEN BRAIN MRI SEGMENTATION SUPPORTED ON EDGE CONFIDENCE AND a priori TISSUE INFORMATION 79

Fig. 4. Segmentation of synthetic model with noise and bias: (a) original data,(b) edge con dence map, (c) ltered data, (d) labeling of ltered data, (e) resultafter applying transitive closure, and (f) segmented slice.

TABLE IPARAMETER VALUES

, must be considered. The ltered image [Fig. 4(c)] contains

many homogeneous regions (1165 in this example) which arelabeled with an arbitrary color scale in Fig. 4(d); this result isthen fed through the third step for region adjacency analysis,to merge and reduce the number of regions through transitiveclosure operations. The threshold of boundary strengthand the intensity radius must be considered in this step,to obtain a fused image, shown in Fig. 4(e), with 83 regionsin this case. In order to nish this step a pruning process isapplied, where those regions having a number of voxels lowerthan the region size threshold are fused to other regions,to nally obtain a fully segmented image, shown in Fig. 4(f).The resulting segmentation contains only two clusters (three, if

the background is considered), and it can be observed that theprocess is able to identify and separate the clusters, despite of noise and intensity overlap. The same behavior was observedfor all the ve synthetic volumes tested.

The second synthetic volume was obtained from the SBD andcorresponds to a brain digital phantom. This phantom consistsof T1 weighted images with a , 1 mmvoxel resolution, 3% noise, and a 20% intensity inhomogeneity.For this volume the ground truth is available for quantitativecomparison; in order to measure the similarity between the seg-mented image obtained with our procedure and the ground truth,the average Tanimoto coef cient was evaluated for the wholestack. It is important to mention here that Tanimoto indexes areglobal performance metrics that have been criticized for theirlack of local sensitivity to error; however, they have come to be

Fig. 5. Intermediate results corresponding to each step of the segmentationprocess: (a) original image, (b) con dence map, (c) ltered image, (d) labeledimage, (e) fused image by transitive closure, (f) pruned image, (g) segmentedimage with modes, (h) a posteriori probabilities image, and (i) classi ed image.

the standard metric in MRI segmentation analysis, and for per-formance comparison purposes they still have to be used.The Tanimoto coef cient (TC), between the segmented image

(X) and expert segmentation (Y), is the ratio of the number of el-ements they have in common classi ed as class , to the numberof all elements classi ed as class . It is then de ned as [ 29]

(15)Fig. 5 shows all the intermediate results corresponding to

each step of the methodology (see Section V). Fig. 5(a) is theoriginal image; Fig. 5(b) presents the corresponding con dence

map (step 1); Fig. 5(c) contains the mean-shift ltered image(step 2); Fig. 5(d) is the same ltered image, labeled to applyregion adjacency analysis, Fig. 5(e) is the image obtainedafter transitive closure and Fig. 5(f) is the result obtainedafter pruning (step 3); Fig. 5(g) shows the segmented image,labeled with the corresponding estimated intensities; Fig. 5(h)presents the a posteriori probabilities for each region of image5f, estimated to select seeds, and Fig. 5(i) shows the nalclassi cation (step 5). Fig. 6 shows the results obtained fortwo slices. The rst column shows the original images; thesecond, the segmented results applying the methodology andthe third, the reference images. The similarity indexes (mean

std) between the phantom brain segmented by the proposedmethodology and the ground truth were as follows: 0.9960.005 for background, 0.871 0.031 for CSF, 0.9 0.012 for

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 7/10

80 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 25, NO. 1, JANUARY 2006

Fig. 6. Two examples of synthetic brain segmentations: column (a) originalimages, column (b) segmented images, and column (c) reference images.

Fig. 7. Tanimoto indexes along the axial scans (slices) of a selected case.Comparison is made between a segmented synthetic brain (phantom) andthe reference truth. Synthetic scans have been processed with the proposedmethodology.

GM, and 0.924 0.038 for WM. Fig. 7 shows the variation of

the Tanimoto indexes throughout the phantom stack.

B. Real Data

The same procedure was applied to real data, by processingthe 20 data sets and their manual segmentation collected fromthe IBSR. The expert manual segmentation is considered thereference set to be used to compare our procedure with othermethods also available at the IBSR. Comparison is achievedthrough the computation of Tanimoto indexes between the ex-perimental results and the reference. The parameter values usedto process these volumes are shown in Table I, and were keptconstant for all slices of all subjects; note that these are also theparameters used in the synthetic brain example. Two segmen-tations of coronal slices are shown in Fig. 8; the rst column

Fig. 8. Two examples of real image segmentations: (a) original images,(b) segmentation results, and (c) manual classi cations by the expert.

Fig. 9. Volume reconstructionsof (I)GM, (II)WM, and(III) CSF. Comparisonbetween (a) reference and (b) automatically segmented volumes.

presents the original data, followed by the segmentation ob-tained with the proposed methodology in the second column,that can be compared with the manual segmentation carried outby the expert shown in the third column. Fig. 9 shows a 3-Dview of a whole stack segmentation obtained with the approach(column b), for GM (row I), WM (row II), and CSF (row III),which can be compared with the corresponding reference incolumn (a).

Table II shows a comparison of the average TC, obtained bysegmenting the 20 available volumes, with several approachesreported in the IBSR; the last row presents the quanti cation of our results obtained for the same volumes. It can be observed

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 8/10

JIM ÉNEZ-ALANIZ et al. : DATA-DRIVEN BRAIN MRI SEGMENTATION SUPPORTED ON EDGE CONFIDENCE AND a priori TISSUE INFORMATION 81

TABLE IITANIMOTO C OEFFICIENT B ETWEEN THE SEGMENTED IMAGE

AND THE REFERENCE A DAPTED FROM [30]

that the proposed approach has a superior performance for thethree segmented structures of interest.

The performance of the proposed algorithm for each stack can be observed and compared in Fig. 10, which shows the in-dividual Tanimoto indexes for each volume obtained with otherreported procedures, for both GM and WM. Stack ordering fol-

lows the degree of segmentation dif culty as assigned by humanexperts (IBSR), with stack one being the hardest to process andstack 20 the easiest.

The algorithm has been implemented on a PC desktop ma-chine (Pentium III Xeon 1 GHz), using Matlab; on this platform,the segmentation of a 256 256 image takes about one minutein a slice implementation and 13 minutes for the volumetric im-plementation.

VII. D ISCUSSION

Most of the brain segmentation techniques discussed in theeld require conformance to diverse assumptions, such as a spe-

cic probability density distribution [ 7], [15], [32]–[34], piece-wise linear gray level homogeneity [ 35], or a given numberof tissue classes present in the image [ 1], [7], [15], [34]. Fur-ther, there are usually requirements for initialization procedures[32]–[35], or image preprocessing for bias correction [ 1], [2],[8], [33], [35], all performed previous to actual image segmen-tation. On the other hand, feature space analysis for image seg-mentation as proposed in this paper is exclusively guided bythe data when the MS algorithm is applied, avoiding the afore-mentioned shortcomings (statistical assumptions, initializationor preprocessing). MS segmentation has found its applicationsin object recognition and tracking, and in color image segmen-

tation and compression. To the best of our knowledge, the ap-proach proposed in this paper is the rst application of such apowerful technique to brain image segmentation.

The edge con dence map weighted MS algorithm has showna good performance in the search and localization of the modesof an unknown pdf, allowing the delineation of homogenous re-gions of a cerebral MRI. Furthermore, the proposed classi ca-tion strategy allows the identi cation of tissue type that each re-gion corresponds to. Combination of edge con dence maps andhomogenousregions segmentation, showed a good performancein presence of noise and intensity inhomogeneity, preserving theborders between regions, and proving to be robust in the separa-tion of image classes. In the proposed strategy, the identi cationof neuroanatomical structures depends on the number of proba-bilistic maps available.

Fig. 10. Tanimoto coef cients for (a) GM and (b) WM for each of the 20case studies, as processed by six reported methods and the proposed procedure(adapted from [ 30]). The brain scans have been roughly ordered by theirdif culty to be segmented: scan 1 is the most dif cult and scan 20 the least.The “ms” line shows the indexes obtained by our approach; the other linescorrespond to the methods reported in IBSR.

Even though the procedure does not require an initializationstage, the parameters might have to be tuned prior to segmen-tation. We carried out this process by estimating the set of pa-rameters that best segmented a synthetic brain test image, con-sidering the Tanimoto index as the evaluation criterion. Thesevalues were also successfully used for real images, thus indi-cating that a default parameter set can be determined.

Spatial resolution has a major effect in execution time: thegreater the spatial radius, the larger the processing time. Onthe other hand, segmentation quality depends considerably onthe intensity resolution: a small produces many regions, thusover-segmenting the image, while a higher intensity radius canproduce sub-segmented images (few regions). Only image fea-tures with large spatial support are represented in the lteredimage when increases, andonly structures with very differentintensity values remain independent when is large.

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 9/10

82 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 25, NO. 1, JANUARY 2006

A value of edge threshold near zero or one produces, re-spectively, over or sub-segmentation; whereas a small blendingfactor (relevance of con dence larger than that of gradient mag-nitude) preserves weak edges and deals with bias effects. Theboundary strength threshold is calculated from the con dencemap, and in uences the region fusion step, over-segmenting

when its value is near zero, and highlighting the edges whenit tends to one, producing sub-segmentation.In summary, the parameters that have a higher in uence in

the algorithm performance are the kernel radii and theblending factor . At this moment, the approach proposedhere is not intended to be run in a fully automatic mode and,thus, an initial calibration step might be necessary for param-eter tuning. In future work, an adaptive radii scheme can beproposed.

Figs. 8 and 9 provide a qualitative comparison between thesegmentations obtained with the proposed method and the refer-ence stack, in two-dimensional and 3-D view, respectively. Thesimilarities between segmented structures can be clearly appre-ciated for GM and WM. In the case of the CSF reconstructions,it is seen that our algorithm is capable of detecting both intra-and extra-ventricular CSF, which the reference stacks do notinclude. While in a rst impression this condition appears assegmentation noise, it is indeed a more complete description of the corresponding class. As for quantitative results, the perfor-mance of our method is clearly appreciated for brain simula-tions, where Tanimoto indexes of 0.87 and more were obtainedfor all structures. In real data our results show a better perfor-mance compared to the methods reported in [ 30] (see Table II),while other authors [ 15], [33] have obtained higher indexes forthesame image cases.However, the lack of sensitivityof theper-

formance measure for describing local changes makes it hard toargue about the overall superiority of one method with respect tothe other. In this perspective, the method comparisons are morerelevant in the domain of model assumptions, required prepro-cessing, or computational load.

To conclude, the main contributions of the method proposedin this paper are: 1) the application is totally driven by the datacontained in the image, using a xed parameter set that is notindividually tuned to each case; 2) the MS estimation is robusteven in the presence of severe bias inhomogeneity, without theneed of a prior or simultaneous bias estimation and correctionprocedure; and 3) the use of an edge con dence map adequatelypreserves tissue boundaries, even for weak edges due to partialvolume effects.Beyond initial parameter selection, the approachis fully automatic being, therefore, an alternative tool for radiol-ogists, saving time and providing information that depends onlyon the image data.

ACKNOWLEDGMENT

The authors wish to acknowledge and thank the guidance andcomments of Dr. P. Meer, fundamental for the development of their research.

REFERENCES

[1] M. A. Gonz ález Ballester, A. Zisserman, and M. Brady, “Segmenta-tion and measurement of brain structures in MRI including con dencebounds, ” Med. Image Anal. , vol. 4, pp. 189 –200, 2000.

[2] W. M. Wells, W. E. L. Grimson, R. Kikinis, and F. A. Jolesz, “Adaptivesegmentation of MRI data, ” IEEE Trans. Med. Imag. , vol. 15, no. 4, pp.429–442, Aug. 1996.

[3] L. P. Clarke, R. P. Velthuizen, M. A. Camacho, J. J. Heine, M.Vaidyanathan, L. O. Hall, R. W. Thatcher, and M. L. Silbiger, “MRIsegmentation: methods and applications, ” Magn. Reson. Imag. , vol. 13,no. 3, pp. 343 –368, 1995.

[4] J. C. Bezdek, L. O. Hall, and L. P. Clarke, “Review of MR image seg-

mentation techniques usingpattern recognition, ” Med. Phys. , vol.20, no.4, pp. 1033 –1048, 1993.[5] A. K. Jain, R. P. W. Duin, and J. Mao, “Statistical pattern recognition:

a review, ” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 22, no. 1, pp.4–37, Jan. 2000.

[6] P. Schroeter, J. M. Vesin, T. Langenberger, and R. Meuli, “Robustparameter estimation of intensity distributions for brain magnetic res-onance images, ” IEEE Trans. Med. Imag. , vol. 17, no. 2, pp. 172 –186,Apr. 1998.

[7] Y. Wang, T. Adah, S. Y. Kung, and Z. Szabo, “Quanti cation andsegmentation of brain tissues from MR images: a probabilistic neuralnetwork approach, ” IEEE Trans. Image Process. , vol. 7, no. 8, pp.1165 –1181, Aug. 1998.

[8] K. L. Leemput, F. Maes, D. Vandermeulen, and P. Suetens, “Automatedmodel-based tissue classi cation of MR images of the brain, ” IEEE Trans. Med. Imag. , vol. 18, no. 10, pp. 897 –908, Oct. 1999.

[9] Z. Liang, J. R. MacFall, and F. P. Harrington, “Parameter estimation and

tissue segmentation from multispectral MR images, ” IEEE Trans. Med. Imag. , vol. 13, no. 3, pp. 441 –449, Sep. 1994.

[10] K. Fukunaga, Introduction to Statistical Pattern Recognition , 2nded. New York: Academic, 1990.

[11] D. Comaniciu and P. Meer, “Mean shift: a robust approach toward fea-ture space analysis, ” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 24,no. 5, pp. 603 –619, May 2002.

[12] B. Georgescu, I. Shimshoni, and P. Meer, “Mean shift based clusteringin high dimensions: a texture classi cation example, ” in Proc. 9th Int.Conf. Computer Vision , Nice, France, 2003, pp. 456 –463.

[13] H. Park, P. H. Bland, and C. R. Meyer, “Construction of an abdom-inal probabilistic atlas and its application in segmentation, ” IEEE Trans. Med. Imag. , vol. 22, no. 4, pp. 483 –492, Apr. 2003.

[14] K. L. Leemput, F. Maes, D. Vandermeulen, and P. Suetens, “Automatedmodel-based bias eld correction of MR images of the brain, ” IEEE Trans. Med. Imag. , vol. 18, no. 10, pp. 885 –896, Oct. 1999.

[15] J. L. Marroquin, B. C. Vemuri, S. Botello, F. Calderon, and A. Fer-nandez-Bouzas, “An accurate and ef cient Bayesian method for auto-matic segmentation of brain MRI, ” IEEE Trans. Med. Imag. , vol. 21, no.8, pp. 934 –945, Aug. 2002.

[16] K. Fukunaga and L. D. Hostetler, “The estimation of the gradient of adensity function, with applications in pattern recognition, ” IEEE Trans. Info. Theory , vol. IT-21, pp. 32 –40, 1975.

[17] Y. Cheng, “Mean shift, mode seeking, and clustering, ” IEEE Trans. Pat-tern Anal. Mach. Intell. , vol. 17, no. 8, pp. 790 –799, Aug. 1995.

[18] D. Comaniciu and P. Meer, “Distribution free decomposition of multi-variate data, ” Pattern Anal. Applicat. , vol. 2, pp. 22 –30, 1999.

[19] J. R. Jim énez, V. Medina, and O. Y áñez, “Nonparametric density gra-dient estimation for segmentation of cerebral MRI, ” in Proc. 2nd Joint EMBS/BMES Conf. , 2002, pp. 1076 –1077.

[20] , “Nonparametric MRI segmentation using mean shift and edgecondence maps, ” in Proc. SPIE Conf. Medical Imaging 2003: ImageProcessing , vol. 5032, M. Sonka and J. M. Fitzpatrick, Eds., 2003, pp.

1433 –1441.[21] D. Comaniciu and P. Meer, “Mean shift analysis and applications, ” in

Proc. IEEE Int. Conf. Computer Vision , Kerkyra, Greece, 1999, pp.1197 –1203.

[22] D. Comaniciu, “An algorithm for data-driven bandwidth selection, ” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 25, no. 2, pp. 281 –288,Feb. 2003.

[23] B. W. Silverman, Density Estimation for Statistics and Data Analysis ,1st ed. London, U.K.: Chapman & Hall/CRC, 1998.

[24] P. Meer and B. Georgescu, “Edge detection with embedded con dence, ” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 23, no. 12, pp. 1351 –1365,Dec. 2001.

[25] C. M. Christoudias, B. Georgescu, and P. Meer, “Synergism in low levelvision, ” in Proc. 16th Int. Conf. Pattern Recognition , vol. 4, 2002, pp.150–155.

[26] R. Sedgewick, Algorithmsin C . Reading, MA:Addison-Wesley, 1990.[27] J. Ashburner and K. Friston, “Multimodal image coregistration and

partitioning —a unied framework, ” NeuroImage , vol. 6, no. 3, pp.209–217, 1997.

7/24/2019 01564328

http://slidepdf.com/reader/full/01564328 10/10

JIM ÉNEZ-ALANIZ et al. : DATA-DRIVEN BRAIN MRI SEGMENTATION SUPPORTED ON EDGE CONFIDENCE AND a priori TISSUE INFORMATION 83

[28] J. Ashburner, K. Friston, A. Holmes, and J-B. Poline. (2000, Jan.) Sta-tistical Parametric Mapping —SPM99b. The Wellcome Dept. CognitiveNeurology, Univ. College London, London, U.K.. [Online]. Available:http://www. l.ion.ucl.ac.uk/spm/

[29] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classi cation , 2nded. New York: Wiley Interscience, 2000.

[30] Center for Morphometric Analysis. (2005, Sep.) Internet Brain Seg-mentation Repository. [Online]. Available: http://www.cma.mgh.har-

vard.edu/ibsr/ [31] McConnell Brain Imaging Centre. (2003, Nov.) BrainWeb: SimulatedBrain Database. [Online]. Available: http://www.bic.mni.mcgill.ca/ brainweb/

[32] C. Baillard, P. Hellier, and C. Barillot, “Segmentation of brain 3-D MRimages using level set and dense registration, ” Med. Image Anal. , vol. 5,pp. 185 –194, 2001.

[33] X. Zeng, L. H. Staib, R. T. Schultz, and J. S. Duncan, “Segmentation of the cortex from 3-D MR images using coupled-surfaces propagation, ” IEEE Trans. Med. Imag. , vol. 18, no. 10, pp. 927 –937, Oct. 1999.

[34] R. Vald és-Cristerna, V. Medina-Ba ñuelos, and O. Y áñez-Su árez, “Cou-pling of radial-basis network and active contour model for multispectralbrain MRI segmentation, ” IEEE Trans. Biomed. Eng. , vol. 51, no. 3, pp.459 –470, Mar. 2004.

[35] Y. Zhang, M. Brady, and S. Smith, “Segmentation of brain MR images

through a hidden Markov random eld model and the expectation-maxi-mization algorithm, ” IEEE Trans. Med. Imag. , vol. 20, no. 1, pp. 45 –57,Jan. 2001.