invariant image recognition by zernike moments

8/8/2019 Invariant Image Recognition by Zernike Moments

1/9

I E E E T R A N S A C T I O N S O N P A T T E R N A N A L Y S I S A N D M A C H I N E I N T E L L I G E N C E . VOL. 12. N O . 5. MA Y 1990

R E F E R E N C E SH. Freeman and L. Garder, Apictorial jigsaw puzzles: The com-puter solution of a problem in pattern recognition, IEEE Trans.Electron. Compui . , vol. EC-13, pp. 118-127, Apr. 1964.G . M. Radack and N . I. Badler, Jigsaw puzzle matching using aboundary-centered polar encoding, Comput. Graphics Image Pro-cessing, vol. 19, pp. 1-17, 1982.H. Wolfson, E. Schonberg, A. Kalvin, and Y . Lamdan, Solvingjigsaw puzzles using computer vision, Ann. Oper. Res., vol. 12,pp. 51-64, 1988.N. Ayache and 0 . D. Faugeras, HYPER: A new approach for therecognition and positioning of two-dimensional ob jects, IEEE Trans.Pattern Anal. Machine Intell., vol. PAMI-8. pp. 44-54, Jan. 1986.J . L. Turney, T. N . Mudge, and R. A . Volz, Recognizing partiallyoccluded parts, IEEE Trans. Pattern Anal. Machine Intell., vol.PAMI-7, no. 4, pp. 410-421, July 1985.A. Kalvin, E. Schonberg, J . T. Schwartz. and M. Sharir, Two di-mensional model based boundary matching using footprints, Int. J .Robotics Res., vol. 5, no. 4, pp. 38-55, 1986.H. Wolfson, On curve matching, in Proc. IEEE Comput. Soc.Workshop Computer Vision, Miami Beach, FL . Nov. 1987, pp. 307-310.M. D. Ernst and B. E. Flinchbaugh, Image/map correspondenceusing curve matching, presented at the AAA1 Robot NavigationSym p., Mar. 28-30, 1989.J . T . Schwartz and M. Sharir, Identification of partially obscuredobjects in two dimensions by matching of noisy characteristic curves,Int. J . Robotics Res., vol. 6, no. 2, pp. 29-44, 1987.H. Freeman, Shape description via the us e of critical points. Pat-tern Reco gn., vol. I O , pp. 159-166, 1978.J . Hong and H. J. Wolfson, An improved model-based matchingmethod using footprints, in Pro c. Int. Con$ Pattern Reco gnition ,Rome, Italy, Nov. 1988, pp. 72-78.P. Weiner, Linear pattern matching algorithms, in Proc. /4 /hAnnu. Symp. Switching and Automata Theory, IEEE Comput. Soc.,M. McCreight, A space-economical suffix tree construction algo-rithm, J . ACM , vol. 23, no. 2, pp. 262-272, Apr. 1976.D. . Lee and F. P. Preparata, Euclidean shortest paths in the pres-ence of rectilinear bariers, Networks, vol. 14, pp. 393-410, 1984.I . J . Stoker, Diferential Geometry. New York: Wiley-Interscience,1969.E. Kishon and H. Wolfson, 3-D curve matching, in Proc. AAAIWorkshop Spaiial Reasoning and Multisensor Fusion, St. Charles,

1973, pp. 1 - 1 1 .

IL , Oct. 1987, pp. 250-261.

Invariant Image Recognition by Zernike MomentsALIREZA KHOTANZAD AN D YAW HUA HONG

Abstract-This correspondence addresses the problem of rotation,scale, and translation invariant recognition of images. A new set of

Manuscript received August 29, 1988; revised October 13, 1989. Rec-ommended for acceptance by C. Y. Suen. This work was supported in partby the Defense Advanced Research Projects Agency under Grant MDA-A. Khotanzad is with the Image Processing and Analysis Laboratory,Department of Electrical Engineering, Southern Methodist University,Dalla s , TX 75275.Y . H. Hong was with the Image Processing and Analysis Laboratory,

Department of Electrical Engineering, Southern Methodist University,Dallas, TX 75275. He is now with Texas Instruments Incorporated, Dallas,TX .

903-86-C-0182.

IEEE Log Number 8933763.

__

489rotation invariant features are introduced. They are the magnitudes ofa set of orthogonal complex moments of the image known as Zernikemoments. Scale and translation invariance are obtained by first nor-malizing the image with respect to these parameters using its regulargeometrical moments. A systematic reconstruction-based method fordeciding the highest order of Zernike moments required in a classifi-cation problem is developed. The quali ty of the reconstructed imageis examined through its comparison to the original one. More momentsare included until the reconstructed image from them is close enoughto the original picture. The orthogonality property of the Zernike mo-ments which simplifies the process of image reconstruction makes thesuggested feature selection approach practical. Furthermore, featuresof each order can also be weighted according to their contribution (theirimage representation ability) to the reconstruction process. The methodis tested using clean and noisy images from a 26-class character dataset and a 4-class lake data set. The superiority of Zernike moment fea-tures over regular moments and moment invariants is experimentallyverified.

Index Terms-Feature selection, image recognition, image recon-struction, invariant pattern recognition, moment invariants, Zernikemoments.

I . INTRODUCTIONAn important problem in pattern analysis is the automatic rec-ognition of an object in a scene regardless of i ts position, size , andorientation. They arise in a variety of situations such as inspection

and packaging of manufactured parts [141, classification of chro-mosomes [3], target identification [2 ], [15], and scene analysis [5].The current approache s to invariant two-dimensional shape recog-nition include extraction of global image information using regularmoments [151, boundary-based analysis via Fourier descriptors[121, [141, [151, [181, or autoregressive models [9], image repre-sentation by circular harmonic expansion [6], and syntactic ap-proaches [3]. A fundamental element of all these schemes is defi-nition of a set of features for image representation and datareduction. Normally additional transformations are needed toachieve the desired invariant properties for the selected features.After invariant features are computed , they are input to a designedclassification rule to decide a labeling for the underlying image.The util ization of good features is not the only decisive factor inthe success of these methods. An additional para meter to be de-cided upon is the number of such features to be used. However,the majority of the existing techniques use an ad hoc procedure forarriving at such a decision. Th e aim of this work is to develop newfeatures along with a systematic method for selection of the re-quired number of features needed.Mome nts and functions of mom ents have been util ized a s patternfeatures in a number of applications [ I ] , 121, [7], [15]. Such fea-tures capture global information about the im age and do not requireclosed boundaries a s boundary-based methods such as Fourier de-scriptors do. Regular moments have by far been the most populartype of moments. They are defined as

where mpq s the ( p + q)th order moment of the continuous imagefunction f (x , y ) . For digital images the integrals are replaced bysummations and mpqbecomes

Hu [7] introduced seven non linear functions defined on regular mo-ments which are translation, scale, and rotation invariant. These0162-8828/90/0500-0489$01OO 0 9 9 0 IEEE


2/9

490 I E E E TRANSACTIONS ON PATTERN A N A L Y S I S A N D MA C H I N E I N T E L L I G E N C E . VOL. 12. NO . 5. MA Y 1990seven so called mome nt invariants were used in a number of patternrecognition problems [2], [ 131.The definition of regular moments has the form of projection off ( x , y ) function on to the monomial x p y 4 .Unfortunately, the basisse t xp y q is not orthogonal. Consequently, the recovery of imagefrom these moments is quite difficult and computationally expen-sive. Moreover, it implies that the information content of mpq'shave a certain degree of redundancy. Teague [I61 has suggestedthe orthogonal moments based on the theory of orthogonal poly-nomials to overcome the problems associated with the regular mo-ments. Zernike moments used in this study are a class of such or-thogonal moments. The reason for selecting them from among theother orthogonal moments is that they possess a useful rotation in-variance property. Rotating the image doe s not change the magni-tudes of its Zernike moments. Hence, they could be used as rota-tion invariant features for image representation. Thes e featurescould easily be constructed to an arbitrary high order. Another mainproperty of Zernike moments is the ease of image reconstructionfrom them. The orthogonality property enables one to separate outthe individual contribution of each order moment (its informationcontent) to the reconstruction process. Simple addition of these in-dividual contributions generates the reconstructed image. Takingadvantage of this characteristic, a method for selection of the re-quired number of features (max imum orde r of moments) is devel-oped. This technique evaluates the image representation ability offeatures of each order moments through comparison of the recon-structed image by them with the original one . The maximum orde rrequired is the one for which the reconstructed image is close tothe original one . Furtherm ore, one can weight the features accord-ing to their relative contribution to the reconstruction process.The defined features on the Zernike moments are only rotationinvariant. To obtain scale and translation invariance, the image isfirst subjected to a normalization process using its regular mo-ments. The rotation invariant Zernike features are then extractedfrom the scale and translation normalized image.Teh and Chin [17] examined noise sensitivity and informationredundancy of Zernike moments along with five other moments.They concluded that higher order moments are more sensitive tonoise. It was also shown that orthogonal moments including Zer-nike moments are better than other types of moments in terms ofinformation redundancy and imag e representation.The organization of this correspondence is as follows. SectionI1 defines the Zernik e mo ments and their properties. In Section 111,the image reconstruction from its Zernike moments is shown. Sec-tion IV discusses the rotation invariant features obtained from Zer-nike moments. Section V describes the synthesis based feature se-lection method. Section VI contains the scale and translationnormalization approach and examines the performance of the pro-posed features and the accompanying feature selection methodthrough experimen tal studies involving a 26-class English ch arac-ter data set and a 4-class lake data set. Performance comparisonsto moment invariants and regular moments are also presented inthis section. Section VI1 gives the conclusion of ou r study.

11. ZERNIKEM OM E NT SIn [19], Zernike introduced a set of complex polynomials whichform a comple te orthogonal set over the interior of the unit circle,i . e . , x 2 + y 2 = 1. Let the set of these polynomials b e denoted by{ Vnm ( x , ) } . The form of these polynomials is:

wheren Positive integer or zero.mPe

Positive and negative integers subject to constraints nLength of vector from origin to ( x , y ) pixel.Angle between vector p an d x axis in counterclockwise- 1m( ven , Im l I .

direction.

R,, ( p ) Radial polynomial defined asn - in / 2

( n - s ) ! P n - 2 r

Note that Rn , ( P ) = R n m ( P ).These polynomials a re orthogonal and satisfy

1 a = b0 otherwise

with6ob =

Zernike moments are the projection of the image function onto theseorthogonal basis functions. The Zernike moment of order n withrepetition m for a continuous image fun ctionf (x, y ) that vanishesoutside the unit circle is

For a digital image, the integrals are replaced by summations toge t n + lA, , =- C f ( x , ) V , * , ( p , e ) , x 2 + y 2 5 1. ( 5 )

T 1 1TO compute the Zernike m oments of a given im age, the center ofthe image is taken as the origin and pixel coordinates are mappedto the range of unit circle, i .e. , x 2 + y 2 5 1 . Those pixels fallingoutside the unit circle are not used in the computation. Also notethat A:", = A, , - , , .111. IMAGER E C O N S T R U C T I O NR O M Z E R N I K EM O M E N T S

Suppose that one knows all mome nts A,, o ff (x , y ) up to? givenorder n m ax . t is desired to reconstruct a discrete function f (x , y )whose moments exact ly match those o ff ( x , y ) up to the givenorder nmax. ernike moments are the coefficients of the image e x-pansion into orthogonal Ze mi ke polynomials. By orthogonality ofthe Zernike basisnmdl

n = O mPCx, Y > = C A n m V n r n ( p 3 6) ( 6 )with m having sim ilar constraints as in (3) . Note that as nmax p-proaches inf ini tyf (x , y ) wi l l appr oachf ( x , y ) .Since it is easier to work with real-valued functions, one canexpand (6) noting that V,*,( p , 0 ) = V n , - m ( , 0 )

r 1


3/9

IEEE TRANSACTIONS ON PATTERN ANALYSIS A N D MA C H I N E I N T E L L I G E N C E , V O L . 12. NO. 5 . MA Y 1990

f ( x , y ) = C C (c,, co s me + s sin m e ) R , , ( P )n m > O

( 7 )with

. R,,( p ) co s me dx dy

. Rn,( p ) sin me dr dy.Indeed, C,, an d S expressions could be used in place of (4) tocompute A, , as wel l .This reconstruction process is illustrated in Fig. 1 for two 64 X64 binary images of letters E an d F. The reconstructed binary im-ages are generated by using ( 7 ) followed by mapping to [0, 2551range, histogram equalization [SI, nd binarization using thresholdof 128. It is evident that lowe r order moments ca pture gross shapeinformation and high frequency details are filled in by high er ordermome nts. In this example mome nts of order 2 through 12 are used.The reason for omitting orders 0 an d 1 is due to the nature of pre-processing done on the original images which will be discussedlater.I v . R O T A T I O NNVARIANT FEATURES DERIVED R O M ZERNIKEM OM E NT S

Conside r a rotation of the image through angle a . If the rotatedimage is denoted by f r , the relationship between the original androtated images in the same polar coord inates is

The Zernike mo ment expression ca n be mapped from the xy-planeinto the polar coordinates by changing the variables in dou ble in-tegral form of (4). This can be seen from [8]

where a (x , y ) / a ( p , e ) denotes the Jac obian of the transformationan d is the determinant of the matrix

Lap aeJFor this case where x = p cos 0 an d y = p sin 0 , the Jacobianbecomes p . Hence

= 1 SIf p , 0 ) R,, ( p ) ex p ( -jme) dp do. ( 11)The Zernike moment of the rotated image in the same coordinateis

49 1

Fig. 1. The reconstructed images of characters E and F. From top row andleft to right: original image, reconstructed image with up to second ordermoment through up to twelfth order m oment.

By a change of variable = 8 - a

. ex p ( - jm O l) p dp de l ex p ( - j m a )( 1 3 )

I= A, , ex p ( - jma ) .

Equation (13) shows that Zernike moments have simple rotationaltransformation properties; each Ze rnike moment merely acq uires aphase shift on rotation. This simple property leads to the conclu-sion that the magnitudes of the Zernike mom ents of a rotated imagefunction remain identical to those before rotation. Thus 1 A,, 1 , themagnitude of the Zernike moment, can be taken as a rotation in-variant feature of the underlying image function. Note that sinceA, ,-, A,*,, then 1 A ,, 1 = I A n ,-,; hus , on e can concentrate on1 A, , I with m 2 0 as far as the defined Zernike features are con-cerned. Table I lists the rotation invariant Zernike features and theircorresponding numbers from order 0 to order 12 .This rotation invariancy property is illustrated by an experime nt.Fig. 2 shows a 64 x 64 binary image of character A and five ro -tated versions of it, with rotation angles of 30, 60, 150", 180",and 300 ", respectively. T able I1 is the list of the magnitudes oftheir Zernike moments for orders 2 and 3, their respective samplemean p, sample standard deviation U , an d u / p % , which indicatesthe percentage of spread of the 1 A ,, 1 values from their correspond-ing means. It is observed that rotation invariancy is very wellachieved. For example, u / p % is 0.30% an d 0.90% fo r 1 A,, I an d1 A33 , respectively. These are to be compared to the exact invari-ances of 0 % . The reason for not obtaining exact invariances is thediscrete form of the image fun ction rather than being a continuousone.

V . FE AT U RE E L E CT ION I A RE CONST RUCT I ONHaving shown that the magnitudes of Zernike moments can betaken as rotation invariant features, a main q uestion to be answeredis; how big should n be? In other words, up to what ord er momentsare needed for a good classification of a given database. In fact, a

major shortcoming of many previously developed feature sets forimage representation is the lack of a systematic method for auto-matic selection of this number.A good set of features is one that can characterize and representthe image well. The difference between an image and its recon-structed version from a finite set of its moments is a good measure


4/9

492 I E E E T R A N S A C T I O N S O N P A T T E RN A N A L Y S I S A N D M A C H I N E I N T E L L I G E N C E , V O L . 12. N O . 5. M AY 1990TABLE ILIST F ZERNIKE OMENTSN D THEIR ORRESPONDINGU M B E RFFEATURESR OM OR D E R Z E R O T O OR D E R T W E L V E

images. If H( , f 5 E , where E is a preselected threshold, thenit can be concluded that enough information is extracted and noadditional order of moments needs to be computed, i.e. , n* = i.The abov e procedure not only specifies the highest ord er neededfor a prototype, but also provides a means for treating features ofeach order differently. It is apparent that different order momentscapture different characteristics of the image. One can isolate thecontribution of ith o rder mome nts to the reconstruction process anduse its relative strength to weight the corresponding features. Thecontribution of ith order moments to the reconstruction process canbe measured by computing how m uch clo ser& is to f ompared tofi ~, . Hamming distance is again employed to carry out this task.The contribution of the ith order moments denoted by C ( ), iscomputed asA large positive value of C ( ) indicates that the ith order momentsdo capture a lot of important information about the shape. On theother han d, a small positive or a negative C ( ) is an indication thatthe corresponding moments focus on unimportant aspects of theimage under study. Consequently, it makes sense to weight impor-tant (in the sense of reconstruction) m oments and their correspond-ing features more and vice versa. Thus, one can introduce aweighting mechanism for the ith order features based on their cor-responding C ( ) S . All ith order features could be weighted by w iduring classification stage w here,

QQQQOOFig. 2. The image of character A and five rotated versions of it . From leftto right rotation angles are: 0 . 3 0 , 60, 150, 180, and 300.TABLE 11

SHOWN IN F I G . 2ND T H E I RORRESPONDING STATISTICSMAGNITUDESF S O M E OF THE ZERNIKE MOMENTSOR ROTATEDMA GES

of the image representation ability of the considered set of mo-ments. The ease of image reconstruction from Zernike momentsmakes it practical to base the feature selection process on such ameasure. The idea is that n*, the maximum needed order, is onewhich can generate a reconstructed image which is similar to theoriginal in the sense of a defined threshold. In the following dis-cussion, we will concentrate on binary images. However, exten-sion to ray level images is straightforward.Le tf enote the binary image reconstructed by using momentsof order 0 through i extracted from the original image, f. is gen-erated usingwhere F represents mapping to [0 , 2551 gray level range, histo-gram equalization, and thresholding at 128. A simple measure ofimage representation ability is the difference between fr and theoriginal binary image f. The Hamming distanc e between the two,H( , f ) s employed to quantify this difference. The Hammingdistance is the total number of pixels that are different in the two

an dD ( n * ) = C ( j ) .

J = I. (1 )B 0If C ( ) is negative, w , s set to zero. Note that the w,s sum up to1 O. Fig. 3 and Table 111 show the synthesized images and the cor-responding C ( i ) an d w, alues for character A when E = 300pixels. Again note that the zeroth order and first order moments arenot used du e to the preprocessing explained later. Th e weight forthe second order moment is set to zero since there is no previousimage ( i .e . , f i ) for comp arison. Also, note that the unit circle partof a 64 x 64 mage which consists of 3096 pixels is the basis forcomparison and Hamming distance calculation. If more than oneprototype exists for a class, each one may give rise to a differentn* an d w,. n that case, the highest n* and the average of w, ar eused.

The sam e procedure can be used in the case of gray level images.The only needed modification is changing the difference measurefrom the Hamm ing distance to either a correlation type measure ormean squared error.Up to now, the discussion has centered o n how to select the rightorder of moments and feature weights for a single class from itsgiven training samples. In a multiclass problem, the highest ordermoment to be extracted from an unknown image is nn*laXwhereis the maximum value among all the classes to be considered.

VI. E X P E R I M E N T A LTUDYIn this section, the classification powe r of the proposed Zernikemoment features and the accompan ying feature selection method isexperimentally tested and the results are reported. Furtherm ore, theperformance of these features is compared to those of moment in-variants and regular mome nts. N oise sensitivity of Zernike featuresis also examined.

A . The UtilizedData SetsTw o different data sets of shapes are ge nerated. T he first dataset consists of 26 upper case English characters from A to Z .Twelve different 64 x 64 binary images from each character (fora total of 314 images) are considered. Four slightly different sil-houettes of each character are generated and th ree scaled, rotated,


5/9

I E E E T R A N S A C T I O N S O N P A T TE R N A N A L Y SI S A N D MA C H I N E I N T E L L I G E N C E . VOL 12. NO . 5. MA Y 1990 493

Order 1 H(f , ,f )I

QOOOQQOC ( i ) w,

Fig. 3 . The reconstructed images of character A . From left to right: orig-inal image and reconstructed images with up to second order momentthrough up to seventh order moment. n* = 7 is selected with t = 300.

TABLE I11STATISTICSOR RECONSTRUCTEDMAGESHOWNN FIG.3

I I

,6328492 ,0460

Fig.4.The twelve scaled, translated, and rotated images of letter A in thecharacter data set. Note the slight intraclass variations in shape.

and translated versions of each silhouette are considered to makeup the twelve images per class. Fig. 4shows the twelve generatedimages of character "A". Note the within class differences ofshapes and their scale, orientation, and translation. In Fig. 5four(unrotated) out of twelve images of each of the othe r characters areshown.The second da ta set consists of four classes of shapes which arethe aerial views of lakes Erie, Huron, Michigan, and Superior.Again twelve differently oriented 64 X 64 binary images of eachlake are generated. Fig. 6shows these images. Note that no withinclass shape difference is considered for this data set.As discussed before, the proposed Zernike features are only ro-tation invariant. But, the considered images have scale and trans-lation differences as well. T herefo re, prior to extraction of Zernikefeatures, these images should be norma lized with respect to scalingand translation. A regular moment-based approach is taken towardthis stage which is discussed in the next section.

B. Scale and Translation NormalizationTo achieve scale and translation uniformity, the regular mo-ments ( i . e . , mpq ) f each image are utilized. Translation invariancyis achieved by transforming the image into a new one whose firstorder moments, mo l an d mlO, re both equal to zero. This is done

by t ransforming the or iginalf (x , y) image into another one whichis f (x + X, + j , where X an d j are the centroid location of theoriginal image computed from

Fig. 5 . Four out of the twelve images of letters B to 2 in the character dataset. The remaining eight images per character ar e rotated, scaled, andtranslated versions of the shown four.

In other words, the origin is moved to the centroid before mom entcalculation.Scale invariancy is accomplished by enlarging or reducing eachshape such that its zeroth order moment moo s set equal to a pre-determined value p. Note that in the case of binary images moo isthe total number of s hape p ixe l s i n t he image . L e t f ( x / a , y / a )represent a scaled version of the image functionf(x, y). Then, theregular moment mP qo f f ( x . p) an d mLq, the regular moment of


6/9

494 I E E E T R A N S A C T I O N S O N P A TT E RN A N A L Y S I S A N D M A C H I N E I N T E L L I G E N C E , VOL. 12. N O . 5. M A Y 1 9 9 0

Fig. 6 . The twelve scaled, translated, and rotated images of Lakes Erie,Huron, Michigan, and Superior.

f ( x / a , y / a ) , a r e r el at ed by

Since the object ive is to have rnh = 0, ne can let a = Jp/moo.Substituting a = into m&, one obtains m b = a2moo= p .Thus scale invariancy is achieved by transforming the original im-a g e f u n c t i o n f ( x , y ) i n to a n ew f u n c t i o n f ( x / a , y / a ) , w i th a =

In summary, an image funct ionf ( x , y ) can be normalized withrespect to scale and translation by transforming it into g ( x , y),where

Jp/moo.

with (X, ) be ing the cen t ro id o f f ( x , y ) an d a = Jp/moo, ith0 predetermined value. W herever ( x / a + X, / a + j ) oes notcorrespond to a grid location, the function value associated with itis interpolated from the values of the four nearest grid locationsaround it.Fig. 7 shows the effect of this normalization stage on images ofcharacter "A " using 0 800. Fig. 8shows one normalized imageof each of the lakes .This scale and translation invariancy stage do es affect two of th eZernike features. Those features are 1 Aoo an d 1 A , I . I A, I is goingto be the sam e for al l images and I A , , is equal to zero. This is

Fig. 7. The scale and translation normalized images of those shown inFig. 4.

Fig. 8. The scale and translation normalized images of lakes. One out ofthe twelve images per lake is shown.

ands = 0. ( 2 0 )

S ince moo = 0 t i s eviden t that l & - l = I COO/^) - ( S o o / 2 ) I= P/a for al l the normalized images . The refore, I A, I is not takenas one of the features utilized in the classification. For 1 A , 1,

= '= ' i g ( x , y ) x d * - d y = -mlo ( 2 1 )g ( x , y ) p co s 0 du dya r ? + , ? s 4a r ? + i ? s I

an d

since rnlo = mol = 0 for all normalized images, then l A l l =l ( C l 1 / 2 ) - ( S l l / 2 ) 1 = 0 for them, and I A l l I wil l not be in-cluded as one of the ut i l ized feature. Thus in the fol lowing exper-iments, the extracted Zernike features start from the second ordermoments .C. Classification Rules

Two different classifiers namely ne arest-neighbor ( N N ) and min-imum-mean-distance ( M M D ) are used in this s tudy. T he descr ip-t ion of each fol lows .The nearest-neighbor classifier labels an unknown image repre-sented by a n rn-dimensional feature vector X = [ x I , 2 , . . . , x, ]with the label of the nearest neighbor of X among al l the t raining


7/9

I E E E T RA N S A C T I O N S O N P A T T ER N A N A L Y S I S A N D M A C H I N E I N T E L L I G E N C E . V O L . 12. N O . 5. M AY 1990 495

samples . The dis tance between X and a training sample is measuredusing Euclidean distance. This is a mapping from m-dimensionalfeature space onto a one-dimensional Euclidean space. However,to prevent the domination of a subgroup of features, one has tonormalize the features. The normalization consists of subtractingsample mean and d ividing by standard deviation of the correspond-ing class.In a c class problem let t : ) = [ t i : ) , i : ) , . . . , ;: ,)] nd N , denotethe kth m-dimensional training feature vector of the ith class andthe number of available training samples of class i , respectively.The unknown test sample X s classified to class i * , where

with ;I an d U:: ) representing the sample mean and standard devia-tion of the jth eleme nt of the m-dimensional training feature vectorof class i.The second classifier is a weighted minimum-m ean-distance rule.Each class is represented by th e sample means and variances of itstraining features. The utilized classifier measures the sum of thesquared distance between th e feature vector of the test image X an dthe mean of the feature vectors of each of the classes weighted bythe inverse of the correspon ding variance s. The purpose of weight-ing is to balance the effect of each of the m components of thefeature vector.Le t d ( X , ) be the weighted distance between test image X an drepresentation of class i . Then

X is class i fied to class i* for which the distance d ( t , ) is minimumamong { d ( X , ) i = 1 , 2 , . . . , c } .D. r r o r Es t i ma t i o n Sc he me s

To estimate the error rate associated with the selected features,the available samples must be divided into two sets, one for train-ing (design) and one for testing. Three different partitioningschemes are considered. The first method is known as leave-one-out . This means that out of N samples from c classes per data-base, N - of them are used to train the classifier and the remain-ing one to test it . This process is repeated N t imes , each t ime leav-ing a different sample out. T herefore , all of samples are ultimatelyused for testing. The ratio of the number of misclassifications tothe total number of tested samples yields an upper bound of theclassification error for the considered set [4 ]. Since the number ofsamples in the lake data set is rather small, it is only tested withthis scheme. But, for the character data set two additional parti-tionings are considered.In a leave-one-out scheme the classifier is trained on rotated im -ages. T o test the rotation invariance pow er of the features, a secondpartitioning method is considered. In this scheme called trainedon unrotated , the classifier is trained on the four unrotated imagesper character and tested using the remaining rotated images. Thistranslates into 104 training and 208 test samples.In the above two cases the classifier sees all four silhouettes ofeach charac ter during the training phase. The third partitioningscheme is designed to test the sensitivity of the method to slightvariations in shape . The classifier is trained on three images of onlyone of the four silhouettes per character and is tested using theremaining nine images from the othe r three silhouettes. This means7 8 training and 234 test samples are considered. This method isrefered to as train on one silhouette.

E. Class i j i ca t ion Resul t sTo decide on n * , eight images per class are used. In the case ofcharac ter data set, two images from each of the four silhouettes percharacter are used. The selected threshold is E = 300 pixels. Therationale for selection of this number is that it represents around10% difference between the original and the reconstructed imagewhich is a good degree of closeness. The relation between E an dn* is shown later in this section. Tables IV apd V list the obtainedn* along with the Hamming distance of H ( n. , f ) for each of theeight considered images in the two data sets. A slight modification

to the algorithm was allowed which limits the maximum order to12. Therefore, for som e of the eight images of three characters, B ,P , an d R which do not satisfy the closeness criterion for n 5 12,n* = 12 is selected and no higher order is considered. Based onthese results, it is concluded that for the character data set, oneneeds to extract up to twelfth order Zernike moments correspond-ing to 47 features. F or the lake data set, up to eighth order momentscorresponding to 23 features are sufficient. The features are notweighted. The effect of weighting is investigated later.The classification accuracy rates obtained for the character dataset using 47 features and the three described error estimationschemes are listed in the first row of Table VI. In the same tableclassification results using orde rs lower than 12 are also presented.Table VI1 provides a relationship between E and the selected order.For the lake data set only leave-one-out method along with theminimum-mean-distance classifier is considered. A perfect classi-fication accuracy is obtained.F. Effect of Fe a t ur e W e i g h t i ng

In the next set of experiments, the utilized features are weightedaccording to the scheme described earlier. The resulting recogni-tion accuracies are very close to those obtained using the un-weighted feature s. In each case either no improvement is observedor at most the error is decreased by one sample. This is expectedsince for the considered data sets, the unw eighted results are veryaccurate to start with, leaving very little room for improvement.However, it can be argued that the absence of any performancedegradation validates the proposed feature weighting scheme.G. E f f e c t of M i s s i ng P ha s e I n f o r ma t i o n

The considered features are the magnitudes of complex Zernikemoments. The phase information is dropped to obtain rotation in-variance. The effect of deletion of phase information on classifi-cation is investigated through a set of experiments involving fourunrotated images of each characte r. These are the images shown inthe first column of Fig. 4 and imag es presented in F ig . 5.The firsttwo images per charac ter are used for training and the remainingtwo for testing. The nearest neighbor classifier is utilized. Tw o setsof experiments are carried out. In the first set, both magnitudes andphases are used as features while in the second set only magnitudesare considered . Perfect recognition accuracies ar e obtained for bothcases when the maximum al lowable order is 7 I * I 2. Whenn* = 6, magnitudes and phases give perfect accuracy while mag-nitudes only yield a 9 8% rate (one error in 52 tests). Thus it canbe inferred that the influence of loss of phase information in clas-sification is rather insignificant especially when high order mo-ments are included.H . P e r f o r ma nc e Co mp a r i s o ns

In this section, the performance of Z ernike features is comparedto those of mome nt invariants and regular moments. In the case ofmome nt invariants six features are utilized since generating a big-ger number of them is not a trivial task. These features are log,,16; 1, i = 1 , 2 , . . . , 6 where the 6is re defined in [ 7 ] .Note thatin this case there is no need to n ormalize the images since the disare not onty invariant to rotation but also to scale and translation.The sam e classifiers and error estimation schemes are utilized andthe results are listed in the third row of Table VIII. We also ex-


8/9

496 I E EE T R A N S A C T I O N S O N P A T T E R N A N A L Y S I S A N D M A C H I N E I N T E L L I G E N C E . VOL. 12, NO . 5, MA Y 1990

AB

TABLE IVn*s A N D H ( in.,) s FO R EIGHTROTOTYPE IMAGES OF C H A R A C T E RATAEIGHT)OR EACHCLASSS E T . THELASTC OL U MN SHOWS T HE SE L E CT E D n* ( M A X I M U M* AMONG

1 2 3 4 5 6 7 8n , ~ n , ~ n,n n,H n,H n,H n,H n,H n 7,225 10,285 7,25 7,285 7.219 10,264 7 , 216 1n .m in

12,375 12,429 12,242 12,371 12.372 12,431 12,237 12,377 12

M 9,267 11,272N 8,214 8,298o i2 , i . w 12 , 220P 12,365 12,389

11,272 9,288 9,280 11.278 11,287 10,279 118,290 11,207 8,224 8,299 8.273 11,191 11

12,226 12,207 12,191 12,1611 12,205 12,220 1212,345 12,249 12,303 12,338 12,326 12,224 12

Q I 12,316 I 11,247 I 12,297 I 12.311 I 12,284 I 11,272 I 12.289 I 12,255 I 12R I 10,245 I 12.385 I 12.443 I 12,437 I 10,248 I 12.410 I 12,476 I 12,327 I 12~~S 10,227 11,271 12,185 11,149 11,146 11,230 11,289 12,255 12T 9.291 8,280 9,296 9.286 8,290 9,202 10,141 g.295 i nU 10,164 8,240 10,170 9,290 10,147 8,253 10,139 1o.185 I Ov 7,253 10,174 10,208 10,175 9,277 10,176 10,213 10,185 I OwYXZ

11,280 11,261 12,266 12,219 1 z. 1 ~8 11.22s 12,246 11.250 12g 9 m 9 , 22 5 1 0 .2 3 9 1 1 , 24 7 1 0, 21 7 9,208 10,270 11,291 11

12,253 12,27 2 12,266 12,253 12,240 12,247 12.237 12.281 1212,287 12,137 12,258 12,270 12,163 12,142 12,237 12,268 12

TABLE Vn*s AN D H ( A*,f S FOR EIGHTROTOTYPE IMAGES OF LAKEDATA ET .TH ELASTC OL U MN SHOWS T HE SE L E C T ED n* ( M A X I M U M n* AMONGEIGHT)OR EACH L A S S

perimented w ith such features extracted from translation and scalednormalized images and go t very similar classification results.Unlike moment invariants, regular moments mp qcould be con-structed for any positive p an d q . However, as noted earlier, theyare not rotation invariant. To experiment with them, the imagesneed to be corrected for rotation as well. This is done by the methodof principal a xis described in [151. After translation and scale nor-maliza tion, the principal axis of the im age is found and it is rotatedsuch that this axis lines up with the horizontal axis. H owev er, thereare problems associated with this techniq ue. If the image is n-foldsymmetric, there will be multiple possible sets of principal axes.In the case of a character data set, such a problem happened forsome of the symmetr ic characters l ike C . In addition, the pres-ence of even a moderate amount of noise significantly affects theaccuracy of rotation correction.To do a fai r compar ison, 47 regular moments are extracted f romtranslation, scale, and rotation normalized images. The p q ordersare the same as mn orders of the cons idered Zernike m oments . Thesame classifiers and error estimation schemes are utilized and theresults are presented in the second row of Table VIII.The entries of Table VI11 clearly verify the assertion regardingsuperiority of Zernike mom ent features over momen t invariants andregular moments. The same relative strength is preserved if lessthan twelfth orders are considered.I . P e r f o r ma nc e on Noisy I ma g e s

In this section, the noise tolerance and sensitivity of the Zernik efeatures are studied experimentally. Eight noisy images with dif-

TABLE V IRECOGNITIONCCURACYATES OR THE CHARACTEIZERNIKEEATURES.N A ND M M D S T A N DOR N EAN D M I N I M U M - M E A N - D I S T A N C EL A S S I F I E R S , I

Leave one out Train onunrotated

NN MM D

DATA E T U S I N G\REST-NEIGHBORESPECTIVELY

Train on onesilhouette

TABLE VI1T H ERELATION ETWEENHRESHOLD( IN PIX E L S) A N D T HE SE L E C T E DM A X I M U MRDER

410470560TABL E VI11PERFORMANCEOMPARISONSM O N G E R N I K E , REGULAR O M E N T S , A N DMOMENTNVARIANTEATURES

Zernike Featur es 18 40ferent orientations are generated for each character by randomlyselecting some of the 4096 pixels of a normalized noise-free binaryimage and reversing their values from 0 to 1 or vice versa. Therandom pixel selection is done accordin g to a uniform distributionbetween 1 and 40 96. Different sets with different noise levels aregenerated. The signal-to-noise ratio (SNR) of the generated setsar e 30 dB, 25 dB, and 17 dB. Fig. 9 shows one of he eight noisyimages of characters A and B for different SNRs . Althoughthe square images are shown, only the unit circle portion of themare used in the experiments. Using the noise-free normalized im-ages as training sa mples, and the noisy images as test samples, theperformance of the fea tures are tested in three sets of experiments.All twelve clean translation and scale normalized im ages, only thefour unrotated images, and only the three from one silhouette im-age are used fo r training in the first, second, a nd third sets of ex-periments, respectively. The results are tabulated in Tables IX , X ,an d XI. In [17] it is shown that higher order moments are moresensitive to no ise. O ur experim ents verify this point since includingthem in many cases degrades the accuracy. Overall, the perfor-mance is rather good when SN R is 25 dB or higher .

VII. CONCLUSIONIn this correspondenc e, a new set of features defined on the Zer-nike moments which are a mapping of an image function onto a set


9/9

I E E E T R A N S A C T IO N S O N P A T TE R N A N A L Y S I S A N D M A C H I N E I N T E L L I G E N C E . VOL. 12. N O . 5. M AY 1990

Fig. 9. One out of the eight noisy images of characters A and B. From leftto right SNR is 30 dB, 25 dB, and 17 dB.

TABLE IXRECOGNITIONCCURACYATESOR TH E NOISY CHARACTER DATA ET.TH ECLASSIFIERS ARETRAINEDN AL L TWELVELEAN IMAGES PERCHARACTER

TABLE XRECOGNITIONCCURACYATESOR TH E NOISY CHARACTER DATA ET.TH ECLASSIFIERS ARETRAINEDN FOURUNROTATED CLEAN IMAGES PERCHARACTER

of orthogonal basis functions over the unit circle has been devel-oped. These features are the magnitudes of complex Zernike mo-ments and are proven to be rotation invariant. The orthogonalityproperty of Zernike moments makes the image reconstruction fromits moments computationally simple. Moreover, it enables one toevaluate the image representation ability of each order moment aswell as its contribution to the reconstruction process. Based on thisproperty a systematic method for selection of the required numberof features in a classification problem is developed. The selectednumber for the highest order moment is the one which yields areconstructed image which is close to the original one. The dis-crimination power of the proposed Zernike moment features andthe developed feature selection method are tested by a series ofexperiments on two different da ta sets using a nearest-neighbor aswell as a minimum-mean-distance classifier. The considered im-ages have differe nces in scale, translation, and rotation. They arefirst normalized with respect to scale and translation using regularmoment based techniques. T he obtained classification accuracy fora 26-class character data set is 99%,while a perfect recognitionrate is reported for 4-class lake data set. Thus one can concludethat the proposed features and the accompanying feature selectionmethod are quite effective for the image classification problem. In

497TABLE XIRECOGNITIONCCURACYATESOR TH E NOISY CHARACTER DA TA ET.TH ECLASSIFIERS ARETRAINEDN THREEMAGES OF O N E OF TH ESILHOUETTESER CHARACTER

addition the superiority of Zernike moment features over regularmoments and moment invariants is shown. Finally, the noise sen-sitivity of Zernike features is studied and it is concluded that theycan perform well in the presence of a mod erate level of noise.

REFERENCES[l ] Y. S . Abu-Mostafa and D. Psaltis, Image normalization by complexmoments, IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-7, no. 1 , pp. 46-55, Jan. 1985.[2] S . A. Dudani, K. J. Breeding, and R. B.McGhee, Aircraft identi-

fication by moment invariants, IEEE Trans. Comput., vol. c-26,no. I , pp. 39-45, Jan. 1983.[3] K. S . Fu, Syntactic Pattern Recognition and Application. Engle-wood Cliffs, NJ : Prentice-Hall, 1982.[4] K. Fukunaga, Introduction to Statistical Pattern Recognition. NewYork: Academic, 1972.[5] R. C. Gonzalez and P. Wintz, Digital Image Processing. Reading,MA: Addison-Wesley, 1977.[6] Y. N. Hsu, H. H. Arsenault, and G. April, Rotational invariantdigital pattern recognition using circular harmonic expansion, Appl .O p t . , vol. 21, pp. 4012-4015, 1982.[7] M. K. Hu, Visual pattern recognition by moment invariants, IRETrans. Inform. T heory, vol. IT-8, pp. 179-187, Feb. 1962.[8] R. E. Johnson, F. L. Kiokemeister, and E. S . Wolk, Calculus withAnalyt ic Geom etry, 6t h ed.[9] R. L. Kashyap and R. Chellap pa, Stochastic models for closedboundary analysis : Representation and reconstruction, IEEE Trans.Inform. Theory, vol. IT-27, pp. 627-637, Sept. 1981.[ IO] A. Khotanzad and Y . H . Hong, Rotation and scale invariant featuresfo r texture classification, in Proc. Robotics and Automation.

IASTED, Santa Barbara, CA, May 1987, pp. 16-17.[ I 11 -, Rotation invariant pattern recognition using Zernike mo-ments, in Proc. 9th ICPR, Rome, Italy, Nov. 1988, pp. 326-328.1121 A. Krzyzak, S . Y. Leung, and C. Y. Suen, Reconstruction of two-dimensional patterns by Fourier descriptors, in Proc. 9th ICPR,Rome, Italy, Nov. 1988, pp. 555-558.[13] S . Maitra, Moment invariants, Proc. IEEE, vol. 67, no. 4, pp.697-699, Apr. 1979.[14] E. Persoon and K. S. Fu, Shape discrimination using Fourier de-scriptors, IEEE Trans. Syst . , Man, and Cybern. , vol. SMC-7, pp.170-179, Mar. 1977.[I51 A. P. Reeves, R. J. Prokop, S . E. Andrews, and F. Kuhl, Three-dimensional shape analysis using moments and Fourier descriptors,IEEE Trans. Pattern Anal. Machine Intell., vol. 10, no . 6, pp. 937-943, Nov. 1988.[16] M. Teague, Image analysis via the general theory of moments, J .Opt. Soc. Ame r . , vol. 70, no. 8 , pp. 920-930, Aug. 1980.1171 C. H . Teh and R. T. C hin, On image analysis by the methods ofmoments, IEEE Trans. Pattern Anal. Machine Intell., vol. IO , no.1181 C. T. Zhan and C. T. Roskies, Fourier descriptors for plane closedcurves, IEEE Trans. Comput.,vol. (2-21, pp. 269-281, Mar. 1972.[I91 F. Zernike, Physica, vol. 1, p. 689, 1934.

Boston, MA: Allyn and Bacon, 1978.

4, pp. 496-513, July 1988.

invariant image recognition by zernike moments

Documents