2-d shape recognition using recursive landmark ...ece.ucf.edu/~qu/journals/j09.pdf ·...

15
2-D Shape Recognition using Recursive Landmark Determination and Fuzzy ART Network Learning APIWAT SAENGDEEJING, ZHIHUA QU, NOPPHAMAS CHAEROENLAP, and YUFANG JIN School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA. e-mail: [email protected] Abstract. In this Letter, 2-D shape recognition is done using a combination of recursive search of landmarks, landmark-based invariant features, and a fuzzy ART neural-network classifier. To make this novel combination work well, an upper limit is imposed on the number of total landmarks allowed, and this maximum size is then translated into fixed dimensions of invariant features and into the neural processing of the features. It is shown that the recursive landmark search approximates very well any smooth 2-D shape contour, that the shape features used are independent of perspective transformation, and that, when combined with a fuzzy ART clas- sifier, unknown features can be efficiently learned on-line to identify multiple distinct objects. An illustrative example is used to demonstrate effectiveness of the proposed algorithm. Key words. fuzzy ART network, recursive landmark determination, 2-D shape recognition 1. Introduction Object recognition is one of the essential characteristics of an intelligent system, and its importance is clear in such areas as machine vision, inspection and automation, and target identification. For recognizing geometrically different objects, shape con- tour is the most obvious and important feature to be used. Although an object can be described well by its shape contour, the shape contour may change depending on camera perspective and/or motion of the object. Furthermore, in order to recognize several known objects and to learn unknown objects, various shape contours need to be compared and classified. To this end, an analytical shape representation has to be developed. There have been two ways of modelling shape representations. In the spatial domain, methods such as, chain code [16], run length coding [7, 9], polygon approxi- mation [6, 14], convex hull [10, 17] are proposed. In the transform domain, there are such techniques as Fourier Descriptor [8, 18], short-time Fourier Transform, Gabor, and wavelet transforms [1]. These methods have been studied by many researchers and have been used successfully in many applications. To accommodate changes in per- spective and to guarantee successful recognition, invariant features can be developed to eliminate the impact of any affine transform [2, 3, 15]. It is known that, if the depth of an object is small compared with its distance to the camera, the object can be mod- elled as a 2-D object and its viewing and projection transformation becomes affine. Neural Processing Letters 18: 81–95, 2003. 81 # 2003 Kluwer Academic Publishers. Printed in the Netherlands.

Upload: others

Post on 22-Jan-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

2-D Shape Recognition using Recursive Landmark

Determination and Fuzzy ART Network Learning

APIWAT SAENGDEEJING, ZHIHUA QU, NOPPHAMAS CHAEROENLAP,and YUFANG JINSchool of Electrical Engineering and Computer Science, University of Central Florida,Orlando, FL 32816, USA. e-mail: [email protected]

Abstract. In this Letter, 2-D shape recognition is done using a combination of recursive searchof landmarks, landmark-based invariant features, and a fuzzy ART neural-network classifier.

To make this novel combination work well, an upper limit is imposed on the number of totallandmarks allowed, and this maximum size is then translated into fixed dimensions of invariantfeatures and into the neural processing of the features. It is shown that the recursive landmark

search approximates very well any smooth 2-D shape contour, that the shape features used areindependent of perspective transformation, and that, when combined with a fuzzy ART clas-sifier, unknown features can be efficiently learned on-line to identify multiple distinct objects.An illustrative example is used to demonstrate effectiveness of the proposed algorithm.

Key words. fuzzy ART network, recursive landmark determination, 2-D shape recognition

1. Introduction

Object recognition is one of the essential characteristics of an intelligent system, and

its importance is clear in such areas as machine vision, inspection and automation,

and target identification. For recognizing geometrically different objects, shape con-

tour is the most obvious and important feature to be used. Although an object can be

described well by its shape contour, the shape contour may change depending on

camera perspective and/or motion of the object. Furthermore, in order to recognize

several known objects and to learn unknown objects, various shape contours need to

be compared and classified. To this end, an analytical shape representation has to be

developed.

There have been two ways of modelling shape representations. In the spatial

domain, methods such as, chain code [16], run length coding [7, 9], polygon approxi-

mation [6, 14], convex hull [10, 17] are proposed. In the transform domain, there are

such techniques as Fourier Descriptor [8, 18], short-time Fourier Transform, Gabor,

and wavelet transforms [1]. These methods have been studied by many researchers and

have been used successfully in many applications. To accommodate changes in per-

spective and to guarantee successful recognition, invariant features can be developed

to eliminate the impact of any affine transform [2, 3, 15]. It is known that, if the depth

of an object is small compared with its distance to the camera, the object can be mod-

elled as a 2-D object and its viewing and projection transformation becomes affine.

Neural Processing Letters 18: 81–95, 2003. 81# 2003 Kluwer Academic Publishers. Printed in the Netherlands.

Page 2: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

In this Letter, a new method of 2-D object identification is proposed for applica-

tions in robotics and automation [13]. The proposed method includes three compo-

nents: a recursive search routine to find a limited set of best landmarks as shape

features (that is, a constrained optimal polygon approximation), a new composite

invariant feature that is calculated based on the landmarks found and is independent

of reference points, and the application of fuzzy adaptive resonance theory [5]. This

combination makes it possible to efficiently describe any shape contour (known or

unknown), to extract transformation-independent features, and to learn and classify

distinct objects in a set of still or time sequential images. Computational efficiency

and real-time learning capability are the main advantages of the proposed method.

An example of target identification is used to illustrate effectiveness of the proposed

method.

2. Shape Acquisition and Processing

Upon completing edge detection, a specific perspective of a target to be identified is

obtained, and it can usually be characterized as one closed contour or several con-

tours. To illustrate the main idea, we assume that the perspective is described one

closed contour.

If the camera is moving relative to the target, the perspectives will change over

time and as long as the depth of the target can be neglected compared to the distance

between the camera and the target, the target can be treated as 2-dimensional and its

perspective changes can be described by an affine transformation [12].

To identify the specific target without knowing the transformation, one needs to

determine a number of invariant features for the target. In the problem of identifying

multiple targets, their corresponding invariant features need to be learned automa-

tically. To this end, a multiple-object recognition technique is proposed in this Letter

and it consists of the following modules:

. A group of landmarks will be chosen automatically through a recursive algorithm

to characterize closed contours in a perspective of the targets.

. A sequence of transformation-invariant features will be extracted based on the

landmarks associated with each closed contour.

. A fuzzy ART classifier is used to separate and learn the invariant features of

various objects and hence to distinguish different objects.

3. Recursive Selection of Landmarks

In general, a curve (that is, a portion of a closed, shape contour) can be simply

approximated by such pre-defined geometric primitives as line segments. This

approach, traced back to [11, 14] and often called polygonal approximation in com-

puter graphics, is used in the proposed recognition technique. Here, the polygonal

approximation is applied to best capture the shape features in a perspective by

searching for the maximum distances between pairs of points on the contour until

82 APIWAT SAENGDEEJING ET AL.

Page 3: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

a certain minimum resolution is reached and, to be computationally efficient, a max-

imum value is imposed on the number of vertices to be selected. Specifically, a recur-

sive algorithm is proposed to locate vertices of a fitted polygon, called landmarks, by

partitioning the contour according to the maximum distance found and by repeating

the partition through a tree-like search.

To describe the proposed algorithm of locating landmarks, let

. Xp be the set of all points, pi, on a digital image of the closed contour, i.e.,

Xp ¼ f pi; i ¼ 1; . . . ;Npg ¼ f½xi; yi; 0�T; i ¼ 1; 2; . . . ;Npg such that point pi is

adjacent to both piþ1 and pi�1.

. O � Xp be the set of landmark points to be found, that is, O ¼ fqj; j ¼ 1; . . . ;Nlg

where qj is the jth landmark and Nl, to be found, is the dimension of set O.

Typically, Nl << Np.

. Dmax is a positive integer representing the maximum search depth (or the maxi-

mum number of recursion steps, or maximum search level in the tree). Normally,

2Dmaxþ1 5Nl.

. Od be the set of landmark points found up to the dth iteration, where

d ¼ 0; . . . ;Dmax þ 1. Thus, O1 � O2 � ¼ O. Assume without loss of gene-

rality that Od ¼ fqðd Þi ; i ¼ 1; . . . ;Nðd Þg be sorted such that landmark point q

ðd Þi

is adjacent to both qðd Þiþ1 and q

ðd Þi�1.

. Sðd Þi be the set of points that are on the contour segment with its two end points

at qðd Þi and q

ðd Þiþ1.

Then, the proposed recursive algorithm is as follows: given any positive constant Eand integer Dmax,

Initialization: Set d ¼ 0, Nð0Þ ¼ 2, and find qð0Þ1 and q

ð0Þ2 such that

kq1 � q2k ¼ maxpi;pj2Xp

kpi � pjk;

where k k denotes the Euclidean norm.

Recursion at Level d: Set Od ¼ Od�1. For each contour segment Sðd�1Þi ,

i ¼ 1; . . . ;Nðd�1Þ, find q0i such that

kqðd�1Þi � q0ik þ kq0i � q

ðd�1Þiþ1 k ¼ max

p2Sðd�1Þi

½kqðd�1Þi � pk þ kp� q

ðd�1Þiþ1 k�:

If

kqðd�1Þi � q0ik þ kq0i � q

ðd�1Þiþ1 k > kq

ðd�1Þi � q

ðd�1Þiþ1 k þ E; ð3:1Þ

add q0i into set Od; otherwise, ignore q0i.

Once segments Sðd�1Þi are exhausted, set Od is obtained after a simple sorting within

itself. Let d ¼ dþ 1. If d4Dmax, continue the recursion; otherwise, exit the loop.

2-D SHAPE RECOGNITION 83

Page 4: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

As an example, consider the closed contour in Figure 1. During the initial step,

two entry-level landmarks A1 and B1 are found by searching for the maximum

distance between the points on the contour. These two landmarks partition the shape

contour into two segments. Using the proposed algorithm, one can recursively locate

landmark points at different levels (denoted by Ai, Bi, Ci, and so on for i ¼ 2; . . .) by

finding the maximum accumulated distance (i.e., maximizing the sum of the two dis-

tances between a point in a segment to the end points of the segment). Note that the

landmarks found in the previous steps are the ending points of various segments in

the current recursion. In doing so, landmark points A2 and B2 are found in the level-

1 search, and landmark points A3, B3, C3 and D3 are found in the level-2 search, and

so on. This process continues until either the maximum search level is reached or the

maximum accumulated distance becomes less than E. It is obvious from this recursive

mechanism that, given a relatively large Dmax and a relatively small E, the whole set

of landmarks can approximate any smooth shape contour. In fact, the proposed

recursion represents the process of a constrained optimal approximation.

To illustrate the proposed landmark search algorithm, consider the shaded region

in Figure 2. A simple edge detection algorithm can be readily applied to find its shape

contour. Upon completion, the proposed landmark search algorithm is carried out,

the results and corresponding polygon segments are shown in Figure 2, and

Figure 1. Recursive assignment of landmark locations.

Figure 2. An arbitrary shape and the corresponding results of landmark search.

84 APIWAT SAENGDEEJING ET AL.

Page 5: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

effectiveness of the algorithm is apparent. In this example, Dmax ¼ 8 and E ¼ 1 are

set, and the proposed algorithm locates 54 (i.e., Nl) landmark points.

The maximum search level Dmax imposed by the algorithm ensures that the num-

ber of landmarks is limited. This size limitation is needed for both computational

efficiency and feature characterization. As will be shown in the next section, a set

of new invariant features are developed, and they are based on the number of land-

marks. By limiting the number of maximum landmarks, new invariant features can

be represented by a vector of fixed dimension, which is critical for automatic classi-

fication using neural network learning.

In essence, the landmark set provides an approximation of contour shape features

in an image. Constant E can be viewed as the measure of resolution, and its value

typically depends on engineering choices. If E is sufficiently small and Dmax is suffi-

ciently large, O ¼ Xp. As aforementioned, the compression from Xp to O is impor-

tant and necessary. The proposed recursive algorithm ensures that landmarks are

properly spread along any (non-uniform) contour via an efficient and uniform search

through a tree-like structure.

Obviously, choices of Dmax and E require tradeoffs between accuracy and compu-

tational efficiency. In general, searching depth and/or landmark resolution should be

increased as complexity of the shape contour increases. For a very complicated con-

tour, searching depth and landmark resolution can be set locally for parts of the con-

tour. For instance, one can first apply an algorithm of water flow to determine such

features as loop, loop size, and variations of gradient. Using a criterion based on

these features, searching depth and landmark resolution can be selected intelligently

and automatically.

4. A Vector of New Projection-Invariant Features

In many applications, 2-D images are obtained by projecting a 3-D object onto the

imaging plane. In case that the depth of the object is much smaller than the distance

between the object and the projection plane, shape contours in the images can be

treated as ones of 2-D. For moving objects, their shape contours will change as a

function of projective mappings. To classify distinct but unknown objects based

on image sequences, a generic projection-invariant measure is developed in this sec-

tion for distinguishing general objects so that automatic classification can be pursued

in the next section by means of learning.

To this end, consider four distinct points on the shape contour in an image: pi, pj,

pk, and pl. Using coordinates of any three points (say, pi, pj, and pk), one can formu-

late the following determinant:

vði; j; kÞ ¼1

2

xi xj xkyi yj yk1 1 1

������������: ð4:1Þ

Note that vði; j; kÞ 6¼ 0 unless the three points are on a straight line.

2-D SHAPE RECOGNITION 85

Page 6: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

For general projective images, cross ratios of a linear shape measure are invariant

to affine transform action [2, 15]. When the perspective changes, the four points in

the new image become p0i, p0j, p

0k, and p0l, and the relationship between pi and p0i is

given by an affine transformation

p0i ¼ Api þ b;

where A is the rotational/scaling matrix and b is the translation vector. By employing

the homogeneous representation, we have

x0iy0i1

24

35 ¼

A b0 1

� � xiyi1

24

35 ¼

DT

xiyi1

24

35:

Therefore, if adopted as a feature, ratio

vði; j; kÞ

vði; j; l Þð4:2Þ

is projectively invariant as expected since

v0ði; j; kÞ ¼1

2

x0i x0j x0ky0i y0j y0k1 1 1

������������ ¼

jTj

2

x0i x0j x0ky0i y0j y0k1 1 1

������������

and consequently

vði; j; kÞ

vði; j; l Þ¼

v0ði; j; kÞ

v0ði; j; l Þ:

Ratio (4.2) is both shape relevant and projection invariant and will be adopted as

the seed feature. This is because determinant (4.1) is a measure corresponding to geo-

metric area of a triangle formed by landmark points. Specifically, the triangle with

vertices at pi, pj and pk has the following area:

A ¼1

2jð pi � pkÞ � ð pj � pkÞj:

On the other hand, it follows that, since the value of a determinant is invariant under

elementary operations,

vði; j; kÞ ¼1

2

xk yk 1xi yi 1xj yj 1

������������:

Subtracting the first row from second and third rows and then multiplying the last

column by ð�xkÞ and ð�ykÞ and adding the results to the first and second columns,

respectively, yield

vði; j; kÞ ¼1

2

1 1 1xi � xk yi � yk 0xj � xk yi � yk 0

������������ ¼

1

2ð pi � pkÞ � ð pj � pkÞ:

111

24

35: ð4:3Þ

86 APIWAT SAENGDEEJING ET AL.

Page 7: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

Therefore, determinant (4.1) is a signed geometric measure (related to the area). Fur-

thermore, vði; j; kÞ has the property that, if pk is changed to ð1 � EÞpk for some small

constant E, vði; j; kÞ is perturbed by the same percentage, that is, sensitivity of vði; j; kÞ

with respect to a variation in pi (or pj or pk) has a unity value, which can be simply

denoted by

S vp ¼ 1: ð4:4Þ

Nonetheless, ratio (4.2) is not an appropriate shape feature for unknown objects

for two reasons. First, while being invariant under affine transformation, its value

is too dependent upon the choices of points and thus captures too little information

about the shape to be effective. Theoretically, one could overcome this problem by

calculating all the values associated with combinations of all the points. However,

this method of exhaustion is undesirable computationally. Second, if calculation

of the ratio is selectively done for some of the points, selection of these points need

to be carried out intelligently so that shape information of unknown projects can be

described by a fixed feature that can be computed efficiently.

The approach taken in the Letter resolves the aforementioned difficulties by intro-

ducing a new invariant feature vector I. The vector is calculated in two steps (so as to

address the aforementioned problems). First, only the landmark points will be con-

sidered for computing the invariant feature vector. It has been shown in Section 3

that, through the recursive search algorithm, information of shape contour is com-

pressed and captured by the landmark points. Second, motivated by ratio (4.2), the

following vectors of fixed dimension are calculated:

Ci ¼vlð1; iþ 1; iþ 2Þ

vlði; iþ 1; iþ 2Þ

vlð2; iþ 1; iþ 2Þ

vlði; iþ 1; iþ 2Þ

vlðNl � 1; iþ 1; iþ 2Þ

vlði; iþ 1; iþ 2Þ

�vlðNl; iþ 1; iþ 2Þ

vlði; iþ 1; iþ 2Þ0 0

�T; ð4:5Þ

where i ¼ 1; . . . ;Nl � 2, Nmax ¼ 2Dmaxþ1 � 2, CI 2 <Nmax , and vlð; ; Þ are the deter-

minants of form (4.1) for the set of landmarks. That is, given a set of landmarks

O ¼ fqj; j ¼ 1; . . . ;Nlg,

vlði; j; kÞ ¼1

2

qxi qxj qxkqyi qyj yyk1 1 1

������������:

Then, the proposed invariant feature vector is denoted by I 2 <Nmax , and its kth

element is defined to be

Ik ¼

PNl�2

i¼1

Cik k2 i ¼ 1;

1I1

PNl�2

i¼1

PNmax

j¼1

Ciðmodðkþ j� 1;NmaxÞÞCið j Þ i ¼ 2; . . . ;Nmax

8>>><>>>:

ð4:6Þ

2-D SHAPE RECOGNITION 87

Page 8: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

where Cið j Þ is the jth element of vector Ci, and function modðk; nÞ is defined to be

the modulus of k by n.

Remark. It follows from (3.1) that, given the shape resolution characterized by

E > 0, any three adjacent landmarks ðqi; qiþ1; qiþ2Þ are selected not to be on a straight

line. Therefore, vectors in (4.5) and feature (4.6) are well defined (i.e., there is no

singularity).

The proposed feature also has the following properties:

. Vector I is invariant with respect to any affine transformation on the elements in

O.

. Vector I is invariant with respect to the ordering in set O.

. Feature component Ik can be viewed as the correlation factor of vectors Ci with

step size k, and thus vector I is the self correlation vector of collection fCig.

Furthermore, it is straightforward to show that jIkj4 1 for k ¼ 2; 3; . . . ;Nmax

and that

I1 þ I1XNmax

k¼2

Ik

XNl�2

j¼0

Cj

����������2

:

. If O is replaced by O [ O (so that Nl is doubled while Nmax is fixed and not

exceeded by Nl), then the non-zero elements of Ci are repeated once within

the resulting new set, the corresponding value of feature Ik is increased by a con-

stant multiplier, and the non-zero elements of vector I are repeated once within

the new set as well.

The last property together with (4.4) justifies the use of landmarks in preserving

shape information in I while limiting computational effort to the landmark points.

To illustrate invariance of the proposed feature vector, consider the two projec-

tions of an object in Figure 3. Landmarks are found by the recursive search algo-

rithm, and they are shown in Figure 3. The corresponding features Ik, invariant to

translation, scaling, rotation, and reference point, are calculated, verified, and then

plotted in Figure 4.

Figure 3. Landmark locations of two perspectives of an object.

88 APIWAT SAENGDEEJING ET AL.

Page 9: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

5. Automatic Classification by Unsupervised Learning

Having the capability of pattern learning and automatic recognition is one of the

most essential and desirable functions of today’s intelligent systems. For automatic

classification, unknown patterns need to be processed, characterized, and learned so

that they can be assigned into different classes or categories according to their char-

acteristics. For online applications, a classifier must be versatile in handling pattern

features, fast in online learning, and stable in classification. Adaptive resonance the-

ory (ART) is one of the neural network classes that possess such properties. In this

Letter, fuzzy ART is selected and implemented as the classifier because, unlike an

ART network which operates only at discrete values, the fuzzy ART network admits

inputs whose values are between 0 and 1. The basic structure and fundamental func-

tionality of a (fuzzy) ART network is given by Figure 5. In what follows, its algo-

rithm is illustrated.

As shown in Figure 5, the network consists of two layers: an input layer and a

competitive layer. These two layers are interconnected by forward and backward

weighting matrices W1!2 and W2!1 (both of which have non-negative entries and

the initial values of their entries are typically set to be 1, and start with one compe-

titive node, i.e. M ¼ 1), respectively. When a vector of shape features (say, x 2 <N

with 04 xi 4 1 and for i ¼ 1; . . . ;N) is presented, the state of the input layer

becomes

X ¼ ½x1; x2; . . . ; xN; 1 � x1; 1 � x2; . . . ; 1 � xN�T:

Then, a search is conducted at the competitive layer for the output node that yields

the maximum response, and the network either starts to resonate or continues its

Figure 4. Circular plot of invariant sequence Ik where w1 ¼ cosð2kp=NlÞ and w2 ¼ sinð2kp=NlÞ.

2-D SHAPE RECOGNITION 89

Page 10: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

search until the best node representing the corresponding input is found. Specifically,

during the search/resonance phase, the network first selects the node candidate that

yields the maximum response TJ, that is,

TJ ¼ maxWj

1!22W1!2

kWj1!2 ^ Xk1

bþ kWj1!2k1

where b > 0, Wj1!2 is the weighting row vector from the input nodes to the jth out-

put node, ða ^ bÞi ¼ minðai; biÞ, kvk1 ¼P2N

i¼1 vi, Wj2!1 is the jth column vector of

W2!1 and it is set to be Wj2!1 ¼ ½Wj

1!2�T. Upon identifying the Jth node, the node

is validated by the so-called vigilance test, i.e., whether the inequality

kWJ2!1 ^ Xk1 5rkXk1 � rN ð5:1Þ

holds for a vigilance constant r 2 ½0; 1�. If true, the node represents the closest clas-

sification of the input, its weighting will be updated by adaptation law

WJ1!2 ¼ WJ

1!2 ^ X; ð5:2Þ

and the network is kept at to this state until the next input arrives. If false, the maxi-

mally responding node is not a valid candidate, and it will be excluded from the next

round of search (by resetting the Jth node attribute so that this node is prohibited

from winning in the next round of search), and the network continues its search.

If there is no convergence (i.e., no valid result emerges from the search), a new node

is created to represent the input. In this case, output node index M will be increased

by one (that is, Mnew ¼ Mþ 1 ), W2!1 adds a new column WMnew

1!2 , initial values of all

its elements are set to be 1, and as a result TMnew¼ N=ðbþ 2NÞ. Thus, in checking

(5.1), a new node will be added if all the current values of Tj are less than

N=ðbþ 2NÞ � 1=2.

Figure 5. Fuzzy ART.

90 APIWAT SAENGDEEJING ET AL.

Page 11: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

In the Letter, the proposed shape recognition method is based on both the fuzzy

ART network and the new invariant feature vector I in Section 4. To produce the

normalized input vector, we let N ¼ Nmax and

x ¼ ½ �I1; �I2; . . . ; �INmax�T;

where �Ik 2 ½0; 1� is defined by

�I1 ¼ 1; �Ik ¼Ik � min Ik1 � min Ik

; i ¼ 2; . . . ;Nmax

and Ik is that in equation (4.6).

In essence, the vigilance test ensures the matching condition, inequality (5.1),

between the input pattern and the selected weight WJ1!2, and the fuzzy ART network

assigns a node (or classification) to represent the input by maximizing the corre-

sponding response while being vigilant. Note that matching condition (5.1) and

update law (5.2) form a contraction mapping. Therefore, as new input vectors come,

the vigilance constant r can be viewed and used as the compression ratio between

patterns and their corresponding nodes. Specifically, if r approaches one, the net-

work trends to generate new nodes rather than update the existing ones in order

to keep the match between an input pattern and its corresponding node.

In general, the adaptive resonance theory has one unique advantage over other

neural networks. That is, an ART has the property of being able to learn of a

new pattern without losing previously stored information. Specifically, adaptive

resonance theory is developed to solve the crucial tradeoff between stability and plas-

ticity [4]. A typical neural network tends to loss previously stored information as it

starts learning a new pattern.

In the ART architecture, the solution to the problem is the use of resonance and

feedback inter-connected weights that connect the output from the second layer back

to the input layer, as shown in Figure 5. The feedback mechanism allows the net-

work to either resonate or search for the best representation through different can-

didate nodes until an existing or a new output becomes the desired node, and that

node is then considered as the closest representation among all of the nodes.

Specifically, the important feature of the ART model is its automatic switching

between convergence (stable) and learning (plasticity) modes. This property is

achieved by the design of the weight update law and resonance mechanism since

the network updates only the chosen node while leaving all other weights intact. This

prevents cross interference from happening during the network updates. Whenever a

totally different pattern arrives, the resonance mechanism creates a new node to

represent the pattern. As the result, these key properties make the ART network

much more stable, without losing previously stored information and also automati-

cally generate new nodes to encode new patterns when they arise.

By combining the new invariant feature vector and fuzzy ART network learning

capability, the proposed method has all necessary properties and functions to

achieve online and automatic shape recognition.

2-D SHAPE RECOGNITION 91

Page 12: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

6. Illustrative Example

In this section, the proposed algorithm is tested for automatic classification of air-

planes. The test uses a set of different geometrical perspective images of the aircraft

to illustrate functionality and capability of the proposed fuzzy ART neural network

based classification method. Specifically, in the simulation, we prepare a set of 16

images, P ¼ fPi; i ¼ 1; 2; . . . ; 16g for testing the network. The first four images,

P1; . . . ;P4, are the pictures that appear in Figure 6 and 7, and the others are a set

of different geometrical perspectives of P1; . . . ;P4 (e.g., Figure 8), denoted by

P4iþj ¼ TiðPj Þ, j ¼ 1; 2; . . . ; 4. These images are generated by three affine transforms,

TiðÞ, i ¼ 1; 2; 3. Results of the network presented by a sequence of the test images

are shown in Tables I and II.

In the simulation, the fuzzy ART network and the recursive landmark parameters

are set to be b ¼ 0:1; r ¼ 0:9; Dmax ¼ 7, M ¼ 1, and E ¼ 1. The matching scores

between patterns and their representing nodes and the results of the network respon-

ses are summarized in Tables I and II. It is apparent that the proposed network

recognizes the types of aircrafts with high matching scores.

Figure 6. Planes of type-1 and type-2 with their features.

92 APIWAT SAENGDEEJING ET AL.

Page 13: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

Figure 8. Testing image and its invariant features.

Figure 7. Planes of type-3 and type-4 with their features.

2-D SHAPE RECOGNITION 93

Page 14: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

7. Conclusion

A method of shape recognition using landmark invariant features and a fuzzy ART

network is presented. It is shown to possess such unique properties as learning and

recognizing object shape online with a combination of invariant features against

weak perspective distortion and stable learning capability of fuzzy ART network.

The proposed method has the potential of enabling automatic shape learning and

recognition in many industry applications. Illustrative simulations have been carried

out to demonstrate the effectiveness of the proposed method, and a summary of

results is included.

Acknowledgement

This research is supported in part by Microtronic Inc. We would like to thank the

editor and reviewers for their help.

References

1. Antoine, J.: Shape characterization with the wavelet transform, Signal Processing, 62(3)

(1997), 265–290.

Table I. Network responses and matching scores of presenting images (in percentage).

Image P1 P2 P3 P4

Network response create node 1 create node 2 create node 3 create node 4

Matching score 100 99.9 99.9 99.9

Image P5 P6 P7 P8

Network response match node 1 match node 2 match node 3 match node 4

Matching score 97.8 99.5 94.8 95.8

Table II. Network response and matching scores of presenting images (in percentage).

Image P9 P10 P11 P12

Network response match node 1 match node 2 match node 3 match node 4

Matching score 97.5 98.7 93.9 95.1

Image P13 P14 P15 P16

Network response match node 1 match node 2 match node 3 match node 4

Matching score 96.1 97.5 93.5 96.2

94 APIWAT SAENGDEEJING ET AL.

Page 15: 2-D Shape Recognition using Recursive Landmark ...ece.ucf.edu/~qu/Journals/J09.pdf · approach,tracedbackto[11,14]andoften calledpolygonal approximation incom-puter graphics, is used

2. Barrett, E. B., Brill, M. H., Haag, N. N. and Payton, P. M.: Some invariant linear meth-

ods in photogrammetry and model-matching, Proceedings of IEEE Conference on Compu-ter Vision and Pattern Recognition, pp. 122–128, 1992.

3. Barrett, E. B. and Payton, P. M.: Geometric invariant for rational polynomial cameras,

Proceedings of Applied Imagery Pattern Recognition Workshop, 29 (2000), 223–234.4. Carpenter, G. A., Grossberg, S., Markuzon, N. and Reynolds, J. H.: Fuzzy ARTMAP:

A neural network architecture for incremental supervised learning for analog multidimen-

sion maps, IEEE Transactions on Neural Networks, 5 (1992), 698–713.5. Carpenter, G. A., Grossberg, S. and Rosen, D. B.: Fast stable learning and catergoriza-

tion of analog patterns by an adaptive resonance system, Neural Networks, 4 (1991),759–771.

6. Fischler, M. A., Wolf, H. C.: Locating perceptually salient points on planar curves, IEEETransactions on Pattern Analysis and Machine Intelligence, 16(2) (1994), 113–129.

7. Freeman, H.: On the encoding of arbitrary geometric configurations, IRE Transactions on

Electronic Computers, 10 (1961), 260–268.8. Granlund, G.: Fourier preprocessing for hand print character recognition, IEEE Transac-

tions on Computers, 21(2) (1972), 195–201.

9. Kim, S. D., Lee, J. H. and Kim, J. K.: A new chain-coding algorithm for binary imagesusing run-length codes, Computer Vision, Graphics and Image Processing, 41 (1988),114–128.

10. Luo, D., Macleod, J. E. S., Leng, X. and Smart, P.: Automatic orientation analysis of

particles of soil microstructures, Geotechnique, 42 (1992), 97–107.11. Morel, J. M. and Soliminii, S.: Variational Methods in Image Segmentation. Boston:

Birkhauser, 1995.

12. Mundy J. L. and Zisserman, A.: Geometric Invariance in Computer Vision. Cambridge,Mass.: MIT Press, 1992.

13. Qu, Z.: Robust Tracking Control of Robotic Manipulators. New York: IEEE, 1996.

14. Ramer, U.: An iterative procedure for the polygonal approximation of plane curves,Computer Graphics and Image Processing, 1 (1972), 244–256.

15. Reiss, T.: Recognizing Planar Objects Using Invariant Image Features. New York:

Springer-Verlag, 1993.16. Sarkar, D.: A simple algorithm for detection of significant vertices for polygonal-approxi-

mation of chain-codeed curves, Pattern Recognition Letters, 14 (1993), 959–964.17. Sklansky, J.: Measuring concavity on a rectangular mosaic, IEEE Transactions on Com-

puters, 21 (1972), 1355–1364.18. Zahn, C. T. and Roskies, R. Z.: Fourier descripters for plane closed curves, IEEE Trans-

actions on Computers, 21(2) (1972), 269–281.

2-D SHAPE RECOGNITION 95