recognition of sequential patterns by nonmonotonic neural networks

9
Recognition of Sequential Patterns by Nonmonotonic Neural Networks Masahiko Morita Institute of Information Sciences and Electronics, University of Tsukuba, Tsukuba, Japan 305-8573 Satoshi Murakami Doctoral Program in Engineering, University of Tsukuba, Tsukuba, Japan 305-8573 SUMMARY A neural network model that recognizes sequential patterns without expanding them into spatial patterns is presented. This model forms trajectory attractors in the state space of a fully recurrent network by a simple learning algorithm using nonmonotonic dynamics. When a sequen- tial pattern is input after learning, the network state is attracted to the corresponding learned trajectory and the incomplete part of the input pattern is restored in the input part of the model; at the same time, the output part indicates which sequential pattern is being input. In addition, this model can recognize learned patterns correctly even if they are temporally extended or contracted. ' 1999 Scripta Technica, Syst Comp Jpn, 30(4): 1119, 1999 Key words: Sequential pattern; pattern recogni- tion; nonmonotonic neural network; trajectory attractor; learning algorithm. 1. Introduction A number of models have been proposed concerning pattern recognition by neural networks, but they basically deal with static spatial patterns only. To recognize a sequen- tial pattern, they have to expand it into a spatial pattern using multilayer delay units or various time-delay filters [1, 2]. However, these methods of recognition have such prob- lems as the temporal length of the pattern being limited by the maximum delay time and difficulty in coping with temporal expansion and contraction of the pattern. Moreover, most of the conventional recognition mod- els use a layered neural network without feedback, and cannot utilize dynamics of the network, namely, dynamic interaction between many neurons. It is true that the Elman model [3], for example, has feedback connections in the middle layer, but it is based on discrete-time dynamics requiring synchronization between neurons and sampling of the input pattern at discrete times, so that it does not make the best use of the network dynamics. Another feature of the conventional models is that they use a complex learning algorithm like the back-propa- gation algorithm and give priority to improving it. Espe- cially when there are feedback connections, the learning algorithm tends to be very complicated. As a result, the computational cost increases explosively and learning does not succeed easily, in addition to other problems. On the other hand, the mechanism of sequential pat- tern recognition in the human brain seems far different from that of the above models, for the following reasons. First, no delay circuit which can convert a long sequence into a spatial pattern is found in the higher center of the brain. In contrast, the brain has plenty of feedback connections CCC0882-1666/99/040011-09 ' 1999 Scripta Technica Systems and Computers in Japan, Vol. 30, No. 4, 1999 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J81-D-II, No. 7, July 1998, pp. 16791688 11

Upload: masahiko-morita

Post on 06-Jun-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recognition of sequential patterns by nonmonotonic neural networks

Recognition of Sequential Patterns by Nonmonotonic Neural

Networks

Masahiko Morita

Institute of Information Sciences and Electronics, University of Tsukuba, Tsukuba, Japan 305-8573

Satoshi Murakami

Doctoral Program in Engineering, University of Tsukuba, Tsukuba, Japan 305-8573

SUMMARY

A neural network model that recognizes sequential

patterns without expanding them into spatial patterns is

presented. This model forms trajectory attractors in the state

space of a fully recurrent network by a simple learning

algorithm using nonmonotonic dynamics. When a sequen-

tial pattern is input after learning, the network state is

attracted to the corresponding learned trajectory and the

incomplete part of the input pattern is restored in the input

part of the model; at the same time, the output part indicates

which sequential pattern is being input. In addition, this

model can recognize learned patterns correctly even if they

are temporally extended or contracted. © 1999 Scripta

Technica, Syst Comp Jpn, 30(4): 11�19, 1999

Key words: Sequential pattern; pattern recogni-

tion; nonmonotonic neural network; trajectory attractor;

learning algorithm.

1. Introduction

A number of models have been proposed concerning

pattern recognition by neural networks, but they basically

deal with static spatial patterns only. To recognize a sequen-

tial pattern, they have to expand it into a spatial pattern

using multilayer delay units or various time-delay filters [1,

2]. However, these methods of recognition have such prob-

lems as the temporal length of the pattern being limited by

the maximum delay time and difficulty in coping with

temporal expansion and contraction of the pattern.

Moreover, most of the conventional recognition mod-

els use a layered neural network without feedback, and

cannot utilize dynamics of the network, namely, dynamic

interaction between many neurons. It is true that the Elman

model [3], for example, has feedback connections in the

middle layer, but it is based on discrete-time dynamics

requiring synchronization between neurons and sampling

of the input pattern at discrete times, so that it does not make

the best use of the network dynamics.

Another feature of the conventional models is that

they use a complex learning algorithm like the back-propa-

gation algorithm and give priority to improving it. Espe-

cially when there are feedback connections, the learning

algorithm tends to be very complicated. As a result, the

computational cost increases explosively and learning does

not succeed easily, in addition to other problems.

On the other hand, the mechanism of sequential pat-

tern recognition in the human brain seems far different from

that of the above models, for the following reasons. First,

no delay circuit which can convert a long sequence into a

spatial pattern is found in the higher center of the brain. In

contrast, the brain has plenty of feedback connections

CCC0882-1666/99/040011-09

© 1999 Scripta Technica

Systems and Computers in Japan, Vol. 30, No. 4, 1999Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J81-D-II, No. 7, July 1998, pp. 1679�1688

11

Page 2: Recognition of sequential patterns by nonmonotonic neural networks

among neurons, which suggests that it actively uses the

network dynamics. Moreover, the actual neurons seem to

act neither synchronously with a clock nor according to a

highly complex learning rule.

Of course, we need not use the same mechanism as

the brain if our purpose is only practical application. How-

ever, it seems hardly possible to achieve a neural network

model as excellent as the brain within the conventional

framework. The purpose of this study is to construct a

neural network model that recognizes sequential patterns

based on a framework more similar to the brain than the

conventional one, and to show its ability and possibilities.

An interesting model from such a standpoint has been

proposed by Futami and Hoshimiya [4]. Their model, using

the state transition of a neural network with mutual connec-

tions, can recognize a sequential pattern without converting

it into a spatial pattern. It is also different from the Elman

model in that the neurons act time-continuously and the

transition of the network state is continuous. However, this

model is based on the local representation of information

or �grandmother-cell� type encoding [5]; that is, each neu-

ron represents only a specific part of a specific sequence.

Besides, the network cannot take every possible state, but

very limited states. Accordingly, this model does not effi-

ciently use neurons nor make the most of the merits of

parallel distributed processing.

The root cause of these problems is that the dynamics

of conventional neural networks is not appropriate for

controlling the network to make a smooth and stable state

transition between arbitrary states [6, 7]. It is therefore

necessary to use improved dynamics for the above purpose.

A very simple and effective method of improving

network dynamics is to use nonmonotonic dynamics, that

is, to change the monotonic output function of each neuron

into a nonmonotonic one [8, 9]. As reported recently [6],

the neural network with such dynamics (nonmonotonic

neural network) can form a trajectory attractor along a given

orbit in its state space, using a simple learning algorithm. It

is thus expected that the nonmonotonic neural network can

recognize sequential patterns based on a principle similar

to the brain.

The existing nonmonotonic neural network model,

however, is for memory, where stored sequential patterns

have to be mutually separate in the pattern space so that

confusion in recall may not occur. On the other hand, the

model for sequential pattern recognition should treat the

sequences such that the same spatial pattern appears repeat-

edly in different contexts; otherwise, it is nothing more than

a model of spatial pattern recognition. We thus need some

contrivance so that the network can learn trajectories having

intersection and overlap. This paper presents a concrete

method for that purpose.

Incidentally, neurons with nonmonotonic input�out-

put characteristics (nonmonotonic neurons) are not found

in the brain. In this respect, the nonmonotonic neural net-

work seems quite implausible as a model of the brain.

However, similar dynamics can be achieved by combining

excitatory and inhibitory neurons of monotonic charac-

teristics, and such a model [5, 10, 11] is supported by some

physiological findings. Nevertheless, the use of the non-

monotonic neuron has the advantages of simple architec-

ture, concise description, and ease of analysis. Since the

main purpose of this study is not to construct a biologically

plausible model but to present a new principle of recogni-

tion, we use the nonmonotonic neuron as a component of

the model here.

2. Principle

2.1. Structure and dynamics

This model has a simple structure composed of n

nonmonotonic neurons with fully recurrent connections.

These neurons are divided into three groups, input, middle,

and output parts, though all neurons obey the same dynam-

ics and learning rules. For convenience, we give serial

numbers to the neurons such that neurons 1 to k are the input

part, k + 1 to l are the middle part, and l + 1 to n are the

output part.

The dynamics of the network is expressed by

where ui is the potential of neuron i, wij is the synaptic

weight from neuron j, zi is the external input, and W is a time

constant. The output yi is given by

where f�u� is a nonmonotonic function as shown in Fig. 1.

We use, as the nonmonotonic output function,

(1)

(2)

(3)

Fig. 1. Nonmonotonic output function.

12

Page 3: Recognition of sequential patterns by nonmonotonic neural networks

where c, cc, h, and N are constants (we substitute c = 50, cc

= 10, h = 0.5, N = �1 in the experiments described later).

Since the polarity of ui is important in nonmonotonic

neural networks, we consider xi sgn�ui� and treat the

vector x �x1, . . . , xn� as the network state, where sgn(u) =

1 for u > 0 and �1 for u d 0.

The network state x at an instant is represented by a

point in the state space consisting of 2n possible states.

When x changes, it almost always moves to an adjacent

point in the state space, because xi changes asynchronously.

Consequently, x leaves a track with time, which we call the

trajectory of x. Similarly, we call xin �x1, . . . , xk�,

xmid �xk�1, . . . , xl� and xout �xl�1, . . . , xn� the states of

the input, middle, and output parts, respectively, and con-

sider the trajectories of xin, xmid, and xout in the state space

of each part.

2.2. Learning

For simplicity of discussion, we assume for the pre-

sent that k l, or that the model has no middle part; the case

of l ! k will be treated in section 4, but the basic principle

is the same.

Let s1�t�, . . . , sm�t� be m sequential patterns to be

recognized, where sP s1P, . . . , sk

P�. We assume that the

elements siP are r1 and change asynchronously. Then we

can consider m trajectories corresponding to sP in the k-

dimensional pattern space regarded in the same light as the

state space of the input part. These trajectories may intersect

or overlap with one another.

We perform learning so that the state xout of the output

part becomes a target state SP �sk�1P , . . . , sn

P� when the

sequential pattern sP is input into the input part. The learn-

ing algorithm is as follows.

First , we create a learning signal vector

r �r1, . . . , rn� with binary elements �ri r1�. The learning

signal rin corresponding to the input part is sP, that is,

ri siP for i d k. The learning signal rout corresponding to

the output part is a spatiotemporal pattern changing gradu-

ally from a static pattern O to SP, where we assume

O ��1, . . . , �1� without losing generality. Since r is an

n-dimensional binary vector, as well as x, r is regarded as

moving in the state space of the network from

�sP�0�, O� { �s1P�0�, . . . , sk

P�0�, �1, . . . , �1� to �sP�T�, SP�,

where T is the temporal length of sP.

Next, we give an initial state x �sP�0�, O� and input

r in the form zi Oiri to the network while it acts according

to Eq. 1. Here, Oi denotes the input intensity of ri, which is

a constant Oin for the input part �i d k� and a variable Oout

decreasing with the process of learning for the output part

�i ! k�.

We simultaneously modify all synaptic weights wij

according to

where Wc denotes a time constant of learning �Wc !! W� and

D is a learning coefficient. Since learning performance is

better when D is a decreasing function of |ui| [6], we put

D Dcxiyi, Dc being a positive constant.

When r is moving in the state space, x follows slightly

behind, leaving its track as a gutter in the energy landscape

of the network [6]. If r reaches the end, we keep

r �sP�T�, SP� and continue modifying wij until x comes

close enough to r.

We apply this procedure to all P, and repeat it over a

number of cycles, gradually decreasing Oout. If xout can reach

a state near SP even when Oout 0, then the learning is

completed.

2.3. Recognition

By the above learning, the trajectories of x, which are

roughly the same as those of r, become attractors of the

dynamical system formed by the nonmonotonic neural

network [5, 6]. Accordingly, there exist m trajectory attrac-

tors in the state space after learning. Using this, the sequen-

tial patterns sP are recognized in the following way.

Let us assume that sc �s1g, . . . , sk

g) is an input pattern

made by transforming (e.g., adding noise to) s1 and that sig

is 1, �1, or 0. We input it to the model in the form

zi Oinsig � �i d k�. To the output part, we give the initial state

O and input nothing �zi 0� thereafter.

When sc is input in this way, x is attracted to the

nearest trajectory attractor that is thought to correspond to

s1. Consequently, it is expected that the output state xout

becomes nearly equal to S1 when we finish inputting sc.

3. Behavior of the Model

To confirm the above principle, computer simulations

were performed using a network with 300 input and 200

output neurons �k l 300, n 500� [12].

Four sequential pat terns s1 {ABC}40W,

s2 {ABD}40W, s

3 {DAC}40W, and s4 {DBC}40W were

used in the experiment, where A, B, C, and D are k-dimen-

sional binary vectors selected at random; {ABC} represents

the shortest path from A via B to C, and {ABC}T denotes a

spatiotemporal pattern whose trajectory is {ABC} and tem-

poral length is T. The target states S1, S2, S3, and S 4 were

selected at random, but S1 and S2 were selected such that

they have a similarity of 0.5, where similarity is defined by

the direction cosine between two vectors. The reason we

make S1 and S2 similar is described below.

After finishing 10 cycles of learning, we input various

patterns and examined the behavior of the model. The

(4)

13

Page 4: Recognition of sequential patterns by nonmonotonic neural networks

parameters were Dc 2, Wc 5000W, and Oin 0.2; Oout was

decreased by degrees from 0.2 to 0.

3.1. Recognition process

Figure 2 shows the process of recognition when part

of s1 was input. Specifically, zi 0.2si1 for half elements of

the input part that are randomly chosen and zi 0 for the

other half; at t ! 40W, zi 0 for all i. Similarities (direction

cosines) between xout and SP denoted by dout�SP� are plotted

in the top graph and those between xin and A to D are plotted

in the bottom one. The abscissa is time scaled by the time

constant W.

The similarity din�A� between xin and A increases very

rapidly from the initial value of 0.5 to more than 0.9 and

then decreases gradually. In this process, din�A� � din�B� is

constant and is nearly equal to 1, which indicates that xin is

moving along the path {AB}. We also see that xin u B at

t 20W and xin u C at t ! 40W and that the whole of s1 is

restored in the input part.

On the other hand, the similarity dout�S1� between

xout and S1 (thick line) increases consistently with time and

xout u S1 at t ! 40W. This means that the model has correctly

recognized the input pattern as s1. We should note that the

trajectory {ABC} of s1 overlaps everywhere with other

trajectories and thus s1 cannot be distinguished by the

instantaneous input pattern at any moment.

We should also note that dout�S1� and dout�S

2� rise in

the same manner while t � 25W and rapidly separate at

t u 30W. This indicates that xout moves in the middle of

trajectories {OS1} and {OS2} at first and then approaches

{OS1} when xin approaches C after passing through B.

This process is schematically shown in Fig. 3, where

the n-dimensional state space of the network is represented

three-dimensionally. Panels (a) and (b) depict the same

thing from different angles. The origin represents the initial

state x �A, O� meaning xin A and xout O. The thick line

represents the trajectory of x and the broken lines represent

the trajectories r1 and r2 of the learning signal for s1 and

s2. The gray lines represent the projection onto the x1�x2 or

x3�x4 plane, that is, the trajectories in the state space of the

input or output part.

If we observe only the input part, the two trajectories

r1 and r2 overlap in their first half and then diverge in their

second. On the other hand, the trajectories in the output part

diverge at the starting point. Thus, as a whole, r1 and r2 are

separate but rather close in the first half.

The nonmonotonic neural networks have the prop-

erty that when some attractors exist nearby, states lying

Fig. 2. A process of recognition.

Fig. 3. Schematic of the recognition process.

14

Page 5: Recognition of sequential patterns by nonmonotonic neural networks

between them are comparatively stable [6, 7]. In other

words, the energy landscape of the network has a �flat�

bottom between neighboring attractors because the attrac-

tive force is smaller near attractors (note that in the case of

conventional neural networks, the energy landscape is

�sharp� at attractors). Consequently, while xin is moving

along the common path {AB}, xout moves in the middle of

{OS1} and {OS2}.

As xin departs from B toward C, the distance from x

to r2 increases rapidly whereas that to r1 does not increase

to such an extent so that x cannot remain in between the

two. As a result, x is attracted to r1 and xout approaches S1.

We can see from the above discussion why a spa-

tiotemporal pattern {OSP}T, rather than a static pattern SP,

should be used for the learning signal rout. That is, if rout is

a static pattern, r1 and r2 are far apart over the entire path

and thus x is attracted to either of the two soon after starting

from the origin; once x is attracted to r1, for example, x can

hardly transfer to r2 even if xin goes along {BD} afterwards.

By similar reasoning, if the trajectories of s1 and s2

are identical or very similar in their first section, the corre-

sponding target states S1 and S2 should be similar so that the

distance between r1 and r2 is decreased. Then there is less

possibility that x is attracted to r1 or r2 before sufficient

information is given to the model, and even if it occurs, x

can transfer to the correct trajectory more easily.

3.2. Recognizing patterns with blank sections

The trajectory attractor formed by the above learning

not only has a strong surrounding flow that runs into it, but

also has a gentle flow that moves as fast as r along the

trajectory [6]. Accordingly, this model can recognize a

learned sequential pattern even if the input pattern has some

blank sections.

As an example, the first quarter of s2 (from A to the

midpoint between A and B) was input for 0 d t d 10W and

then the input was cut off. Figure 4 shows the behavior of

the model in the same way as shown in Fig. 2, but the thick

line represents dout�S2�. We see that the movement of x is

roughly the same as that in Fig. 2 for 10W � t d 20W, although

there is no external input. However, when xin passed

through B and slightly approached C and D, x stopped. This

is thought to be an equilibrium state in which the attracting

forces from multiple attractors balance out.

Then the third quarter of s2 (from B to the midpoint

between B and D) was input for 30W d t d 40W, and the input

was cut off again. We see that x is attracted to r2 and finally

comes close to �D, S2�.

In this way, this model complements the blank sec-

tions of the input pattern and recognizes it correctly, pro-

vided that no other trajectory attractors exist near the blank

sections.

3.3. Recognizing patterns with temporal

extension and contraction

As described above, x basically moves at the same

pace as that in learning after being attracted to the learned

trajectory. However, when sP is input at a different pace, xin

follows the input pattern unless the pace is too fast; then xout

keeps pace with xin and approaches SP. That is, the model

can recognize a temporally extended or contracted pattern

of sP.

Figure 5a shows the recognition process when

s3 {DAC}40W was input at one-fifth the pace, and Fig. 5b

shows the case when s4 {DBC}40W was input at double the

pace. We see that the input patterns are correctly recog-

nized, though the transition of xin is slightly delayed in the

latter case.

In this connection, if the input pace is still faster, the

delay of xin increases so much that the model fails in

recognition. Also, as the input pattern contains more noise,

the model is less tolerant to temporal extension and con-

traction of the pattern.

4. Introduction of the Middle Part

In the example of the previous section, we treated the

case where the input spatiotemporal patterns have rather

simple trajectories, and thus we may associate xin�u sP�

directly with xout that has a short straight trajectory from O

to SP. If sP are long and intricately intertwined, however, the

Fig. 4. Behavior when the sequence with blank sections

was input.

15

Page 6: Recognition of sequential patterns by nonmonotonic neural networks

above method does not work well because the difference in

trajectory structure between rin and rout is too large.

We can reduce the difference by increasing the di-

mension of the output part and making the trajectory of rout

long and curved. However, this causes another problem,

that we cannot know the result of recognition until xout

comes close enough to the end point SP of the trajectory of

rout, whereas in the above model, we can easily distinguish

(e.g., by a perceptron) the destination of xout when it is

attracted to one of the trajectories.

To solve this problem, we will introduce hidden

neurons in the middle part, leaving rout unchanged. That is,

we expect that we can associate xin with xout through the

middle part where xmid draws an intermediary trajectory.

4.1. Generating the learning signal for the

middle part

The structure and dynamics of the network with

hidden neurons were previously described. The point is

how to generate the learning signal rmid for the middle part.

We often use error signals obtained by the back-

propagation algorithm for training hidden neurons. This

method, however, has problems as described in section 1

and is unsuitable for the present model. It is also undesirable

to generate rmid off-line with a complex method. We thus

choose a method of supplementing the above model with a

network which generates rmid from rin and rout in real time.

Of course, the supplementary network should be as simple

as possible and not require complex learning.

The requirements for rmid are as follows: (1) its tra-

jectories may be curved, but should be shorter and less

intertwined than those of rin; (2) it should reflect the change

in rin and rout to some extent; (3) its starting point should be

near the initial state Oc ��1, . . . , �1� of the middle part.

Since all of these properties lie in between those of rin and

rout, it is thought that a desired rmid is obtained by mixing

rin and rout using a randomly connected network.

Based on this idea, we constructed the model shown

in Fig. 6. The lower half of the figure is the supplementary

network used only in learning. This network consists of

ordinary binary neurons of the same number l � k as the

middle part. Each neuron i�i k � 1, . . . , l� receives rin and

rout through synaptic weights aij and bij, respectively, and

Fig. 5. Behavior when the sequences with temporal

extension and contraction were input.

Fig. 6. Structure of the model with the middle part.

16

Page 7: Recognition of sequential patterns by nonmonotonic neural networks

its output ri is given to the corresponding hidden neuron as

the learning signal. It also has a self connection with a

positive strength U, which prevents rmid from sharp fluctua-

tions and makes the trajectory of rmid smooth. In mathemati-

cal terms,

where k � 1 d i d l and we put ri �1 for t � 0.

The synaptic weights aij and bij are randomly deter-

mined, but the average of bij should be positive so that rmid

may be close to Oc in the initial state when rout = O. In the

following experiments, aij are normally distributed random

numbers with mean 0 and variance 1 / k, bij are those with

mean 1 / �n � 1� and variance 1 / �n � 1�, and U = 1; these

values are determined by several trials, and are not neces -

sarily optimal.

4.2. Computer simulation

Computer simulations were performed on the model

with 400 input, 600 hidden, and 200 output neurons (k =

400, l = 1000, n = 1200).

We prepared 21 sequential patterns s1�s21 as shown

in Table 1. These patterns were generated by connecting

400-dimensional binary vectors A�G in random order. On

average, A�G appear 15 times each; also, it is calculated

that a unit section such as {AB} and {AC} appears twice

(actually, the frequency of appearance was distributed from

0 to 6). The temporal length T was set to 80W for all sP.

The target states SP of the output part were selected

randomly out of the orthogonal vectors to O, but if the

trajectories of sP and sn�P z n� are identical in the first p

quarters, corresponding SP and Sn were so selected that their

similarity is p/4.

Using rin sP, rout {OSP}T, and rmid generated in

the above way, the model was trained 15 times for each

sequential pattern. As learning proceeds, the input intensity

Omid of rmid was decreased gradually from 0.2 to 0, as well

as Oout. The parameters for learning are the same as those

in section 3 except that Wc 40000W.

Figure 7 shows the recognition process when

{CcEcAcCcAc}T was input after learning; Ac, Cc, and Ec are

vectors obtained by randomly flipping 100 components

(the noise ratio is 50%) of A, C, and E, respectively. The top

and bottom graphs indicate the time courses of dout�SP� and

din�A��din�G�, respectively. The middle graph indicates the

state transition of the middle part in a different way, where

similarities between xmid�t� and rmidP �t� (used in learning sP)

at time t are plotted.

We see that the model recognized the input pattern as

s8 {CEACA}T. In this process, xmid moves roughly along

the trajectory of rmid8 , departing gradually from those of the

other rmidP . It should be noted that s8 overlaps with

s7 {CEABD}T in their first half and thus x at first moves

along an intermediate trajectory between r7 and r8.

In the same way, for 14 out of 21 sP, patterns with a

noise ratio of 50% were correctly recognized. Recognition

failed for the other 7, but all of them were correctly recog-

nized when the noise ratio was 25%.

(5)

Fig. 7. A recognition process of the model with the

middle part.

Table 1. Sequential patterns for the experiment

17

Page 8: Recognition of sequential patterns by nonmonotonic neural networks

Figure 8 shows the behavior when a vague pattern

{�AD�Ec�AG�Fc�AC�}T was input. Here, Ec and Fc are vec-

tors containing 50% of random noise, and (AD) denotes a

vector lying midway between A and D; (AG) and (AC) are

also middle vectors. Note that (AD) is quite different from

Ac and Dc.

We see that the model recognized this pattern as

s3 AEGFAT which is overall the most similar. We also see

that G and A are restored in the input part when (AG) and

(AC) are being input, respectively, whereas an intermediate

pattern between A and D appears when (AD) is input

initially.

Generally, however, this kind of vague pattern is

difficult to recognize, and in many cases, xout does not reach

any target state, so that recognition fails. One of the causes

is that in this experiment, vectors A and B are nearly

orthogonal and vector (AB) is very distant from them. The

performance will thus be better if similar objects that can

be confused with each other are encoded into similar vec-

tors; however, such improvement remains for future study.

5. Conclusions

We have described a model that can recognize se-

quential patterns by the use of trajectory attractors formed

in a nonmonotonic neural network. Distinctive features of

this model are enumerated in the following.

1. The input sequence is not expanded into a spatial

pattern, so that no delay elements are necessary and the

length of the sequence is not restricted

2. The network changes its state continuously,

based on fully distributed representation. This enables such

flexible recognition as shown above.

3. The network has simple architecture. The learn-

ing algorithm is also simple and does not require many

iterations of input.

4. The input sequential pattern is not only recog-

nized but restored, that is, defects of the pattern are repaired

in both spatial and temporal dimensions.

From these features, the model seems much closer to

the brain in its working principle than the existing models

of sequential pattern recognition, and thus we expect that it

has a great potential in brain modeling as well as techno-

logical application.

However, the model described in this paper is a basic

one and there remain many subjects for future study. For

example, we should proceed with experimental and theo-

retical analysis of the properties of the model. Also, this

model has much room for further development; for exam-

ple, we may possibly treat complex sequences more effi-

ciently by introducing hierarchical structure into the middle

part. Improvement of learning signal generation and appli-

cation to speech recognition are also future subjects.

Acknowledgments. The authors thank Mr.

Hidekazu Kimura (currently with NTT Data Co.) for pre-

liminary experiments in this study. Part of this study was

supported by a Grant-in-Aid for Scientific Research

(#08279105, #08780328 and #0978031) from the Ministry

of Education of Japan.

REFERENCES

1. Waibel A. Modular construction of time-delay net-

works for speech recognition. Neural Computation

1989;1:328�339.

2. Tank DW, Hopfield JJ. Neural computation by con-

centrating information in time. Proc Natl Acad Sci

USA 1987;84:1896�1900.

Fig. 8. Behavior when the input pattern has vague

sections.

18

Page 9: Recognition of sequential patterns by nonmonotonic neural networks

3. Elman JL. Finding structure in time. Cognitive Sci

1990;14:179�211.

4. Futami R, Hoshimiya N. A neural sequence identifi-

cation network (ANSIN) model. Trans IEICE

1988;J71-D:2181�2190.

5. Morita M. Neural network models of learning and

memory. In: Toyama T, Sugie N, editors. Brain and

computational theory. Asakura Publishing; 1997. p

54�69.

6. Morita M. Memory and learning of sequential pat-

terns by nonmonotone neural networks. Neural Net-

works 1996;9:1477�1489.

7. Morita M. Associative memory of sequential patterns

using nonmonotone dynamics. Trans IEICE

1995;J78-D-II:678�688.

8. Morita M, Yoshizawa S, Nakano K. Analysis and

improvement of the dynamics of autocorrelation as-

sociative memory. Trans IEICE 1990;J73-D-II:232�

242.

9. Morita M. Associative memory with nonmonotone

dynamics. Neural Networks 1993;6:115�126.

10. Morita M. A neural network model of the dynamics

of a short-term memory system in the temporal cor-

tex. Trans IEICE 1991;J74-D-II:54�63.

11. Morita M. Computational study on the neural mecha-

nism of sequential pattern memory. Cognitive Brain

Res 1996;5:137�146.

12. Morita M, Murakami S. Recognition of spatiotempo-

ral patterns by nonmonotone neural networks. Proc

ICONIP�97, vol 1, p 6�9.

AUTHORS (from left to right)

Masahiko Morita graduated in 1986 from the Department of Mathematical Engineering and Information Physics, Faculty

of Engineering, University of Tokyo, where in 1991 he obtained a D.Eng. degree. He has been an assistant professor at the

University of Tsukuba since 1992. He is engaged in research on biological information processing and neural networks. He

received a Research Award and a Paper Award from the Japanese Neural Network Society in 1993 and 1994, respectively.

Satoshi Murakami graduated in 1995 from the College of Engineering Systems, University of Tsukuba. He is currently

in the Doctoral Program in Engineering there, studying neural information processing.

19