multidimensional neural networks unified theory rama murthy_new age_2007

8/8/2019 Multidimensional Neural Networks Unified Theory Rama Murthy_NEW AGE_2007

1/168


2/168


3/168

This page

intentionally leftblank


4/168


5/168

Copyright 2008, New Age International (P) Ltd., Publishers

Published by New Age International (P) Ltd., Publishers

All rights reserved.

No part of this ebook may be reproduced in any form, by photostat, microfilm,

xerography, or any other means, or incorporated into any information retrieval

system, electronic or mechanical, without the written permission of the publisher.

All inquiries should be emailed to [email protected]

PUBLISHINGFORONEWORLD

NEW AGE INTERNATIONAL (P) LIMITED, PUBLISHERS

4835/24, Ansari Road, Daryaganj, New Delhi - 110002

Visit us at www.newagepublishers.com

ISBN (13) : 978-81-224-2629-8


6/168

Dedicated to the memory ofDedicated to the memory ofDedicated to the memory ofDedicated to the memory ofDedicated to the memory of

JESUS CHRIST

and

SARASWATI


7/168

This page



8/168

This book deals with a novel paradigm of neural networks, called multidimensional neuralnetworks. It also provides comprehensive description of a certain unified theory of control,communication and computation. This book can serve as a textbook for an advanced courseon neural networks or computational intelligence/cybernetics. Both senior undergraduate

and graduate students can get benefit from such a course. It can also serve as a referencebook for practicising engineers utilizing neural networks. Further more, the book can beused as a research monograph by neural network researchers.

In the field of electrical engineering, researchers have innovated sub-fields such ascontrol theory, communication theory and computation theory. Concepts such as logicgates, error correcting codes and optimal control vectors arise in the computation,communication and control theories respectively. In one dimensional systems, the conceptof error correcting codes, logic gates are related to neural networks. The author, in hisresearch efforts showed that the optimal control vectors (associated with a one dimensionallinear system) constitute the stable states of a neural network. Thus unified theory isdiscovered and formalized in one dimensional systems. Questioning the possibility oflogic gates operating on higher dimensional arrays resulted in the discovery as well asformalisation of the research area of multi/infinite dimensional logic theory. The authorhas generalised the known relationship between one dimensional logic theory and onedimensional neural networks to multiple dimensions. He has also generalised therelationship between one dimensional neural networks and error correcting codes tomultidimensions (using generator tensor).

On the way to unification in multidimensional systems the author has discovered andformalised the concept of tensor state space representation of certain multidimensionallinear systems.

It is well accepted that the area of complex valued neural networks is a very promisingresearch area. The author has proposed a novel activation function called the complex signum

function. This function has enabled proposing a complex valued neural associative memoryon the complex hypercube.

He also proposed novel models of neuron (such as linear filter model of synapse).

This book contains 10 chapters. The first chapter provides an introduction to the unifiedtheory of control, communication and computation. Chapter 2 introduces a mathematical

PrefacePrefacePrefacePrefacePreface


9/168

model of multidimensional neural networks and the associated convergence theorem. In

Chapter 3, the concepts of multidimensional error correcting codes, multidimensionalneural networks and optimization of multi-variate polynomials (associated with a tensor)over various subsets of multidimensional lattice, are related from different view points. InChapter 4, Tensor State Space Representation (TSSR) of certain multidimensional linearsystems is discussed. In Chapter 5, Unified Theory of Control, Communication andComputation in multidimensional linear systems is summarized. In chapter 6, the authorproposes a novel complex signum function. In Chapter 7, a novel optimal filtering problemassociated with a one dimensional linear system is formulated and solved. In Chapter 8, alinear filter model of synapse is proposed. Also a novel continuous time associative memoryand the associated convergence theorem are discussed. In Chapter 9, a novel model ofneuron and associated real/complex neural networks are proposed. Finally in Chapter 10,advanced theory of evolution based on the unified theory is briefly discussed.

The Chapters in this book are organised in such a way that there is considerableflexibility in its use by its readers. For instance, Chapters 1 to 5 can form the basis for agraduate course on multidimensional neural networks and unified theory. This course is acompulsory course for students interested in doing research on computational intelligence(cybernetics). The students/researchers interested in doing research on complex valuedneural networks will find interesting material in Chapters 6 and 9. Further, the students/researchers interested in exploring interrelationship between signal processing and neuralnetworks will enjoy understanding the material in Chapters 7 and 8. Finally, Chapter 10will provide counter-intuitive insights into the theory of organic evolution.

This writing project would not be possible without the cooperation of my brother Dr.G.V.S.R. Prasad and my beloved mother. I thank many colleagues at IIIT and those aroundthe world who believe that this book is my first masterpiece. I specially thank Sri Damodaranand other employees of New Age International (P) Ltd. for making my dream of publishingthis book a reality.

G. Rama Murthy

(viii) Preface


10/168

PREFACE (vii)

1. INTRODUCTION 1

LOGICALBASISFORCOMPUTATION 3

LOGICALBASISFORCONTROL 3LOGICALBASISOFCOMMUNICATION 4

ADVANCEDTHEORYOFEVOLUTION 6

2. MULTI/INFINITE DIMENSIONAL NEURAL NETWORKS, MULTI/INFINITE

DIMENSIONAL LOGIC THEORY 9

2.1. I NTRODUCTION 9

2.2. MATHEMATICALMODELOFMULTIDIMENSIONALNEURALNETWORKS 11

2.3. CONVERGENCETHEOREMFORMULTIDIMENSIONALNEURALNETWORKS 14

2.4. MULTIDIMENSIONALLOGICTHEORY, LOGICSYNTHESIS 17

2.5. I NFINITEDIMENSIONALLOGICTHEORY: INFINITEDIMENSIONALLOGICSYNTHESIS 20

2.6. N EURALNETWORKS, LOGICTHEORIES, CONSTRAINEDSTATICOPTIMIZATION 232.7. CONCLUSIONS 25

3. MULTI/INFINITE DIMENSIONAL CODING THEORY: MULTI/INFINITE

DIMENSIONAL NEURAL NETWORKSCONSTRAINED STATIC OPTIMIZATION 27


3.2. MULTIDIMENSIONALNEURALNETWORKS: MINIMUMCUTCOMPUTATIONIN

THECONNECTIONSTRUCTURE: GRAPHOIDCODES 29

3.3. MULTIDIMENSIONALERRORCORRECTINGCODES: ASSOCIATEDENERGY

FUNCTIONSGENERALIZEDNEURALNETWORKS 34

3.4. MULTIDIMENSIONALERRORCORRECTINGCODES: RELATIONSHIP

TOSTABLESTATESOFENERGYFUNCTIONS 39

3.5. N ON-BINARYLINEARCODES 42

3.6. N ON-LINEARCODES 45

3.7. CONSTRAINEDSTATICOPTIMIZATION 53

3.8. CONCLUSIONS 59

ContentsContentsContentsContentsContents


11/168

4. TENSOR STATE SPACE REPRESENTATION: MULTIDIMENSIONAL SYSTEMS 61

4.1. I NTRODUCTION 614.2. STATEOFTHEARTINMULTI/INFINITEDIMENSIONALSTATIC/DYNAMICSYSTEMTHEORY:

REPRESENTATIONBYTENSORLINEAROPERATOR 63

4.3. STATESPACEREPRESENTATIONOFCERTAINMULTI/INFINITEDIMENSIONAL

DYNAMICALSYSTEMS: TENSORLINEAROPERATOR 65

4.4. MULTI/INFINITEDIMENSIONALSYSTEMTHEORY: LINEARDYNAMICALSYSTEMS STATESPACEREPRESENTATIONBYTENSORLINEAROPERATORS 69

4.5. STOCHASTICDYNAMICALSYSTEMS 70

4.6. DISTRIBUTEDDYNAMICALSYSTEMS 73

4.7. CONCLUSIONS 76

5. UNIFIED THEORY OF CONTROL, COMMUNICATION ANDCOMPUTATION: MULTIDIMENSIONAL NEURAL NETWORKS 79


5.2. ONEDIMENSIONALLOGICFUNCTIONS, CODEWORDVECTORS, OPTIMALCONTROLVECTORS:

ONEDIMENSIONALNEURALNETWORKS 80

5.3. OPTIMALCONTROLTENSORS: MULTIDIMENSIONALNEURALNETWORKS 82

5.4. MULTIDIMENSIONALSYSTEMS: OPTIMALCONTROLTENSORS,

CODEWORDTENSORSANDSWITCHINGFUNCTIONTENSORS 90

5.5 CONCLUSIONS 92

6. COMPLEX VALUED NEURAL ASSOCIATIVE MEMORY ON THE COMPLEX HYPERCUBE 95


6.2. FEATURESOFTHEPROPOSEDMODEL 96

6.3 CONVERGENCETHEOREMS 97

6.4. CONCLUSIONS 105

7. OPTIMAL BINARY FILTERS: NEURAL NETWORKS 107


7.2. OPTIMALSIGNALDESIGNPROBLEM: SOLUTION 107

7.3. OPTIMALFILTERDESIGNPROBLEM: SOLUTION(DUALOFSIGNALDESIGNPROBLEM) 113


8. LINEAR FILTER MODEL OF A SYNAPSE: ASSOCIATED NOVEL REAL/COMPLEXVALUED NEURAL NETWORKS 117


8.2. CONTINUOUS TIME PERCEPTRONAND GENERALIZATIONS 118

8.3 ABSTRACTMATHEMATICALSTRUCTUREOF NEURONAL MODELS 120

8.4. FINITE IMPULSE RESPONSE MODELOF SYNAPSES: NEURAL NETWORKS 121

(x) Contents


12/168

8.5. N OVEL CONTINUOUS TIME ASSOCIATIVE MEMORY 122

8.6. MULTIDIMENSIONAL

GENERALIZATIONS

1258.7. GENERALIZATIONTO COMPLEXVALUED NEURAL NETWORKS (CVNNS) 125


9. NOVEL COMPLEX VALUED NEURAL NETWORKS 129


9.2 DISCRETE FOURIER TRANSFORM: SOME COMPLEXVALUED NEURAL NETWORKS 130

9.3. COMPLEXVALUED PERCEPTRON 133

9.4. N OVEL MODELOFA NEURON: ASSOCIATED NEURAL NETWORKS 133

9.5. CONTINUOUS TIME PERCEPTRONLEARNING LAW 134

9.6. SOME IMPORTANTGENERALIZATIONS 135

9.7. SOME OPENQUESTIONS 1359.8. CONCLUSIONS 136

10. ADVANCED THEORY OF EVOLUTION OF LIVING SYSTEMS 137

10.1. U NIFIED THEORY: CYBERNETICS 137

10.2. ORGANIC EVOLUTION 137

10.3. EVOLUTIONOF LIVING SYSTEMS: INNOVATIVE PRINCIPLES 138


INDEX 141

Contents (xi)


13/168

This page



14/168

1IntroductionIntroductionIntroductionIntroductionIntroduction

1CHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTER

Ever since the dawn of civilization, the homo-sapien animal unlike other lower level animalswas constantly creating tools that enabled the community to not only take advantage ofthe physical universe but also develop a better understanding of the physical reality throughthe discovery of underlying physical laws. The homo-sapien, like other lower level animalshad two primary necessities: metabolism and reproduction. But, more important was theobsession with other developed necessities such as art, painting, music and sculpture.These necessities naturally lead to the habit of concentration. This most important habitenabled him to develop abstract tools utilized to study nature in most advanced civilizations.Thus the homo-sapien animal achieved the distinction of being a higher animal comparedto the other animals in nature.

In ancient Greece, the homo-sapien civilization was highly advanced in many matterscompared to all other civilizations. Such a lead was symbolized by the development ofmathematics subject in various important stages. The most significant indication of suchdevelopment is left to posterity in the form of 13 books called, Euclids Elements. Thesebooks provide the first documented effort of axiomatic development of a mathematicalstructure such as the Euclidean geometry. Also, Greek, Babylonian civilizations madeimportant strides in algebra: solving linear, quadratic equations and studying the quadratichomogeneous forms in two variables (for conic sections). Algebra was revived during theRenaissance in Italy. In algebra, solution of cubic, quartic equations was carried out by theItalian algebraists. This constituted the intellectual heritage, cultural heritage along withreligious, social traditions.

To satisfy the curiosity of observing the heavens, various star constellations,astronomical objects were classified. In navigating the ships for battle purposes as well astrade, astronomical observations were made. These provided the first curious data relatedto the natural world. In an effort to understand the non-living material universe, homo-sapiens have devised various tools: measuring equipment, experimental equipment,mathematical procedures, mathematical tools etc.


15/168

2 Multidimensional Neural Networks: Unified Theory

With the discovery that Sun is the center of our relative motion system by Copernicus,

Ptolemaic theory was permanently forsaken. It gave Galileo, the curious motivation forderiving the empirical laws of far flung significance in natural philosophy/naturalscience/physics. Kepler after strenuous efforts derived the laws of planetary motionleading to some of the laws of Newton. Issac Newton formalized the laws of Galileo bydeveloping calculus. He also developed a theory of gravitation based on the empiricallaws of Kepler. Michael Faraday derived the empirical laws of electric and magneticphenomena. Though Newtons mechanical laws were successfully utilized to explainheat phenomenon, kinetic theory of gases as being due to mechanical motion of molecules,atoms, they were inadequate for electrical phenomena. Maxwell formalized Faradayslaws of electro-magnetic induction leading to his field equations. Later physics developedat a feverish pace.

These results in physics were paralleled by developments in other related areas such

as chemistry, biology etc. Thus, the early efforts of homo-sapiens matured into a clearerview of the non-living world. The above description summarizes the pre 20 th centurydevelopment of this progress on homo-sapien contributions to understanding the non-living material universe.

In making conclusive statements on the origin and evolution of physical reality,the developments of the 20th century are more important. In that endeavor, Einsteinsgeneral theory of relativity was one of the most important cornerstones of 20th centuryphysics. It enabled him to develop a general, more correct theory of gravitation,outdating the Newtonian theory. It showed that gravitation is due to curvature of space-time continuum. The general theory of relativity also showed that all natural physicallaws are invariant under non-linear transformations. This result was a significant

improvement over special theory of relativity, where he showed that all natural physicallaws are invariant under linear Lorentz transformations. This result (in special theoryof relativity) was achieved when Einstein realized that due to finiteness of velocity oflight, one must discard the notions of absolute space and time. They must be replacedby the notions of space-time continuum i.e. space and time are not independent of oneanother, but are dependent. Thus, special and general theories of relativity constrainedthe form of natural physical law.

In the 20th century, along with the Theory of Relativity, Quantum Mechanics wasdeveloped due to the efforts of M.Planck, E. Schrodinger and W. Heisenberg. This theoryshowed that the electromagnetic field at the quantum level was quantized. This, alongwith, wave-particle duality of light was considered irreconcilable with the general theory

of relativity. To reconcile general theory of relativity with various quantum theories,Y. Nambu proposed a string model for fundamental particles and formalized thedynamics of light string. Utilizing the experimentally verified quantum theories ofchromodynamics, electrodynamics, supersymmetry of fundamental particles (unifyingBosons and Fermions), it was possible to supersymmetrize the string model offundamental particles, resulting in the so-called superstring (supersymmetric string)


16/168

Introduction 3

theory. Currently, to explain the non-living universe, string model hopes to be

experimentally verifiable, theoretically viable model.But the material universe consists of living universe as well as non-living universe. All

efforts in science probed the non-living universe using experimental as well as theoreticalmethodology. The efforts of all scientists enabled them to see farther by standing on theshoulders of earlier giants. The homo-sapien animals by devising various tools discoveredand formalized various laws and theories related to non-living physical reality baseddynamical systems. The homo-sapien animal learned to build machines to facilitate hislife and that of the community surrounding him. By understanding the mechanism ofvarious functional units in living system such as ear, eye, various machines such astelephone, television, loud-speaker were built. Also, in the research area of artificialintelligence in Electrical Engineering, various functions of human brain are simulated inmachines called robots.

In the case of living universe, the scene was entirely different. The author made variouspioneering innovations on living systems unlike the extended, stretched over effort ofnon-living systems by various eminent scientists. The objective/goal of this is to provideartificial/manufacturable models of living systems i.e. robots which resemble in everyrespect living systems. In arriving at artificial models, the effort of various eminentmathematicians, scientists culminating in those of N. Wiener (who coined the wordCYBERNETICS) were helpful. The important discovery and the associated formalizationbelonged to the pioneering efforts of the author.

LOGICAL BASIS FOR COMPUTATION

George Boole developed the algebra when the variables assume true or false values.This algebra is called the Boolean algebra. Certain elementary Boolean algebraic expressionsare realized in equipment called logic gates. When the logic gates are combined/co-ordinated, arbitrary Boolean algebraic expression can be computed. The combination ofBoolean logic gates ( an assemblage with some minimum configuration of gates) andmemory elements forms an arithmetic unit. When such a unit is coupled with a controlunit the Central Processor Unit (CPU) in a computer is realized. The CPU in associationwith a memory, input and output units forms a computational unit without intelligence.This is just a machine which can be utilized to perform computational tasks in a fast manner.Various thought provoking modifications make it operate on data in an efficient mannerand provide computational results related to various problems.

LOGICAL BASIS FOR CONTROL

Faraday conducted the experiments related to electrical and magnetic phenomenon. Hediscovered the laws of electro-magnetic induction. Based on his investigations, Fleming


17/168


discovered that a time varying electric field leads to magnetic field which can be capitalized

for the motion of a neutral body. He also discovered that a time varying magnetic field leadsto electric field inside a neutral conductor and flow of current takes place. These formed theFlemings left hand and right hand rules relating the relativistic effects between the electricfield, magnetic field and conductor. These investigations of Faraday and other scientistsnaturally paved the way for electric circuits consisting of resistors, inductors and capacitors.Such initial efforts led to canonical circuits such as RL circuit, RLC circuit, RC circuit etc. Thesystems of differential equations and their responses were computed utilizing the analyticaltechniques. The ability to control the motion of an arbitrary neutral object led to applicationsof electrical circuits and their modifications for control of trajectories of aircrafts. Thus, theautomata which can perform CONTROL tasks was generated. These control automata wereprimarily based on electrical circuits and operate in continuous time with the ability to makesynchronization at discrete instants. Later utilizing the Sampling Theorem, sample-data

control systems operating in discrete time were developed.

LOGICAL BASIS OF COMMUNICATION

The problem of communication is to convey message from one point in space to anotherpoint in space as reliably as possible. The message on being transmitted through the channel,by being subject to various forms of disturbance (noise) is changed/garbled. By codingthe message (through addition of redundancy), it is possible to retrieve the original messagefrom the received message.

Thus, the three problems: control, communication and computation can be describedthrough the illustration in Figure 1.1. From the illustration, the message that is generated

may be in continuous time or discrete time. Utilizing the Sampling Theorem, if the originalsignal is band-limited, then the message can be sampled. The sampled signal forms themessage in discrete time. The message is then encoded through an encoder. It is thentransmitted through a channel. If the channel is a waveform channel, various digitalmodulation schemes are utilized in encoding. The signal, on reaching the receiver isdemodulated through the demodulator and then it is decoded. This whole assembly ofhardware equipment forms the COMMUNICATION equipment.

The above summary provided the efforts of engineers, scientists and mathematiciansto synthesize the automata which serve the purpose of CONTROL, COMMUNICATIONAND COMPUTATION. These functions are the basis of automata that stimulate livingsystems. These automata model the living systems. In other words, control, communication

and computation automata when properly assembled and co-ordinated lead to robots whichsimulate some functions of various living systems.

In the above effort at simulating the functions of living systems in machines, traditionallythe control, communication and computation automata led to sophisticated robots (whichserved the purpose pretty well). Thus, the utilitarian viewpoint was partially satisfied.But, the author took a more FUNDAMENTAL approach to the problem of simulating a


18/168


19/168


are extended to multi/infinite dimensional linear systems. Also, the results developed

in one dimension for computation of optimal control are immediately extended to certainmulti/infinite dimensional linear systems. This result in association with the formalizationof multi/infinite dimensional logic theory, multi/infinite dimensional coding theory(as an extension of one dimensional linear and non-linear codes) provided the formalUNIFIED THEORY in multi/infinite dimensional linear systems. The formalmathematical detail on models of living system functions are provided in Chapters 2 to5. These chapters provide the details on control, communication and computationautomata in multiple dimensions. Several generalized models of neural networks arediscussed in Chapters 5 to 9. Also relationship between neural networks and optimal filtersis discussed in Chapter 7. In Chapter 10 advanced theory of evolution is discussed.

ADVANCED THEORY OF EVOLUTION

Mathematical models of living system functions motivated us to take a closer look at thefunctions of natural living systems observed in physical reality. In physical reality, we observehomo-sapiens as well as lower level animals such as tigers, lions, snakes etc. It is reasonedthat some of the functions of natural living systems are misunderstood or un-understood.

Biological living systems such as homo-sapiens lead to a biological culture. In abiological culture that originated during the ice age in oceans, various living specieswere living in the oceans. Through some process, the two necessities ofmetabolism andreproduction were developed by all living species. The homo-sapien species wasresponsible for our current understanding of various activities, functions of observedliving systems. The author hypothesizes that the homo-sapien interpretations are totally

wrong. For instance, Metabolism which leads to killing of one species by another is unnecessary to

sustain life.

The belief (like many superstitions) that death and aging are inevitable is onlypartially true.

To be more precise, it should be possible to take non-decayed organs of a living speciesand by recharging the dead cells, make it living. Many such innovative ideas on livingsystems are discussed in Chapter 10.

The only necessities of natural living systems that are observed are metabolism andreproduction. By and large the only organization and community formation that we seein other (than homo-sapiens) natural sustems are of the following form

Migratory pattern of birds

Sharing the information on the place of food

Forming a group of families to satisfy the reproductive needs

Occasional bird songs of mutual courtship

Occsional rituals related to protecting the members of their group etc.


20/168

Introduction 7

The organization, culture observed in other biological systems and other natural living

systems is nowhere comparable to those observed in the homo-sapien species. But theauthor hypothesizes that this marginal/poor organization is primarily due to lack of co-ordination which is achieved through the language. Thus, major effort in organizing thelower level species of living systems is through teaching a language. Thus, organization ofliving systems other than the homo-sapiens (for homo-sapien and other purposes) shouldbe possible.

An important part of organizing the homo-sapiens was the educational system throughan associated language. In the same spirit, by teaching some lower level animals to speakcertain language, they could be organized/educated to understand as well as developscience and technology. When the lower level animals are organized in a zoo throughvarious methods, they could lead to a culture and a civilization.

Various natural living machines have developed organs/functional units due toevolutionary needs. These functional units essentially include sensors to collect video,audio information or more generally sensors to collect data on the surrounding environmentin the universe. The data gathered by the living machine from the surrounding environmentin physical reality is utilized to perform some primary functions such as metabolism,reproduction etc. The data is processed by various functional sub-units inside the brain ofa living machine. Thus the understanding of the operation of various functional sub-unitsin the brain of natural living machines leads to building artificial living machines whichare far superior in functional capabilities.


21/168

This page



22/168

2Multi/Infinite DimensionalMulti/Infinite DimensionalMulti/Infinite DimensionalMulti/Infinite DimensionalMulti/Infinite Dimensional

Neural Networks, Multi/InfiniteNeural Networks, Multi/InfiniteNeural Networks, Multi/InfiniteNeural Networks, Multi/InfiniteNeural Networks, Multi/Infinite

Dimensional Logic TheoryDimensional Logic TheoryDimensional Logic TheoryDimensional Logic TheoryDimensional Logic Theory2

CHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTER

2.1 INTRODUCTION

One dimensional logic theory is concerned with the study of static/dynamictransformations on one dimensional arrays of zeroes and ones to arrive at arrays ofzeroes and ones. Various standard logic gates such as AND, OR, NOT, NAND, XOR,NOR are defined on one dimensional arrays/vectors. The logic synthesis of digitalintegrated circuits, consisting of the interconnection of logic gates which transit througha set of states, is performed through the utilization of the associated state transitiondiagram. The set of allowed transitions in the state space lead to various classes of

digital circuits such as shift registers, counters, flip flops etc. In one dimensional logictheory various theorems on the decomposition, synthesis of Boolean functions areproved and are utilized in the logic synthesis of complex digital integrated circuits. Inthe practical implementation of such digital integrated circuits, semiconductortechnology with devices such as diodes, transistors, field effect transistors waseffectively utilized.

The design and implementation of complex digital integrated circuits led to thedevelopment of highly sophisticated computers, computer systems serving various practicalapplications. Some practical applications such as those in medical imaging, remote sensing,pattern recognition led to the design and implementation of various types of parallelcomputers. These computers operate on two dimensional arrays of zeroes and ones. But theprocessing units in these computers treat the two dimensional array elements as those fromone dimensional arrays. Thus, the two dimensional nature of an array with dependencystructure is never capitalized. This limitation led the author to innovate information processingunits which operate on two/multidimensional arrays. Such information processing unitsshould necessarily be based on sub-units which operate on arrays of binary data and producebinary arrays. These sub-units constitute the two/multidimensional logic circuits. A more


23/168


general class of information processing sub-units and thus the units operate on arrays whose

entries are allowed to assume multiple (not necessarily binary) values.Automata which operate on multidimensional arrays to perform desired operation

can be defined heuristically in many ways. In some applications such as in 3-d array/image processing, the information processing operation can only be defined heuristicallybased on the required function. But, a more organized approach to define multidimensionallogic functions is discovered and formalized by the author. In this chapter, the authordescribes the mathematical formalization for multidimensional logic units. The relationshipbetween multidimensional logic units and multidimensional neural networks is alsodiscussed. The generalization of the results to infinite dimensions is also briefly described.

Two dimensional neural networks were utilized by various researchers working in thearea of neural networks. The application of two dimensional neural networks to various

real world problems was also extensively studied. But, an effective mathematical abstractionfor modeling two/multi/infinite dimensional neural networks was lacking. The author inthis chapter demonstrates that tensors provide a mathematical abstraction to model multi/infinite dimensional neural networks.

The contents of this chapter are summarized as follows:

A mathematical model of an arbitrary multidimensional neural network is developed. Aconvergence theorem for an arbitrary multidimensional neural network represented by afully symmetric tensor is stated and proved. The input and output signal states of amultidimensional logic gate/neural network are related through an energy function,defined over the fully symmetric tensor representing the multidimensional logic gate, suchthat the minimum/maximum energy states correspond to the output states of the logicgate realizing a logic function. Similarly, a logic circuit consisting of the interconnection oflogic gates, represented by a symmetric tensor, is associated with a quadratic/higher degreeenergy function. Multidimensional logic synthesis is described. Infinite dimensional logictheory, logic synthesis are briefly discussed through the utilization of infinite dimension/order tensors.

This chapter is organized as follows. In section 2, a mathematical model of an arbitrarymultidimensional neural network and associated terminology is developed. In section 3, aconvergence theorem for an arbitrary multidimensional neural network is proved. In section4, the input/stable states of a multidimensional neural network are associated with theinput/output signal states of a multidimensional logic gate. A mathematical model of anarbitrary multidimensional logic gate/circuit is described. Thus, multidimensional logictheory, logic synthesis is formalized. In section 5, infinite dimensional logic theory, logic

synthesis are described. In section 6, the relationship between multidimensional neuralnetworks, multidimensional logic theories, various constrained static optimization problemsis elaborated. Various constrained optimization problems that commonly arise in variousproblems are listed. Various innovative ideas in multidimensional neural networks arebriefly described. The chapter concludes with a set of conclusions.


24/168

Multi/Infinite Dimensional Neural Networks, Multi/Infinite Dimensional Logic Theory 11

2.2 MATHEMATICAL MODEL OF MULTIDIMENSIONAL NEURAL NETWORKS

A discrete time multidimensional neural network paradigm is a dynamical system evolvingin discrete time. It can be represented by a weighted connectionist structure inmultidimensions. Thus, there is a weight attached to each edge of the connectionist structurein multidimensions and a threshold value attached to each node. At each node of theconnectionist structure, a certain algebraic threshold function is computed.

It is well known in the theory of one dimensional neural networks that a symmetricmatrix can be utilized to represent a one dimensional neural network. With the motivation,applications of one dimensional neural networks, two dimensional neural networks wereheuristically designed and utilized for various applications. But, the author for the firsttime realized that tensor is the most natural mathematical abstraction that can be utilizedto represent two/multidimensional neural networks.

Multidimensional Neural Networks: Tensors

Before describing the mathematical model of multidimensional neural networks, thefollowing discussion on tensors and associated concepts is very relevant.

It is important to realize that given n independent variables, the expression

= 1

1

n

ii

C X (2.1)

is called a homogeneous linear form in the variables, the expression

= =1 1

n n

ij i ji j

C X X (2.2)

is called a homogeneous quadratic form, the expression

= = =

1 1 1

n n n

i jijk k i j k

C X X K (2.3)

is called a homogeneous form (BoT) of degree three and so on. Given the components ofa tensor of order n, of dimension m , it is possible to define a homogeneous form ofdegree n.

The connection structure of a one dimensional neural network, the symmetric matrix,

is naturally associated with a homogeneous quadratic form as the energy function, whichis optimized over the one dimensional hypercube. Thus, in one dimension, to utilize ahomogeneous form of degree n as the energy function, a generalized neural network isemployed, in which, at each neuron, an arbitrary algebraic threshold function is computed.But, in multidimensions, to describe the connection structure of a neural network, a tensoris necessarily utilized.


25/168


With the above description of necessity of tensors to represent generalized/

multidimensional neural networks, some notation related to tensors is provided to facilitatethe description of mathematical model of an arbitrary multidimensional neural network.

Tensors, Tensor Products

Matrices are utilized to represent quadratic forms, whereas tensors are necessary torepresent a homogeneous form of degree n.

Suppose, one second order tensor is a linear function of another second order tensori.e.

= imik iklmA B (2.4)

where iklm is a set ofk4coefficients. It is easy to see that iklm is a tensor of dimension kand

order 4. This is illustrative of linear transformation of tensors.Now, we discuss some concepts in the multiplication of tensors.

LetAikand B

ikbe the components of two second order tensors. Consider all possible

products of the form

= imiklm ik C A B (2.5)

Then, the numbers Ciklm

are the components of a fourth-order tensor, called the outerproduct of tensors with componentsA

ikand B

ik.

Multiplication of any number of tensors of arbitrary order is defined similarly (BoT),i.e. the product of two or more tensors is the product of the components of the tensors,which are factors. The order of a tensor product is clearly the sum of the orders of the

factors.Contraction of Tensors: The operation ofsumming a tensor of order n (n >2) over two

of its indices is called contraction. It is clear that contraction of a tensor of order n leads to atensor of order n-2. This tensor can be repeatedly contracted to arrive at a tensor of order 2or a scalar depending on whether n is even or odd.

The result of multiplying two or more tensors and then contracting the product withrespect to indices belonging to different factors is often called an Inner Product of thegiven tensors.

Thus, based on the notation associated with the indices, it is understood from the contextwhether inner product or outer product of tensors is utilized.

With the above requisite notation from tensor algebra summarized, before describinga mathematical model of an arbitrary multidimensional neural network, the followingintuitive discussion is provided to facilitate easier understanding.

The state of a neuron at the discrete time instant n+1 is computed by summing thecontributions from other neurons connected to it through synaptic weights which are thecomponents of a fully symmetric tensor S, representing the connection structure and the


26/168


state tensor of neuronal states at the time instant n. Thus, we first compute the outer product

of connection tensor and the state tensor of neurons at the time instant n and perform thecontraction over all the indices (representing the neurons) connected to a chosen neuron.Thus, this inner product operation followed by determining its sign/parity/polarity(positive or negative value) gives us the state tensor at time instant n+1. This procedure isrepeated at all the neurons where the state is updated.

Remark

Throughout the research article, the notation multidimensional neural network is utilized.The standard notation associated with tensors utilizes the term, dimension to representthe number of values an independent variable can assume and the term, order to representthe number of independent variables. Thus, the state tensor order represents the numberof independent dimensions in the multidimensional neural network,MN. The notational

confusion between the usage of terms order, dimension should be resolved from thecontext.

Mathematical Model Description

LetMNbe a multidimensionalneural network ofdimension m and order n, thenMN isuniquely specified by ( S, T ) where ( the number of neurons in each independent variable/dimension/ order index is m )

S is a fully symmetric tensor of order 2n and dimension m . S, the connection structureof multidimensional neural network, is a fully symmetric tensor in the following sense

=1, 2,..., ; 1, 2,..., 1, 2,..., ; 1, 2,...,i i in j j jn j j jn i i inS S (2.6)

for all {i1,i2,...,in}, {j1,j2,...,jn}. This captures the intuitive notion that the multidimensionalneural network has nodes which correspond to the multidimensional neurons. Theconnectionist structure of the network, in the fully connected case, has a synaptic connectionfrom every neuron to every other neuron and thus specifies the number of order indices/dimensions/variables of the fully symmetric tensor. Furthermore, it is fully symmetricsince there is a link between any two nodes and the weight attached to the link is the samein both directions.

T is a tensor compatible with S such that each component is the threshold at the node(i1, i2,...,in) of the multidimensional neural network.

Every node ( multidimensional neuron ) can be in one of the two possible states, either

+1 or 1. The state of node (i1, i2,...., in) at time t is denoted by Xi1, i2,..., in(t). The state ofMNat time t is the tensor Xi1, i2,..., in

(t), where X is tensor of dimension m and order n. The stateevolution at node (i1, i2,...,in) is computed by

+ =1, 2,..., 1, 2,...,( 1) ( ( )),i i in i i inX t Sign H t (2.7)


27/168


where,

= =

= 1, 2,..., 1,..., ; 1,..., 1,..., 1,...,1 1 1

( ) ... ( ) ( )m mi i in i in j jn j jn i inj jn

H t S X t T t (2.8)

The next state of the network Xi1,...,

in(t +1) is computed from the current state by

performing the evaluation (2.7) at a subset of the nodes of the multidimensional neuralnetwork, to be denoted by G. The modes of operation of the network are determined bythe method by which the subset G is selected in each time interval.

If the computation is performed at a single node in any time interval, i.e.|G| = 1, thenwe will say that the network is operating in a serial mode, and if |G| = m n, then we willsay that the network is operating in a fully parallel mode. All other cases, i.e. 1 < |G| < m n,will be called parallel modes of operation. Unlike a one dimensional neural network,

multidimensional neural network lends itself for various parallel modes of operation. It ispossible to choose G to be the set of neurons placed in each independent dimension or aunion of such sets. The set G can be chosen at random or according to some deterministicrule. A state of the network is called stable if and only if

= 1,..., 1,..., 1,...( ) ( ( ) )i in i in i i inX t Sign S X t T (2.9)

where denotes inner product i.e. outer product followed by contraction over theappropriate indices. Once the network reaches such a state, there is no further change inthe state of the network no matter what the mode of operation is.

2.3 CONVERGENCE THEOREM FOR MULTIDIMENSIONAL NEURAL

NETWORKS

Utilizing a fully symmetric tensor to represent the connection structure of amultidimensional neural network, utilizing the notation of tensor products, in thefollowing, convergence theorem for an arbitrary multidimensional neural networkis stated and proved.

Theorem 2.1: LetMN= (S, T) be a multidimensional neural network of dimension m and

order n. S is a fully symmetric tensor of order 2n and dimension m with 1, ..., ; 1 ,.. ., 0i in i inS .

The networkMNalways converges to a stable state while operating in the serial mode(i.e. there are no cycles in the state space) and to a cycle of length utmost 2while operating

in a fully parallel mode (i.e. the cycles in the state space are of length 2).Proof: Serial mode of operation of the multidimensional neural network is first considered.In this mode of operation, during each time step of the operation of the neural network,the state of only one neuron is updated. In other words, the state of each neuron is onlyupdated serially. At each multidimensional neuron in the networkMN, the total synaptic


28/168


contribution from all neurons is first determined and its sign is determined to arrive at the

updated state of the neuron. Mathematically, this is achieved by computing the outerproduct of the fully symmetric tensor Sand the {+1, 1} state tensor of the multidimensionalneural network. In tensor notation, this is specified by

=1,..., ; 1,..., 1,..., ; 1,..., 1,..., .i in j jn i in j jn j jnC S X (2.10)

The total synaptic contribution at any neuron located at the location ( i1, i2,..., in)isdetermined by contracting the above outer product over all the indices {j1,j2,...,jn} i.e.over all the neurons connected to it through the synaptic weights determined by thecomponents of the fully symmetric tensor S. The resultant scalar synaptic contributionat any neuron (i1, i2,..., in) is thus determined by the inner product operation. The sign ofthe resulting scalar constitutes the updated state of neuron. Thus, the state of any neuron(i1, i2,..., in) in the multidimensional neural network in the serial mode of operation is

given by

= =

+ = 1, 2,..., 1,..., ; 1,..., 1,...,1 1 1

( 1) ( ... ( ) )m m

i i in i in j jn i inj jn

X k Sign C k T (2.11)

= ( ( ) )Sign S X k T (2.12)

where is utilized as the symbol to denote the inner product between compatible tensors.This symbol is sometimes suppressed and it should be understood from the context whetherinner product/outer product between the tensors is meant.

With the state updating scheme in the tensor notation specified, the energy functionthat is optimized in the networkMNis described. It is given by

= = = =

=< > = 1,..., ; 1,..., 1,..., 1,...,1 1 1 1 1 1

( ), ( ) .. .. ( ) ( )m m m m i in j jn i in j jni in j jn

E X k S k S X k X k (2.13)

where < > denotes the inner product operator between the compatible tensors. It isassumed in the above specification of the energy function of the neural networkMNthatthe threshold at each neuron is zero. This is no loss of generality, since by augmentingthe tensor S and the state tensor, the threshold values can be forced to be zero. It is easyto see that such a thing can always be done by considering a one dimensional neuralnetwork in which the threshold at each neuron is non-zero and arriving at a network inwhich the threshold at each neuron can be made zero by augmenting the state vector aswell as the connection matrix.

Utilizing the definition of the above energy function of the network, let = + 1 1( 1) ( )E E t E t , (discrete time index t instead ofk is used) be the difference in the

energy associated with two consecutive states (transited in the serial mode of operation of

the multidimensional neural network ), and let 1,....i inX denote the difference between the

next state and the current state of the node at location (i1, i2,..., in) at some arbitrary time t.Clearly,


29/168


=1,..., 1,...,0, if, ( ) ( ( ))i in i inX t Sign H t

1,...,i inX = = =1,..., 1,...,{ 2, if, ( ) 1, and, ( ( )) 1i in i inX t Sign H t (2.14)

+ = = +1,..., 1,...,2, if, ( ) 1, and, ( ( ) 1i in i inX t Sign H t

By assumption, the computation (2.14) is performed only at a single node at any giventime. Suppose this computation is performed at any arbitrary node at location(i1, i2,..., in) ; then the difference in energy resulting from updating the network state isgiven by

E = + 1,..., 1,..., ; 1 ,..., 1,..., 1,..., ; 1,..., 1,...,

1 1

( .. .. )i in i in j jn j jn i in j jn i inj jn i in

X S X S X

+ 1,..., ; 1,..., 1,..., 1,..., 1,...,2i in i in i in i in i in

S X X T (2.15)

Utilizing the fact that S is fully symmetric and the definitionHi1,..., in

(t), it follows that

+ = + 1,..., 1,..., 1,..., ; 1 ,... 1,....,2 i in i in i in i in i inE X H S X (2.16)

Hence, since 1,..., 1,..., 0i in i inX H and 1,..., 1,..., 0i in i inS , it follows that at every time instant,

0E . Thus, since the energy E is bounded from above by the appropriate norm ofS, the

value of energy will converge. Now, it is proved in the following that convergence ofenergy implies convergence to a stable state.

Once the energy in the network has converged, it is clear from the following facts thatthe network will reach a stable state after utmost m2n time intervals.

(a) if =0X then it follows that =0;E

(b) if 0X , then =0,E only if the change in 1,..., ( )i inX t is from 1 to +1,

with =1,..., 0.i inH

In the fully parallel mode of operation of the networkMN, the state updating schemefor the state tensor ofMNis given by

+ = 1,..., 1,..., 1,...,( 1) ( ( ) )i in i in i inX t Sign S X t T (2.17)

where denotes the inner product between compatible tensors. Since, the serial mode proofshows that a stable state is always reached with the above stated updating scheme, it is immediate

that by pairwise flipping of the values of any two dimension variables in the state tensor, the sameenergy function value is attained. This, in turn implies that in the parallel mode of operation of amultidimensional neural network, either a stable state is reached or a cycle of length utmost 2 isreached (The two state tensors lead to the same value of the energy function). This approach to theproof for the parallel mode of operation follows the one provided in reference. [Br G] Q. E. D


30/168


2.4 MULTIDIMENSIONAL LOGIC THEORY, LOGIC SYNTHESIS

One dimensional logic theory as well as logic synthesis deal with information processinglogic gates, logic circuits which operate on one dimensional arrays of zeroes and ones (ormore generally one dimensional arrays containing finitely many symbols ). The operationsperformed by AND, OR , NOR, NAND, XOR gates have appropriate intuitive interpretationin terms of the entries of the one dimensional arrays i.e. vectors. Any effort to generalizethe one dimensional logic operations to multidimensions leads to various heuristicpossibilities and requires considerable ingenuity in formalizing a definition. But, in thefollowing, utilizing the multidimensional neural network model described above, a formal/mathematical procedure to multidimensional logic theory is described.

The input and output signal states of a multidimensional logic gate are related throughan energy function. Equivalently, the multidimensional logic functions are associated withthe local optimum of various energy functions defined over the set of input m-d arrays. Inview of the mathematical model of a multidimensional neural network described in section3, it is most logical to define the minimum/ maximum energy states of a multidimensionalneural network (optimizing an energy function over the multidimensional hypercube ) tocorrespond to the multidimensional logic gate functions operating on the input arrays.

Definition 2.1

A multidimensionallogic functionrealized through a multidimensional logic gate (withinputs and outputs) is defined to be the local minimum/maximum of the energy functionof an associated multidimensional neural network.

Equivalently, the local optima of the energy function of a multidimensional neural

network correspond to the logic functions that are realized through various logic gates.The following detailed description is provided to consolidate the above definition vital

to multidimensional logic theory.

The logic functions which operate on the input array are identified to be the stable statesof a multidimensional neural network ( in multiple independent variables i.e. time, spaceetc.). These are the transformations between a set of input states of a multidimensional neuralnetwork which converge to a stable state on iteration of a multidimensional neural network.In other words, in multiple independent variables, the mapping between the input statesand the stable states to which the network converges on iteration are defined to be the logicfunctions realized by a multidimensional logic gate.

By the proof of the convergence theorem, the logic functions are invariants of a tensor on

the multidimensional hypercube. The definition of multidimensional logic function is illustratedin Figure 2.1.

In the case of one dimensional logic theory, it has been shown that the set of stablestates of a neural network correspond to various one dimensional logic functions (CAB).

With the definition of multidimensional logic function stated and clarified in manyredundant ways above, multidimensional logic synthesis is described in the following.


31/168


Multidimensional Logic Synthesis

A multidimensional logic circuit consists of an arbitrary interconnection ofmultidimensional logic gates. Multidimensional logic synthesis, as in one dimension,involves synthesizing logic circuits for different purposes.

In view of the above definition of multidimensional logic functions defined throughthe local optima of energy functions (realized through multidimensional neural networks),it is natural to see if it is possible to associate energy functions with multidimensionallogic circuits. When such a scalar valued energy function can be associated with logiccircuits, the problem of multidimensional logic synthesis, is reduced to realizing such energyfunctions. In the following, this important idea is developed.

A multidimensional logic circuit consists of interconnection of multidimensional logicgates. But, the interconnection structure of a multidimensional logic gate is represented by

a fully symmetric tensor. Since, every two gates in a logic circuit need not necessarily beconnected to one another, a multidimensional logic circuit connection structure isrepresented by a tensor of necessary/compatible order which is not necessarily fullysymmetric but it is required to be minimally symmetric. Thus, this block symmetric tensorwhich is fully symmetric within the blocks (representing the connection structure of amultidimensional neural network corresponding to a component logic gate) provides arepresentation of multidimensional logic circuit. This tensor is utilized to associatequadratic/higher degree energy functions with the multidimensional logic circuit. The setof local optima of the energy functions constitute the stable states of one or moreinterconnected logic gates. Thus, the set of input states (input pins) and output states (outputpins) of an entire multidimensional logic circuit are related through an energy function,

defined over the connection structure of a very high dimensional neural network. The setof local optima of the energy function relating the input and output pins of amultidimensional logic circuit realize various multidimensional logic functions.

From the above description, it is evident that the multidimensional logic synthesisdepends on how the multidimensional logic gates are connected to one another. Thestructure of interconnection determines the structure of symmetric tensor representingthe multidimensional logic circuit. The essential result in multidimensional logic synthesisis summarized through the following theorem.

Theorem 2.2: Given a multidimensional logic circuit, there exists a block symmetrictensor S, representing the inter-connection structure of multidimensional neuralnetworks (modeling the multidimensional logic gates). The mapping between the inputand output states of a multidimensional logic circuit corresponds to that between inputtensors, local optima of energy function (quadratic/higher degree) represented by theblock symmetric tensor.The stable states of interconnected multidimensional neuralnetworks represent the multidimensional logic functions synthesized by the logiccircuit.


32/168


The proof of the above theorem follows from the convergence theorem and is avoided

for brevity.The classification of multidimensional logic circuits is based on the type of transitions

allowed between the states in the multidimensional state space. The type of state transitionsfall into the following form:

(a) whether the next state reached depends on the past state only or not, as in onedimensional logic synthesis,

(b) the type of neighbourhood of states about the current state on which the next statereached depends. The type of neighbourhoods about the current state are classifiedinto few classes. These classes are similar to those utilized in the theory of randomfields, multidimensional image processing,

(c) the classification of trajectories transited by the multidimensional neural network

or a local optimum computing circuit/scheme.In the above discussion, we considered quadratic forms as the energy functions

(motivated by the simplest possible neural network model) optimized by the logic gates,which when connected together lead to logic circuits. This approach towardmultidimensional logic theory motivates the definition of more general switching/logicfunctions as the local optimum of higher degree forms over the various subsets ofmultidimensional lattice (hypercube, bounded lattice etc.).

Definition 2.2

Ageneralized logic function (representing a generalized logic gate or generalized logiccircuit) is defined as a mapping between an m -dimensional input array and the local

optimum of a tensor based form of degree greater than or equal to two, over varioussubsets of multidimensional lattice (the multidimensional hypercube,multidimensional bounded lattice). These local optimum of higher degree form (basedon a tensor) are realized through the stable states of a generalized multidimensiona lneural network.

In (Rama 3) , it is shown that the strictly generalized logic function defined above hasbetter properties than the ordinary logic function described in Definition 4.1. The generalizedlogic function is related to a multidimensional encoder utilized for communication throughmultidimensional channels.

Now, with the generalized multidimensional logic gate defined above, logic synthesiswith these types of logic gates involves interconnection of them in certain topology.

This ordinary and generalized approach to multidimensional logic gate definition andlogic synthesis is depicted in Figures 2.1 to 2.3. Detailed documentation on logic synthesisand design of future information processing machines is being pursued.


33/168


34/168


Proof: One dimensional neural network with state vector size infinity is uniquely defined

by (S, T) where S is an infinite dimensional (rows as well as columns) symmetric matrixand T is an infinite dimensional vector of thresholds at all the neurons.

The state of the neural network at time t is a vector whose components are +1 and 1. Thenext state of a node is computed by

+ = = + ( 1) ( ( )) 1, if, ( ) 0i i iX t Sign H t H t (2.18)

1, otherwise

where,

=

= 1

( ) ( )i ji j ij

H t S X t T (2.19)

The entries ofS are such that the infinite sum in the above expression converges.The next state of the network i.e. X ( t+1 ), is computed from the current state by

performing the evaluation (2.18) at a subset of the nodes of the network, to be denoted byK. The mode of operation of the network is determined by the method by which the set Kis selected at each time interval i.e. if|K| = 1, then we will say that the network is operatingin a serial mode. Without loss of generality T= 0.

In the following, we consider the serial mode of operation. We argue that with theabove stated updating scheme at an arbitrary chosen neuron, the energy function (quadratic)increases.

= =

=1 1

( ) ( ) ( )ij i ji j

E k S X k X k (2.20)

Without loss of generality, consider the case where all the thresholds are set to zero. Itis easy to see (set the last component of state vector to 1 and appropriately augmentedentries ofS) that for any finite L, we have

= = = =

+ + 1 1 1 1

( ) ( ) ( 1) ( 1)L L L L

ij i j ij i ji j i j

S X k X k S X k X k (2.21)

by the convergence theorem for one dimensional neural networks of order L, for anyarbitrary L. Now let L tend to infinity. Hence

= = = =

+ + 1 1 1 1

( ) ( ) ( 1) ( 1)ij i j ij i ji j i j

S X k X k S X k X k (2.22)

Thus, in the serial mode, the network converges to a stable state.

By the Convergence Theorem for one dimensional neural network (with the state vectorsize finite) in the parallel mode of operation, if any finite set of nodes is state updated,there is either convergence or existence of a cycle of length 2. Thus, when an infinite


35/168


dimensional vector is state updated in the parallel mode, for every finite segment of it,

either there is convergence or a cycle of length 2 (utmost two vectors for which the energyvalues are the same) exists. Since, the energy function associated with the infinitedimensional vector is the limit of those associated with the finite segments, it is evidentthat the scalar energy values converge or a cycle of length utmost two exists. Q.E.D.

Now, we discuss briefly, the other infinite dimensional neural networks of dimensioninfinity and order finite/infinite ( modeling tensor variables).

The following lemma is well known from the set theory.

Lemma 2.1: Countable union of countable sets is countable.

The above lemma implies that the convergence theorem proved above in associationwith the convergence theorem for multidimensional neural networks (its proof argumentin section 3) provides us with the convergence proof for a large class of infinite dimensionalneural networks (dimension and/or order of tensors utilized in modeling is infinity). Detailson the convergence theorem for infinite dimensional neural networks are provided below.

Tensors utilized to represent the connection structure, state of neurons of infinitedimensional neural network are such that the either the dimension or the order is finite/infinitewith not both of them being finite (either the dimension or the order or both are infinite ).

In one dimension, when the number of neurons is infinite and a quadratic energy functionis optimized through a neural network scheme, by a straightforward extension of the resultsin (Rama 3), the stable states of the neural network constitute a graph-theoretic code (withthe length of the codeword being infinite). The set over which optimization is carried out isthe unbounded unit hypercube (countable number of entries in the infinite dimensionalstate vector), a subset of the lattice ( based on one independent variable ).

The following theorem is concerned with the points on the lattice inmulti/infinite dimensions.This theorem is the infinite dimensional extension of the result proved in section 3.

Theorem 2.4: Let MN= (S, T) be an infinite dimensional neural network of order n/infinite and dimension infinity (number of neurons in each dimension). S is a fully

symmetric tensor of dimension infinity and order 2n/infinity with 1, ..., ; 1 ,.. ., 0i in i inS .The networkMNalways converges to a stable state while operating in a serial mode(i.e., there are no cycles in the state space), while in the parallel mode, the networkwill always converge to a stable state or to a cycle of length 2 (i.e., the cycles in thestate space are of length 2).

Proof: For a multidimensional neural network modeled by a tensor of dimension andorder finite, in the serial mode of operation, the network always converges to a stablestate. Since, the quadratic energy function is a scalar value defined over the connectiontensor (whose order, dimension are finite ), by letting the dimension and/or order tend toinfinity in (2.13), it is immediate that the energy function value increases in the serial modeuntil a stable state is reached starting in a certain initial state. Thus, for various infinite


36/168


dimensional neural networks considered, convergence to a stable state in the serial mode

of operation is ensured ( i.e. there are no cycles in the state space ).In the parallel mode of operation of the infinite dimensional neural network, by thesame reasoning as in Theorem (2.1), the network will always converge to a stable state orto a cycle of length 2 depending on the order of the network ( i.e. the cycles in the statespace are of length less than or equal to 2). Q.E.D

As in the case of multidimensional logic theory, the above convergence theorem isutilized as the basis to describe infinite dimensional logic theory as well as logicsynthesis. It should be noted that the infinite dimensional logic synthesis only hastheoretical importance. Brief discussion on infinite dimensional versions is providedfor the sake of completeness.

Definition 2.3

An infinite dimensionallogic functionrealized through an infinitedimensional logic gate(with inputs and outputs) is defined to be the local minimum/maximum of the energyfunction of an associated infinitedimensional neural network. Equivalently, the local optimaof the energy function of an Infinitedimensional neural network correspond to the logicfunctions that are realized through various logic gates.

With the above definition of infinite dimensional logic function, detailed results ininfinite dimensional logic synthesis are being developed along the lines of those inmultidimensional logic synthesis. Brief description is provided in the following for thesake of completeness.

An infinitedimensional logic circuit consists of an arbitrary interconnection ofinfinitedimensional logic gates. Infinitedimensional logic synthesis, as in one dimensioninvolves synthesizing logic circuits for different purposes. These infinite dimensional logiccircuits only have theoretical implementations. Infinitedimensional logic synthesis dependson how the infinitedimensional logic gates are connected to one another. The structure ofinterconnection determines the structure of symmetric tensor (order and/or dimension isinfinity) representing the infinitedimensional logic circuit.

2.6 NEURAL NETWORKS, LOGIC THEORIES, CONSTRAINED STATIC

OPTIMIZATION

Multidimensionalneural networksprovide a computationalparadigm to determine the

local optima of quadratic as well as higher degree forms defined in terms of tensors(including matrices) over various subsets of the multidimensional lattice. These units whichmap a multidimensional array/tensor to a local optimum (stable state of themultidimensional neural network), thus constitute the multidimensional logic gates.Interconnection of such multidimensional logic gates constitutes a multidimensional logiccircuit. Thus, multidimensional logic circuits are interconnected multidimensional neural


37/168


networks. The interconnection structure weights are represented through a symmetric

tensor. Thus, multidimensional logic theory/logic synthesis are associated with the theoryof multidimensional neural networks. These theories are in turn related to staticoptimization of various forms (quadratic as well as higher degree) over different subsetsof lattice and other sets.

Various constrained static optimization problems that are of interest in differentapplications (neural networks, logic theories etc.) are summarized below:

(1) Optimization of a quadratic form in finitely many variables over the onedimensional hypercube (one independent variable),

(2) Optimization of a higher degree form in finitely many variables over the onedimensional hypercube (one independent variable),

(3) Optimization of a quadratic form over the infinite dimensional (size of the statevector) hypercube in one dimension,

(4) Optimization of a higher degree form over the infinite dimensional (size of thestate vector) hypercube in one dimension,

(5) Optimization of a quadratic form over the finite/infinite dimensional hypercubein finitely/infinitely many dimensions,

(6) Optimization of a higher degree form over the finite/infinite dimensional

hypercube in finitely/infinitely many dimensions,

(7) Optimization of a quadratic form over a bounded lattice in finitely/infinitelymany dimensions,

(8) Optimization of a higher degree form over a bounded lattice in finitely/infinitelymany dimensions,

(9) Optimization of a quadratic form over the unbounded lattice in finitely/infinitelymany dimensions,

(10) Optimization of a higher degree form over the unbounded lattice in finitely/infinitely many dimensions.

When the constraint set is the lattice (unbounded lattice) in finitely/infinitely manydimensions and the number of state variables is not finite but countable, the objectivefunction is a power series each of whose terms is a quadratic/higher degree form. It isproved in (Rama 3) that some of the constrained optimization problems arise in the design

of multi/infinite dimensional codes. In (Rama 4), various optimization problems describedabove are utilized in dynamic optimization setting.

In the following , various innovative themes in multi/infinite dimensional neuralnetworks are briefly discussed.


38/168


Continuous Time Neural Networks

The well knownmodelof a neural network is a discrete time system in one or multipledimensions.A signal design problem for optical/magnetic recording channels modeled aslinear systems, led to the discovery of continuous time neural networks ( Rama 5). The stateupdating scheme of the continuous time neural network takes the following form

= 0

( ) ( ( , ) ( ) )T

X t Sign R t s X s ds (2.23)

In this technical memorandum, the author for the first time associates energy functionswith the state updating scheme. The multidimensional versions of these continuous timeneural networks are discussed in (Rama 4).

Complex Neural Networks

Neural networks in which the entries of the connection structure as well as state variables(indicating the binary states of the neuronal networks) are complex valued are already studiedin one dimension. These results have the corresponding multidimensional versions. Theseresults parallel the results for real neural networks. These results are aided by the fact thatthe quadratic form associated with a Hermitian symmetric matrix is always real and thus theeigenvalues of the Hermitian symmetric matrix are always real.

Adaptive Neural Networks

These are neural networks in which the connection structure of the one/multidimensional

neural network is varying with discrete/continuous time index. More explicitly, theconnection tensor whose elements constitute the synaptic weights between the neuronsthat are located in one/two/multiple dimensions is varying with the time index in someorderly ( or random ) manner. The analysis of such one/multidimensional neural networksis being studied.

2.7 CONCLUSIONS

A mathematical model of an arbitrary multidimensional neural network is described. Thismodel is utilized to prove the convergence theorem for multidimensional neural networks.Utilizing the convergence theorem, multidimensional logic functions are defined and

multidimensional logic synthesis is discussed. Infinite dimensional logic synthesis is brieflydescribed. Various constrained static optimization problems of utility in control,communication, computation and other applications are summarized. Several innovativethemes on one/multidimensional neural networks are summarized.


39/168


REFERENCES

(BoT) A. I. Borisenko and I. E. Tarapov, Vector and Tensor Analysis with Applications, DoverPublications Inc., New York,

(BrG) J.Bruck and J.W. Goodman, A Generalized Convergence Theorem for NeuralNetworks, IEEE Transactions on Information Theory, Vol. 34, No. 5, Sept 88.

(CAB) S.T. Chakradhar, V.D. Aggarwal and M.L. Bushnell, NeuralModels and Algorithmsfor Digital Testing, Kluwer Academic Publishers.

(HoT) J. J. Hopfield and D. W. Tank, Neural Computations of Decisions in OptimizationProblems, Biological Cybernetics., Vol. 52, pp. 41-52, 1985.

(Rama 1) Garimella Rama Murthy, Multi/Infinite Dimensional Logic Synthesis, Manuscriptin Preparation.

(Rama 2) Garimella Rama Murthy, Unified Theory of Control, Communication andComputation-Part 1, Manuscript to be submitted to IEEE Proceedings.

(Rama 3) Garimella Rama Murthy, Multi/Infinite Dimensional Coding Theory: Multi/InfiniteDimensional Neural Networks: Constrained Static Optimization, Proceedings of 2002 IEEEInformation Theory Workshop, October 2002.

(Rama 4) Garimella Rama Murthy, Optimal Control, Codeword, Logic Function TensorsMultidimensional Neural Networks, International Journal of Systemics, Cybernetics andInformatics, October 2006, pages 9-17.

(Rama 5) Garimella Rama Murthy,Signal Design for Magnetic and Optical Recording Channels:Spectra of Bounded Functions, Bellcore Technical Memorandum, TM-NWT-018026.


40/168

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 27

3.1. INTRODUCTION

In the recent years, technological developments in parallel data transfer mechanisms ledto HIPPI (high performance parallel interface), SMMDS (switched multi-megabit dataservice), FDDI (fiber distributed data interface). To match these high speed parallel datatransfer mechanisms, multidimensional coding theory has been originated and some adhoc procedures were developed for designing linear as well as non-linear codes.

Multidimensional codes are utilized to encode arrays of symbols for transmission overa multidimensional communication channel. Thus, the central objective in multidimensional

coding theory is to design codes that can correct many errors and whose encoding/decodingprocedures are computationally efficient. A multidimensional error correcting code canbe described by an energy landscape, with the peaks of the landscape being the codewords.The decoding of a corrupted codeword (array) which is a point in the energy landscapethat is not a peak is equivalent to looking for the closest peak in the energy landscape. Analternative way to describe the problem is to design a constellation which consists of a set ofpoints on a multidimensional lattice that are enclosed within a finite region, in such a waythat a certain optimization constraint is satisfied.

Neural network model, simulated annealing, relaxation techniques are some of the variouscomputation models (based on optimization) that have been attracting much interest becausethey seem to have properties similar to those of biological and physical systems. The standard

computation performed in a neural networkis the optimizationof the energy function. The statespace of a neuro-dynamical system can be described by the topography defined by the energyfunction associated with the network. The connection structure of a neural network can eitherbe distributed on a plane or in multidimensions (Rama 2).

Thus, the field of multidimensional neural network theory and the field ofmultidimensional coding theory are linked through the common thread of optimization of

3Multi/Infinite DimensionalMulti/Infinite DimensionalMulti/Infinite DimensionalMulti/Infinite DimensionalMulti/Infinite Dimensional

Coding Theory: Multi/InfiniteCoding Theory: Multi/InfiniteCoding Theory: Multi/InfiniteCoding Theory: Multi/InfiniteCoding Theory: Multi/Infinite

Dimensional NeuralDimensional NeuralDimensional NeuralDimensional NeuralDimensional Neural

NetworksNetworksNetworksNetworksNetworksConstrained StaticConstrained StaticConstrained StaticConstrained StaticConstrained Static

OptimizationOptimizationOptimizationOptimizationOptimization

3CHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTERCHAPTER


41/168


multivariate polynomials (tensor based) over various subsets of the multidimensional

lattice. In a nut shell, multidimensional error correcting codes and multidimensional neuralnetworks can be associated with such polynomials.

In contrast to the traditional ad hoc attempts to design multidimensional codes by ageneration of researchers, the author for the first time discovered and formalized the idea ofutilizing the theory of tensor spaces to represent and study multidimensional error correcting codes.The theory of tensor spaces enables the design of codes in one dimension (encoding aswell as decoding techniques) to be translated to multi/infinite dimensions.

Utilizing this representation, the author took a significant step forward in formallydemonstrating the relationship between multidimensional neural networks,multidimensional codes and optimization of multivariate polynomials/monomials overvarious subsets of multidimensional lattice. This relationship provides new insights into the

design of multidimensional encoders as well as decoders. Also, the relationships betweenconcepts such as minimum distance, correctable errors of multidimensional codes can bederived through new proof arguments. Furthermore, the relationship enables the utilizationof multidimensional decoding techniques for the solution of optimization of multivariatepolynomials over the multidimensional hypercube ( other subsets of multidimensional lattice),a difficult problem that arises in various applied fields such as operations research, theoreticalcomputer science etc. Also, utilizing the powerful techniques developed in these appliedareas for such problems, new algorithms for maximum likelihood decoding ofmultidimensional error correcting codes can be designed.

Thus, the results in this chapter are summarized in the following three paragraphs.

The concepts of multidimensional neural networks, multidimensional error correctingcodes, optimization of quadratic/higher degree forms based on components of a tensor(tensor component based multivariate polynomials), over various subsets ofmultidimensional lattice, are related from different viewpoints.

It is proved that given a multidimensional linear block code, a neural network(generalized neural network) can be constructed in such a way that every local maximumof the energy function corresponds to a codeword tensor and every codeword tensorcorresponds to a local maximum. It is shown that determining the global maximum of theenergy function of a multidimensional neural network/generalized neural network isequivalent to performing the maximum likelihood decoding in a linear blockmultidimensional code. The results are generalized to multidimensional non-linear as wellas non-binary codes.

Theorems related to optimization of tensor based multivariate polynomials (terms/monomials are based on the components of tensors) over arbitrary open/closed sets areproved. Infinite dimensional extension of the results is briefly discussed.

This chapter is organized as follows. In section 2, after briefly reviewing the theoryof multidimensional neural networks, it is proved that finding the global optimum ofthe energy function of the network is equivalent to finding a minimum cut in a certain


42/168

Multi/Infinite Dimensional Coding Theory: Multi/Infinite Dimensional Neural Networks 29

graphoid, the connection structure of a multidimensional neural network. In section 2, a

connection between the multidimensional neural network model and graphoid basedcodes is established. It is shown that maximum likelihood decoding in a graphoid basedcode is equivalent to finding a minimum cut in a certain graphoid. Thus, it is shown thatmaximum likelihood decoding in a graphoid based code is equivalent to finding amaximum of the energy function in a multidimensional neural network. In section 3, theresults are extended to general multidimensional linear block codes. A general energyfunction, not necessarily quadratic, is defined based on the generator tensor of a givenlinear block code. It is proved that finding the global maximum of the energy function isequivalent to maximum likelihood decoding in the code. In section 3, it is briefly discussedhow the infinite dimensional codes are represented through the infinite order/dimension(either the order or the dimension or both is infinite) generator tensor (the entries ofwhich satisfy some regularity conditions) and thus enable the infinite dimensional

versions of the results to be derived. In section 4, the energy function associated with theparity check tensor of the multidimensional linear block code is described. When thetensor is written in the systematic form, it is shown that each codeword tensor correspondsto a local maximum of the multivariate polynomial associated with the parity checktensor and that each local maximum corresponds to a codeword tensor. The results areinterpreted as the dual to the ones in the previous section for defining the MaximumLikelihood Decoding (MLD) problem. In section 5, the results are generalized to non- binary codes. Further, in section 6, the results are generalized to non-linearmultidimensional codes. In section 7, by means of a decomposition principle, theoremsrelated to optimization of tensor based (based on the components of a tensor) multivariatepolynomials over arbitrary open/closed sets are proved. Also, various innovative ideason the utilization of results in previous sections, to derive very general results in staticoptimization are described. The chapter concludes with a summary of results derived.

The results in this chapter are exactly the multidimensional versions of those in (BrB).

3.2 MULTIDIMENSIONAL NEURAL NETWORKS: MINIMUM CUT

COMPUTATION IN THE CONNECTION STRUCTURE: GRAPHOID CODES

A discrete time multidimensional neural network is a discrete time dynamical systemrepresented by a weighted undirected connectionist structure in multidimensions. At eachmultidimensional neuronal element, there is a threshold value which will fire each neuronon crossing it. Each neuronal element computes an algebraic threshold function in the

input variables.LetMNbe a multidimensional neural network of dimension m and order n; thenMNis

uniquely specified by (S, T) where ( the number of neurons in each dimension is m i.e. thenumber of values assumed by each independent dimension variable) S is fully symmetrictensor of dimension m and order 2n, and Tis a tensor of thresholds attached to neuronalelements with compatible order ( n ) and dimension ( m ). Every node can be in one of two


43/168


possible states +1 and 1. The state of node ( , ,..., )i i in1 2 at time t is denoted by 1, 2.. ., ( )i i inX t .

The state ofMNat time t is the tensor Xi1, i2,...1. in (t)of dimension m and order n. The stateevolution at node ( , ,..., )i i in1 2 is computed by

+ =1, 2 ,..., 1, 2,...,( 1) ( ( ))i i in i i inX t Sign H t (3.1)

where

= =

= 1,..., 1,..., ; 1,..., 1, 2,..., 1,...,1 1 1

( ) ... ( ) ( )m m

i in i in j jn j j jn i inj jn

H t S X t T t

The next state of the network i.e. Xi1, i2,..., in

(t + 1), is computed from the current state byperforming the evaluation (3.1) at a subset of nodes of the multidimensional neural network,to be denoted by G. The modes of operation are determined by the method by which the

subset G is selected in each time interval. If the computation (3.1) is performed at a singlenode in any time interval i.e. =| 1G , then we will say that the network is operating in the

serial mode, and if =| nG m , then we will say that the network is operating in the fullyparallel mode. A state is called stable if and only if

= 1, 2,..., 1,..., 1,...,( ) ( ( ) )i i in i in i inX t Sign S X t T (3.2)

where denotes inner product (the symbol is sometimes suppressed for notational brevity).Once a neural network reached such a state there is no change in the state of the networkno matter what the mode of operation is.

An important feature of the networkMNis the convergence theorem stated below.

Theorem 3.1:LetMN= (S, T) be a multidimensional neural network of dimension m andorder n. S is a fully symmetric tensor of order 2n and dimension m. The network MNalways converges to a stable state while operating in the serial mode (i.e. there are nocycles in the state space) and to a cycle of length utmost 2 while operating in the fullyparallel mode.( i.e. cycles in the state space are of length utmost 2 ).

This theorem is proved in (Rama 2). This theorem suggests the utilization ofMNas adevice for performing a local search of the optimum of an energy function. In the following,we formulate a problem that is equivalent to determining the global maximum of an energyfunction and how to map it onto a multidimensional neural network.

Definition 3

multidimensional neural networks unified theory rama murthy_new age_2007

Documents