machine perception of edges using cognitive mapping

21
Pergamon Computers Elect. Engng Vol. 20, No. 2, pp. 99-119, 1994 Copyright © 1994 Elsevier Science Ltd Printed in Great Britain. All rights rcscrvvd 0045-7906/94 $6.00 + 0.00 MACHINE PERCEPTION OF EDGES USING COGNITIVE MAPPING MADAN M. GUPTA and GEORGE K. KNOPF~" Intelligent Systems Research Laboratory, College of Engineering, University of Saskatchewan, Saskatoon, Sask., Canada STN OWO (Accepted for publication 15 June 1993) Almtract--An algorithm for the perception of edge information by a vision machine is presented in this paper. In this approach, the sensored visual data are transformed by the process of cognition into generalized manifestations called percepts and concepts. Machine cognition is hypothesized as the transformation of sensory information between the information domain and comprehension domain. Percepts are the mental impressions of the sensory data which arise within the information domain. On the other hand, concepts are meanings which are attached to these percepts in the comprehension domain. The mechanisms for transforming raw data into percepts and then attaching conceptual meanings to them may be described by cognitive mapping functions. In this paper, a fuzzy set theoretic approach is used to emulate the cognitive mapping processes that occur within the information domain. More specifically, this algorithm employs fuzzy mapping functions to transform sensory gray-level images into various properties associated with a perceived edge. This new notion interprets an edge as a field with distributed membership values that may be observed at various levels of perception. Key words: Machine perception, Information domain, Comprehension domain, Gray-level images, Cognitive mapping functions, Perceived edges, Levels of perception. 1. INTRODUCTION During the past three decades numerous scientists and engineers have been working towards the development of sophisticated machine systems that are able to mimic certain aspects of human behavior. As yet, no machine has been developed that can be called "truly" intelligent or autonomous because all artificial systems suffer from inherent limitations when dealing with unstructured environments. In this context, it is imperative that an autonomous machine be able to adequately acquire and interpret a variety of information about its local environment in order to function effectively. A fast and efficient vision system is, therefore, an integral part of any autonomous machine. The role of the vision system is not merely to improve picture quality but to extract the necessary information required for making intelligent decisions [1-4]. All aspects of image processing, feature extraction and scene analysis must be performed independently by the system. Fundamental problems in vision systems design arise when deciding what information within the original sensor image is important and how to synthesize the large variety of information attributes into a simplified coherent form. If auatonomous intelligent machines, similar to that shown in Fig. 1, could achieve even a small degree of the robust characteristics exhibited by living systems it would significantly increase their usefulness as tools for a large variety of engineering applications. At the moment, a critical distinction in the way information is processed in biological and computer vision systems is the notion of perception. Some researchers feel that machine perception is distinct from biological perception [1]. They hold the view that machine perception is nothing more than the process of quantitatively interpreting the measurements obtained from a variety of sensors. However, this view does not account for the subjective nature of human awareness and interpretation. Contrary to many existing computer vision paradigms, some psychological evidence [5] suggests that biological visual information is processed in terms of relative grades rather than absolute values. Relative grades are able to reduce the massive volume of possible information embedded tPresent address: Mechanical Engineering Department, University of Western Ontario, London, Canada. ¢AEE 20/2--B 99

Upload: madan-m-gupta

Post on 25-Aug-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Pergamon Computers Elect. Engng Vol. 20, No. 2, pp. 99-119, 1994

Copyright © 1994 Elsevier Science Ltd Printed in Great Britain. All rights rcscrvvd

0045-7906/94 $6.00 + 0.00

M A C H I N E P E R C E P T I O N O F E D G E S U S I N G C O G N I T I V E

M A P P I N G

MADAN M. GUPTA and GEORGE K. KNOPF~" Intelligent Systems Research Laboratory, College of Engineering, University of Saskatchewan,

Saskatoon, Sask., Canada STN OWO

(Accepted for publication 15 June 1993)

Almtract--An algorithm for the perception of edge information by a vision machine is presented in this paper. In this approach, the sensored visual data are transformed by the process of cognition into generalized manifestations called percepts and concepts. Machine cognition is hypothesized as the transformation of sensory information between the information domain and comprehension domain. Percepts are the mental impressions of the sensory data which arise within the information domain. On the other hand, concepts are meanings which are attached to these percepts in the comprehension domain. The mechanisms for transforming raw data into percepts and then attaching conceptual meanings to them may be described by cognitive mapping functions. In this paper, a fuzzy set theoretic approach is used to emulate the cognitive mapping processes that occur within the information domain. More specifically, this algorithm employs fuzzy mapping functions to transform sensory gray-level images into various properties associated with a perceived edge. This new notion interprets an edge as a field with distributed membership values that may be observed at various levels of perception.

Key words: Machine perception, Information domain, Comprehension domain, Gray-level images, Cognitive mapping functions, Perceived edges, Levels of perception.

1. I N T R O D U C T I O N

During the past three decades numerous scientists and engineers have been working towards the development of sophisticated machine systems that are able to mimic certain aspects of human behavior. As yet, no machine has been developed that can be called "truly" intelligent or autonomous because all artificial systems suffer from inherent limitations when dealing with unstructured environments. In this context, it is imperative that an autonomous machine be able to adequately acquire and interpret a variety of information about its local environment in order to function effectively. A fast and efficient vision system is, therefore, an integral part of any autonomous machine. The role of the vision system is not merely to improve picture quality but to extract the necessary information required for making intelligent decisions [1-4]. All aspects of image processing, feature extraction and scene analysis must be performed independently by the system. Fundamental problems in vision systems design arise when deciding what information within the original sensor image is important and how to synthesize the large variety of information attributes into a simplified coherent form.

If auatonomous intelligent machines, similar to that shown in Fig. 1, could achieve even a small degree of the robust characteristics exhibited by living systems it would significantly increase their usefulness as tools for a large variety of engineering applications. At the moment, a critical distinction in the way information is processed in biological and computer vision systems is the notion of perception. Some researchers feel that machine perception is distinct from biological perception [1]. They hold the view that machine perception is nothing more than the process of quantitatively interpreting the measurements obtained from a variety of sensors. However, this view does not account for the subjective nature of human awareness and interpretation.

Contrary to many existing computer vision paradigms, some psychological evidence [5] suggests that biological visual information is processed in terms of relative grades rather than absolute values. Relative grades are able to reduce the massive volume of possible information embedded

tPresent address: Mechanical Engineering Department, University of Western Ontario, London, Canada.

¢AEE 20/2--B 99

100 MADAN M. GUPTA and GEORGE K. KNOPF

Work-environment / ~// 1 1

/ / / /

/ / /

~ Sensor ~ Perception ~ module module

/ Control module [/4 IN

P ¢,

. . . . . . . . Actuator t

Movement

Reasoning/ knowledge

base module

Fig. I. A generalized framework of an autonomous intelligent machine.

in the signal by means of generalizations. These grades reflect properties such as relative contrast, relative constancy and relative motion. Thus, in order to perceive any visual phenomenon there must be some relative difference in the visual signal. It is the degree to which this relative difference exists in the sensor signal that forms the basis of biological perception. From a design perspective, an approach based on relative information would enable both the reduction and aggregation of immense volumes of sensory signals into acceptable and manageable forms for the higher levels of comprehension.

A small sub-class of the overall visual perception problem is addressed in the following paragraphs. In the edge perception problem, the ambiguity arises because of the subjective definition of an edge in the image. In many conventional edge detection algorithms, the edge is defined as a locus of boundary points that exist between two significantly different intensity levels. In other words, an edge is assumed to be binary in the sense that it either does or does not exist. In machine vision applications, such edges are often detected by finding the location in the image where the intensities change above a certain predefined threshold [6,7]. However, the definition of what is a significant difference in intensity is very subjective, and in many ways contradictory to the precise or crisp definition of an edge. In natural vision it is the process of perception rather than detection that play an important role in the extraction of the visual information embedded in the image. Thus, for the edge perception problem we assume that the mental processes create an ambiguistic (fuzzy) edge phenomenon called an edge field which is comprised of a distribution of possible edges over the interval [0, 1].

The ambiguity inherent within gray-level images can be modeled using the notion of graded membership functions. The fuzzy set theoretical approach has been successfully applied to processing digital gray-level images [8,9-13,14-16]. In order to extract perceptual information, such as edges, the cognitive mapping functions are used to transform the gray-level digital images from the sensor space to the perception space. Once in the perception space, various visual attributes associated with perception are determined by employing fuzzy-logic operations. This notion of perceiving edges over a range of values leads to various new edge definitions such as perception level and edgefieid. Several simulated examples are given in Section 4 which further illustrate this theory for machine vision applications.

Machine perception of edges 101

2. C O G N I T I V E I N F O R M A T I O N P R O C E S S I N G

2.1. The cognitive domain

Experimental scientists and engineers interpret the universe in terms of physical entities or events. The mathematical tools often employed to describe these events are probability theory and binary set theory. These traditional approaches provide a well-developed algebra to describe and relate various observed events. Fundamental to these approaches is the measurable, experimental and reproducible nature of the events. Notions of observability and randomness are pertinent to the employment of these theories. However, the cognitive universe which contains all human information and thought processes does not invoke exercises in experimentation or in the observation of random events. Cognition is interpreted as the process whereupon we extract information from sensory signals and attach meanings or definitions to such perceived information. These meanings are both descriptive and subjective, and can be expressed in the form of concepts. The concept is a generalized subjective idea and is, therefore, fuzzy in nature.

All external knowledge of the environment in which we operate is physically acquired through natural sensors and reshaped by the processes of cognitior~. This sensory information does not directly influence our mental understanding, rather it is first generalized into vague manifestations called percepts. These percepts or impressions form our perception of a physical phenomenon. For machine vision applications, cognition may be summarized as the process that initially maps the sensor data into perceived information, and later maps concepts onto this perceived information in order to attach meanings.

The process of extracting percepts from sensor data is confined to the information domain F [13]. The cognitive mechanisms associated with perception map information from the sensor space S to the perception space P. Thus, the information domain F may be described as:

r = s u P . (1)

Correspondingly, all activities related to the recognition, understanding or comprehension of the perceived information are assumed to occur within the comprehension domain 0 [13]. This domain is occupied by both the perception space P and the concept space C. The comprehension domain is thus described as:

0 = P u C . (2)

Note that both the information and the comprehension domains intersect at the perception space. Therefore, the universal cognitive domain U can be defined as the representational space where all mental activities occur, and is given by:

U = S u P u C . (3)

In summary, the overall perception of a visual scene is the result of attaching conceptual meaning to the sensor information. That is, the individual elements of a sensor image x~(xi c S) are mapped onto the percepts Ai(A~ c P) contained within the information domain F. Correspondingly, the percepts A~ are mapped onto the concepts B~(Bi ~ C) in the comprehension domain 0. This process of cognition is summarized in Fig. 2.

2.2. Notion of graded membership

The fuzzy set theoretic approach [17] is a powerful tool for modeling the processes of both cognition and perception. In general, fuzzy set theory is based on the notion of graded membership, /~A(x), and deals with inexact or ambiguous phenomena. The uncertainty found in cognitive information may arise due to the following three situations:

(i) Generality: This notion refers to the variety of possible situations that may arise in a particular phenomenon; that is, the defined universe is not a single point.

(ii) Ambiguity: This notion describes more than one distinguishable sub-phenomenon such that the membership will have several local maxima.

(iii) Vagueness: This notion reflects a set of phenomena with non-precise or non-crisp bound- aries.

102 MADAN M. GUPTA and GEORGE K. KNOpF

F / x~ -e ' - - - - -~ S /

' / " x 3 / I i I /

"-o-

U

U Universal cognitive domain

F Information domain

0 Comprehension domain

S Sensor space

P Perception space

C Concept space

Fig. 2. The process of perception in the universal cognitive space.

In terms of cognitive information processing, a fuzzy set may be used to describe a percept that corresponds to any possible attribute of the sensored information such as red, dimpled, edge, etc. All percepts are considered to be a fuzzy set with an elastic membership that can be stretched and shaped to fit our perception of the corresponding attribute found in the sensor data. The degree of membership assigned to a physically acquired set of data depends upon the amount of effort involved in expanding the percept definition in order to fit it.

In general, a percept A may be defined as a set that accepts partial membership over the interval [0,1]. If we let X = {xi} be the sensor data, then the percept of an attribute can be defined by the ordered pair:

A= {X,,laA(X,)}, Xi~:X, X Cz S, (4)

where the "membership function" /~A(xi) describes our perception, confidence or belief that xi belongs to A. In other words, the grade of membership can be interpreted as the degree of compatibility between the sensor data x~ and the attribute represented by the percept A, or the degree of possibility that x~ is restricted to A.

The membership function /~A(X~) maps the fuzzy set A onto the interval [0, 1]:

~A(Xi): A--,[0, 1]. (5)

The grades 1 and 0 represent respectively the full membership and non-membership of element xi in an attribute A. A precise percept is nothing more than a set whose assigned membership is bounded by crisp values (0, 1). It should be noted that the assignment of a membership function to a percept is subjective and, in general, reflects the context in which the problem is viewed [10-13].

2.3. Cognitive mapping functions

Consider a set of visual data obtained from a sensor which is viewed as being fuzzy. This fuzziness arises during the perception of the raw visual data because of generality, ambiguity and vagueness. The set of visual data may be represented by a set of real continuous phenomena or

Machine perception of edges 103

a set of discrete countable events. For a continuous phenomenon x the overall perceived information A may be represented as an aggregation of:

= fx/~,(x), (6) A

where Sx is an aggregation over the membership functions/~A(x). For a set of countable discrete events, such as the pixels of a digital image, the perceived information may again be represented as an aggregation given by:

A = U ],~A(Xi) • (7) i

Now let the fuzzy sensory data X be defined over the real interval [Xra , XM] where x m and XM correspond to the possible lower and upper bounds of the acquired visual data obtained from the sensor. In digital gray-level image problems the bounds Xm and XM may correspond to the minimum and maximum intensity levels, respectively. A variety of cognitive mapping functions may be employed in order to transform the image from the sensory space over the range [Xm, XM] to the perception space over [0, 1]. Many types of mapping functions exist in the literature [11,17] which are able to transform subjective and ambiguous phenomena X onto a real-line in the information domain. We will now briefly discuss a few such functions that apply to the edge perception problems.

2.3.1. The S-mapping function. For the fuzzy set of sensory data X given over the real interval [Xm, XM] , we define a symmetrical crossover point x~ as:

x~ = ½(Xm + XM),

with a corresponding mapping function S as:

1 [ -xc) Vs(X) = S(x, xc, XM) = ~ 1 + sgn(x

(8a)

s i n a i _ _ / ; X m ~ X < X M, (8b)

where the power index ? is a positive real constant, y > 0. The mapping function S assigns low membership values between 0 < # < 0.5, for x ~< xc, and high values between 0.5 </~ ~< l, for x /> xc. The assignment of membership values can be enhanced by the power index 0 < ~ < l as shown in Fig. 3.

2.3.2. The ~-mapping function. As illustrated in Fig. 4, we define a symmetrical n-mapping function for a fuzzy phenomenon X on the real line over the interval [Xm, XM]:

I xs = ~ (xm + XM) (9a)

1.0

0.5

0 X m

y I

X c X m + X M x M 2

Fig. 3. Symmetrical S-mapping function.

104 MADAN M. GUPTA and GEORGE K. KNOPF

1.0

0.5

_ i

Y V

B I

Xm+X M x M m xcl Xs = 2 xc2

Fig. 4. Symmetrical n-mapping function.

and the lower and upper crossover points as:

l Xs) = l(3xm + XM) (9b) x~, = ~(x,, +

and

= ~(Xm + 3XM). (9C) Xc2 = Xs + (xc, - Xm)

Using the definitions of the points x~, xc~ and x~2, a symmetrical n-mapping function can now be defined as:

].ln(X ) = "l~.(X, Xm, XM) = ~ A ( X , Xcl , X s ) k . ) ~ B ( X , Xc2 , XM) (9d)

where hA(X, Xc~, Xs) is defined over Xm ~< x ~< xs, as is shown in Fig. 4, and is given by:

~ A ( X , Xcl , Xs) ~--- S ( x , X¢l , Xs) = ~ 1 + sgn(x - xc, 2 \x~ - x ~ , /

and rOB (X, X¢2, XM) is defined over x~ < x ~< Xu, and is given by

rt B ( x , x ~ 2 , x M ) = I - S ( x , X c 2 , x M ) = ~ 1 - - s g n ( x - - s m ~ x_--x~2 ; 0 < 7 <1 . (90 \ XM - - Xc2,'/ I -[

The membership function # . (x) is a convex mapping function which increases monotonically from 0 to 1 over the interval [Xm, Xs] and decreases monotonically from 1 to 0 over [Xs, XM]. At the crossover points x¢, and x¢2:

p,(x¢,) = p,(x~2 ) = 0.5. (9g)

2.3.3. General comments. The cognitive mapping functions defined above map the real values of x e X, over the interval [Xm, XM] onto the perception space over the real interval [0, 1]. For example, the S-mapping function, S(x, xc, XM), maps the interval [Xm, XM] onto the interval [0, 1]:

S: [Xm,XM]--*[0, 1]. (10)

The S-mapping function is a monotonic function and essentially gives the perception of a phenomenon such as the image is bright for the histogram profile over the intensity levels [Xm, XM ]. The n-mapping function, on the other hand, monotonically increases over the interval [Xm, Xs] and then decreases over [Xs, XM]. This function essentially represents the perception of a phenomenon such as brightness of the image is x~, for the histogram profile centered at xs. The S-mapping function covers the entire region from x m to XM, whereas the n-mapping function separates this region into two sections, Xm to xs and xs to XM.

Machine perception of edges 105

2.4. Mathematical operations performed on cognitive functions

The notion of graded membership plays an important role in many fuzzy mathematical operations not found in ordinary set theory. Let A be a fuzzy set of X with the membership #A(x), where x e X. Now let the following primitive mathematical operations be defined:

(i) The power set, [A g]

A g = f {#A(x)} g g > 0 . (11) 3x e X

(ii) Concentration, CON (A)

(iii) Dilation, DIE (A):

CON(A) = fx~X {#A(x)}g g > 1. (12)

DIE(A) = / {/~A(X)} g 0 < g < 1. (13) dx~X

Usually, for CON (A) the value of the exponent g is 2, and for DIL (A) it is 0.5. Several other important operations used for processing information in the perception space are described below.

Recall that a percept is the basic element of acquired or perceived information. The shape of the membership function used to define the percept is determined by the "context" in which the information is obtained and the "experience" that the system has obtained in dealing with it. A shift in the definition of a percept occurs if conditions in either of these two areas change. A shift in the contextual definition of a percept may be generally altered by the five fuzzy mathematical operations illustrated in Fig. 5. Blurring and intensification retain the basic definition of the percept by fixing the point of maximum uncertainty at/~A(x) = 0.5. The lower and upper bounds, or limit points, are altered to emphasize or de-emphasize membership values in order to achieve the desired changes.

Blurring, BLR (A), increases the ambiguity associated with the definition of the percept A by increasing the membership if it is less than 0.5, and decreasing the membership if it is greater than 0.5. The operation is defined as:

:DIE(A) for all x, 0 ~< #A(x) ~< 0.5 (14) BER(A) = (CON(A) for all x, 0.5 < #A(x) ~< l"

Intensification, INT (A), reduces the amount of ambiguity associated with the percept and, hence, the entropy associated with it. INT (A) is, therefore, the opposite of BER (A). This operation is very useful in machine vision problems that involve image enhancement and edge perception:

J'CON(A) for all x, 0</~A(x)~<0.5, (15) INT(A) = (DIE(A) for all x, 0.5 < #A(x) ~ 1.

Stretching, STR (A) and contracting, CTR (A), are operations that subtly change the percept's definition. These operations require only one point of minimum uncertainty to remain fixed. The point of maximum uncertainty is allowed to fluctuate slightly from the original location, thereby altering the percept's definition. Stretching moves the upper bound forward by employing dilation:

STR(A) = DIE(A), 0 ~< gA(x) ~< 1.0. (16)

contraction moves the lower bound in the reverse direction by employing Correspondingly, concentration:

CRT(A) = CoN(A), 0 ~</~A(X) ~< 1.0. (17)

The fuzziness inherent in a percept may also be described by the notion of second- and higher-order fuzziness. These higher-order fuzzy sets, or ultra-fuzzy sets, can be defined using the power set as:

A: g~(x) ~ [0, IIk. (18)

106 MADAN M. GUPTA and GEORGE K. KNOPF

(a) Blur operation (b ) Intensification operation

1.0

0.5

o X m

/ ~ _ / / / / /

x N

1,0

< 0.5 =L

x 0

/ //I / / '

m XM

( c ) Stretch operation (d ) Contraction operation

1.0

0.5

.~ - " 1.0

///I //I /I l/P- /////

///// x m x M

~.~ 0.5

0 x 0

- / / ~ , ~ / / / -~---~ /

/ / I /

b/ x m x M

( e ) Ultra-fuzzy operation

::L

1.0

0.5

o x X m x M

Fig. 5. Mathematical operations to alter definitions of cognitive functions: (a) blur operation; (b) intensification operation; (c) stretch operation; (d) contraction operation; and (e) ultra-fuzzy operation.

.E

1.0

0.5 / ............ i ~

. . . . . . . . . . 7 . . . . . . . . : - - - i . . . . . . . . . . 7 j - /

X c X' c

Fig. 6. Shift in the point of maximum uncertainty, ~ ^ ( x ) = 0.05.

Machine perception of edges 107

An alternate type of percept modification occurs due to changes instigated by the diverse experiences that the system encounters. This often causes the point of maximum uncertainty in the membership function to shift in a manner similar to that in Fig. 6. In other words, the experience gained by the system changes the definition of the percept which is reflected in the shift of the point of maximum uncertainty, /~A(x) = 0.5.

3. M A C H I N E P E R C E P T I O N OF E D G E S

3.1. Fuzzy edges

The distribution of visual attributes embedded in a scene often follow deterministic (shapes) or random (motion) patterns. However, our perception of these patterns often exhibit ambiguistic characteristics. The processes associated with mental activity receive physical information through the various natural sensors such as the eye. Depending upon our original objective, we often "concentrate" only upon those attributes that are most relevant to the given task at hand. Based upon an individual's past experience in dealing with this visual information, these deterministic and probabilistic attributes are transformed into generalities called ambiguistic cognitive information [10,12,13]. The process of "seeing" depends entirely upon the perception of visual attributes due to the mechanisms of mental activity rather than measurements of their physical characteristics.

Ambiguity arises in the edge perception problem because of the subjective definition of an edge. The spatial distribution of the intensity gradients which represent an edge may appear as a random phenomenon throughout the image. In many conventional edge detection algorithms edges are considered to be binary in the sense that a particular intensity gradient does or does not represent an edge. During the process of edge perception, the cognitive mechanisms create edge fields which have continuous distributions of possible edge points bounded from a minimum (=0) to a maximum (= 1) level. For this edge perception algorithm, we consider the edges to be an ambiguistic (fuzzy) phenomenon with a distribution of possible intensity levels over [0, 1].

The ambiguity associated with gray-level images may be interpreted in terms of graded memberships. Using the notion of graded membership, the gray-level images are transformed from the sensor space to the perception space by the various cognitive mapping functions discussed in Section 2.

3.2. Separation of gray-level images into multi-region intensity levels

Consider a gray-level digital image with varying intensities over the range [x~,, XM] and with a possible k-modal intensity profile as shown in Fig. 7. From this histogram, one can identify k distinct levels that correspond to peaks in the histogram. This histogram profile can be perceived as a composition of several vague intensity levels, Xl, x 2 . . . . . Xq . . . . . Xk, where the intensity levels

e~

Z

Low

@

,/ 1

Xmin

High Low

@ @

I

1

l x 1 x 2 Xc 1 XC 2 XC 3

I /

I

/

High Low I

I

I I 1 I \ I / i

' Al~,//

I Xq XCq

High

@

x k Xk-I Xc k Xrnax

Fig. 7. Multi-modal histogram of intensity values.

108 MADAN M. GUPTA and GEORGE K. KNOPF

Xq, q = 1 . . . . k, represents a set o f fuzzy numbers . This percept ion of the intensity profile can be described linguistically (in fuzzy terms) as follows:

"Over the intensity range [Xm, XM] , the gray-level digital image is composed of a set of intensity levels (x, , x2 . . . . . . ,:~ . . . . . xk)".

These intensity levels are clearly identified at their peaks Xq with membersh ip grade ( = 1), where q = 1. • . , k. A m a x i m u m a m o u n t of uncer ta inty exists, however, at the valleys xc,, xc2.. , xc, in the sense that it is not clear which adjacent peak intensity level it belongs to. For example, the intensity level at the valley x¢2 m a y belong to the intensity x~, or the intensity x2 with an equal possibility, p =0 .5 .

A m a x i m u m a m o u n t o f certainty and, hence, no ambigui ty exists at the peaks /~ = 1, and a m a x i m u m a m o u n t of uncer ta inty exists at the valleys of the h is togram # = 0.5. These valleys may be considered as the crossover points with/~ = 0.5. Thus, the entire intensity range can be divided into several possible regions, R0, R ~ . . . . . Rk, with each region being separated by points o f m a x i m u m uncertainty.

A h is togram given over an intensity range [Xm, XM] with k valleys, xc,, q = 1, 2 . . . . k, can be separated into (k + 1) regions, Rq, q = 0, 1 . . . . k, where each region is bounded by the adjacent crossover points. Thus, the width of the qth region Rq is given by:

dq = (x~ + , + x~,), (19)

where

~(x~q +,) = u(xoq) = 0.5.

The mid-poin t between the crossover point can be approx ima ted as a point o f m a x i m u m certainty; that is:

xq = ½(x:~ + xc,+, ). (20)

Note that the actual peaks o f the h is togram are at x = Xq, which may or may not cor respond to the midpoin t x = Xq o f the two adjacent crossover points given in equat ion (20).

3.3. The multi-region cognitive mapping function

The main objective in a gray-level image edge percept ion p rob lem is to separate any two adjacent intensity regions by letting the intensity level o f one region go to 0 (max imum darkness) and the other to 1 (max imum brightness). This process of dividing adjacent regions into distinct levels can

1,0

0.5

0 Xmin

Low High Low High Low High

R 0 R 1 R 2 Rq Rk_ 1 R k

/ I I t I

/ i I \ / i i ~ / / i i \ / i , ~ t

/ i , ~ I I I~ [ / I I \ I I J ~ /

/ o i \ I t f ~ l / f t \ / f I ~ /

/ u I \ / n i \ / g I I I I

XC3 X c X Xc 1 Xc 2 q Xc k max

Fig. 8. Multi-region cognitive mapping function.

Machine perception of edges 109

I be achieved by, alternatively, mapping the regions R0, R~ . . . . . Rk in the histograms into "low and high" or high and low" intensity profiles as shown in Fig. 8.

Using this alternate "low and high" intensity profile, it is possible to employ a multi-region cognitive mapping function given by:

k q~[x] = q?O Mq(x), (21a)

where Mq(x), q = 0, 1 . . . k are defined as follows: (i) for extreme left region, R0:

. ~ ( X - - X m ) , M o ( x ) = s m ~ ~ , - - x ~ X m ~ X ~ X ¢ , .

(ii) for regions R~ to Rk_~:

:: Mq(x) = 1 + s m ~ / / / ,

2L \Xc~ +, - x~dd

for q = 1, 3, 5 . . . odd numbered regions, Xc~ ~< x ~< x~, +~

and

l[ { x-xo. V! Mq(x) = ~ 1 - sm n [ / / ,

k Xc. + , - XcJ I for q = 2, 4, 6 . . . even numbered regions.

(iii) for the extreme right region, Rk:

(21b)

( 2 1 c )

(21d)

• 2 ~ fxM + x - 2xck'~ 1 - sin ~ [ ~ ~ - - / for xck ~ x ~ xM and k = even,

Mk(X)----- ,, M-- ck / (21e) • 2 ~ [ ' x M + x - x ~ , ~

sm - : ! - - - - - - | forx~k~<x~<x M a n d k = o d d . 4 \ XM-- Xck /

3.4. Algorithm fo r edge perception

Figure 9 is the functional block diagram for the edge perception algorithm presented in this paper. A description of the various parameter and operational blocks is given below:

I = [im,.]u × N, the original gray-level scene in the real world, Y = [Ym..]u × s, the noise corrupted digital image in the sensor space, Ym.. ~ [0, L - 1], F = [fm,.]u × s, the smoothed digital image in the sensor space; fro,. e [0, L - 1],

4 [ . ] = the multi-regional cognitive mapping function with (k + l) intensity regions, A = [2m,.]u × N, the intensified transformed image in the perception space, ).,.,. ~ [0, 1],

INT['] = the contrast intensification operator, A ' = [2" . ]~ × u the intensified transformed image in the perception space, 2~,,. E [0, l],

EDGE['] :-the edge field operator, fl = [com..]u × u the overall possible edge fields in the perception space, corn,, e [0, l],

f~ = the edge field at the ~-Ievel of perception, corn,. ~ [~, l].

3.4• 1. Smoothing operation. Consider Y to be a noise corrupted discrete gray-level image whose intensity levels are spatially distributed within the 2-D sensor space• Therefore, the 2-D sensor data may be mathematically represented as a matrix of order M x N having a pixel intensity y~.. in the position defined by the mth row and n th column. A noise corrupted digital image Y may be thus defined as:

Y=[Ym..]M×U; r e = l , 2 . . . . ,M, n = l , 2 . . . . . N. (22)

The individual gray-levels for any pixel Ym.. has p possible intensity levels, where p = 0, 1 . . . . L - 1. The bounds p = 0 and p = L - 1 correspond to "completely" black and white intensities, respectively.

110 MADAN M. GUPTA and GEORGE K. KNOPF

Image

IMxN

Noise B( )

YMxN

I Smoothing

Region [ segmentation [

FMxN

_1 Cognitive ] -[ mapping

AMxN

Intensification I

A'Mx N

Edgefield

~'~ M x N

a-level of perception

I QaMxN

Fig. 9. A functional block diagram of the edge perception algorithm.

In the presence of noise, the distinct clusters or regions of intensity levels may not be clearly identifiable from the histogram. In order to minimize the effect of this noise during the vision process, it is desirable to filter it out as much as possible. For this purpose, a simple point averaging scheme represented by a window W e × ¢ of size (~ × ~) is used. The window size is selected based upon the noise characteristics and intensity levels in the sensor image Y. Any smoothed pixel fro,, in the image may be defined as an average pixel intensity taken over the window:

1 = fli, y,.+~,.+j; i, j e W¢×¢, (23) f"," { x { w~.¢

where fl~a are the weighting coefficients such that:

E fli, j = 1.0. (24)

3.4.2. Cognitive mapping of the histogram profile. Images taken from the real world scenes often exhibit a very rough intensity histogram profile similar to that shown in Fig. 10. This intensity histogram contains numerous local minimum points which correspond to the many possible boundaries that may exist between the different intensity regions. It is necessary for the algorithm to select the most prominent local minima as the crossover points between distinct regions. Since the locations for the most prominent valleys or local minima points are not precisely known, a

Machine perception of edges 111

u

e~

Z

0 [L-1

Fig. 10. Multi-modal intensity histogram of a gray level image.

simple weighted averaging window is applied to the intensity histogram. The size of the window is generally small and depends upon the number of possible intensity levels in the histogram. This procedure not only eliminates some of the less significant valleys but also introduces additional ambiguity into the intensity histogram profile.

Consider an M x N smoothed image defined as:

F = [ fm. , lM × U" (25)

A histogram of a smoothed image, as shown in Fig. 11, has k distinct local minima at intensity levels, lct, lc~ . . . . . lck and k local maxima intensity levels at 10, l~, . . , lk. The intensity histogram over the range [0, L - 1] can, therefore, be divided into (k + 1) regions separated by local minima points lc,, 1, 2 . . . . . k. Corresponding to these (k + 1) distinct intensity regions in the image, we can define k edge fields. Each perceived edge field has intensity levels that lie around one of the k local minima, lc,, found in the smoothed intensity histogram profile.

The smoothed image F, represented by the intensity histogram, of the sensory space can now be transformed into perceived properties A in the perception space by employing multi-region cognitive mapping functions q~[.] as defined in Section 3.3. Thus:

F = [fro.n] ~ a = [2,...], (26)

where F c S and A c P. This mapping function can be written as:

~: [fm, n] ~ [0, L - - 1] ~ 2,... e [0, 1]. (27)

The cognitive mapping function divides the entire intensity profile into (k + 1) distinct regions

-t Z

o I o c l

\ 12 13

1c2 Ic3 Ick_ 1

L, i

Ik I /k+l L - I I

c k

Fig. 1 I. Smoothed intensity histogram of a gray-level image.

112 MADAN M. GUPTA and GEORGE K. KNOPF

assigning alternately the low and high perception values. These perception values lie within the Iow e [0, 0.5] and high e [0.5, 1] ranges. This mapping from fro,, to 2,~,, may be considered as perceiving the particular intensity level in generalized terms.

3.4.3. Contrast intensification. Once the various intensity clusters of the image have been mapped into alternating low and high perception regions, these regions may be enhanced further by using the contrast intensification operation, Section 2.3, on the pixels 2m,,. Thus:

A ' = [~m,n]M × N = INT[~m.n]M × N, ~m,. E [0, 1]. (28)

This IWT[2m,,] operation assigns very low membership values to a pixel in the Low range [0, 0.5), and very high membership values to a pixel in the High range of [0.5, 1.0]. In carrying out this procedure, the intensification operator reduces the amount of ambiguity and, therefore, the entropy associated with each pixel obtained during the process of data acquisition and transformation into the perception space [13]. The mathematical formulation given above for the transformation of images from the sensor space to the perception space attempts to emulate some of the subjectivity inherent in human visual information processing for the development of a flexible machine vision system.

3.5. The edge field

The contrast intensified image [2~,,,]M × N obtained in (28) provides the basis for determining the field of possible edges. Each point in the image has a membership in the edge percept by some value over the range [0, 1]. The stronger the impression of an edge, or edge percept, the closer the membership value is to 1. The absence of edge membership, such as a point on a continuous surface, will result in a value around 0. All points in the image whose memberships are above a small threshold ~ are candidates for the edge field f~.

An edge field operator EDGF['] is used to determine the possible edge points inherent in the enhanced image [2~,,,]M × N of the perception space. Most points in the image lie in either the lower perception range [0, 0.5) or the higher perception range (0.5, 1.0], therefore, they belong to specific perceived regions and not to the boundaries between such regions. Some points, however, have membership values in the neighborhood of 0.5. These points reflect most of the regional uncertainty because their association to specific regions is not crisply identifiable. Such points possess an entropy value around the maximum possible value of 1. Thus, the points that lie between two adjacent regions will generate a field of points in the 2-D perception space with maximum, or near maximum, entropy values.

Now consider a point x at the spatial location (m, n) in the perception space. Centre a window W e ×~ of size (~ x ~) around the point x. The size of the window determines an acceptable width of the perceived edge field. If greater edge details (high resolution) are desired, then a small window (3 x 3), may be employed. If only a rough edge representation is required, then a larger window size may be used.

c -

o

e~

0.5

1 . 0 - -

. . . . . . . . . . . . . . . . . . . . . . Et

I

Lore, n 0.5 1.0

t Macroscopic edges [0.75, 1]

Fig. 12. Level of perceived edge in the edge field f].

Macroscopic and microscopic edges [0, 1]

Machine perception of edges 113

Let tOm.n be the membership value of the possible edge point at location (m, n) in the overall edge field. The mapping from the general perceived information [2~,.n] ~ [0, 1] to the edge field to.,,, e [0, 1] may be derived by the following EDGF['] operator

EDGF[']:,~m,n ~ [0, 1] --* co,... ~ [0, 1]. (29)

The EDGF['] operator may be defined by one of the following min-max operations:

o r

r }1 (i) ~o,... = 2;,,, - max 2[, , L \ i,j / _ 1

(ii) ' ' f-Om, n = m,n - min ~ i , j k i,j

(30a)

(30b)

(iii) (On,, = max 2i,j -- min 2[j , (30c) L k i,j \ i,j

where ( i , j )#(m,n) , (i , j)e W~×¢, 2[j~ [0, 1] and ogm,,e [0, 1]. The edge field is a field that contains the edge membership values 09,,.n for all points in the

perception space:

edge field = 0 0 e~,,,n. (31) m n

Again, each point ~o,,,, has a possible membership to the edge percept by a value over the real number interval [0, 1].

The edge field contains all points which are possible candidates for a perceived edge, however, each point may be interpreted at different levels. This notion of separating perception into different levels of perception corresponds to the degree of visibility of the edges in a scene. We define an or-level of perception as the edge information contained in the edge field fl that is bounded by the interval [0t, 1], as shown in Fig. 12. The resultant edge field with respect to a particular or-level is, therefore, denoted by f~. A value for ct in the neighborhood of 1.0 creates an edge field with only macroscopic edges, while a value in the neighborhood of 0 creates an edge field with microscopic details. Generally, the macroscopic edges represent the basic outline of the objects contained in the scene while the microscopic edges represent greater surface contour information. The microscopic edges may be helpful in aiding many decision-making tasks encountered in machine vision. However, in most situations sufficient detail for object recognition is provided by extracting macroscopic edges only.

4. S I M U L A T I O N S T U D I E S

4.1. Examples of edge perception

Several sets of gray-level imagery are presented in order to demonstrate the various concepts described in the previous sections on edge perception. The edge perception algorithm was implemented on a VAX 11/780 computer and COMTAL image processor. Each of the original 256 x 256 dimensional gray-level images contained intensity levels bounded over the range of 0 to 255. Unless otherwise stated, the edge perception algorithm employed a power index of 7 = 2 for the S- and n-cognitive mapping functions (21), g = 2 for the intensification operator (28) and the minimum EDGF['] operator given by (30b).

For illustration purposes, only two different sets of imagery are reported in this paper. The first set of images is of a geometric object digitized under ambient lighting conditions. The second set corresponds to an outdoor scene of a marathon runner. These images possess numerous edges with varying degrees of edge ambiguity, thereby representing a variety of possible situations found in real-world environments. The various perceived edges at different 0t-levels of perception, ~t = 0.25, 0.5, 0.75, for the geometric object are illustrated in Fig. 13. Figure 14 shows the effect of modifying the contrast intensification operator, with g = 1, 2, 4 and 16, on the degree of edges perceived over the entire image. Correspondingly, Fig. 15 shows the perceived edges over

114 MADAN M. GUPTA and GEORGE K. KNOPF

a. Original image, YMxN" b. Perception space A'Mx N.

c. Edge field, ~ = 0. d. Edge field, ~ = 0.75

e. Edge field, ot = 0.5. f. Edge field, o~ = 0.25

Fig. 13. Images of a geometric object.

the different s-levels for the mara thon runner, and Fig. 16 shows the effect of changes in contrast intensification on the edge field content.

4.2. Discussion

The two sets of gray-level imagery are presented in this section to demonstrate the strength of the edge field that is found by transforming information from the sensor space to the perception

Machine perception of edges 115

a. g = l . b. g = 2 .

c. g = 4 . d. g = 16.

Fig. 14. Edge fields, ~ = 0, of a geometric object for various intensification operations g.

space by cognitive mapping functions. The perception of the edge attributes are now defined for various possible levels of perception ~, where each level ~ is a value within the interval [0, 1]. Within this definition of cognitive activity, the robust nature of edge perception lies in the utilization of information ambiguity and the dynamic positioning of perception level ~ for a specific task. The perceived edge information extracted from both the geometric object and marathon runner illustrate these notions. By changing the value of ~, the degree of edge connectivity can be enhanced. Any additional modifications in the extracted information may be achieved without recomputing the edges in the scene, rather it is accomplished by selecting a new range of possible edge points, bounded by ~, in the overall edge field f~.

Machine perception based on cognitive mapping is also capable of distinguishing macroscopic and microscopic edges. The macroscopic edges are generally the most prominent ones, and these primarily identify the overall outline of the objects in a scene. As ~ is lowered, the microscopic edges become more apparent and reflect greater surface details. This characteristic is more clearly evident in the example of the marathon runner when ~ = 0 (Fig. 15c). In many machine vision problems only the macroscopic edges need to be perceived for adequate recognition of objects. However, in certain circumstances, such as detail manufacturing or surveillance, the microscopic details embedded in the scenes are necessary. In such a situation the machine vision system is required to interpret the perceived edges over [ct, 1], where ~ is in the neighborhood of 0.

Another important generalization is that by increasing the power index g of the edge information operator (28), it is possible to obtain more crisply defined edge information. This is, however, at a loss of general edge information as reflected by a reduction in edge possibility and thereby edge details. As the edges become more crisp, the possibility of other potential edge points is reduced until only one possible edge exists. This final edge is not necessarily the "optimum" or "true" edge,

CAEE 2 0 / 2 ~

116 MADAN M. GUPTA and GEORGE K. KNOPF

a. Original image, YMxN" At b. Perception space MxN"

c. Edge d. Edge field, a = 0.75

el Edge fie l d , ~ - - 0.5. ~ f. Edge field, a = 0.25

Fig. 15. Images of a marathon runner.

but often represents only one possible edge. This procedure effectively diminishes the robust nature of the edge perception algorithm presented in this paper.

5. C O N C L U S I O N S Perception is an inherently biological process that accounts for ambiguities in sensor data by

creating impressions of the information embedded therein. At the present time it is not possible

Machine perception of edges 117

a. g = l . b. g = 2 .

C. g = 4 . d. g = 16.

Fig. 16. Edge fields, ct = 0, of a marathon runner for various intensification operations g.

to emulate the physiological processes of perception in machines because we are restricted to existing computer paradigms. However, it is within this paradigm that an algorithm for the machine perception of edges using cognitive mapping functions was described. The new notion of mapping ambiguous sensory information from the sensor space onto a graded perception space was introduced. The mapping between the different spaces uses the fuzzy set theoretic approach. Additional notions of edge fields and perceived edges at various different levels of perception have also been introduced. Although the idea of perception is not new, it is introduced for the first time in this type of machine vision problem. A small sample of examples has been given to illustrate the various situations where this notion may be useful. In this paper only 2-D gray-level images have been considered. However, extensions of these notions to multidimensional, coloured and textured images are possible.

R E F E R E N C E S

1. R. Jain, Perception Engineering. Machine Vision and Applications, Vol. 1, no. 2, pp. 73-74 (1988). 2. R. Kasturi and R. C. Jain, Computer Vision: Advances and Applications. IEEE Computer Society Press, Los Alamaos,

CA (1991). 3. L. Uhr, Psychological motivation and underlying concepts. In Structured Computer Vision (S. Tanimoto and A. Klinger,

Eds), pp. 1-30. Academic Press, New York. (1980). 4. L. Uhr, Highly parallel, hierarchical, recognition cone perceptual structures. In Parallel Computer Vision (L. Uhr, Ed.),

pp. 249-292. Academic Press, Orlando FL (1987). 5. L. H. Hurvich, Color Vision. Sinauer Assoc. Inc. Sunderland, MA (1981). 6. K. R. Castleman, Digital Image Processing. Prentice-Hall, Englewood Cliffs, NJ (1979).

118 MADAN M. GUPTA and GEORGE K. KNOPF

7. W. K. Pratt, Digital Image Processing. Wiley, New York (1978). 8. M. R. Civanlar and H. J. Trussel, Digital signal restoration using fuzzy sets. 1EEE Trans. Acoustics, Speech Signal

Process. 34, 919-936 (1986). 9. T. K. De and B. N. Chatterji, The concept of deenhancement in digital image processing. Patt. Recog. Lett. 2, 329-332

(1984). I0. M. M. Gupta and G. K. Knopf, Theory of edge perception for computer vision feedback control. J. Intell. Robot. Syst.

2, 123-151 (1989). 11. M. M. Gupta, G. K. Knopf and P. N. Nikiforuk, Sinusoidal-based cognitive mapping functions. In Fuzzy Logic in

Knowledge-Based Systems (M. M. Gupta and T. Yamakawa, Eds), pp. 69-92. North-Holland, Amsterdam (1988). 12. M. M. Gupta, G. K. Knopf and P. N. Nikiforuk, Edge perception using fuzzy logic. In Fuzzy Computing: Theory,

Hardware, and Applications, (M. M. Gupta and T. Yamakawa, Eds), pp. 35-51. North-Holland, Amsterdam. (1988). 13. M. M. Gupta and G. K. Knapf, Concepts and conceptual fields for pattern understanding. Proc. 1989 IFSA Congr.,

pp. 751-754. Seattle WA (1989). 14. S. K. Pal and R. A. King, Image enhancement using smoothing with fuzzy sets. 1EEE Trans. Syst. Man. Cybernet.

11, 494-501 (1981). 15. S. K. Pal and R. A. King, On edge detection of X-ray images using fuzzy sets. IEEE Trans. Part. Anal. Mach. lntell.

5, 69-77 (1983). 16. S. K. Pal, Decision making through fuzzy measures. In Approximate Reasoning in Expert Systems (M. M. Gupta, A.

Kandel, W. Bandler and J. B. Kiszka, Eds), pp. 179-199. North-Holland, Amsterdam (1985). 17. A. Kaufmann and M. M. Gupta, Introduction to Fuzzy Arithmetic. Van Nostrand Reinhold, New York (1985). 18. D. H. Hubel, Eye, Brain and Vision W. H. Freeman, New York (1988).

A U T H O R S ' B I O G R A P H I E S

Madan M. Gupta--Madan Gupta (Fellow, IEEE and SPIE) is the Professor of Engineering and the Director of the Intelligent Research Laboratory and the Center of Excellence on Neuro-Vision Research at the University of Saskatchewan, Saskatoon, Canada. He received the B.Eng. (Hons.) in 1961 and the M.Sc. in 1962, both in electronics-communications engineering, from the Birla Engineering College (now the BITS), Pilani, India. He received the Ph.D. degree for his studies in adaptive control systems in 1967 from the University of Warwick, U.K.

Dr Gupta's field of research has been in the areas of adaptive control systems, non-invasive methods for the diagnosis of cardiovascular diseases, monitoring the incipient failures in machines, and fuzzy logic. His present research interests are expanded to the areas of neuro-vision, neuro-control, fuzzy neural networks, neuronal morphology of biological vision systems, intelligent systems, cognitive information, new paradigms in information and chaos in neural systems.

In addition to publishing over 400 research papers, Dr Gupta has co-authored two books on fuzzy logic with Japanese translation, and has edited 14 volumes in the field of adaptive control systems and fuzzy logic/computing and fuzzy neural networks. He was elected to the IEEE Fellowship for his contributions to the theory of fuzzy sets and adaptive control systems, and to the advancement of the diagnosis of cardiovascular disease. He has been elected to the grade of Fellow of SPIE for his contributions to the field of neuro-vision, neuro-control and neuro-fuzzy systems.

He has served the engineering community in various capacities through societies such as; IEEE, IFSA, IFAC, SPIE, NAFIP, UN and ISUMA. He has been elected as a Visiting Professor and as a Special Advisor (in the areas of high technology) to the European Centre for Peace and Development (ECPD), University for Peace, established by the United Nations.

Machine perception of edges 119

George K. Knopf---George Karl Knopf was born in Saskatchewan, Canada. He received a B.A. degree in the humanities and a B.E. degree in mechanical engineering from the University of Saskatchewan in 1984, and the M.Sc. and Ph.D. degrees in machine vision from the University of Saskatchewan in 1987 and 1991, respectively. Dr Knopf was a research associate with the Centre of Excellence on Neuro-Vision Research (IRIS) at the University of Saskatchewan. Recently he joined the Department of Mechanical Engineering, University of Western Ontario, London. He has co-authored numerous technical papers in the field of neuro-vision systems. His major research interests include machine vision systems, neural networks, robotics, fuzzy approximate reasoning methods for ill-defined systems and biological paradigms for engineering applications. He is the co-editor (with Dr M. M. Gupta) of the IEEE-press book on Neuro-Vision Systems, Theory and Applications.