curve detection in a noisy image - carnegie mellon universitysamondjm/papers/pizloetal1997.pdf ·...

PergamonVisionRes., Vol.37,No.9, pp. 1217–1241,1997Q 1997ElsevierScienceLtd.AUrightsreserved

PII: S0042-6989(96)00220-9 Printedin GreatBritain0042-6989/97$17.00+ 0.00

Curve Detection in a Noisy ImageZYGMUNT PIZLO,*$ MONIKA SALACH-GOLYS~* AZRIEL ROSENFELD~

Received 29 December 1995; in revisedform6 June 1996

In this paper we propose a new theory of the Gestalt law of good continuation.In this theoryperceptual processes are modeled by an exponential pyramid algorithm. To test the new theory weperformed three experiments. The subject’s task was to detect a target (a set of dots arranged alonga straight or curved 1ine)among background dots. Detectability was high when: (a) the target waslong; (b) the density of target dots relative to the density of background dots was large; (c) the localchange of angle was small along the entire line; (d) local properties of the target were known to thesubject. These results are consistent with our new model and they contradict prior models. @ 1997Elsevier Science Ltd. All rights reserved.

Figure–groundsegregation Localvs globalanalysis

INTRODUCTION

The goal of visual perception is to provide the observerwith visual information about the three-dimensionalenvironment so that the observer can recognize objects,manipulatethem, and navigate in the environment.Therehave been many theories and models that attempted todescribe and explain three-dimensionalvisualperception(e.g. Braunstein, 1976; Cutting, 1986; Gibson, 1950,1979; Johansson, 1977; Marr, 1982; Pizlo, 1994; Rock,1983;Shepard & Cooper, 1982;Zusne, 1970).It is clear,however, that before any three-dimensionalobject can bereconstructed or recognized, the visual system must first“decide” whether the retina contains an image of anyobject at all. If it does, the next questionis which parts ofthe retinal image correspond to this object. Thephenomenon where a region on the retina representinga given object and the boundary of this region aredetermined is called figure-ground segregation. Theimportance of this phenomenon was recognized quiteearly by Gestalt Psychologists(see, for example, Koffka,1935).

Solving the figure-ground segregation problem maynot be easy if the scene is cluttered and the objectoccluded. This is illustrated in Fig. 1, which shows ahouse behind trees and bushes. But solving the figure–ground segregation problem is difficult also (or evenprimarily) because a given retinal image does notuniquely determine the three-dimensional scene: thereis always an infinitenumberof possibleinterpretationsof

*Department of Psychological Sciences, Purdue University, WestLafayette, IN 47907-1364,U.S.A.

~Center for Automation Research, University of Maryland, CollegePark, MD 20742-3275,U.S.A.

*To whom all correspondence should be addressed [EmaiZ [email protected]].

the regions and contours in the retinal image. However,despitethis ambiguityinherent in figure-ground segrega-tion, the observer’s percept is usually unambiguous.Figure 2 shows an example. This figure contains acollectionof individualdots and there are many possibleinterpretations of this figure. For example, one couldperceive this figureas a set of individualdotsunrelated toone another.Alternatively, the observer could group thedots, which can lead to the percept of two curved lines.Note, however, that there are at least two differentgroupings possible because it is unclear whether thisfigurerepresentstwo smoothlines that give rise to an “X”intersection or whether there are two lines, one being a“V” and the otherbeing an inverted“V”, meeting at theircorners. Clearly, this figure has many possible inter-pretations.Phenomenologically,however,we tend to seetwo smooth lines forming an “X” intersection.How doesthe visual system arrive at this particular interpretation?Why is this interpretation preferred, over many othersthat are geometricallyequivalent?

According to Gestalt Psychologists, to decide amongmany possible interpretations, the visual system usessome rules or laws. These laws determinehow the imageis organized before it becomes a percept. Hence, theselaws are called laws of organization.The fundamentallaw of organization in Gestalt Psychology is called thePragnanz (or simplici~) principle. According to thisprinciple, the interpretation that is preferred by theobserver is the simplest one that can be derived from agiven retinal image. Thus, in the case shown in Fig. 2, theobserver perceives two smooth lines, rather than twolines each having a V shape, because in the former casethe lines are closer to straight lines, and a straight line issimpler than a V-shaped line. The simplicity principle,when applied to cases similar to that in Fig. 2, isconventionallycalled the law of good continuation.The

1217

1218 Z. PIZLO et al,

name of this law, good continuation,is related to theperceptual tendency to prefer a smooth line whenresolving the direction of a given line at an intersectionwith another line. This law was described first byWertheimer (1923/1958). He provided a number ofexamples illustrating this law and conjectured that goodcontinuationrefers to “an appropriatenessof thecurve, aninner belongingness, a good whole or good configura-tion”. Because of its vagueness, this conjecture did notreceive much attention and subsequent theories of goodcontinuation were based on the concept of simplicity,which due to its mathematical tractability could easilylead to testable theories (interestingly,such a formulationof good continuationis not consistentwith Wertheimer’sunderstandingof this perceptual phenomenon:he specu-lated that good continuation “does not imply a mathe-matical simplicity”).

The simplicity principle is not the only principle thatcan be used to explain the phenomenonof figure-groundsegregation. An alternative principle is called thelikelihood principle. According to this principle ourpercept follows the interpretation which is the mostlikely, or the most probableone, that can be derivedfroma given retinal image.Again, considerthe examplein Fig.2. Assume that this figure represents the observer’sretinal image produced by two lines in three-dimensions.This retinal image could have been produced by two V-shaped lines in three-dimensions only if the observerviewed these lines from one particularviewing direction,so that the images of the corners of the two lineshappened to touch one another. Clearly, such a situationis quite unlikely in everyday life and it happens withprobability close to zero (this is called a degenerateview). On the other hand, two smooth lines in three-dimensionscan give rise to an X intersectionon the retinafor a wide range of viewing directions. It is obvious thatthis can happen with probabilitygreater than zero and, asa result, this situation is more likely as compared to thecase with two V-shaped lines. Thus, there have been twodifferent ways of explaining figure-ground segregation,one based on the Pragnanz or simplicity principle of

9●

●b

●

●●

b

●

●

4b

b

FIGURE 2. Illustration of the law of good continuation. Despite thefact that this set of dots determines many possible interpretations, itusually gives rise to the percept of two smooth lines forming an X

intersection.

Gestalt and the other based on the likelihoodprinciple ofthe empiristic school of psychology.

We will begin with a review of prior research, inpsychology and in computer vision, on the law of goodcontinuation.Next, we will provide a new theory of thislaw based on a new definitionof smoothnessrooted in thelikelihoodprinciple.After the smoothnessis defined,wewill presenta stimulationmodel,based on an exponentialpyramid algorithm, of the perceptual mechanism under-lying this law. Then we will report the results of threepsychophysicalexperimentsthat tested the new definitionof smoothnessas well as the new model.

PRIOR RESEARCH

In this review we will classify the theories and theexperimentalresultswith respect to such features as localvs global processing,and bottom-up(data based) vs top-down (model based) processing. Also, we will try toidentify whether a given theory or experimental result isconsistentwith the simplicity principle of Gestalt or thelikelihoodprinciple of empiristicpsychology.

Traditionally,figure-ground segregationwas assumedto be the first stage of perceptualprocessing,which was aprerequisite for further stages such as object reconstruc-tion and recognition. This assumption implied that inorder to ensure that recognition and reconstruction ofobjects is performed within reasonably short times, thisfirst stage of visualprocessinghad to be fast. Next, sincefigure-ground segregation by itself was not assumed tolead to elaborated percepts of three-dimensionalobjectsand scenes, it seemed natural to assume that figure–ground segregation involves relatively simple mechan-isms. These apparently obvious assumptions, namely,that figure-ground segregationhad to be fast and simple,restrictedthe classof modelsthat havebeen consideredinthe past as possiblemodels of figure-ground segregation.We will show, however, that these restrictions mighthave been too severe and that using them led toinadequatemodels of figure-ground segregation,

Uttal’s (1975) study was one of the first quantitativeapproaches to curve detection in the presence of noise.Uttal used dots as stimuli.The backgroundwas a randomdot pattern and a target was some regular arrangementof

CURVE DETECTIONIN A NOISY IMAGE 1219

dots (e.g. a set of dots along a straightor curved line). Ifthe observer is presented with such a stimulus, theobservercan often easily detect the presenceof the target.Note that in such dotted stimuli, both the target and thebackground consist of the same elements, dots, whoseonly property is position. This allowed Uttal toinvestigate the role of perceptual organization itself,simply by changing the arrangementof the dots, withoutchanging the physical properties of the individual dots,like intensity, duration or blur.

Uttal tested the effects of various parameters like thedensityof dots in the target and in the backgroundand theshapeof the target, on the target’sdetectability.The mainresults of Uttal’s experiments can be summarized asfollows: the easiest target to detect was a straight line.When the target was a curved line, performance wasworse. More precisely, the higher the curvature (i.e. thegreater the departure from straightness), the worse wasthe performance. This result supported the Gestaltsimplicity principle. Next, the detectability of the targetwas higher when:

1. The density of dots in the target was greater;2. The target was longer; and3. The density of dots in the backgroundwas lower.

If the spacing of dots in the target was irregular,or thepositionsof the dots in the target were randomlychangedso that the target was not a smooth line, detectabilitywaslower.

To account for these results Uttal proposed anautocorrelation model. This model analyzed the entirestimulusglobally and this analysiswas purely bottom-up(data based). However, the purely global nature of theautocorrelationmodel is a problembecause the outputofthe model carries little or no informationabout the targetitself (Caelli et al., 1978).Furthermore,if the backgroundis highlyregular, for exampleif it consistsof a set of shortvertical line segments, and the target is an equally shortoblique line segment, the autocorrelationmodel will notdetect the target at all. But the human observer can stillquite easily detect a target under such conditions [thisphenomenon is called pop-out, Treisman & Gelade(1980)]. To overcome some of these problems, othermodels have been proposed [but see Ben-Av & Sagi(1995) for a recent revival of an autocorrelationmodel].

van Oeffelen and Vos (1983) proposed a model whereprocessingbegan with locally convolvingthe image witha Gaussian distribution function. Then, grouping wasdetermined by finding sets of dots that were locatedwithin a single iso-density contour. This operationrepresented the proximi@ law of Gestalt. The goodcontinuation principle was incorporated into this modelby using an elliptical Gaussian distributionwith orienta-tion close to the orientationof the line to be found (Smitset al., 1986). The problem with this model was that theorientation of this line had to be known in advance.

To generalize this model to the case of a straight linewith unknown orientation as well as to curved lines,Smits and Vos (1986) introduceda connectivitymeasure.In this model the image was convolved with a circular

Gaussiandistributionand then the principal directionsofcurvature of the resulting surface at saddle points weredetermined. One of these directions was parallel to theline connecting the two dots and the other wasperpendicular to this direction. Thus, these directionsalloweddeterminationof the orientationof this line. Thisconnectivity measure, along with proximity (measuredby the average distance between neighboring dots) andgood continuation(measuredby the standarddeviationofthe orientationsbetween adjacentlines),were assumedtoaffect the saliency measure of a curve.

This approach in which the local analysis involvedpairs of dots, has been more recently generalized byincluding a search for triplets of dots that are approxi-mately collinear (Vos & Helsper, 1991).This new modelcan detect approximatelycollinearsets of dots and it maylead to detection of polygonal figures. Vos and Helsperpointedout, however,that this model is only a firststep inmodeling the phenomenonof good continuationand thatan adequate model should be general enough to includethe case of curved patterns. Furthermore, they con-jectured that a psychologicallyplausiblemodel cannotbepurelybottom-up(databased), as are all currentand priormodels. Instead, it has to allow for some top-downconstraints(hypotheses).

An approach which also involved computing saliencyof curves was described by Sha’ashua and Unman(1988). In their algorithm the saliency of a curveinvolved concepts of good continuation and proximity.The saliency was greater if the curve was longer,smoother and with fewer gaps (occlusions). Thesmoothness itself was measured by curvature or curva-ture variation. Despite the fact that this algorithm candetect curves in similar cases to those where a humanobserver detects them, this algorithm is not a plausiblemodel of the human perceptual mechanisms underlyingcurve detection. First, in their algorithm the saliency isaccumulated in a serial way along the entire curve. Thismakes the processing time proportional to the length ofthe curve. But it is known that the time of perceptualprocessing of a line is insensitive to the line’s length(Pizlo et al., 1995).Another problem is related to usingcurvature as a measure of smoothness. Althoughcurvature is a measure of departure from straightnesscurvatureis scale dependent(for example, circles havingdifferent diameters have different curvatures). As aresult, changing size on the retina affects curvaturesproportionally,which then should change the perceptualsaliencyof curves in the scene.This is a problembecausein everyday life the retinal sizes of objects change veryoften when the distances of the objects from theobserver’s eye change, but the percepts of the contoursof the objects do not seem to change [see also Alter &Basri (1996) for their analysis of mathematical aspectsrelated to the scale dependence of Sha’ashua andUnman’s algorithm].

Finally,we will describebriefly an approach based, inpart, on the anatomical and physiological properties ofthe visual cortex (Zucker, 1985; Dobbins et al., 1989).

1220 Z. PIZLO et al.

FIGURE 3. Schematic illustration of a non-overlappedexponentialpyramid.

Zucker and his colleaguesshowed that parts of a straightor curved line can be detected by detectors that aresensitive to the orientation and curvature of a line, andwhose spatial properties resemble the receptive fields ofsimple, complex, and hypercomplex cells. They con-jectured that after parts of a line are detected, they can beintegrated in a curve synthesis stage.

It is seen from our reviewthat prior theoriesshared twoaspects: they all used bottom-up processing and they allinvolved the simplicityprinciple. The fact that the priormodels did not involve any top-down effects (e.g.familiarity with the target to be detected) allowed forrelatively fast processing. However, it is known thathuman figure-ground segregation can be affected byfamiliaritywith the object to be detected (Leeper, 1935).Therefore, a psychologicallyplausible model of figure-ground segregationshould allow for such effects. Below,we describe a class of algorithmscalled pyramidswhoseproperties make them a possible class of models offigure-ground segregation. Specifically, they allowperforming both local and global operations as well asapplying bottom-up and top-down analysis.

Pyramid algorithms

General features. A pyramid consists of a stack oflayers containingprocessing units (nodes) (see Fig. 3 fora schematic illustrationof a pyramid). Processingbeginsat the bottom layer (number O) which has the largestnumber of nodes (N). Other layers have fewer nodes: thehigher the layer, the smaller the number of nodes. Moreexactly, the number of nodes in layer k is equal to N/bk,where b is called a reductionratio. Because this reductionis exponential, such an architecture is called anexponentialpyramid. Each node in the pyramid receivesinformation from a limited region of the image, called areceptive field. The receptive fields of nodes on higherlevels are larger. This increase of the receptive field sizesis accomplished by projecting information from several“children”nodes in a lower layer to one “parent”node in..a higher layer.

A pyramid has three general properties.First, it allowssimultaneous (parallel) processing of different parts ofthe image. Second, it contains a number of representa-tions of the image, different representations havingdifferent spatial scales. Third, it allows for integrationof the different representations in either of two ways:fine-to-coarse(bottom-up) or coarse-to-fine(top-down).It is worth pointing out that the traditional distinctionbetween global and local analysis (we used thisdistinction in the previous section to classify differenttheories and models) does not really exist in a pyramid.Properties that are local in the higher layers of thepyramid are at the same time global in the lower layers ofthe pyramid.As a result,pyramidsoffer a natural solutionto the traditionalcontroversyof whether to perform localor global analysis; namely,“both types of processing areperformed simultaneously.

Exponential pyramid algorithms have been usedduring the last 20 yr in computer vision to solve a widerange of “earlyvision”problemslike image segmentationand feature extraction [seeTanimoto and Pavlidis (1975)for an early publication on pyramids, and Jolion andRosenfeld (1994) for a recent review of pyramids].Interestingly,however,pyramidshave not been used veryoften as models of human vision IPizlo et al. (1995) isone of the few exceptions].This fact is surprisingbecauseon the one hand the pyramid’s organization andfunctioning are similar to the known anatomical andphysiologicalpropertiesof the visual system, and on theother hand, a pyramid can account for a wide range ofperceptual observations and experiments. Consider firstthe anatomy and physiologyof the visual system.

Biologicalplausibilityofpyramid models. The humanvisual system processes visual information in a highlyparallel way. Simultaneous activations of differentreceptorsin the retina (there are about 120x 106receptorsin the retina of each eye) are passed through the opticnerve, which contains 106 nerve fibers, to the lateralgeniculate nucleus and then to the primary visual cortex(area Vi). Nerve fibers that are projected from differentparts of the retina terminate at differentparts of area V1.This allows simultaneous processing of the visualinformation.The primaryvisual cortex is a topographicalmap of the retina, i.e. neighboringparts of the retina arerepresentedby neighboringparts of V1. Area VI projectsto area V2, which then projects to other areas represent-ing higher stages of visual processing. Thus, the visualsystem has, to some extent, a hierarchical organization.The separate areas in the visual cortex (Vi, V2, V4 etc)are well defined and the numbers reflect the order ofprocessing.Althoughno area has connectionsto only oneother area, the connections are not random, but insteadform quite clear patterns. As a result, processing of theretinal image is done in stages, where the cells at laterstagesof processingselectivelyrespondto more complexstimulation than the cells at earlier stages of processing.Finally, the sizes of the receptive fields in the visualcortex are different in different areas (Zeki, 1993). Thereceptivefieldsof cells in area V1 are the smallest.These

CURVEDETECTIONIN A NOISY IMAGE 1221

cells project to ,areaV2 in such a way that severalcells inVI converge to one cell in V2. Such connectionsmakethe receptive fields of the cells in V2 larger than theircounterparts lri VI. The cells from area S72 haveconnections to area V4 (from the thin strips in V2) orV5 (from the thick strips in V2). Again, several cellsproject to one cell in the higher area, and thus thereceptive fieldsare still larger in areas V4 and V5. So, thehigher in the processinghierarchy the cell is, the larger isits receptive field. All these anatomical features of thehuman visual system: parallel processing,preserving thetopographical map of the retina, hierarchical organiza-tion, and increasing receptive field sizes at successivestages of processing, are consistentwith the organizationof the exponentialpyramid.

Psychological plausibility of pyramid models. Now,we briefly review psychophysical results that areconsistent with the pyramid’s architecture. Treismanand Gelade (1980) discovered conditionsunder which atarget can easily be detected among distracters withoutdirection of attention to any particular part of the scene(the “pop-out” phenomenon). Similarly, Julesz (1962)and Beck (1966) determined conditions under whichtexture segregation occurs effortlessly and pre-attenta-tively. These two groups of phenomena suggest that thevisual system is analyzing the entire scene in a parallelfashion. Next, consider the multiresolution property ofthe visual system. This property was first demonstratedby Campbelland l?obson(1968).They postulatedthat thehuman visual systemhas distinctchannels, each of whichshows the greatest responseto a certain spatialfrequency.This work on spatial properties of the visual system hasbeen generalized by Watt (1987) in his model of spatio-temporal integration of visual information. He testeddiscrimination sensitivity for length, orientation, curva-ture and stereoscopicdepth for stimuliwith various sizesand for various exposure durations. He found that thediscrimination sensitivity increased with the exposureduration of the stimuli. Watt explained his results byinvokingspatial frequency channelsand assumingthat atthe onset of the stimulus only the coarsest channel(resolution)is used. Then, this channel is “switched off’and a finer channel is used, and so on, until the channelwith the finest resolution remains. Clearly, this model issimilar to (although not identical with) coarse-to-fineprocessing in the pyramid. This work of Watt waselaborated by Pizlo et al. (1995). They pointed out thatfor a wide range of lengths in Watt’s (1987) study, theWeber fraction (the ratio of the discriminationthresholdfor length to the length itself) was constant [see alsoBurbeck & Yap (1990) for a similar result]. This resultindirectly implied that the speed of length processing inthe visual systemdoes not dependon the length itself,butinstead on the required precision of length judgment.Pizlo et al. (1995) showed that this result can be modeledby an exponential pyramid algorithm and that such amodel can better account for a wider range of perceptualphenomena of size perception and mental size transfor-mation than other models.

To summarize, the architecture of the pyramid is aplausible model of the anatomy of the human visualsystem and the known computational properties of theexponential pyramid seem to agree with results ofpsychophysicalexperiments. Therefore, we believe thatthe exponential pyramid is a possible (perhaps even aplausible) model of the perceptual mechanisms under-lying human vision*. In this paper we use this model tostudy and explain the law of good continuation,which isan elementof the figure-groundsegregationphenomenon.

In this studywe performed three experiments.First wedescribe Experiment 1 in which we tested the detect-ability of a set of collinear dots among background dots.Next, we presentour exponentialpyramidmodel of curvedetectionalongwith a new definitionof smoothness.Thismodel was then used to perform simulation experimentsunder the same conditionsas those used in Experiment 1.Then we report Experiment 2 which further tests ourmodel and the new definition of smoothness. In thisexperiment a number of different curves were used as’targets: straight lines, circular arcs and arbitrarily shapedcurves. Again, psychophysicalresults were compared tosimulation results from our exponentialpyramid model.FinaIly,our Experiment3 tests the roles of familiarity of

*Oneof the reviewers indicated that a pyramid is a discrete precursorof a continuous scale space model (Koenderink, 1984). In thismodel a given image is represented by a one parameter family ofderivedimageswhere resolutionis a (continuous)parameter. It hasto be pointed out that the scale space model is a continuousequivalent of only one special case of an exponential pyramid,caIleda mrdtiresolrrtionpyramid.In a multiresolrrtionpyramid(andin the scale space model)the onlyspatial operationwhich is appliedacross levels of scale is blurring(whichcorrespondsto computingaweightedaverage) and this operationis applied to onlyone feature:intensity.Also, the representationof a stimulusin a multiresolutionpyramid (and in the scale space model) is derived in a purelybottom-upfashion. In a general case of an exponentialpyramid,onthe other hand, one can perform a variety of operations, likecomputing higher order statistics (e.g. variance) and detectingpropertiesof a frequencydistributionsuch as bimodalityor outliers(Rosenfeld, 1990).These operationscan be applied to a numberoffeatures, and the features themselvescan be either continuous(e.g.hue, size, orientation, curvature) or discrete variables (textureelements). Finally, an exponential pyramid allows changing theparameters of the locat operations, allowing for top-down(modelbased) effects. It is possible that some of the capabilities ofexponential pyramids could be modeled by continuous models(similar to the scale space model), but we do not think that alldiscrete pyramidshave continuouscounterparts.We want to pointout, however,that despite the clear theoretical distinctionbetweendiscrete and continuous models, this distinction may be lessimportant on a computationallevel. Namely, if the reductionratiois close to one (as is the case in “fractional pyramids’’—Burt,1981),the discrete pyramidbecomes computationallyvery similarto a continuousmodel.The question remains of which of the twotypes of models, discrete or continuous, is psychologically andbiologically more plausible. Existing perceptual results, including“pop-out” phenomena, texture segregation and top-doyn effects,are more consistent with the exponential (discrete) pyramidmodels, since those models, but not the scale space model, canaccount for the perceptual results. Similarly, on a biological level,the discrete nature of the nervous system seems to be moreconsistentwith discrete, rather than continuous,models.


the shape and orientationof the target in its detectability.The last section provides a summary and conclusions.

EXPERIMENT1:THE EFFECTOF THE DENSITYANDLENGTH OF A TARGET ON ITS DETECTABILITY

In the first experimentwe used collinear sets of dots astargets and we tested the effect of the density and lengthof the target on its detectability. From Uttal’s (1975)results it is known that the detectability of a targetincreases when either the density or length of the targetincreases. These results seem to be intuitively obviousand we did not expect to measure different effects. Thepurpose of this experiment was to describe theserelationships in a more systematic way and to extendUttal’s results to different stimuli.Specifically,we used alonger exposure duration and a larger visual angle of thestimuli and we did not mix many target shapes in onesession (this last factor is related to familiarity with thetarget, and it will be directly tested in Experiment 3).

Methods

Subjects. Two of the authors (MSG and ZP) served assubjects. They had prior experience as subjects inpsychophysical experiments and had extensive practicefor this experiment. MSG was an emmetrope. ZP was amyope and used his normal correcting glasses.

Stimuli. The stimulus consisted of a target andbackground. The target was a set of collinear, equallyspaced dots. The background was a set of similar,randomly distributeddots (see Fig. 4 for an example of astimulus). The stimuli were displayed on the monitor(1152 x 900 pixels) of a Spare 2GS computer. The dotswere white (luminance56.7 cd/m2).The size of each dotwas equal to the size of a pixel. The black backgroundhad a luminanceof <0.001 cd/m2.The stimulusoccupieda square array of 840x 840 pixels. The viewing distancewas 150 cm. The stimulus size was 9.4 deg.

The numbern of dots in a target of length 1and densityd was calculated as: n = l-d+ 1. The length here isspecified in pixels, and density is the reciprocal of thedistancebetween neighboringdots (to avoid ambiguityofhow the pixels are counted for different orientations ofthe line,we assumethat a pixelhas a squareshapeand thepixel’s size is measured as the length of the side of thissquare). The position and orientation of the target wasrandomfrom trial to trial. The coordinatesof backgrounddots were randomly generated from a uniform distribu-tion. The total number of dots (target and background)ina stimuluswas 400.

Procedure. The experiment consisted of two blocks;each block had six sessions. The first block tested theeffect of target length for a fixed value of target density(one density level per session). This block is called thejixed densitycondition.The secondblock tested the effectof target density for a fixed value of target length (onelength level per session). This block is called the jixedlength condition. The order of the blocks was differentfor the two subjects, and the order of sessions within ablock was randomized.

The method of constant stimuli was used. In a givensession, for a fixed level of the density (or length), therewere seven equally spaced levels of length (or density).The range of levels of density and length were chosenseparately for each subject in a preliminary experiment,so that the extremes of the ranges gave rise to close tochance and close to perfect performance.There were 100trials per level of length (or density).There were also anadditional 100 catch trials in each session, in which notarget was presented.Thus, the total number of trials in asession was 800. Before the experimental session 80practice trials were used (10 trials per level plus 10 catchtrials). The subject was instructed to adopt a responsecriterionso as to keep the error rate on catch trials as lowas possible, but above zero. After each response thesubject was given feedback about the accuracy of theresponse.The subjectrespondedby using the buttonsof acomputer mouse. The time for response was unlimited.The stimuli in the practice and experimental trials werepresented in a random order. After the block of practicetrials the subject was informed about the proportion oferrors on the catch trials. The subject had an option torepeat the block of practice trials, but this option wasrarely used.

Before each trial a fixation cross was presented in thecenter of the screen. The subject started the trial bypressing a button of a computer mouse. Then, the crossdisappeared and after 100 msec the stimulus waspresented. The exposure duration of the stimulus was100 msec. After this time the screen went black and thefixation cross was displayed. The session lasted about30 min.

The subject sat in a dark room. The monitor wasviewed monocularly with the right eye. The subject’shead was supportedby a chin-foreheadrest. The monitorwas adjustedin such a way that the subject’sline of sightwas orthogonal to the screen.


(A) 5.1 (a)1

0908070.6050.40.3020.1

0

density .025

p(Chi2)=.0875

4)

0 100 200 300 400 500 SOo 700 Seo I

1-09-0.8-0.7-06-0.5-0.4- density .030.30.2-

p(Chi2)=,1795

00 100 200 300 400 500 600 700 800 I

10.9-0.8-070.6-0.5-0,4- densify .03503-0.2-

p(Chi2)=.ffi95

0.1 *b

o 100 200 300 400 500 600 700 800

10.90.6-0.7-0.6-0.5-0.4- density .040.3-0.2

p(Chi2)=.3647

0.10

0 100 200 Xlo 400 500 600 700 600

10.9-0.6-0.7-0.6-0.5-0.4- densify ,0450.3-0.2

p(Chi2)=..CO17

0.10

0 100 200 300 400 500 600 700 600

10.9-0.6-0.7-06-0.5-0.4- density .050.3-0.2

p(Chi2)=,0103

0.10

0 100 200 300 400 500 600 700 S00

5.1 (b)1-

0.9-0.8-0.7-0.6-0.5-0.4- densify .030.3 --0.2 --

p(Chi2)=.4597

0

10.9-0.6-0.7- I

0.6-0.5-0.4- density .0350.3-0.2-

p(Chi2)=.C002

0.10

0 IIJI 203 3c0 4W 5(HI 6C43 7CQ 8CC

09-0,8- I0.7-0.6-0.5-0,4- density .040,3-0.2-

p(Chi2)=.C047

0.10

0 too 200 300 400 500 600 700 000

1-0.9-0.8-0.7-0.6-05-0,4- density .0450.3-0,2-

p(Chi2)=.0535

0.10

I o 100 NO 3ce 4CC we mo 703 6CO[

10,9-0.6-07-0.6-0.5-0.4- densiiy .050,3-0,2-

p(Chiz)=.5381

0.10

0 10Z 203 3C0 40o 5C0 600 7C0 602

10,9-0.60.7.0.6-0.5-0.4- density .0550.3-0,2-

p(Chi2)=. C055

0.1 -0

I o ICO 2C6 300 4CC 5Ci3 6CQ 700 800

Iergth of a line in pix?ls

FIGURE5(A).

Iengfh of a line in pimls

Analysis. For each session, the relationship between function was estimated only from points representingthe proportion of “target detected” responses and the trials in which the target was present. The quality of theindependent variable (density or length) was approxi- fit was evaluated using a X2statistic with 5 d.f. (sevenmated by a cumulative Gaussian distribution function data points minus two estimated parameters of theusing ProbitAnalysis (Finney, 1971).The approximating distribution).

1224 Z. PIZLO et d.

(B) 5.2(a)1

0,9-0,8-0.7-0.6-0.5-0.4-0.3-

length 2W

0.2p(Chi2)=.0775

0.10

\ O 0,0? 0.0, 0.03 0,04 0.05 0.06 0.07 0.08

10.90.80.70.60,50,40.30.20,1

0

lengthWo

P(Ct’i2)=.1581

I 0 oO> 002 003 0.04 0.05 ,.06 0.,, 0,061

10.9-0.8-0.7-0.6-0.5-0.4- !en@h 4C00,3-0,2

p(Chi~=.1128

0.10

0 0.01 0.02 0.03 0.04’ 0.05 0,06 0.07 0.08

0,80.7l-’-t06 f0.50.40,30.201““”l f

length S00

p(Chi2)=.@35

, ..O /I 0 001 002 003 004 0.05 0,06 0,07 o.cal

t-0.9.0.8.0.7 -0.6.0.5.0.4- Ienglfr 6000.3. - P(Cti2)=.55280.20.1

0I O 0.01 0.02 003 004 0,05 0.08 0.07 0.081

0 0.01 0,02 0,03 0.04 0.05 0.06 0,07 0,08 I

densify of a line

0.80.8.0.7. -0,6- - 10.5-0.4-0,3-0.2-0.1 --

p(Chi2)=.@300

0?0 0.01 0.02 0,03 0.C4 0.05 0.C6 0.07, 0.08

0.9-0.8-0,7-0.6-0,5-0,4-0,3- length 3000,2- p(cw)=.owlo0,10

0 0.01 0.02 0,03 0,04 0.05 0.06 0,07 O,oc

10.9-0.8-0.7- E

0.6-0.5-0.4-0.3- kw@h 4000.2-0.1

p(ChF)=.CO07

o0 0.01 0.02 O.ro 0.04 0.05 0.03 0.07 0.06 I

10.9-0.8-0.7- - $

0.6-0.5-0.4-0,3- length W0,2- P(CiW)=.03260,1

0

I o ool 002 0.03 0.04 0.05 0.06 0.07 0,L781

10.9-0.8-0.7-0,8-0.5-0,4-0.3- Iw@h 0000.2- p(ctw)..ecm0.10

0 0.01 0,02 0.03 0.04 0.05 0.06 0,07 0,08

1-0,9- m0,8-0.7-0.6-0.5-0.4-0.3- length 7C00.2- II(Cti2)=.02550.1

00 0.01 0.02 0.03 0.04 0.05 0,06 0.07 0.08

density of a line

FIGURE 5. (A) Results of Experiment 1 for fixed density condition.The ordinate shows the proportionof responses “targetpresent” andthe abscissa showsthe lengthof the target in pixels.Differentpanels correspondto different levels of the densityofdots in the target. Circles represent the trials with target and diamondsrepresent the trials without target (i.e. catch trials). (a)Subject MSG; (b) subject ZP. (B) Results of Experiment 1 for fixed length condition.The ordinate shows the proportion ofresponses“target present” andthe abscissa showsthe densityof dots in the target. Differentpanelscorrespondto different levelsof the length of the target in pixels. Circles represent the trials with target and diamondsrepresent the trials without target (i.e.

catch trials). (a) Subject MSG; (b) subject ZP.


-4.5 -4 -3.5 -3 -2.5In (density)

-+- MSG-fixeddensity

]+ MSG-fixedlength ]

]+zp-fixeddensity \+ ZP-fixedlength 1

FIGURE6. Fifty percent detectabilitycurves for Experiment1.The squaresand diamondsrepresentresults from subiect MSG,and the circles &d triangles from subject ZP. The a~eaabove and to the right of the 5070de~ectabilitycurve represe~ts targetsdetected in >50% of the trials, whereas the area below and to the left of the curve represents targets detected in <50% of the

trials.

Results and discussion

Figure 5(A) shows the results from the fixed densitycondition, and Fig. 5(B) shows the results from the fixedlength condition, for the two subjects. The circlesrepresent trials when the target was present, whereasthe diamonds represent catch trials. For each data pointthere are shown error bars representingthe magnitudeofsampling error computed as [(p(l-p)JNll’2, where p isthe proportionof “target detected” responsesandN is thenumber of observations per data point (N= 100). Thecontinuous line is the best fitting cumulative Gaussiandistribution function. The probability of obtaining a X2statisticequal to the value computedfrom the data pointsor larger is given in each panel. If the fit of theapproximating function to the data points is good, thevalue of the test statistic Z2is small, and the correspond-ing probability is large. Conventionally,a value of this

. probability >0.1 is assumed to represent a good fit.For subject MSG the fit was good (P > 0.1) in 8 out of

16 conditionsand for subjectZP the fit was good in 2 outof 16 conditions. These results suggest the presence ofsome heterogeneities in the data that could have beenproduced either:

1. By variability of the dependent variable largerthan the variability assumed from the samplingerror; or

2. By using an inadequate approximatingfunction.

To check this latter possibility, we repeated theapproximationusing a logarithmic transformationof theindependentvariable. This time the fit was good in 5 outof 16 conditions for MSG and in 7 out of 16 conditionsfor ZP. These results do not allow rejecting one type ofindependentvariable in favor of the other. We concludethat it is more likely that the heterogeneitiesobserved inthe data are produced by increased variability of thesubject’s responses as compared to the theoreticalvariability predicted from the sampling error. Suchincreased variability could have come about, for

example, by instability in the subject’s criterion for theresponse “target detected”.

It is seen from these results that it is easier to detect aline when the line is longer and the density of dots in theline is greater. These results seem to be intuitivelyobvious and they are consistent with prior results (e.g.Uttal, 1975).To investigatethe joint effect of length anddensity of a line on its detectability, we plotted 50%detectability curves (Fig. 6). This graph shows therelationshipbetween the length of a line and the densityof dots in the line in log–log coordinates from bothsessions: fixed density and fixed length. For the fixeddensity condition, the values on the abscissa are thevalues of density used in the experiment, and the valueson the ordinate are the lengths of the lines which weredetected in 50% of the trials (i.e. they represent the 50thpercentile of the fitted cumulative Gaussian distributionfunction). For the fixed length condition, the values onthe ordinate are the lengths of the targets used in thiscondition,and the values on the abscissaare the densitiesof the lines detected in 50’%of the trials. Thus, the pointsplotted in this graph represent 50% detectability curvesfor each subject across the two conditions (the squaresand diamonds for MSG and the circles and triangles forZP). The area above and to the right of the 50%detectabilitycurve representstargetsdetected in >50% ofthe trials, whereas the area below and to the left of thecurve represents targets detected in <50’%of the trials.

It is seen from this graph that the results from eachsubject are quite consistent across the two conditionsused, namely, the data points correspondingto the fixeddensityconditionand the data pointscorrespondingto thefixedlengthconditioncan allbe approximatedby a singleline. Next note that the 50% detectability curves for thetwo subjects are nearly parallel to one another [theirslopes, –1.98 for MSG and –1.79 for ZP, are notsignificantlydifferent (P > 0.1)]. These results suggestthat the commonperceptualmechanismis representedbythe similar slopes of the 50% detectability curves, andthat the differencein sensitivitybetween the two subjects


(the sensitivityof MSG was higher than that of ZP—the50% detectabilitycurve for ZP is translatedto the top andright from that for MSG) is represented by the differentintercepts.

In this experiment two features of the target weremanipulated, namely, density and length, but the back-ground was kept the same. To evaluate the effect of thebackground on the subject’s performance a follow-upexperimentwas performed in which both the length anddensity of the target were kept constant (150 and 0.04,respectively), but the density of the background varied,by using different numbers of dots ranging from 100 to400. The method of constantstimuliwas used with sevenlevels of the number of background dots. During the“catch trials”, when no target was presented, the numberof backgrounddots was chosenrandomlyfrom the sevenlevels used in the other trials. ZP was tested in onesession (800 trials). It was found that the proportion of“target present” responseswas greater when the numberof background dots was smaller (the proportionof errorson the “catch trials” was 0.13). The relationshipbetweenthe proportion of “target present” responses and thenumber of background dots was approximated by thebest-fitting cumulative Gaussian distribution function.The fit, evaluated by the Z2 test, was quite good(P> 0.29). If the logarithmof the numberof backgrounddots was used as the independent variable, the fitimproved (P > 0.68).*

To summarize the qualitative results of this experi-ment, detectability of a target (a collinear set of dots) ishigherwhen the target is longer, and the densityof dots inthe target relative to the densityof dots in the backgroundis greater. In the next section we will describe our newmodel based on an exponentialpyramid architectureandcompare simulationresults to the psychophysicalresults.

THE EXPONENTIALPYRAMIDMODEL OF CURVEDETECTION

First, we will introduce our new definition ofsmoothness. Then, we will describe the new model ofcurve detectionand illustratethe model’sperformanceasa functionof the values of its free parameters.Finally,wewill show that this model can generate psychometriccurves that fit quite well to those from our Experiment 1.

*We want to point out that the fact that using a logarithmictransformationof the independentvariable (length, target density,backgrounddensity)tendedto improvethe fit of the approximatingfunctions or simplified their relationships, could be regarded assupportingan exponentirdpyramidmodel.This is the case becausein the exponential pyramid algorithm the independent variablesenter in logarithmic transformations(see Pizlo et al., 1995).Note,however, that there are many other, qualitatively differenttransformations of the independentvariables that could give riseto equally good fits, especially if the range of variation of thevariables is relatively small (as is the case in most experimentswhere detection or discrimination thresholds are measured).Therefore, we will base our evaluation of the model on moredirect tests that involve comparison of psychophysical andsimulation results.

changeof angle= rx2-od

FIGURE7. Local change of angle (W-Cil) is definedas the differencebetween two successive changes of orientation, Uzand Ml.

A new definitionof smoothness

In mostprior theoriesof curve detection,the curvewaseasier to detect if it was closer to a straight line (see thereview in the Prior Research section). This theoreticalclaim is rooted in the simplicityprincipleof Gestalt and itis consistentwith the resultsof somepsychophysicaltests(e.g. Uttal, 1975;Field et al., 1993).The departure from“straightness”was usually defined as the magnitude ofcurvature (e.g. Sha’ashua & Unman, 1988). We pointedout, however, that curvature itself is psychologicallynotplausible because curvature is not scale invariant. Theeffect of scale on curvatureimplies that if the observer isapproachinga given object (or moving away from it), allcurvatures on the retina will change because of changesof retinal sizes. Subjectively,however, the percept doesnot seemto be affectedby the changeof distancebetweenthe objectand the observer.Therefore,we believe that thedeparturefrom straightness(or more generally,departurefrom “smoothness”)shouldbe measuredby angles ratherthan curvatures,because angles are scale invariant.

We pointed out in the Introduction that there is analternative approach (different from the simplicityprinciple) to figure-ground segregation in general, andthe law of good continuationin particular,which involvesthe likelihoodprinciple. According to this approach, thepercept favors interpretationsthat are likely to occur ineveryday life. If this principle is applied to curve (orcontour) detection, then the percept should be equallysensitiveto a wide range of possibleshapesof curves, notjust to straight line segments or circles, because thecontours found in images of everyday life scenes (e.g.contours of images of animals and persons) do notinclude many straight line segments or even parts of acircle or an ellipse. Despite the fact that the contours ofimages in everyday life scenes are characterized bydifferent degrees of departure from straightness,they allshare one feature, namely they are piece-wise smooth.

We definesmoothnesshere as a small change of anglealong the curve. This definition is illustrated in Fig. 7.Assume that a curve is representedby a set of dots, as inour experiments (note that if a curve is continuous anddoes not have distinctivedots, one can always divide thecurve into short segmentsand use the endpointsof thesesegments in lieu of dots). The change of orientationbetween successivetuples of dots is called an angle (e.g.al), and the difference between two successive angles(ctz-ctl) is the change of angle. We conjecture that a


curve is smooth (i.e. conforms to the law of goodcontinuation) if the change of angle is small along theentire curve. In a straight line all angles (and changes ofangle) are zero. Thus, a straight line is smooth accordingto this definition.In all circular arcs, the change of angleis also zero everywhere (even though the angle is nolongerzero), and, therefore,all circles are smooth,too. Inthe two lines shown in Fig. 2, the angles are not zero andthe changes of angle are not zero either,but the changeofangle is small and the curves are perceived as smooth.Before this phenomenological observation is tested(Experiment 2), we describe the details of our modeland show that this model can account for the results ofour Experiment 1.

The new model

We proposea hierarchicalpyramid as a modelof curvedetection. The structure of the pyramid and the basicoperations performed by each node were chosen on thebasis of what is known aboutthe anatomyand physiologyof the human visual system, as well as on the basis ofcurrent knowledgeof the perceptualmechanismsof earlyvision. First, we describe the main properties of thestructure of our model.

We used a pyramid with four layers. The bottom layerof the pyramid contained 64 nodes organized in an 8 x 8square. The number of nodes at the bottom layer wassmaller than the number of pixels in the input image, soeach node in this layer received an inputfrom a portionofthe image. The ratio of the number of nodesbetween twosuccessive layers (reduction ratio) was 4. Thus, each“parent node” received input from four “childrennodes”.In the bottom layer the receptive fields were enlargedby t a half of the width (height) of the individualreceptive field. As a result, the receptive fields in alllayers overlappedby this amount. This allowed avoidingproblems in cases when the target to be detected fell onthe border of two nodes. Finally, we assumed one globalcoordinate system in the pyramid and no interactionsamong nodes in a given layer.

Our pyramid model, whose structure is describedabove, represents only one particular instance of a moregeneral class of pyramid algorithms. In the general casethe reduction ratio can be arbitrary, and it does not evenhave to be a whole number (Burt, 1981).Receptivefieldsmay or may not overlap. The shape of the receptive fielddoes not have to be a square. In fact, the receptive fieldsizes and shapes may change as a function of a stimulus(Meer, 1989).Finally,nodes in a given layer may interactand each node may use its own local coordinate system.

Next, we describe rules according to which ourpyramid model operates. Each node in the pyramidstores coordinates of dots that are located within itsreceptive field, the descriptionof figures (i.e. sets of dotsforming possible targets) detected in its receptive field,along with a measure of “goodness”of each figure. Thegoodnessdepends on the figure’slength and density.If anode detects at least one figure in its receptive field, this

figure,but not the individualdots, is passedup to the nextlayer for further processing.

A node in the first layer starts processingby analyzingdots in its receptive field to determine if they can form afigure. Our definitionof smoothness,which involves theconcept of a change of angle, implies that the simplestfigure consists of four dots (a quadruple of dots is thesmallest set for which a change of angle can becomputed). The examination of dots in the receptivefields involves three stages. It begins if the number ofdots in the receptivefieldsis greater than some minimum(in our simulationsthe minimum number of dots was 6).Otherwise, all dots are passed up for further processing.This requirement prevents the pyramid from losinginformation: if the dots are very sparse, the target mightbe missed not because the dots do not form a figure,butbecause there were not enough dots to verify theexistence of a figure or its part. The rules involved inthe individualstages are given below:

1. In the first stage, pairs nti of dots are formed, if thedots Pi, Pj are sufficientlyclose to one another:

Pi, Pj form a pair ~ti if d(P~,Pj) S 6

where d(”) is a Euclidean distance, and 6 is a maximaldistance (reciprocal of minimaldensity). The restrictionthat d(Pi,Pj)is small representsthe fact that only dots thatare close to one anotherare likely to lead to the percept ofa figure(proximityrule).At the sametime, this restrictionlimits the computationaltime by limiting the number ofpossible pairs. In the visual system this stage can beperformed by simple cells.

2. In the second stage, triplets rti~of dots are formedfrom pairs, if the two pairs have one dot in commonand the angle ctti~formed by the two pairs issufficientlysmall.Before themagnitudeof the anglewas evaluated,a randomnumberx. was added to theactual angle. This random number represented inour simulationsperceptual noise in judging angles:

my, ~j/k form a triplet rijkif j = j’ and l~~k+ d%I S ~max

where x. is a random variable subject to a normaldistributionN(O, OU2)(cr. is called here angle standarddeviation), and ct~,, is the maximal angle (anglecriterion). In the visual system, this stage can beperformedby hypercomplexcells (Dobbinset al., 1989).

3. In the third stage, quadruples~~klof dots are formedfrom triplets. Two triplets can form a quadruple, ifthey overlap by two consecutive dots, and thechangeof angleAtiklis sufficientlysmall. Before themagnitude of the change of angle was evaluated, arandom number y~ was added to the actual changeof angle. This random number represented in oursimulationsperceptualnoise in judging a differencebetween angles:

I_tikand Tfk(lform a quadruple X@tlif j = j’

and k = k’ and Aijkl= I(Chjk– ~jfkrl) -1 YAI < Amax

where yA is a random variable subject to a normal


(a) 1~ 0.9- -~ 0.8- -

SJ 0.7 -

~ 0.6- -

g 0.5- -: 0.4- -.: 0.3 --& 0.2 --

= 0.1 --

00 100 200 300 400 500 600

Ienghtofa line

(b) I -

~ 0.9- -~ 0.8 -

g 0.7 --

~ 0.6-

g 0.5-

: 0.4- -

.3 0.3- -g 0.2 --

a 0.1

00 0.01 0.02 0.03 0.04 0.05

density ofa line

FIGURE 8. “Psychometric” functions generated by the exponential pyramid model using the same conditions as in ourpsychophysicalExperiment 1. The axes in this figure have the same units as in Fig. 5(A) and (B). Specificrdly,the ordinateshowsthe proportionof the model’sresponses“target present”. (a) Fixeddensitycondition.The abscissa showsthe lengthof thetarget in p-keis. Different curves repre~entdifferent levels of the density of dots in the target. fb) Fixed length condition.Theabscissa shows the density of dots in the target. Different curves represent different levels of the length of the target in pixels.

distributionN(O,~A2)(a~ is called here charzge-of-anglestandard deviation),and A~.x is the maximal change ofangle (change-of-angle criterion).

After figures are identified, they are passed up forfurther processing according to the following rule.

4. If a node on the second (or higher level) receivesfrom its children descriptions of figures #ii,...,,.,itchecks whether it is possible to merge pairs of thefigures. Such a merge is possible if two figuresoverlap by more than two dots (starting from one ofthe endpoints) or they overlap by two dots and thechange of angle at the site of the merge is less thanthe change-of-anglecriterion:

Oil,...,in and $j,,...jmcan be ‘erfiled ‘f :

(a) i.-~ =jI,..., in = j~+l andk22

or

(b) in_l = jl and i. = jz and &.2in-1i~3

= l(~in-,i.-li. – ~j1~3)+YAI < Amm

Each node orders the figures by using a goodness

measure and stores the 10 “best” figures for furtherprocessing (storing only a small set of figures isconsistent with the assumption that the nodes in thepyramid have limited computational and memorycapacity). A figure is “good” if it is long and dense.Since we found that the iso-sensitivitycurve in Experi-ment 1 was a straight line in log(length)vs log(density)coordinates, the goodness was measured by the ratiolog(length)/log(density). The longer and denser thetarget, the smaller the value of this criterion (but thelarger in absolutevalue)-see Fig. 6, in which the betterfigures correspond to points in the upper right corner ofthe graph.At the apex of the pyramid the goodnessof thebest figure was compared to a goodness criterion todecide whether the best figure is to be classified as atarget.

In the next sectionwe illustratethe performanceof themodel by analyzing the effects of the length and densityof a target on the proportionof correct detections,as wellas by analyzing the effects of several parameters of themodel on its performance.


“Gxxtness” criterion1

0.9- -0.s0.7-0,6-0.5-0.4- -0.3-0.2-0.1

0.0 e3 103 150 3U0 250 m 350

I

Angle criterion

1=”o 50 Iol Iw ZrHl 250 303 350

Angle standard deviation1-

0,9- -0,6-0.7.0.60.50,40.30.2-0.1

0.

0 WI 100 150 200 2SI 3eo 350

Change of angle criterion

ILz!zl’o 50 102 150 200 250 301 350

I

Change of angle standard deviation

EzlE?!I’

I 1,Minirmrl density

I0.90.s0.70.60.s0.40.30.20.1

0 L

length of a line in pixels

FIGURE9. Illustrationof the effect of model parameters on the modelperformance. The ordinate shows the proportion of the model’sresponses “target present” and the abscissa shows the length of thetarget in pixels. The middle curve (representedby circles) is the samein all six graphs.The other two curves in each panel representthe effect

of one parameter on the simulated psychometricfunction.

Performance of the model

Before we fit the model to the data (next section) weanalyze the model’sperformancein order to examine thefamily of possible simulationresults. If, by changing themodel’s parameters, one could obtain an arbitrary curverepresenting the relationship between the proportion of“target present” responses and the independentvariable(lengthor density),then the predictivevalue of the modelwould be relatively weak (any result can be accountedfor). If, on the other hand, the model can generate only arestricted family of curves, its value as a theory would bemuch greater (because it can be invalidated).

In the first set of simulations, all parameters of themodel were kept constant, and the proportion of “targetpresent” responses as a function of density and lengthwas computed. In these simulationsthe stimuli were thesame as those used in Experiment 1 for subject MSG.Each data point corresponds to 100 trials (as inExperiment 1). The target was always a straight line.The values of the model parameters were as follows:angle criterion (ct~,X):7 deg, angle standard deviation(cr.): 5 deg, change-of-angle criterion (Am=): 5 deg,change-of-anglestandard deviation (~A):4 deg, maximaldistance (6) between dots that can form a pair: 60 pixels(minimum density: 0.016), goodness criterion: –1.55.These values were chosen in such a way that the modelwould give rise to psychometricfunctions in a range ofthe independentvariable (length, density) similar to thatproduced by the subjects.

Figure 8(a) shows a family of simulated psychometricfunctions for a fixed density condition and Fig. 8(b)showsthe functionsfor a fixed lengthcondition.It is seenthat these simulation curves are qualitatively similar tothoseobtainedfrom the subjectsin Experiment 1 (Fig. 5).This means that our pyramidmodel is a possibletheoryofcurve detection in the noisy image.

Next, we analyze the effect of the model’s parameterson the simulatedpsychometricfunctions.Figure 9 showssix panels, each panel having three functionscorrespond-ing to three levelsof a given parameter.The values of theother parameters were kept constant at their middlevalues. The targets used in this simulationwere straightlines with a fixed density of 0.035. It is seen that all thefunctions are of the same type, namely, all areapproximately monotonic. Changing the parametersresulted in changing the slope or positionof the function,or both. Thus, the parameters of the model give rise tosystematic changes of the psychometric function. Thismeans that our model cannot reproduce any function.Instead, the model always produces a function having ashape similar to the shape of a cumulative Gaussiandistribution function. The next section will test howclosely the simulated functions can reproduce (approx-imate) the psychometric functions measured in Experi-ment 1.

Fitting the model to the data

Fitting the model to the data consisted of determiningthe optimalvalues of the parametersthat led to the best fit

1230

(A) 10.1 (a)

04 I0 IW 200 300 4CA 500 6C0 7CQ 8W

10,9- -0.8-0.7- -0,6. -0,5. - .,

0.4 -0.3- - density .030.20,1 B

~Chi2=12.10)=,~73

0.0 100 200 200 400 500 600 700 20

0.4-0,3- i density .0350.2

[P(CN2=2.67)=.6145

0.103

0 10) 200 300 40o SW 62Q 7CC 8Ct

10.9-0.8-0.7-0.6-0.5-0.4-0.3- demity .040,2- p(Chi&5.79)=.56450.1

00 lfll 203 300 4W SCM 600 7W 800

10.9. -0.8. -0.7.0.6- -0.5-0.4-0.3-0,20.1

p(ChF=22.EY3)=.CKN6

1’0 Io 100 200 300 400 500 600 700 600

1-0,9-0.6-0.7-0.6- -0.5-0.4-0,3 density .050.2- p(ChP=6.11)=.52700.1

00 100 200 300 400 500 zoo 700 mo

length of a line in pimls

0 103 2@3 3C0 4&l 5(0 600 71xl 8@3

10,9

\ !r~.

0.8 .. .. .

0.70.60.5

/5,.. ““’

0.40.3 ...$”””- density .0350.2 ..0.1

p(Chi2=14.23)=.0472

o0 IW 2W 3CQ 400 500 6CU 7CW3 600

e---=in-cl0.24 ..JY p(Chi2=3.62)=.43090.34 -.7 wrmy

0.100 100 2@3 3C0 400 SW 603 720 8CQ

1 >0.9-0.8- -0.7-0.6-0.5-0.4-0.3- density ,0450.2-0,1

p(Chi2=21.63)=, C027

00 103 202 3W 40U 503 600 700 WO

1. I10.9-0.6-0,7-0.6-0.5-0.4-0.3- density .050.2-0.1

p(Chi2=14.61)=.0413

o 1C6 2C0 300 4LYI 500 620 7C0 6CQ

10.9

*L, ,,,.-

..

0.60.7 ..0.60.50.40.3 density .0550,20.1

p(Chi2=8.03)=.3300

oI o 100 21XI wm .m mo 6C0 7(M m]

length of a line in pbels

(as measured by a ~z test) of the simulatedpsychometric Experiment2). As a result, only the followingparametersfunctions to those obtained in Experiment 1. Since in were subject to optimization:Experiment 1 only straightlineswe~eused as targets, thisoptimization did not involve the change-of-anglecriter- 1. Maximal distance (d) between two dots that couldion (A~,X)or the change-of-anglestandarddeviation(~A) form a pair in a figure (reciprocal of minimum(these two parameters will be used in the simulationsof density);


(B) 10.2(a)

10.9

~DÌ‹~DÌ‹ , ~

. . .. . .

08 ..0.70.60.50,403 Iertgth 2C00.2 ..0.1

p(Chi~15.45)=.0306

o0 0.01 0.02 0.03 0.C4 0.Q5 0.C6

10.90.80.70.6050.4 r0.30.201 i z“o~

0 0,01 0.02 0.03 0,04 0.05 0.06

J.. Iemgth4CXI

p(Chi&ll.94)=.1025

1’0 Io 0.01 0.02 0,03 0.04 0,05 O.ffi

1-0,9-0.8-0.7-0.6-0.5-0.4-0.3- Iergth SXl

0.2 p(Chi&16.27)=,02280.1

1’0 I0 0.01 0.02 0,03 0.04 O.ffi 0,26

0 0.01 0.02 0.03 O.(M 0.05 O,CM

10,90.60.70.60,5

~•Ë‹L+ƒ•ˆ•Ä‹”ö••0.40.3 kqth 7000,20.1

p(Chi+35.31)=0

I 0: 0:01 A O.ffl o:iM0:05O’ffil

demity of a line

10.2(b)

1,0.9-0.6-0.7-0.6-0,5-0.4-0.3-0.2-0,1

p(Chi2=67.46)=0

00 0.01 0,02 0.03 0.04 0,05 0.06 0,07 0.08 I

10.9

~\, ,[

.. . . . . ... .. .

0.60.7 ..”

0.60.5 ..,’

0,4 ,,.

0.3 length 3000.2 .. p(Chi2=30.53)=.00010,1

00 0.01 0.02 0.03 0,04 0.05 0.06 0.07 0.08 ]

10,90,80.70.60.50.40.30.20.1

0

length 40Qp(Chi2=22.97)=.0017

*0 0,01 0.02 0,03 0.04 0.05 0.06 0,07 0.08

10.9

;, [; ,,~DÌ‹ ,

. .

0.80.70,60.50,40.3 length 500

...0,20.1

p(Chi2.30.06)=.0COl..

00 0.01 0.02 0.03 0.04 005 0.06 0,07 0.06

10.9-0.6-0.7-0,6-0.5-0.4-0.3- ien@h WO

0.2-0.1

p(Chi2=16.22)=.0232

00 0.01 0.02 0,03 0.04 0.05 0.06 0.07 0,0

i-0.9-0,6-0.7-0.6- .0,5-0.4-0.3- . Ie@h 7000,2 p(Chi2=56.62)=00.1

00 0.01 0.02 0,03 0,04 0,05 0,06 0,07 0,08

density of e line

FIGURE 10. (A) Simulated and psychophysicalpsychometric functions for the fixed density condition from Experiment 1.Filled circles (connected by a solid line) and a diamond represent psychophysical results [in Fig. 5(A)] and open circles(connected by a dashed line) and a square represent simulated results. (a) Subject MSG; (b) subject ZP. (B) Simulated andpsychophysicalpsychometric functions for the fixed length conditionfrom Experiment 1. Filled circles (connectedby a solidline) and a diamondrepresent psychophysicalresults [shownin Fig. 5(B)] and open circles (connectedby a dashed line) and a

square represent simulated results. (a) Subject MSG; (b) subject ZP.

1232 Z. PIZLG et al.

2. Angle criterion (a~=);3. Angle standard deviation (o.);4. Goodnesscriterion.

The optimization was performed separately for eachsubject because it was reasonable to assume that theoptimal values of the parameters in the model may bedifferent for different subjects. Among the four para-meters listed above, the first three are likely tocharacterize the structure and the functioning of thevisual system itself, whereas the fourth parameter (thegoodnesscriterion)is likely to correspondto the subject’sresponse criterion. Therefore, the values of the threeparameters were estimated only once (for one psycho-metric function) and then were kept constant. The valueof the fourth parameter (the goodness criterion), on theother hand, was optimized for each psychometricfunction.

Thus, the optimizationwas performed in two stages. Inthe first stage, optimalvalues of alI four parameterswereestimated by fitting the simulated curve to the psycho-physical one for the middle (third) session of the fixeddensity condition. The optimal values of the threeparameters (these values were kept constant in theremaining optimization) were as follows (for MSG andZP, respectively):

1. Minimum density (1/6): 0.013 and 0.012;2. Angle criterion (am=): 6 and 4 deg;3. Angle standard deviation (rr.): 9 and 10 deg.

The remaining 11 psychometric functions (five fromthe fixed density condition and all six from the fixedlength condition)from the first experimentwere fittedbyusing only one free parameter, namely, the goodnesscriterion. The results are shown in Fig. 10. The datapoints representing the psychophysical experiment arerepresented by filled symbols:circles joined with a solidline for trials with target, and diamonds for catch trials.The data points representing the simulation experimentare represented by open symbols: circles joined with adotted line for trials with target, and squares for catchtrials. It is seen that the fit was very good for the thirdsession in the fixed density condition, for which fourparameters were subject to optimization(in this case thenumber of degrees of freedomwas four). The fitwas alsogood for several other sessions from the fixed densitycondition (in these cases the number of degrees offreedom was seven). Note that the quality of the fit issimilar to that obtained by fitting the cumulativeGaussian distribution function to the psychophysicalpsychometric functions (see Fig. 5). In the fixed lengthcondition,where the simulated functionswere producedby using estimates of three parameters from the tieddensity condition, the fit, as measured by a Z*test, wasworse. It is quitepossiblethat the fit could be improvedifmore than one parameter was allowed to be optimized.Itis seen, nevertheless, that the shapes of the simulatedpsychometric functions are similar to the shapes of thepsychophysical psychometric functions. We concludethat the simulationresultsproducedby our new modelare

. . . . . . . . . . . . . . . . . .

.. ””...“

... -. .

. .“

. . . . . . . . . . . . . . . . . .

. . . . .. .. . . ,. ”’””. . .

A straightline (angle Odeg)

B. circulararc(angle 10 deg)

C. circularam(angle20 deg)

D. irrqularly shaped mm (angle +-10 deg)

E. irregularlyehaped cum (angle +-20 deg)

FIGURE11. Targets used in Experiment2. The targets have differentshapes, but they all have the same length and density.

consistent with the psychophysical results. It is quitepossible, however, that similarly good fits could havebeen obtained by other models as well [e.g. Uttal’s(1975)]. Therefore, new experiments are needed thatallow the different models to be told apart. Suchexperimentswill be presented in the next two sections.

EXPERIMENT2: DETECTIONOF CURVEDLINES

Existing definitionsof smoothness (or good continua-tion) of a curve can be classifiedby two parameters: localangle and local change of angle. According to most priortheories, a curve is smooth if the local angles are smallalong the curve (e.g. Uttal, 1975;Grossberg& Mingolla,1985a,b).As a result, in these theories, a straight line isthe easiest to detect and detectabilityshoulddeteriorateifthe local angle increases (e.g. a small circle should bemore difficult to detect than a large one). According toother theories (e.g. Smits & Vos, 1986; Yuen et al.,1990), a curve is smooth if the local angle is constantalong the curve (i.e. the change of local angle is zero). Inthese theories all circles and straight lines should beequallyeasy to detect and detectabilityshoulddeteriorateif the shape of the curve is irregular. On the contrary,according to our new definition of smoothness, detect-ability shouldbe equallygood for a wide range of shapes:straight lines, circles, and irregularly shaped curves,providedthe local changeof angle is small.To test whichof the theories, if any, is psychologicallyplausible, weused the following shapes of targets (see Fig. 11):

1. Straight line;2. Circular arc with local angle 10 deg;3. Circular arc with local angle 20 deg;4. Irregularly shaped curve with maximal change of

local angle 20 deg;5. Irregularly shaped curve with maximal change of

local angle 40 deg.

All targets had the same density of dots and the samelength. Therefore, any differences in the subjects’performance can be attributed to the shape itself, ratherthan to other unrelated factors.


Methods

Subjects. Three subjects, including two of the authors,were tested. Subjects MSG and ZP served in the firstexperiment. The third subject, BJ, was naive about thehypothesisbeing tested.BJ was a slightmyopeand he didnot usuallywear his correctiveglasses.Therefore,he wastested in the experimentwithout glasses.He did not haveprevious experience as a subject in psychophysicalexperiments. He was given extensive practice for thecurrent experiment.

Stimuli.The length of the target and the densityof dotsin the target were constant for all sessions for a givensubject. Subjects ZP and BJ were presented with targetsof length 400 and density 0.045. Subject MSG waspresented with targets of length 600 and density 0.03,which were more difficult than the targets used by theother two subjects. Testing MSG with more difficultstimuli led to similar performance in all subjects, due tothe fact that MSG had greater detectability (see theResults section of Experiment 1 for comparison of thedetectability of MSG and ZP). The number of dots ineach target was 19 (in all sessions and for all threesubjects), except in the case of a circular arc with localangle 20 deg [Fig. 11(C)]. In this case the circlewas closed and, as a result, the first and the last dotcoincided.

Circular arc targets [Fig. 11(B and C)] were generatedby randomly choosing the starting point and theorientation of the segment represented by the first twodots. The remaining dots were generated on thecircumference of the circle in such a way that the angle

*Thedetectabilityd’ for a straight line, estimated in this experimentbythe use of signal detection theory, agrees quite well with thediscriminability estimated in Experiment 1, where the method ofconstant stimuli was used. Consider first the results of MSG inExperiment 1 in the fixed length condition for length 600 (i.e. thelength used in the present experiment). The proportion of “targetpresent” responseson catch trials was 0.15.This correspondsto thestandard normal variable z = –1.04. Next, consider a target withdensity 0.03 (the density used in the present experiment).A targetwith this density was not used in the session with length 600;therefore there is no measurementof performancefor such a target.However, the performance (i.e. the proportionof “target present”responses) can be estimated from the best fitting cumulativeGaussian distributionfunction. In this session the fitting curve hadmean 0.0197 and standard deviation 0.0043. This means that thedensity 0.03 corresponds to a z score of +2.40. As a result, thediscriminability d’ between noise and a target of length 600 anddensity 0.03 is estimated as 3.44. This estimate agrees quite wellwith the d’estimated from the present experimentfor MSGand thestraight line condition (see Fig. 12). Next, we perform a similarcomparisonfor ZP. Considerthe fixedlengthconditionwith length400 and take the data points representing catch trials and a targetwith density0.045.The proportionof “target present” responsesoncatch trials was 0.05, which corresponds to z = –1.64, and theproportionof “target present”responsesfor density0.045was 0.91,which corresponds to z = 1.34. Thus, the d’ estimated fromExperiment 1 is 2.98. Again, this estimate is quite close to the d’measured in the present experiment (see Fig, 12). This agreementbetween the results obtainedin the two experimentsmeans that thesubjects’ performance did not depend on the psychophysicalmethod used (constant stimuli in Experiment 1 vs signal detectionin Experiment 2).

formed by three consecutivedots was constant and equalto the given angle w (we used u equal to 10 deg and20 deg). The shape of the irregularlyshaped targets [Fig.11(D and E)] was random from trial to trial and wasobtained by using the following method. The first twodots in the case of an irregularly shaped target weregenerated identically to those in the case of a circulartarget. The remaining dots were generated in such a waythat the angleformedby three consecutivedotswas equalto plus or minus the given angle a (again,we used a equalto 10 and 20 deg). As a result, the local change of anglecould take values O,~2g, or –2m

Procedure. The experimentconsisted of five sessions.The order of the sessions was randomized for eachsubject. In each session only one target was used. Theshape of an irregularly shaped target Fig. 11(D and E)was random and changed from trial to trial.

The method of signal detectionwith confidenceratingwas used. In a given session there were 300 trials inwhich a target was present, and 300 trials without atarget, Before each session 60 practice trials werepresented to the subject; in half of the trials the targetwas present.

The subject’s task was to detect a target. After thestimulusdisappeared the subjectwas presented with fivepossible responses: “noise”, “probably noise”,“uncertain”, “probablysignal”, and “signal”.

Signal detection theory for the data containingconfidence ratings was used to analyze the results ofthe experiment (Macmillan & Creelman, 1991). Theanalysis provided the detectability measure d’ and itsestimated standard deviation.

Results

The results are shown in Fig. 12. Each panel shows theresultsof all sessionsfor one subject. On the abscissathetype of targetused in a given sessionis indicated(straightline, circulararc with 10 deg local angle, circulararc with20 deg local angle, irregularly shaped curve with+10 deg local angle, irregularly shaped curve with~ 20 deg local angle). For each session (target) the valueof d’ is represented by the height of a bar. Additionally,error bars, with heights equal to t 1 SD of d’, aremarked.

It is seen that all subjectsproduced the same pattern ofresults. The straight line was easiest to detect.* The nextthree targets:circulararc with local angle 10 deg, circulararc with local angle 20 deg, and irregularlyshaped curvewith local angle 10 deg, show similar detectability foreach subject. The differences among these three condi-tions are small and comparableto the standarddeviationsof d’. Finally, the session with irregularly shaped curveswith local angle 20 deg produced the poorest perfor-mance.

Discussion

We will now comparethese resultsto the predictionsofprior theories of the law of good continuation.First, thefact that the detectability of an irregularly shaped curve


MSG4

3.5

3

2.5

d’ 2

1.5

1

0.5

0line arc 10 arc 20 irregular irregular

10 20

l-r2.5

2

~’1.5

1

0.5


10 20

2.5

2d’

1.5

1

0.5


10 20

FIGURE 12. Results of Experiment 2. Individualbars represent different targets (shown in Fig. 11). Different panels showresults for different subjects.

with a local angle &10 deg is not different from thedetectability of a circular arc implies that goodcontinuation (or smoothness) is not equivalent tosimplicity of the curve representation. An irregularlyshaped curve changes its direction randomly; to describethe shape of such a curve the number of parametersneeded would be comparableto the number of dots in thecurve, whereas the shape of a circular arc can be

describedby only two parameters:the local angle and thelength of the arc. Thus, our results contradict thesimplicity principle and all prior theories based on thisprinciple.The onlypropertycommonto a circulararc andto an irregularly shaped curve with local angle +10 degis that these curves do not change direction rapidly, i.e.the change of local angle is small along the curve (in thisexperiment this change of local angle was not >20 deg).


When the change of local angle was greater (40 deg), asin the case of an irregularlyshapedcurvewith local angle~ 20 deg, detectability was poorer. These results areconsistent with our new definition of smoothness andthey seem to reflect the operation of rule 3 (see The NewModel section).

Next, consider the fact that the performance was thesame for the two circular arcs. This result impliesthat thevalue of the local angle itself (or the value of curvature)has no effect on the detectability of an arc. Therefore,good continuation (or smoothness) of a curve does notseem to involvelocal angle (or curvature)and thus seemsto reflect in this case the absence of rule 2 (The NewModel section)or the use of a liberal criterion u~., in thisrule. This result is consistent with our definition ofsmoothnessand it contradicts those prior theories whichclaimed that the greater the departure from straightness,the worse is the performance.

Next, consider a straight line, where performancewasbest. This superiority of the straight line target was notpredicted by our definitionof smoothness.According toour definition, if the change of local angle is small, thedetectability shouldbe high, and the same shouldbe truefor different curves. Can our pyramid model account forthis higher performance in the case of a straight line? Ifthe model “knows”that the target is a straightline, then itcan use the constraint on the bottom layers of thepyramid, that all local angles are close to zero. Thisconstraintis in fact representedby rule 2 (TheNewModelsection) (recall that using this rule allowed our model toachieve the same (high) performance as the subjects didin Experiment 1). A question is, however, why the anglecriterion produced by knowledge about the local anglecharacterizing the target can improveperformance in thecase of straight lines, but not in the case of circular arcs?Considera circular arc and assumethat the local angle&.formed by a triplet of consecutive dots is known. If theorientationof the circular arc is unknown,a given pair ofdots does not determine the position of the third dotuniquelybecause the angle MOcan be positiveor negative.In other words, given a pair of dots from a circular arc, itis not known whether the arc was generatedclockwiseorcounterclockwise.In such a case if informationabout a.is to be used, it would correspond to the followingmodificationof rule 2 (rule 2a):

la@ +Xa + cull< a~..

This inequality means that the algorithm includes inthe analysisall tripletsof dotswhoseangles(%j~+-%) arewithin the region having angular magnitude of 4a~m(MOt u~,Xand –u. + a~=). Inthe case of a straightline(a. = O)the algorithmincludes in the analysisonly thosetriplets of dots whose angles (aij~+ ~~) are within theregion having angular magnitude of 2a~w (+ a~~).Thus, the region that has to be analyzed is two timessmaller in the case of a straight line as compared to thecase of a circular arc. In other words, in the case of astraight line, predictabilityof the location of the next dotin the line is two-fold greater than that in the case of acircular arc with unknown orientation. This means that

the knowledge of local angle in the latter case could beused, but the benefit from such top-downprocessingmaybe small enough to be overshadowedby the performancebased on bottom-up processing. This reasoning can betested by measuring performance for a circular arc withfixedorientation.In such a case one can take the directionof the chord of the arc as the reference orientation (thisorientationwould be constant and known to the subject).This direction would determine the sign of a. in rule 2a,and, as a result, the uncertainty (and thus the perfor-mance) would be similar or equal to the uncertainty (andperformance)in the case of a straight line with unknownorientation. Such a test will be described in the nextsection.

Finally, consider how top-down effects could accountfor the apparent contradiction between Uttal’s results,where he observed that the greater the departure fromstraightness,the poorer the performance, and our results,where we failed to observe such an effect (see thecomparisonbetween circular arcs with local angle 10 and20 deg). In Uttal’s experiments straight line targets andcurved targets were all presented in random order in asingle session.In such a case, if the subjectused, at leastin some trials, a criterion that the local angle is close tozero (rule 2) (as would be the case if only a straight linetarget was used), performance would be systematicallyaffected by the departure from straightness.This is whatUttal observed. In our experiment, on the other hand,straightline targetswere never mixed with curved targetsin a singlesessionand, therefore, the subjectdid not haveto use two differentcriteria at the same time. As a result,when a curved line was the target, the subject was notlikely to use the criterion for straight lines, andconversely, when a straight line was the target, thesubject was likely to use the criterion for straightnessinall trials, which could give rise to high performance.

In the next section, we present the results ofsimulationsin which our pyramid model was tested withthe same stimuli as those used in Experiment 2.

SimulationsSince the model has alreadybeen tested for the case of

straight line targets (see Fig. 10), the present simulationswere performed to test the model’s capability to accountfor the subjects’performance in the case of curved lines.Therefore, the angle criterion (u~.X)and angle standarddeviation (o.) were not used in the present simulations.Instead, the change-of-angle criterion (Amm)and thechange-of-angle standard deviation (~A) were used.Figure 13 shows the results of simulations for threeconditionsrepresentedby differentpairs of values for thechange-of-angle criterion and change-of-angle standarddeviation: 10 and 8 deg, 20 and 10 deg, and 25 and20 deg, respectively.These three conditionswere chosenin such a way that they gave rise to qualitativelydifferentresults. In the first case, both the criterion and standarddeviation for change of local angle were small. In thesecond, the standard deviation was small, but thecriterion was large. In the last simulation, both the


Changeof angle:criterionIOrleg,standarddev.6deg (-1.65)

2.51 I

2

1.5-d’

1-

0.5-

0(

I line arc 10 arc 20 irregular irregular

I 10 20

Changeofangle:critetion20deg,standarddev. Iodeg (-1.74)

2.5 , I

d’1-

0.5-


10 20

Changeofangle:criterion25deg,standarddev.20deg (-1.75)

2.5 I I

I 1.5d’

1-

0.5-

0,line arc 10 arc 20 irregular irregular

10 20

FIGURE13.Resultsof simulationsfor Experiment2. Individualbars representdifferenttargets. Differentpanels showdifferentcombinations of the model parameters: change-of-anglecriterion and change-of-angle standard deviation. The numbers in

parentheses are values of the goodnesscriterion.

criterion and standard deviation were large. The casewhen the criterion is small and the standard deviation islarge is not interesting here because this would lead to asituationwhere many (or most of) the targets are missed.As a result, detectability would be close to the chancelevel for all targets.

In the simulations, the target had length 600 anddensity 0.03 (the same as used by MSG). The sessionconsistedof 300 images with target and 300 imageswithnoise. The proportion of the model’s “target present”judgments for trials with and without target wererecorded and used to compute the detectabilitymeasured’.

The resultsof the simulationsare shownin Fig. 13.Theform of this graph is identical to the form of the graphsshown in Fig. 12. Each panel in Fig. 13 represents onesimulation for one pair of values for the criterion andstandard deviation of change-of-angle.It is seen that thepattern of results depends on the values of the twoparametersand that the results shown in the middle panelmost closely resemble the pattern of results obtained bythe subjects. Specifically,the model’s detectabilityof anarc with local angle of 10 deg, an arc with local angle of20 deg and an irregularly shaped curve with local angle+10 deg are similar (the differences in d’ among theseconditionsare small and comparable to standard errors).

CURVEDETECTION

At the same time, ~’ for an irregularly shaped curve withlocal angle +20 deg is clearly lower. This pattern ofresults is identical to that obtained by the subjects. Thisagreement between the psychophysical and simulationresultsprovidessupportfor our model of curve detection,along with our new definhion of smoothness.

There is, however, one psychophysical result whichwas not reproduced in our simulation. Namely, thesubjects were able to detect the straight line target betterthan any other target (see Fig. 12). This result was notobtained in our simulations because the simulations werebased on the new definhion of smoothness, and, in thisdefinition, the straight line target is not different fromother smooth line targets. This comparison of thepsychophysical result with the simulation results suggeststhat detecting straight lines may involve using anadditional constraint on local angle, which leads to amore efficient mechanism (as pointed out in the previoussection). Such a constraint could have been used by oursubjectsbecause in the sessionwith a straight line target,no other target was used and the subjectsknew this fact.As we already showed in the section describing ourmodel, where the psychophysicalpsychometricfunctionswere approximatedquite well by the simulationpsycho-metric functions, this constraint can be implemented inour pyramid model as well, and it leads to performanceasgood as that of the subjects.

To summarize, we can conclude that our simulationmodel can account quite well for the results of ourpsychophysicalexperiments on detection of straight andcurved targets. Furthermore, our model allows forimplementingsome top-downconstraintsthat may resultfrom knowledge about properties of the target to bedetected. This fact agrees with our psychophysicalresults. First, in Experiment 2, the detectability of astraight line was clearly better then the detectability ofother lines. This result shows that some informationabout the shape of the target can be used to improvedetectability. Second, we showed in our Experiment 2that the detectability of circular arcs (whose shape wasconstant in a session and known to the subject) was thesame as detectabilityof an irregularly shaped curve withlocal angle ~ 10 deg (whose shape was random andunknown to the subject). This result suggests thatknowledge of shape is not always used, and this resultwas consistent with our new definition of smoothness,accordingto which an arrangementof dots is classifiedasa smooth line (and thus detected) if.the change of localangle is small along the entire curve.These two groupsofresults show that curve detectabilityconformsto the newdefinitionof smoothness,but at the same time it allowsfor using knowledge about some propertiesof the target,provided that the knowledge refers to local propertiesofthe target.

To further study the role of knowledge about a target,Experiment3 was performed. In this next experimenttheeffect of knowledgeof a localproperty(orientation)and aglobal property (shape) of a curved target on itsdetectabilitywere tested. Experiment 2 already revealed

IN A NOISY IMAGE 1237

the lack of an effect of knowledge about a shape on itsdetectability. However, this result was obtained bycomparing detectability for circular arcs with that forirregularly shaped curves. As a result, the knowledgeabout the shape was confounded with the target shapeitself. To test the effect of knowledge about local andglobal properties of a target, unconfounded with thetarget’s shape, Experiment 3 tested the effect ondetectabilityof:

1.2.

3.

Knowledge of the orientationof a circular target;Knowledge of the shape of an irregularly shapedcurve; andKnowledge of both the shape and orientation of anirregularly shaped curve.

EXPERIMENT3: THE ROLE OF KNOWLEDGEOFTHE SHAPE AND ORIENTATIONOF A TARGET

Consider first theoretical predictions of how knowl-edge about a target can be used. In general, suchknowledge can be used in two ways: globally, whilemaking a decision about whether a seen figure is thetarget, or locally, to restrict the number of acceptedfragments of figures. The two possibilities of usingknowledgeaboutthe target have differentimplications.Ifthe knowledge is to be used globally, the figure found inthe image must be compared with the expected target. Inthis method, there is no restriction on the type ofinformationthat may be useful. It is clear, however, thatthis method cannot be implemented in the pyramidalgorithmbecause global information about the target isnot availableto nodeson the layers that are lower than thelayer of the root node (the root is a node which can “see”the entire stimulus). Clearly, the pyramid algorithm canuse only local information. More precisely, only suchlocal information can be used, which can be translatedinto rules that can be applied locally, so that many nodesin the pyramid structure, including the nodes that do not“see” the entire stimulus, can use this information torestrict the number of possible targets. As a result, thepyramid algorithm predicts benefit from knowing boththe shape and orientationof a target and no benefit fromknowing only the shape of an irregularly shaped target.

Methods

Subjects. Three subjects, including the two authorswho served in the first and second experiments, weretested. The third subject, PG, was an emmetrope. He didnot have any prior experience as a subject in psycho-physical experiments. He was given extensive practicefor the current experiment.

Stimuli.Only two types of targetswere used: a circulararc with local angle of 10 deg and an irregularly shapedcurve with local angle of *2O deg. The targets hadconstantlength and density,the same for all sessionsfor agiven subject.SubjectsMSG and PG were presentedwithtargets of length 600 and density 0.03, whereas subjectZP was presented with targets of length 400 and density0.045.


MSG3.0

1

❑ arc 10❑ irregular 20-order 12.5❑ irregular 20-order 2

2.0 Ei irregular 20h n

d’ 1.5

1.0

0.5

0 rlnknownshape knownshape knownshape&orient &orient

ZP3.0

F

❑ arc 10

2.5 H irregular 20-order 1❑ irregular 20-order2 m

2.0

d’ 1.5

1.0

0.5

n.unknown shape knownshape knownahape

&orient &orient

PG3,0

2,5

2.0

d’ 1.5

1.0

0.5

0 unknownshape knownshape knownshape&orient &orient

FIGURE 14. Results of Experiment3. The three groups of bars represent the three levels of familiarity with the target. Theindividualbars represent the followingconditions:(fromthe left): irregularlyshapedcurvewith unknownshapeandorientation,circular arc with unknownorientation, irregularlyshaped curve with knownshape but unknownorientation(order 1, order 2),

circular arc with knownorientation, irregularly shaped curve with knownshape and orientation (order 1, order 2).

Procedure. The experiment consisted of five sessions.The order of the sessions was randomized for eachsubject. In each sessiononly one type of target was used.The following targets were used:

1. Circular arc (local angle 10 deg) with randomizedorientation (as in Experiment 2);

2. Circular arc (local angle 10 deg) with fixed orienta-tion;

3. Irregularlyshapedcurve (local angle *2O deg)withrandomizedshape and orientation(as in Experiment2);

4. Irregularlyshapedcurve (localangle *2O deg)withfixed shape, and randomized orientation;

5. Irregularlyshapedcurve (local angle *2O deg) withfixed shape and orientation.

The position of the target was randomized in allconditions. Note that conditions 1, 2, 4, and 5 use aconstantand knownshape(the subjectswere familiarizedwith the shape before the session). Condition 3 uses arandomlychangingshapeof the same type as in condition4 and 5. Conditions1,3, and 4 use randomorientationsofthe target and conditions2 and 5 use fixed orientationsof


the target. To be able to compare conditions 4 and 5 itwas important that they used a curve with the same shape.Otherwise, the advantage of condition 5 over 4 could berelated to the fact that the shape in condition 5 wascoincidentally easier to remember, rather than to the factthat orientation was fixed. Note, however, that using thesame shape in conditions 4 and 5 could give rise to anorder effect, for example, if condition 5 was run aftercondition 4, the better detectability in condition 5 couldbe attributed to the fact that the subject had moreexperience with this curve while running condition 5. Tounconfound the effect of order, shape and orientation,conditions 4 and 5 were repeated twice for each subject:in order 4, 5 and in order 5, 4. In each such set of twosessions the same shape was used.

At the beginning of each session the subject was showna sample target for unlimited inspection. The subjectswere informed what properties of the target would stayconstant in a given session, and were instructed tocarefully examine and remember the sample target. Therest of the details of the experimental design were thesame as in the second experiment.

Results and discussionThe results are shown in Fig. 14. We will concentrate

here on describing and discussing two results that wereobserved for all three subjects. First is the lack of effect ofknowing the shape of an irregularly shaped target on itsdetectability when its orientation is unknown (condition 3vs condition 4). Second is the improvement of detect-ability in the case of a target whose shape and orientationare known (conditions 2 and 5), as compared to a targetwhose shape is known but whose orientation is unknown(conditions 1 and 4, respectively).

Consider now in more detail the case of a circular arc(conditions 1 and 2). Condition 1 was identical to one ofthe conditions in Experiment 2. As expected, theperformance in this condition of the two subjects whoparticipated in all three experiments (MSG and ZP) wasidentical in both experiments. Next, consider condition 2in the present experiment. In this condition, theorientation of the circular arc was constant and knownto the subject. This knowledge could have been used bythe subject to improve performance, according to ourpredictions derived from the pyramid model (rule 2a inthe Discussion section of Experiment 2). More exactly,performance in this condition was expected to becomparable to performance for the case of a straight linetarget in Experiment 2 because the local uncertaintyabout the direction of a line, as predicted from any pair ofdots, is the same for a straight line with unknownorientation and for a circular arc with known orientation.And, in fact, the performance in condition 2 in the presentexperiment for subjects MSG and ZP was similar to theirperformance in the case of a straight line target inExperiment 2.

Clearly, the results of this experiment are consistentwith the predictions of the pyramid model. Only someinformation about the target can be used in a detection

task. The information which can be used is restricted toinformation that can be expressed as a set of simple rulesthat can be applied locally by nodes at different levels ofthe pyramid. Thus, the information about the shape of anirregularly shaped target is not likely to be useful locallybecause this shape is a global property. The orientation ofa known shape, however, can be used locally.

We conclude that the psychophysical results obtainedin all three experiments can be adequately explained byour pyramid model, after the new definition of smooth-ness is supplemented by the possibility of using someprior knowledge about the local properties of a target.This knowledge can be implemented in the form ofadditional rules similar to rules 2, 2a or 3 in the currentversion of the model.

SUMMARYAND CONCLUSIONS

In this paper we have analyzed one of the Gestalt lawsof organization that are involved in the phenomenon offigure-ground segregation, namely, the law of goodcontinuation. Almost all prior theories assumed thatfigure-ground segregation must operate in a bottom-upfashion and, thus, is pre-attentive. This assumptionseemed to be necessary to assure that this first stage ofvisual processing is fast. We showed, however, that thisassumption is too restrictive and it does not lead to apsychologically plausible explanation of the phenomen-on of figure-ground segregation.

To explain the perceptual mechanisms underlying thelaw of good continuation, we proposed a new theorybased on an exponential pyramid algorithm. This theoryis different from prior theories in several respects:

1. It does not assume purely bottom-up (pre-attentive)processing. Instead, it also allows for the effect ofknowledge of, or familiarity with, the target to bedetected.

2. It does not involve purely local or purely globalanalysis of an image. The processing involves acombination of both these types of analysis.

3. Our definition of smoothness is rooted in thelikelihood, rather than the simplicity, principle.

We showedthat our new modelcan accountfor a widerrange of psychophysical results, as compared to priormodels. Furthermore, the new predictions of our modelwere confirmed by the results of our experiments. Webelieve that thismodel can be further elaboratedso that itcan explain not only the law of good continuation, butalso the general phenomenon of figure-ground segrega-tion.

Finally, we briefly discuss one element of themethodology of our approach. Traditionally, psycholo-gical research has been conducted by first formulating anew theory that made predictions qualitatively differentfrom predictions of prior theories [e.g. Hering’s (1878),opponent process theory vs Young–von Helmholtz’s(1852), trichromatic theory of color vision], and thenperforming experiments testing these predictions [e.g.Hurvich and Jameson’s (1951), experiment on fusion of


green and red]—we call this approach “qualitative”.Since the predictions were qualitatively different, theresultsof such experimentswere usuallyunequivocalandthus easy to interpret. Another approach (we call itquantitative) involves formulating a new theory thatmade predictions different from the predictions of priortheories, but the differences did not have to bequalitative. In such cases, experiments were usuallyfollowed by mathematical or computational modelingand fittingthe model to the data. The theory that led to thebetter fit was accepted (if none of them provided asatisfactoryfit, none was accepted).The advantageof the“quantitative”approach was that it involved quantitativemodels and, thus, the theory was formulated in a moreprecise way. The disadvantagewas, however, that in thepresence of variability in the subject’s responses it isoften difficult, if even possible, to decide conclusivelythat one theory is clearly better than others if the onlydifference between the theories being compared is thedegrees of relationships, rather than their presence ordirection [e.g. see Luce (1986) for a discussion ofdifferent models of the speed–accuracytradeoffl.

Our approach is more conservative.On the one hand,we believe that formulatinga theory in a quantitativewayis important because only then can one be sure that noimplicit assumptions are involved in the theory. On theother hand, we believe that the comparison of the newtheory to previous theories should be made by deriving,from the new theory, predictions that are qualitativelydifferent from the predictionsof prior theories. The newtheory is accepted if experimental results are quantita-tively consistent with the new theory and qualitativelydifferentfrom predictionsof prior theories.By doing thisone can avoid the problems inherent in a purelyqualitative or purely quantitative approach and retainthe advantages of both approaches [in fact, we appliedthis kind of approach not only in this present study, butalso in our prior work: Pizlo (1994); Pizlo et al. (1995);Pizlo & Salach-Golyska(1995)].

REFERENCES

Alter, T. D. & Basri, R. (in press). Extracting salient curves fromimages: an analysis of the saliency network. Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society Press, in press.

Beck, J. (1966). Effect of orientation and of shape similarity onperceptual grouping.Perception and Psychophysics, 1,300-302.

Ben-Av, M. B. & Sagi, D. (1995). Perceptual groupingby similarityand proximity: Experimental results can be predicted by intensityautocorrelations. Vision Research, 35, 853–866.

Braunstein, M. L. (1976). Depth perception through motion. NewYork: Academic Press.

Burbeck, C. A. & Yap, Y. L. (1990). Two mechanisms forlocalization? Evidence for seperation-dependent and separation-independent processing of position information. Vision Research,30, 739–750.

Burt, P. J. (1981). Fast filter transforms for image processing.Computer Graphics and Image Processing, 16, 20-5L

Caelli, T. M., Preston,G. A. N. & Howell,E. R. (1978).Implicationsof.spatial summation models for process of contour perception: Ageometric perspective. Vision Research, 18,723-734.

Campbell, F. W. & Robson, J. G. (1968). Application of Fourier

analysis to the visibility of gratings. Journal of Physiology, 197,551-566.

Cutting, J. E. (1986).Perception with an eye for motion. Cambridge,MA: MIT Press.

Dobbins,A., Zucker,S. W. & Cynader,M. S. (1989).Endstoppingandcurvature. Vision Research, 29, 1371–1387.

Field, D. J., Hayes,A. & Hess, R. F. (1993).Contourintegrationby thehumanvisual system:Evidencefor a local “associationfield”. VisionResearch, 33, 173–193.

Finney, D. J. (1971). Probit analysis. Cambridge: CambridgeUniversityPress.

Gibson, J. J. (1950). The perception of the visual world. Boston:HoughtonMifflin.

Gibson, J. J. (1979). The ecological approach to visual perception.Boston: HoughtonMifflin.

Grossberg, S. & Mingolla, E. (1985a) Neural dynamics of formperception: Boundary completion, illusory figures, and neon colorspreading.Psychological Review, 92, 173–211.

Grossberg,S. & Mingolla,E. (1985b)Neural dynamics of perceptualgrouping: Textures, boundaries, and emergent segmentations.Perception and Psychophysics, 38, 141–171.

von Helmholtz, H. (1852). On the theory of compound colors.Philosophical Magazine, 4, 519–534.

Hering, E. (1878).Zur Lehre vom Lichtsinn. Vienna: Gerold.Hurvich,L. M. & Jameson, D. (1951).The binocular fusion of yellow

in relation to color theories. Science, 114, 199–202.Johansson, G. (1977). Spatial constancy and motion in visual

perception. In Epstein, W. (Ed.), Stability and constancy in visualperception: mechanisms and processes (pp. 375419). New York:Wiley.

Jolion,J. M. & Rosenfeld, A. (1994) .Apyramidalfiamework for earlyvision. The Netherlands:Kfuwer,Dordrecht.

Julesz, B. (1962). Visual pattern discrimination.IRE Transactions onInformation Theory, 2, 84-92.

Koenderink, J. J. (1984). The stmcture of images. BiologicalCybernetics, 50, 363–370.

Koffka, K. (1935). Principles of Gestalt Psycholo~. New York:Harcourt Brace.

Leeper, R. (1935). A study of a neglected portion of the field oflearning-the development of sensory organization. Journal ofGenetic Psychology, 46, 41–75.

Luce, R. D. (1986). Response times. New York: Oxford UniversityPress.

Macmillan, N. A. & Creelman, C. D. (1991). Detection theory: Auser’s guide. Cambridge:CambridgeUniversityPress.

Marr, D. (1982). Vision. New York: W. H. Freeman.Meer, P. (1989). Stochastic image pyramids. Computer Vision

Graphics and Image processing, 45, 269–294.van Oeffelen, M. P. & Vos, P. G. (1983). An algorithm for pattern

descriptionon the level of relative proximity.Pattern Recognition,16, 341–348.

Pizlo, Z. (1994). A theory of shape constancy based on perspectiveinvariants. Vision Researchj 34, 1637–1658.

Pizlo, Z., Rosenfeld, A. & Epelboim, J. (1995). AU exponentialpyramid model of the time-course of size processing. VisionResearch, 35, 1089–1107.

Pizlo, Z. & Salach-Golyska, M. (1995). 3-D shape perception.Perception and Psychophysics, 57, 692–714.

Rock, I. (1983).The logic ofperception. Cambridge,MA: MIT Press.Rosenfeld, A. (1990). Pyramid algorithms for efficient vision. In

Blakemore, C. (Ed.), Vision: Coding and efficiency (pp. 423-430).Cambridge:CambridgeUniversity Press.

Sha’ashua,A. & Unman, S. (1988).Structural saliency: The detectionof global salient structures using a locally connected network. InProceedings of the International Conference on Computer Vision

(PP.321-327). Washington,D.C.: IEEE ComputerSociety press.Shepard, R. N. & Cooper L. A. (1982). Mental images and their

transformations. Cambridge,MA: MIT Press.Smits, J. T. S. & Vos, P. G. (1986). A model for the,.perception of

curves in dot figures: The role of local salience of ‘{virtuallines”.Biological Cybernetics, 54,407416.


Smits, J. T. S., Vos, P.G. & vanOeffelen,M. P. (1986).Tireperceptionof a dotted line in a noise: A model of good continuationand someexperimental results. Spatial Vision, 1, 163–177.

Tanimoto, S. & Pavlidis, T. (1975). A hierarchical data structure forpicture processing. Computer Graphics and Image Processing, 4,10*1119.

Treisman, A. & Gelade, G. (1980). A feature integration theory ofattention. Cognitive Psychology, 12, 97–136.

Uttal, W. R. (1975). An autocorrelation theory of form detection.Hillsdale, NJ: Lawrence Erlbaum.

Vos, P. G. & Helsper, E. L. (1991). Modelling detection of linearstructure in dot patterns. In Dvorak, 1. & Holden, A. V. (Eds),Mathematical approaches to brairrfimctional diagnostics (pp. 429–440). New York: Manchester UniversityPress.

Watt, R. J. (1987). Scanning from coarse to fine spatial scales in thehuman visual system after the onset of the stimulus.Journal of theOptical Society of America A, 4, 200&2021.

Wertheimer,M. (1923/1958).Principles of perceptual organization.InBeardslee, D. C. & Wertheimer, M. (Eds), Readings in perception

(Pp. 115-135). princeton, NJ: D. van Nostrand.Yuen, H. K., Princen, J., IIlingworth, J. & Kittler, J. (1990).

Comparativestudy of Hough transform methods for circle finding.Image and Vision Computing, 8, 71–77.

Zeki, S. (1993).A vision of the brain. Oxford: Blackwell ScientificPublications.

Zucker, S. W. (1985). Early orientation selection: Tangent fields andthe dimensionalityof their support.Computer Vision, Graphics, andImage Processing, 32, 76103.

Zusne, L. (1970). Visual perception of form. New York: AcademicPress.

Acknowledgement—The authors wish to thank the anonymousreviewers for useful comments and suggestions.

curve detection in a noisy image - carnegie mellon universitysamondjm/papers/pizloetal1997.pdf ·...

Documents