retina simulation using ca and gpu

Upload: -

Post on 01-Mar-2016

219 views

Category:

Documents


0 download

DESCRIPTION

Simulation

TRANSCRIPT

  • Machine Vision and Applications (2007) 18:331342DOI 10.1007/s00138-006-0065-8

    ORIGINAL PAPER

    Retina simulation using cellular automata and GPU programming

    Stphane Gobron Franois Devillard Bernard Heit

    Received: 14 June 2006 / Accepted: 19 October 2006 / Published online: 24 January 2007 Springer-Verlag 2007

    Abstract This article shows how the architecturalmodelization of biological retina allows real-time per-formances, on standard widespread computing systems.First, we describe the biological retina with regard toits pipeline architecture, detailing its layer behavioursand properties. Then we propose a corresponding pipe-lined model of artificial retina based on cellular auto-mata. In this work, themain innovation is the computingmethod based on the programming of a personal com-puter graphical card using OpenGL shading language.The last sectiondemonstrates the efficiencyof ourmodelthrough numerical and graphical results. We lay particu-lar emphasis on the fact that our direct implementationof the Graphical Processor Unit (GPU) provides com-putation power about 20 times as fast as conventionalprogramming.

    S. GobronMovement and Perception laboratory,University of Marseille 2, 163, avenue de Luminy,13288 Marseille cedex 9, Francee-mail: [email protected]: http://www.laps.univ-mrs.fr/gobron

    F. Devillard (B)UMR CNRS 7039 (CRANCNRS),Henri Poincar University,Saint-Di-des-Vosges Institute of Technology,11, rue de lUniversit,88100 Saint-Di-des-Vosges, Francee-mail: [email protected]

    B. HeitUMR CNRS 7039 (CRANCNRS),Henri Poincar University, ESSTIN,2, rue Jean Lamour, 54500 Vanduvre-ls-Nancy, Francee-mail: [email protected]

    Keywords Computer vision Real-time computing Artificial retina Graphical Processor Unitprogramming OpenGL shading language Cellularautomata

    1 Introduction

    Today, computer vision requires increasing visual infor-mation to describe and analyse natural scenes efficiently.Visual clues such as shapes, movements, and coloursbring additional information about the observed scenewhich are necessary for a reliable interpretation. Thesimultaneous perception of several properties providesa better understanding of the scene. These methodsapplied to image processing often fail because of theirexcessive specialization and their lack of flexibility,whereas biological visual systemsdemonstrateoutstand-ing achievements. The alternative bioinspired methodshave produced good results but require costly com-puting systems to reach real-time processing. So usinggraphical processor unit programming to avoid bothconstraints mentioned above, to provide flexibility,real-time computation and rendering, is highly relevanthere.

    1.1 Background

    Weobserved recently real-time systems dedicated to theretinian implementations essentially based on Appli-cation-Specific Integrated Circuit (ASIC) components.Here are a few examples of the related literature:

    analog components [4,9,19]; more rarely digital Very Large Scale of Integration

    (VLSI) components [27].

  • 332 S. Gobron et al.

    These works implement photodetectors and basic pro-cessings and so are classified as artificial retinas. Most ofthem are extremely specific while our work puts forwardan original, real-time and flexiblemodel using cheap andwidespread programmable systems.

    1.2 Overview

    The paper is organized as follows. In Sect. 2 we firstdescribe the biological retina, emphasizing its pipelinedstructure and corresponding layer properties. The nexttwo sections develop the heart of our model. Section 3describes the data flow technical architecture; Sect. 4details the reasons for using GPU programming andhow to compute each retina layer resorting to cellularautomaton on GPU. Results in terms of computationalrates and corresponding graphical plates are presentedin Sect. 5. Finally, Sect. 6 concludes this paper and putsforward some of the related future work, including par-allel pipeline and three-dimensional upgrades.

    2 Methodology

    Our algorithm describes the architectural modelizationof the retinian pipeline processing. The resultingimplementation requires well-adapted computingsystems. In this article, we show that a standard per-sonal computer solution directly using parallel GPUarchitecture produces spectacular performances, espe-cially in this kind of application. Results (Sect. 5) showthat direct and optimized programming using mixedalgorithmObject-Oriented Language (OOL [7])OpenGL Shader Language (GLSL [22])first reducessoftware development, compared to very low-level pro-gramming, and second, the computing times comparedto a conventional method. Therefore, the proposedimplementationobtains real-timeperformances on stan-dard widespread computing systems.

    To understand the method of implementation, weneed first to understand well how the biological ret-ina works (keeping in mind the computer science (CS)context). The following subsections describe the globalarchitecture of the biological retina, then the specific keyrole of Outer and Inner Plexiform Layers (respectivelyOPL and IPL [11]) and finally the link between dataflows going through layers in view of their respectivefunctionalities.

    2.1 Biological retina

    The mechanisms of visual perception [10,11]:seeFig. 1astart by a scene acquisition (symbolized by

    black arrows at the top of the figure) through photo-receptors and end by several data flows specialized incharacteristics of that scene. The retina is composedof several layers, organized to filter visual information.Retinian filters behave similarly to low-pass frequencyspatiotemporal filters modelized by signal processing[5,28].

    Figure 1 shows the pipelined architecture of theretina: Fig. 1a being a vertical section through a humanretina, we suggest grouping the main functionalities ofthe retina around OPL and IPL layers presented inFig. 1b and c.

    The photopic visual perception starts with the scenecaptured by photoreceptors (cone and rod cells). Rodcells hold photoreceptor properties specialized for lowlight-intensity which are not required for the currentapproach, which is why only the cone cells are takeninto account. Image luminance is sampled, then trans-mitted by cone layer towards the downstream retinianand cortical layers. A pipeline of cellular layers can beidentified going from cone to ganglionar layers [14]. Twofunctional layers, theOPLand IPL, can classify the pipe-line into two functionality categories.

    Through the biological retina, each layer is composedof a network of interconnected cells linked by moreor less distant connections. Therefore, each cell can bemodelized by a processor or a basic filter which contains

    several inputs established by connections with a var-iable number of neighbouring cells;

    an output which performs the cell activity, compa-rable to a low-pass frequency filter.

    2.2 OPL and IPL areas of neuropil

    The OPL layer is where connections between rod andcones, vertically running bipolar cells, and horizontallyoriented horizontal cells occur [20]. OPL properties aregenerated by the synaptic triad which is composed ofthree kinds of interconnected cells:

    The cone cells constitute a layer of the transduc-tion and regularization processing. Transductionconverts luminance into electrochemical potentialsaimed at the downstream layers. The regularizationconsists in filtering input signals with a light low-passfrequency filtering. The cone cells are defined bytheir shapes (midget, diffuse,), their types ofresponse (on, off) and their functions (colour sensi-bilities: LMS, respectively red, green, and blue col-ours). In addition, their behaviours depend locallyon the luminance intensity and contrast [29].

  • Retina simulation using cellular automata and GPU programming 333

    Fig. 1 Three ways of showing the human retina architecture, in terms of a biological section through a real retina, b region names andgeneral pipeline of processes, c pipeline functionalities

    The horizontal cells constitute a layer of strongregularization processing. The output response per-forms from the cone cells output to elaborate a spa-tial average of the image intensity [29].

    The bipolar cells make the difference between thehorizontal (luminance average) and the cone out-puts. So bipolar cells estimate the local contrasts ofthe image intensity to increase visual indices. Likecone cells, bipolar cells vary considerably accordingto shapes (midget, diffuse, ), types of response(bipolar on, off,) and functions (contrast estima-tion: red/green, blue/green + red,) [12].

    The bipolar axons transfer the OPL outputs to theIPL area. The IPL processing is performed by amacrineand ganglion cells:

    Amacrine cells have multiple functions. Their mainfunctions are to transform a temporal variation intoa peak signal, typically similar to a high-pass fre-quency filtering [17].

    Ganglion cells make the difference between thebipolar cells (contrast evaluation) and the amacrinecells ouputs. The ganglion cells are classified as bipo-lar cells according to their shapes (midget, diffuse,), their types of response (on, off, on + off) , andtheir functions (spatiotemporal contrast estimationand luminance estimation) [12].

    Because of the various types of cells, the retina pipe-line is composed of three pathways which sort visualinformation (luminance, contrast, colors, andmovement). The known pathways in the inner retinianareas are calledK,M, and P, respectively, koniocellular,magnocellular, and parvocellular pathways (see Fig. 1).

    2.3 The retinian data flows

    Gradually, the visual information flow is divided intothree known pathways with weak redundancies. Theretinian output provides split information, organizedalong three specialized pathways:

    The koniocellular path (K), presents numerous fea-tures whichmake it sensitive to luminance and chro-minance [6].

    The magnocellular path (M), carries out an initialprocessing step which is essential to movement per-ception. Antagonistic receptive fields are used [3].

    The parvocellular path (P), achieves a pre-processing step enabling it to extract the visual cluescharacterizing the objects. The cells making up thetype P distance are by far the most numerous. Theyare small-sized, tonic (of less sensitivity to temporalmodulations), and situated mainly in central retina.Sensitive to high spatial frequencies (thin details orsmall objects) and to low time frequencies, they arethe seat of visual acuteness and thus code the specificposition of the stimulus and the colour [3].

    The previous description of the whole retina showedhow difficult it is to modelize this matter accurately. Theinner retinian mechanisms (feedback, adaptation, )are complex and sometimes not even located, thereforeall have not been studied in depth (for instance K path-way). The proposed modelization uses a layer organiza-tion which copies the architecture shown in Fig. 1. Thewhole processing approaches P pathways of the retina.The P Pathway is considered in an achromatic version.We have designed a generalized layer (shared by eachkind of cell) designed to give it the main properties. Thislayer is repeated to build retinian architecture.

  • 334 S. Gobron et al.

    Fig. 2 Equation graphs presenting the four steps of our algo-rithm implemented for a 1D layer. The variable x represents thecell position and k the time value with unit scale samplings. z1 isthe unit delay operator

    3 Approach

    3.1 Layer modelling

    Each layer is built by a network of cells. Each cell isan individually spatiotemporal low-pass filter. Figure 2shows how our algorithm computes for a cell consideredin a 1D spatial dimension network. The cell has an out-put s (cell activity) and takes e signal and its close neigh-bourhood as inputs. The performance of s is obtainedby an iterative computation. Four performing steps arenecessary and illustrated by Fig. 2. Variables x and kare, respectively, the location and instant using a unitsampling. All layers use our cellular algorithm with anadapted setting depending on their functionalities.

    The resulting filter is composed of two categories ofpipelined filters. A Finite Impulse Response (FIR) fil-ter and an Infinite Impulse Response (IIR) filter whichsmooth cell inputs, respectively, spatially and tempo-rally.

    The parameters that spatially influence the cell out-put are n and . They simulate the lateral interactionbetween neighbouring cells. The FIR filter coefficient (see Fig. 2) performs influence between two imme-diate neighbouring cells. n is the number of FIR filteriterations that is applied to each cell. Thus the values and n adjust the dendritic spreading out for each cell

    (2n + 1 by 2n + 1). Coefficient is adjusted to obtaina pseudo-gaussian low-pass frequency response (bestapproximation for = 0.25) [5]. The parameter thatinfluences temporally the cell output is r. It allows tocreate a persistence effect on the cell output (producedby the cell membrane capacity).

    The usual cell layer has input connections with onedownstream layer. It has two exceptions for the trans-duction layer (luminance is the input for the cone layer)and the differentiation layers (two downstream layersoutputs are inputs for bipolar and ganglion layers). Conelayers sample another luminance image I for each kvalue (which form a video sequence input). Spatiallywe consider that layer and cell output are, respectively,modelized by image and pixel in our processings. Fur-thermore, to simplify complexity, cone cells are consid-ered as achromatic photoreceptors.

    3.2 Cellular automaton modelling

    The implementation of our dynamic model is based onsimple rules of cell operation and neighbourhood evo-lution. Thus, we have naturally chosen to model eachcell layer through a cell network associated with a cellu-lar automaton [26]. The cellular automaton constitutesa network of cells sharing common geometrical proper-ties to which a change of state (definite set of rules) isapplied according to a common synchronization. Thecellular automatons enable us to approach the localproperties of the retinian layers simply from an archi-tectural and behavioural point of view. The simplicity ofthe cells and connections makes it possible to considerefficient implementation using GPU processors charac-terized by their high parallelism architectures and theirreduced instruction sets. Each cell follows rules of oper-ation induced by its state. The change of state of the cellis conditioned by the iteration number n.

    Each cell is a state machine. It is composed of statesassociated with behavioural rules. The four steps consti-tute an iterative cycle of processing which is performedby the cellular automaton. The individual cell behaviourof the automaton is driven by the four following steps:

    Step #1: Input sampling; Step #2: FIR filter performing (locally various num-

    bers of iterations); Step #3: IIR filter performing (one iteration); Step #4: Automaton synchronization.

    Step 1 (one iteration) consists in sampling the layerupstream ouput for all the cells (or the signal of bright-ness for the cones layer). The cell remains in step 2 (dura-tion of n iterations) as long as spatial filtering localized is

  • Retina simulation using cellular automata and GPU programming 335

    Table 1 Table of rules of theCA driving a 1D retinianlayer. e and s are the sampledinternal values of respectivelye and s in each CA cell

    Step Action Condition

    #1 ex,k ex,k ix < 0ix ix + 1

    #2 sx,k (1 2 x,k) ex,k + x,k ex+1,k + x,k ex1,k 0 ix < nx,kix ix + 1

    #3 sx,k (1 rx,k) sx,k + rx,k sx,k1 ix == nx,kix ix + 1

    #4 sx,k sx,k ix > nx,kix 1

    not successful. The cellmoves then on to step 3 (durationof one iteration). The cell remains at step 4 (variableduration) as long as all the other cells individually havenot reached this step.

    For a row of cells, this processingmethod is presentedin Fig. 2. The computing steps for the cell at location xare given by a counter value ix (initialized by 1 at thestart). The following Table 1 shows their actions andactivation conditions for all processing steps:

    The 2D release of the network described in Fig. 2slightly complicates the whole processing. The networkgrid has a square shape and we have chosen to use onlyfour connections for each cell. Thus each cell is con-nected with four cells of its immediate neighbourhood.The four connections used are, respectively, towardshorizontal and vertical directions centred on the pro-cessed cell.

    4 GPU retina simulation

    4.1 Retina pipeline software design

    Similarly to [2]which suggests a CPU-basedOPL sim-ulationour representation of the retina pipeline ismodelized by using a succession of two-dimensionalmatrices often associated with a cellular automaton(CA) for computation.

    For readers not familiar withCA,GrardVichniac [8]gives the following definition fitting issues of currentretina model: Cellular automata are dynamical systemswhere space, time, and variables are discrete. Cellularautomata are capable of non-numerical simulation ofphysics, chemistry, biology, and others and they are usefulfor faithful parallel processing of lattice models. In gen-eral, they constitute exactly computable models for com-plex phenomena and large-scale correlations that resultfrom very simple short range interactions. The technicaldesign of our model also covers other specific domains

    using CA. Here is a list of references presenting CAmixed with computer graphics and/or image processing:

    [21] and [1], respectively, for notions of grammarand computer engineering;

    [13,24,25] for CA and graphics based simulations; [16] for neural networks (similar to CA) and image

    processingespecially segmentation; [15,18,23] for CA and GPU programming.

    Figure 3 shows OPL and IPL sections. They are com-puted by a pipelined architecture and composed of fivemajor steps:

    From an input Ithat can be an AVI video or awebcamthe information is first transmitted to the conelayer. The resulting computation C is then transmittedboth into the horizontal and bipolar layers where resultsof the horizontal cells H are simultaneously subtracted.In a similar way, bipolar cells output B is transmittedto both amacrine and ganglion layers. Finally, ganglionoutputs G are the result of the subtraction of amacrineoutputs A and B.

    As shown in Fig. 3, each layer has a specific behav-iour defined by one or two functions. These functionscan be simulated through cellular automata with thresh-olds enabling real parameters.1 Cone cells have a lightblur effect on the image that can be simulated with acellular rule equivalent to step 2 (Table 1). Similar tocone cells, horizontal cells produce a blur effect but anintensive one. This effect is associated with a persis-tence property that is presented with the cellular rulestep 3 (Table 1). Bipolar cells provide output values(HC).Amacrine cells use a persistence functionwhichis already defined by the horizontal cells representation.Andfinally, ganglion cells behave similarly to the bipolarlayer, using (A B) instead.

    1 Classical cellular automata use non-real parameters, and there-fore thresholds are not necessary.

  • 336 S. Gobron et al.

    Fig. 3 Simplified retina pipeline viewed in terms of cellular layers

    Fig. 4 Settings of differentCA on the GPU

    Fig. 5 Graphical interpretations of Table 2: a outputs using a direct input file and GPU; b outputs using a webcam input and GPU

    4.2 Real-time computation using GPU

    Due to the increasing needs of the video game indus-tryespecially in terms of fast computation of largetexturesgraphical cards now have high computationalpotential. This potential can be accessed by using graph-ical processor units that can be directly programmedby assembly language or specific hardware languages.Fortunately, they can also be programmed with C-likelanguages called shader programming language suchas GLSL (OpenGL shader language proposed byOpenGL ARG) or CG (developed by nVIDIA)see[12,15,24,25]. Note that a graphical card is composedof a set of GPUs capable of parallelizing these shadersprograms and therefore is much more powerful than asingle CPU of the same generation.

    As mentionned in the previous section, we suggestusing this powerful tool not for rendering purposes, but

    as a computional unit specialized in the retina pipelinedata transformation. To do so, we first associate everycellular layer matrice with two-dimensional graphicaltextures. Second,withGLSLweassociated a general cel-lular automaton program which can be adjust for eachspecific retina functionalitysee Table 1.

    The communication links between the CPU program(that control the retina pipeline) and the GPU program(that control local cellular behaviour) is presented inFig. 4.

    The first main task consisted in making our C++ soft-ware builds the CA shader programsee Appendix 6.The second task was to transform continuous input dataflow into a 2D graphical texture buffer and send thistexture as input of the GPU pipeline. As shown in Fig. 4we have tested in this paper two types of input anima-tions: an AVI file and a webcam buffer. From this start-ing point, the CPU program leads the GPU via specific

  • Retina simulation using cellular automata and GPU programming 337

    Table 2 Output image rates in frames per second (fps)

    Blur GPU approach CPU model,

    factors Direct input file Webcam inputs Webcam inputs,

    nc nh C H B A G C H B A G B-output

    1 1 889 531 440 390 351 40 30 28 27 27 32 4 626 264 240 225 211 33 24 22 22 21 33 7 483 176 165 158 151 29 20 19 18 18 24 10 393 132 126 121 117 27 19 18 17 17 < 25 13 331 105 101 98 96 25 18 17 17 17 < 26 16 286 88 85 83 81 23 17 16 16 16 < 27 19 252 75 73 72 70 22 17 16 16 16 < 210 28 186 53 52 52 51 21 16 16 16 16

  • 338 S. Gobron et al.

    Fig. 6 Simplified retina pipeline viewed in terms of textures

    would decrease fps rates in a linearmanner. For instanceusing a resolution of 1280 960 as input would be fourtimes slower than results presented in this paper.

    Figure 5a and b graphically illustrate the two rightsections of Table 2. From this figure, we can infer thaton the left-hand diagram, horizontal cell computation isfar from being the most extended time as shown in theright-hand diagram.

    For a set of n couplesee parameters of the layermodelling described in Sect. 3.1Table 2 proposes result-ing frames per second (fps) at C, H, B, A , and G out-puts (see Fig. 6). As horizontal cell are characterizedby a blur factor much higher than cone cells, we haveselected n couple so that nh increase three times fasterthan nc. The use of such a regular factor is justified byBonnet [3], p.13: sizes of receptor fields (simulated inthis paper by n couple) increase linearly with the retinaeccentricity.

    First, on the left-hand box, the table presents values,assuming direct files are used as inputsuch as an AVIsequence or even a single image. Even for relativelyhigh (nc,nh) values, results are always over the real-time threshold ratearound 25 fps. These high framefrequencies demonstrate that the model of artificial ret-ina soon allows to design more advanced and powerfulmodels (see perspectives in the Sect. 6).

    Resulting time expenses when using a webcam asinput are shown on the right-hand box (without theright-hand column). The sharp drop in performancebetweenprevious results and theones schown in this sec-tion of the table is due to the specific webcamused: it haslow efficiency capture rates (less than 15 fps in the worstconditions) and frequent interruption requests whichvirtually freeze the hardware performances. Neverthe-less, even with such poor quality material, frame ratesnever dropped below 15 fps. Notice that due to web-cam interruptions the resulting animation often appearsjerky.

    For significant comparison of our model with CPU-based works [2], the last column of Table 2 presentsresulting fps for a similar CPU retina simulation. Tomake it possible, only OPL layers computational results(at bipolar i.e. B, outputs) are compared; further lay-

    Table 3 Input image rates in fps

    GPU approach Direct input file Webcam inputs1,537 82

    CPU model Direct input file Webcam inputs85 4

    For webcam tests, high values of fps are obtained by over samplingof the video input buffer

    ers using CPU have much too slow rates. Based onexperimental valuescomparing GPU column B withCPU columnwe can conclude that the approach pre-sented in this paper is at least 18 times faster than theCPU solution. This coefficient of 18 can also be foundwhen we compare GPU rates fps with those of CPU,respectively, 1,537 divided by 85 fps and 82 divided by4see Table 3.

    As the hardware used for the experiment in this paperis different from the one used in [2], we used the sameconditions and compared identical outputs to make anexact comparison between this GPU and the CPUapproaches. Table 3 presents GPU- and CPU-basedmodel input video rates. Additionally, Figure 7a com-pares the GPU and CPU bipolar outputssee Table 2.Figure 7b presents the ratio of computational cost, inms, divided by n. Resulting values are proposed in theleft-hand table, and a corresponding graph was drawnout on the right, respectively, in pink/violet for coneand in dark blue for horizontal n values. Two inter-esting experimental values can be deduced from thatdiagram:

    First, as both functions aim at an identical constantvalue for n reaching high values, the time cost for asingle blur functionality computation using GPU isof approximately 0.47ms.

    Second, cones appear to be more costly for low nvalue and then reaches the tangent 0.47ms value.The difference between cone and horizontal func-tionalities is only visible on the GPU resetting.Hence, as cone cells make the first GPU call,resetting GPU cost should be about (0.7340.473ms) which is about 0.26ms.

  • Retina simulation using cellular automata and GPU programming 339

    Fig. 7 Compared results: a CPU and GPU approach using a webcam input, graphical interpretations of Table 2red for CPU and bluefor GPU; b cone versus horizontal cells shader computational expenses

    Fig. 8 Main step of thereal-time retina pipeline:a input webcam animation;b bipolar output; c saturated(i.e. [0.0, 0.5, 1.0]) bipolaroutput; d amacrine output;e and f ganglion outputs,respectively, with and withoutmovements

    5.3 Graphics results

    Figure 8 shows the main resulting steps of the retinapipeline using webcam inputs. As main viewed object,we used a sheet of printed paper with a large, highlycontrasted font: the letter Q is an interesting sample asit shows a natural loop crossed by a segment; for morecomplex character visual simulation, we used the termJapanese kanjis ni-hon (actually meaning Japan).

    This plate shows, in particular, that from an aver-age/low contrasted input sequence (Fig. 8a) our modelproduces expected behaviours for eachmain retina layerin real time. Figure 8b illustrates the bipolar informa-tion enhancing: localization, magnitude, and length ofthe light. These crucial properties can be used for edgedetection. We tried to focus on that contour determi-nation, using thresholds b on cells brightness cb inFig. 8c respectively: if cb < (0.5 b) cb 0.0 elseif cb > (0.5 + b) cb 1.0 else cb 0.5.

    Figure 8d presents the amacrine output cellular layerwhich is similar to delayed bipolar outputs.

    The last two pictures presenting the ganglion out-puts give precious information. The contrast between

    Figs. 8e and f is obvious: Fig. 8e presents immediate con-tour detection of object even with a very slight move-ment, giving a depth visual impression. On the contrary,Fig.8f demonstrates that if objects stop movingand asthe webcam cannot move by itself, as human eyes canall the scene fades into greyish colour intensities andonly pure grey contour of the sheet of paper remains.These effects can be tested by anyone looking straightat a contrasted object.

    Figure 9 shows a comparison, on a ganglion layer,between a fast and abrupt hand movement (Fig. 9a) anda slightly detectable one, the tip of the forefingerseeFig. 9b. From these two pictures, five properties can beobserved: the faster the movement, the brighter appar-ent contrast seems (see Fig. 9a, b) behind, contour objectalso appears contrasted (see Fig. 9a) similar to comicstrips, the directions of moving objects are also obvi-ous (see Fig. 9a); all the scene seems to disappear asmovement decreases (see Fig. 9b); on the contrary, amotionless object is visible when a movement occurs(see Fig. 9a).

    The authors would like to point out that characteris-tics and properties demonstrated by using the model

  • 340 S. Gobron et al.

    Fig. 9 Ganglion multifunctionalities: a fastmovement; b slight motion

    we presented have been identified in so far as real-time computation was possible. Hence, studying aGPU-based architecture was not only away to have frame rateimprovements, but also an open door to major improve-ments in the field of the artificial retina.

    6 Conclusion and future work

    Focusing on the OPL and IPL layers, we presented anew, original model in order to simulate retina cellu-lar layer behaviours. We showed that real-time simula-tion was possible, using GPU programming based onOpenGL Shader Language. After describing the bio-logical retina, we first identified how its layer behav-iours could be interpreted in terms of a pipeline. Wethen proposed a computer-aided design for this pipe-line, detailing our model of cellular automaton GPUapproach and its relationship with the CPUprogram. Tosupport the model efficiency, we specified the resultingtime expenses and we compared this new GPU modelwith a previous CPU-based approach. In particular, wedemonstrated that the current GPU-based approach isat least 20 times as fast as a classical CPU model.

    Many valuable aspects are to be further investigated.We are currently extending our model to a simulationspecialized in contour real-time restoration and vector-ization. A complete analysis of the influence of param-etersabout 20 in the whole pipelinedefined in thismodel should reveal interesting properties. In addition,another key area for further work is the influence of lowbrightness visualization using rod cells. Photoreceptors(i.e. cone and rod cells) spatial positioning and densitiesin the eye sphere havenot been taken into account in thispaper. Such a three-dimensional approachwould lead tothe design of a model capable of focusing automaticallyon specific image areas, thus vastly reducing the analy-sis cost. Furthermore, a non-trivial study would consistin adding the RGB data flow as the biological retinadoes, hence identifying separate colorimetric intensities.To conclude, we are convinced that those new fields of

    research will trigger off the study of multiple pipelinedarchitectures and would lead to fast, low-cost, and effi-cient artificial retina.

    Acknowledgments We would like to thank Mr Herv Bonafosfor his technical assistance and advice concerningOpenGLShaderLanguage. Special thanks are also due to Mrs Aline Caspary andMrs Martine Faure for the proofreading of the draft and dailyservices.

    Appendix: Algorithm

    To help understanding software architecture, here fol-lows the general algorithm of our retina model.

    1. Initialization(a) init environment(b) init classical OpenGL(c) init OpenGL shading language

    ARB calls auto-compile shaders programs

    (d) input source (for instance Webcam or Avi file)(e) init texture

    2. Main loop(a) Run retina pipeline (see Fig. 6)

    open virtual projection buffer cone LCB2:

    inputTex coneTex, call shader[ blur-CA( coneTex ) ] and scanvirtual buffer nc times and StopPipeline if needed

    horizontal LCB:coneTex persHoriTex, call shader[ blur-CA( persHoriTex ) ]and scan virtual buffer nh times, call shader[ persistence( pers-horiTex, horiTex ) ] and StopPipeline if needed

    bipolar LCB:call shader[ subTex2D( coneTexhoriTex ) ], scan virtual buffer bipoTex and StopPipeline if needed

    amacrine LCB:bipoTex persAmacTex, call shader[ persistence( persAmac-Tex, amacTex ) ], scan virtual buffer amacTexandStopPipelineif needed

    ganglion LCB:call shader[ subTex2D( bipoTexamacTex ) ], scan virtual buffer gangTex

    StopPipeline:close virtual projection buffer

    (b) User event managment (mouse, keyboard interuptions, )(c) Information managment (current view, parameters, fps, )(d) scene rendering

    3. Release dynamic allocations

    2 LCB stands for layer cellular behaviour.

  • Retina simulation using cellular automata and GPU programming 341

    References

    1. Rosenfeld, A.: Picture Languages. Academic, New York(1979)

    2. Devillard, F., Gobron, S., Grandidier, F., Heit, B.: Implmen-tation par automates cellulaires dune modlisation archi-tecturale de rtine biologique. In:READ05. InstitutNationaldes Tlcommunications, Evry, France (2005)

    3. Bonnet, C.: Trait de psychologie cognitive 1. Dunod (infrench) (1989)

    4. Mead, C.: Analog VLSI and neural systems. Addison-Wesley, New York (1989)

    5. Marr, D.: Vision. Freeman, San Francisco (1982)6. Dacey, D.M.: Circuitry for color coding in the primate ret-

    ina. Proc. Natl. Acad. Sci. USA 93, 582588 (1996)7. Shreiner, D.,Woo,M., Neider, J., Davis, T: OpenGLProgram-

    ming Guide v1.4. Addison-Wesley Professional, New York(2003)

    8. Vichniac, G.Y.: Simulating physics with cellular auto-mata. Physica 10D, 96116 (1984)

    9. Kobayashi, H., Matsumoto, T., Yagi, T., Kanaka, K.: Light-adaptive architectures for regularization vision chips. Proc.Neural Netw. 8(1), 87101 (1995)

    10. Kolb, H.: The neural organization of the human retina. Prin-ciples and Practices of Clinical Electrophysiology of Visionpp. 2552 Mosby Year Boook, St. Louis (1991)

    11. Kolb, H.: How the retina works. Am. Sci. 91, 2835 (2003)12. Wssle, H., Boycott, B.B.: Functional architecture of the

    mammalian retina. Physiol. Rev. 71(2), 447480 (1991)13. Conway, J.H.: Game of life. Sci. Am. 223, 120123 (1970)14. VanBuren, J.M.: The retinal ganglion cell layer. Springfield,

    Illinois edn. Charles C. Thomas (1963)15. Tran, J., Jordan, D., Luebke, D.: New challenges for cellular

    automata simulation on the GPU. http://www.cs.virginia.edu(2003)

    16. Deshmukh, K.S., Shinde, G.N.: An adaptive color image seg-mentation. Electr. Lett. Comput. Vis. Image Anal. 5, 1223 (2005)

    17. MacNeil, M.A., Masland, R.H.: Extreme diversity amongamacrine cells: implications for function. Neuron 20(5), 971982 (1998)

    18. Harris, M.: Implementation of a cml boiling simulation usinggraphics hardware. In: CS. Dept, UNCCH, Tech. Report, vol.02-016 (2003)

    19. Mahowald, M.: Analog vlsi chip for stereocorrespon-dance. Proc. IEEE Inter. Circuits Syst 6, 347350 (1994)

    20. Ahnelt, P., Kolb,H.:Horizontal cells and cone photoreceptorsin human retina. Comput. Neurol. 343, 406427 (1994)

    21. Chaudhuri, P.P., Chowdhury, R.D., Nandi, S., Chattopaghyay,S.: Additive Cellular Automata Theory and Applications.IEEE Computer Society Press and Wiley (1993)

    22. Rost, R.J.: OpenGL Shading Language, 2nd edn. Addison-Wesley Professional, New York (2006)

    23. Druon, S., Crosnier, A., Brigandat, L.: Efficient cellularautomaton for 2D/3D free-formmodelling. Journal ofWSCG11 (1, ISSN 1213-6972) (2003)

    24. Gobron, S., Chiba, N.: 3D surface cellular automata and theirapplications. J. Vis. Comput. Animat. 10, 143158 (1999)

    25. Gobron, S., Chiba, N.: Crack pattern simulation based on 3Dsurface cellular automata. Vis. Comput. 17(5), 287309 (2001)

    26. Wolfram, S.: ANewKind of Science, 1st edn.WolframMedia,Champaign (2002)

    27. Bernard, T.M., Guyen, P.E., Devos, F.J., Zavidovique, B.Y.: Aprogrammable VLSI retina for rough vision. Mach. Vis.Appl. 7(1), 411 (1993)

    28. Beaudot, W.H.A.: Le traitement neuronal de linformationdans la rtine des vertbrs - un creuset dides pour la visionartificielle. Ph.D. thesis, Institut National Polytechnique deGrenoble, pp. 228 (1994)

    29. Yang, X.L., Tornqvist, K., Dowling, J.E.: Modulation of conehorizontal cell activity in the teleost fish retina I. Effects ofprolonged darkness and background illumination on lightresponsiveness.. J. Neurosci. 8(7), 22592268 (1988)

    Author Biographies

    S. Gobron Specialized in soft-ware engineering and computergraphics. Stephane Gobronembarked on an internationalstudies plan at the age of 19,starting with a 4-year US B.Sc.degree and then he did a 2-yearM.Sc. degree in France. Finally,he spent 5 years in Japan teach-ing and studying for a Ph.D.degree specialized in cellularautomata at Iwate University.After graduating, his researchactivities focused on numericalmodelization of environmentalphenomena. These studies led toa large range of applied domains

    in numerical computation and visual modelling such as ceram-ics and glaze fracture propagations, metallic patina and corrosionsimulations, three-dimensional material deformations, and virtualsurgery. Recently, he has been teaching mainly computer science,computer engineering, physics, and computer graphics at HenriPoincare University in France at the undergraduate level. Hisresearch interests mainly include development relative to weath-ering, cellular automaton, dynamic cellular networks, GPU pro-gramming, and layer-based retina simulation. Finally, he has astrong interest in industrial development and international rela-tions, especially between the European Union, the United Statesof America, and Japan.

    F. Devillard Franois Devillardwas born in Luxeuil-les-Bains,France, on October 27, 1963.He was awarded the B.Sc. in1984 and M.Sc. degrees in 1986from the university of Grenoble,France. He received the Ph.D.degree in 1993 in Signal, Imageand Speech from the InstitutNational Polytechnique de Gre-noble. Since 1994, he has beenAssistant Professor at the Cen-tre de Recherche en Automa-tique de NancyHenri PoincareUniversity in France. His firstresearch interestswere image andsignal processing whereas most

    cases studies achieved implementation allowing real-time perfor-mances. Currently his research is oriented toward the architecturalmodelization of biological retina.

  • 342 S. Gobron et al.

    B. Heit Bernard HEIT wasborn in 1954 in France. Heobtained his Ph.D. in electricalengineering in 1984 and was fromthen Assistant Professor in theCentre de Recherche en Autom-atique de NancyHenri PoincareUniversity in France. Ten yearslater he was promoted Associ-ate Professor, and since 1997,he has been now Full Professorat the Henri Poincare Univer-sity, Nancy, France. His researchinterests include monocular pas-sive vision, image processing, andtheir real-time computer imple-mentation. Lately, his works have

    been focusing on computer implementation of retinal layer archi-tectural model based on mechanisms observed in the biologicalvisual system.

    Retina simulation using cellular automata and GPU programmingAbstract IntroductionBackgroundOverviewMethodologyBiological retinaOPL and IPL areas of neuropilThe retinian data flowsApproachLayer modellingCellular automaton modellingGPU retina simulationRetina pipeline software designReal-time computation using GPUResultsEnvironmentGPU solution testingsGraphics resultsConclusion and future workAcknowledgments

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 150 /GrayImageDepth -1 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputCondition () /PDFXRegistryName (http://www.color.org?) /PDFXTrapped /False

    /Description >>> setdistillerparams> setpagedevice