electronic imaging 14.1 january 2004 electronic imaging

12
E LECTRONIC I MAGING NEWSLETTER NOW AVAILABLE ON-LINE Technical Group members are being offered the option of receiving the Electronic Imaging Newsletter electronically. An e-mail is being sent to all group members with advice of the web location for this issue, and asking members to choose between the electronic and printed version for future issues. If you are a member and have not yet received this message, then SPIE does not have your correct e-mail address. To receive future issues electronically please send your e-mail address to: [email protected] with the word EI in the subject line of the message and the words electronic version in the body of the message. If you prefer to receive the newsletter in the printed format, but want to send your correct e-mail address for our database, include the words print version preferred in the body of your message. JANUARY 2004 VOL. 14, NO. 1 SPIE International Technical Group Newsletter Special Issue on: Smart Image Acquisition and Processing Guest Editors François Berry, Univ. Blaise Pascal Michael A. Kriss, Sharp Lab. of America Aliasing in digital cameras Aliasing arises in all acqui- sition systems when the sampling frequency of the sensor is less than twice the maximum frequency of the signal to be acquired. We usually consider the spatial frequencies of the visual world to be unlimited, and rely on the optics of the camera to impose a cut-off. Thus, aliasing in any cam- era can be avoided by find- ing the appropriate match between the optics’ modu- lation transfer function (MTF) and the sampling frequency of the sensor. In most digital cameras, however, the focal plane ir- radiance is additionally sampled by a Color Filter Array (CFA) placed in front of the sensor, com- posed of a mosaic of color filters. Consequently, each photo site on the sensor has only a single chro- matic sensitivity. Using a CFA, a method first pro- posed by Bayer, 1 allows using one single sensor (CCD or CMOS) to sample color scenes. Missing colors are subsequently reconstructed, using a so- called demosaicing algorithm, to provide a regular three-color-per-pixel image. In a CFA image acquisition system, a match between the optics’ MTF and the sensor’s sam- pling frequency is more difficult to establish be- cause the sampling frequencies generally vary for each color (i.e. filter type). In the Bayer CFA, for example, there are twice as many green as red and blue filters (Figure 1a), resulting in different sam- pling frequencies for green and red/blue. Addi- tionally, the horizontal and vertical sampling fre- quency for the green pixels is different from the diagonal frequency. Thus, Greivenkamp 2 and Weldy 3 proposed an optical system called a birefringent lens that has varying spatial MTFs depending on wavelength. With such a lens it is possible to design a camera where the MTF of the optics matches the sampling frequency of each filter in the CFA. Thus, a color image could be reconstructed without artifacts. In practice, however, this method has not yet been ap- plied because the resulting images are too blurry. They have a spatial resolution far lower than the resolving power of modern CCD or CMOS sen- sors. Most studies have therefore concentrated on how to reconstruct aliased images resulting from CFA camera systems where the optics is designed to pass high spatial frequencies. If the captured scene has high spatial frequen- cies, the demosaiced image can contain visible ar- tifacts. Depending on the scene content and the spe- cific demosaicing algorithm used for reconstruction, they are more or less visible. In general, aliased sig- Figure 1. (Left) An example of a color filter array, the Bayer Color Filter Array (Right) The Fourier representation of an image acquired with the Bayer Color Filter Array. Continues on page 8.

Upload: others

Post on 03-Feb-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

1SPIE International Technical Group Newsletter

ELECTRONIC IMAGING 14.1 JANUARY 2004

ELECTRONIC

IMAGING

NEWSLETTER NOW

AVAILABLE ON-LINETechnical Group members are beingoffered the option of receiving theElectronic Imaging Newsletterelectronically. An e-mail is being sentto all group members with advice ofthe web location for this issue, andasking members to choose betweenthe electronic and printed version forfuture issues. If you are a member andhave not yet received this message,then SPIE does not have your correcte-mail address.

To receive future issueselectronically please send your e-mailaddress to:[email protected] the word EI in the subject line ofthe message and the wordselectronic version in the bodyof the message.

If you prefer to receive thenewsletter in the printed format, butwant to send your correct e-mailaddress for our database, include thewords print version preferredin the body of your message.

JANUARY 2004VOL. 14, NO. 1

SPIE InternationalTechnical Group

Newsletter

Special Issue on:

Smart ImageAcquisition and

ProcessingGuest Editors

François Berry, Univ. Blaise PascalMichael A. Kriss, Sharp Lab. of America

Aliasing in digital camerasAliasing arises in all acqui-sition systems when thesampling frequency of thesensor is less than twice themaximum frequency of thesignal to be acquired. Weusually consider the spatialfrequencies of the visualworld to be unlimited, andrely on the optics of thecamera to impose a cut-off.Thus, aliasing in any cam-era can be avoided by find-ing the appropriate matchbetween the optics’ modu-lation transfer function(MTF) and the samplingfrequency of the sensor. Inmost digital cameras, however, the focal plane ir-radiance is additionally sampled by a Color FilterArray (CFA) placed in front of the sensor, com-posed of a mosaic of color filters. Consequently,each photo site on the sensor has only a single chro-matic sensitivity. Using a CFA, a method first pro-posed by Bayer,1 allows using one single sensor(CCD or CMOS) to sample color scenes. Missingcolors are subsequently reconstructed, using a so-called demosaicing algorithm, to provide a regularthree-color-per-pixel image.

In a CFA image acquisition system, a matchbetween the optics’ MTF and the sensor’s sam-pling frequency is more difficult to establish be-cause the sampling frequencies generally vary foreach color (i.e. filter type). In the Bayer CFA, forexample, there are twice as many green as red andblue filters (Figure 1a), resulting in different sam-pling frequencies for green and red/blue. Addi-tionally, the horizontal and vertical sampling fre-quency for the green pixels is different from thediagonal frequency.

Thus, Greivenkamp2 and Weldy3 proposed anoptical system called a birefringent lens that hasvarying spatial MTFs depending on wavelength.With such a lens it is possible to design a camerawhere the MTF of the optics matches the samplingfrequency of each filter in the CFA. Thus, a colorimage could be reconstructed without artifacts. Inpractice, however, this method has not yet been ap-plied because the resulting images are too blurry.They have a spatial resolution far lower than theresolving power of modern CCD or CMOS sen-sors. Most studies have therefore concentrated onhow to reconstruct aliased images resulting fromCFA camera systems where the optics is designedto pass high spatial frequencies.

If the captured scene has high spatial frequen-cies, the demosaiced image can contain visible ar-tifacts. Depending on the scene content and the spe-cific demosaicing algorithm used for reconstruction,they are more or less visible. In general, aliased sig-

Figure 1. (Left) An example of a color filter array, the Bayer Color Filter Array(Right) The Fourier representation of an image acquired with the Bayer Color FilterArray.

Continues on page 8.

SPIE International Technical Group Newsletter2

ELECTRONIC IMAGING 14.1 JANUARY 2004

Smart camera and active vision:the active-detector formalismActive vision techniques attemptto simulate the human visual sys-tem. In human vision, head mo-tion, eye jerks and motions, andadaptation to lighting variations,are important to the perceptionprocess. Active vision, therefore,simulates this power of adapta-tion. Despite major shortcom-ings that limit the performanceof vision systems—sensitivity tonoise, low accuracy, lack of re-activity—the aim of active vi-sion is to develop strategies foradaptively setting camera pa-rameters (position, velocity, ...)to allow better perception.Pahlavan proposed that these pa-rameters be split into four cat-egories: optical parameters, formapping the 3D world onto the2D image surface; sensory pa-rameters, for mapping from the2D image to the sampled elec-trical signal; mechanical param-eters, for the positioning and mo-tion of the camera); and algorith-mic integration to allow controlof these parameters.

In the active approach to per-ception we assume that the out-side world serves as its ownmodel. Thus, perception in-volves exploring the environ-ment allowing many traditionalvision problems to be solvedwith low complexity algorithms.Based on this concept, the meth-odology used in this work in-volves integrating imager con-trol in the perception loop and,more precisely, in early visionprocesses. This integration allows us to designa reactive-vision sensor. The goal is to adaptsensor attitude to environment evolution andthe task currently being performed. Such smartcameras allow basic processing and the selec-tion of relevant features close to the imager.This faculty reduces the significant problem ofsensor communication flow. Further, as itsname suggests, an active vision system activelyextracts the information it needs to perform atask. By this definition, it is evident that one ofits main goals is the selection of windows ofinterest (WOI) in the image and the concentra-

Figure 1. Global architecture of the sensor. The blocks drawn with dashed linesrepresent optional modules that are not currently implemented.

Figure 2. Prototype of the sensor.

high-level system load; and/or re-duce communication flow betweenthe sensor and the host system.

Active detectionHere, we include sensing parametersin the perception loop by introduc-ing the notion of active detectors thatcontrol all levels of perception flow.These consist of hardware ele-ments—such as a use of a sub-re-gion of the imager or hardwareimplementation using a dedicatedarchitecture—and software ele-ments such as control algorithms.

An active detector can be viewedas a set of ‘visual macro functions’where a visual task in decomposedinto several subtasks. This is simi-lar to Ullman’s work1 on the notionof a collection of ‘visual routines’representing different kinds of ba-sic image-processing sub functions.These can be used in goal-directedprograms to perform elaborate tasks.By contrast, the active detector con-sists of both hardware and software.Thus, in this approach, the sensorhas a key role in the perception pro-cess, its task more important thanjust performing image pre-process-ing. As a result, the hardware archi-tecture and implementation are vi-tal.

Smart architecture based onFPGA and CMOS imagingOur work is based on the use of aCMOS imager that allows full ran-dom-access readout and a massiveFPGA architecture. There are sev-eral options for the choice of an im-

ager and a processing unit. It is important toconsider an active detector as a visual controlloop: the measure is the image and the systemto control is the sensor. For a given visual prob-lem, the active detector must optimize and servethe sensor in order to achieve the task. For thisreason, the architecture of this active vision sen-sor can be viewed as a set of parallel controlloops where the bottleneck is the imager. In-deed, actual CMOS imagers have a sequentialbehavior and their acquisition rates are slowedin comparison with actual dedicated architec-tures performances.

The global architecture shown in Figure 1

Continues on page 9.

tion of processing resources on it: the notionof local study becomes predominant.

Another important feature is the control ofsensing parameters. As explained above, ac-tive vision devices generally focus on opticalparameters (such as zoom, focus, and aperture),mechanical parameters (such as pan and tilt)and sometimes algorithmic integration (for ex-ample, early biologically-inspired vision sys-tems). Thus, the visual sensor requires its ownlevel of autonomy in order to: perform pre-pro-cessing, like the adjustment of sensing param-eters or automatic WOI tracking; reduce the

3SPIE International Technical Group Newsletter

ELECTRONIC IMAGING 14.1 JANUARY 2004

Reflectance-sensitive retinaIt is well established that intelligentsystems start with good sensors.Attempts to overcome sensor defi-ciencies at the algorithmic levelalone inevitably lead to inferior andunreliable overall-system perfor-mance: inadequate or missing in-formation from the sensor cannotbe made up in the algorithm. Forexample, in image sensors there isnothing we can do to recover thebrightness of the image at a par-ticular location once that pixel satu-rates. Sensors that implement someprocessing at the sensory level—computational sensors—may pro-vide us with a level of adaptationthat allows us to extract environ-mental information that would notbe obtainable otherwise.

Most present and future visionapplications—including automo-tive, biometric, security, and mo-bile computing—involve uncon-strained environments with unknown andwidely-varying illumination conditions. Evenwhen an image sensor is not saturated, the vi-sion system has to account for object appear-ance caused by variations in illumination. Toillustrate this, the left panel of Figure 1 showsa set of face images captured under varyingillumination directions by a CCD camera. Evento a human observer, these faces do not readilyappear to belong to the same person. We haverecently introduced a reflectance perceptionmodel1 that can be implemented at the sensorylevel and which largely removes the illumina-tion variations as shown in the right panel ofFigure 1. These images appear to be virtuallyidentical. Using several standard face-recog-nition algorithms, we have shown that recog-nition rates are significantly improved whenoperating on the images whose variations dueto the illumination are reduced by our method.2

In the most simplistic model, image intensityI(x,y) is a product of object reflectance R(x,y)and the illumination field L(x,y), that isI(x,y)=R(x,y)L(x,y). An illumination pattern L ismodulated by the scene reflectance R and to-gether they form the radiance map that is col-lected into a camera image I. R describes thescene. In essence, R is what we care about incomputer vision. When the illumination field Lis uniform, I is representative of R. But L is rarelyuniform. For example, the object may occludelight sources and create shadows.

Obviously, estimating L(x,y) and R(x,y) fromI(x,y) is an ill-posed problem. In many relatedillumination compensation methods, includingRetinex,1,4 a smooth version of the image I isused as an estimate of the illumination field L.

Figure 1. The reflectance-sensitive retina removes illumination-inducedvariations (simulation results). The input images on the left are taken undervarying illumination directions, resulting in substantial appearance changes.Our reflectance-recovery method largely removes these, resulting in virtuallyuniform appearance across different illumination conditions as shown on theright.

Figure 2. The resistive grid that minimizes energy J(I),therefore finding the smooth version of the input image I.Perceptually important discontinuities are preservedbecause the horizontal resistors are controlled with thelocal Weber-Fechner contrast.

Figure 3. Horizontal intensity line profiles through themiddle of subject’s eyes in top middle picture ofFigure 1. The thin black line in the top graph is theoriginal image’I(x,y), the thick gray line is thecomputed L(x,y), and the bottom graph is R(x,y) =I(x,y)/L(x,y).

If this smooth version does notproperly account for discontinuities,objectionable ‘halo’ artifacts arecreated in R along the sharp edgesin the image.

In our method, L is estimated withthe resistive network shown in Fig-ure 2. Here, we use a one-dimen-sional example to keep the notationsimple. The image pixel values aresupplied as voltage sources and thesolution for L is read from the nodalvoltages. To preserve discontinuities,the horizontal resistors are modulatedproportionally to the Weber-Fechnercontrast5 between the two points in-terconnected by the horizontal resis-tor. Therefore, the discontinuitieswith large Weber-Fechner contrastwill have a large resistance connect-ing the two points: smoothing lessand allowing voltage at those twopoints to be kept further apart fromeach other. Formally, in the steady-

state, the network minimizes the energy it dissi-pates as expressed by equation J(I) shown inFigure 2. The first term is the energy dissipatedon the vertical resistors Rv; the second term isthe energy dissipated on the horizontal resistorsRh.

Once L(x,y) is computed, the I(x,y) is dividedto produce R(x,y). Figure 3 illustrates this pro-cess. It can be observed that the reflectancevariations in shadows are amplified and ‘pulledup’ to the level of reflectance variations in thebrightly-illuminated areas. All the details in theshadow region, which are not ‘visible’ in theoriginal, are now clearly recognizable. We arecurrently designing an image sensor that imple-ments this form of adaptation on the sensor chipbefore the signal is degraded by the readoutand quantization process.

Vladimir BrajovicThe Robotics InstituteCarnegie Mellon University, USAE-mail: [email protected]

References1. V. Brajovic, A Model for Reflectance Perception in

Vision, Proc. SPIE, Vol. 5119, 2003, to appear.2. R. Gross and V. Brajovic, An Image Pre-processing

Algorithm for Illumination Invariant FaceRecognition, 4th Int’l Conf. on Audio- and Video-Based Biometric Person Authentication, 2003.

3. E. H. Land and J. J. McCann, Lightness and Retinextheory, J.O.S.A 61 (1), pp. 1-11, January 1971.

4. D. J. Jobson, Z. Rahman, and G. A. Woodell, A multi-scale Retinex for bridging the gap between color im-ages and the human observation of scenes, IEEETrans. Images Processing 6 (7), pp. 965-976, 1997.

5. B. A. Wandel, Foundations of Vision, Sinauer,Sunderland, MA, 1995.

SPIE International Technical Group Newsletter4

ELECTRONIC IMAGING 14.1 JANUARY 2004

Color processing for digital camerasDigital capture is becoming mainstreamin photography, and a number of newcolor processing capabilities are beingintroduced. However, producing a pleas-ing image file of a natural scene is stilla complex job. It is useful for the ad-vanced photographer or photographicengineer to understand the overall digi-tal camera color processing architecturein order to place new developments incontext, evaluate and select processingcomponents, and to make workflowchoice decisions.

The steps in digital camera color pro-cessing are provided below.1 Note thatin some cases the order of steps may bechanged, but the following order is rea-sonable. Also, it is assumed that propri-etary algorithms are used to determinethe adopted white2 and the color-render-ing transform.3

• Analog gains (to control sensitivityand white balance, if used)

• Analog dark-current subtraction (ifused)

• A/D conversion• Encoding nonlinearity (to take

advantage of noise characteristics toreduce encoding bit-depth require-ments for raw data storage, if used)

This is the first raw-image data-storageopportunity• Linearize (undo any sensor and

encoding nonlinearities; optionallyclip to desired sensor range)

• Digital dark-current subtraction (ifno analog dark-current subtraction)

• Optical flare subtraction (if per-formed)

This is the last raw image data storage oppor-tunity (before significant lossy processing)• Digital gains (to control sensitivity and

white balance, if used)• White clipping (clip all channels to same

white level; needed to prevent cross-contamination of channels in matrixing)

• Demosaic (if needed)• Matrix (to convert camera color channels

to scene color channels)This is the scene-referred image data storageopportunity (standard scene referred colorencodings include scRGB4 and RIMM/ERIMMRGB)5

• Apply color rendering transform (to takescene colors and map them to pleasingpicture colors)

• Apply transform to standard output-referred encoding.

This is the output-referred image-data storageopportunity (standard output-referred color

are that many of the decisions affect-ing the appearance of the final picturehave not yet been made. Control of theappearance is relinquished to the down-stream processing. If a raw file is ex-changed, the white-balance and color-rendering choices made after exchangecan produce a variety of results, someof which may be quite different fromthe photographer’s intent. Scene-re-ferred image data has undergone whitebalancing (so the overall color cast ofthe image is communicated), but thecolor rendering step offers many oppor-tunities for controlling the final imageappearance. Output-referred exchangeenables the photographer to communi-cate the desired final appearance in theimage file, thereby ensuring more con-sistent output.

Current open-image exchange sup-ports output-referred color encodingslike sRGB. Generally it is recom-mended that output-referred images beexchanged for interoperability, al-though a raw or scene-referred imagemay be attached to allow other process-ing choices to be made in the future, orby other parties after exchange.

Proprietary component choicesThese include the methods for: deter-mining the adopted white, flare subtrac-tion, demosaicing (if needed), determin-ing the matrix from camera color toscene color,10 and determining thecolor-rendering transform. A camera’scolor reproduction quality will depend

on each of these components. Generally, it isgood to consider them independently, thoughone component may partially compensate fordeficiencies in another. For example, if flaresubtraction is omitted, a saturation boost in thecamera-color-to-scene-color matrix or thecolor-rendering step may help, but the qualityobtained will generally not be as good. Also, ifthe job of one step is deferred to another, im-age-data exchange in open systems is degraded:there may be no standard way to communicatethat some operation was deferred.

Optional proprietary step:scene relightingSome scenes have variable lighting, or veryhigh dynamic ranges due to light sources orcavities in the scene. Proprietary scene re-light-ing algorithms attempt to digitally even out thescene illumination.11 This can make scenes lookmore like they do to a human observer, because

Figure 1. Top left: raw image data with analog dark-currentsubtraction and gamma = 2.2 encoding nonlinearity, but no analoggains. Top right: “camera RGB” image data, after flare subtraction,white balancing and demosaicing (displayed using gamma = 2.2).Bottom left: scene-referred image data, after matrixing to scRGBcolor space (displayed using gamma = 2.2). Bottom right: sRGBimage data, after color rendering and encoding transforms.

Continues on page 8.

encodings include sRGB,6 sYCC,7 ROMMRGB)8

Workflow choicesThe primary workflow choice is the imagestate for storage or exchange. The standardoptions are ‘raw’, ‘scene-referred’ and ‘out-put-referred’.9 The advantage of storing theimage earlier in the processing chain is thatdecisions about subsequent processing stepscan be changed without loss, and more ad-vanced algorithms can be used in the future.The most lossy steps are white-clipping andcolor rendering. However, the white-clippingstep is not needed if the camera to scene ma-trix does not need to be applied (i.e. the cam-era color channels can be encoded as scenecolor channels without matrixing), and care-fully-designed color rendering transforms canminimize color rendering loss.

The disadvantages of exchanging image datausing the raw or scene-referred image states

5SPIE International Technical Group Newsletter

ELECTRONIC IMAGING 14.1 JANUARY 2004

Real-time image processingin a small, systolic, FPGA architectureThe need for high-performance com-puter architectures and systems isbecoming critical in order to solvereal image-processing applications.The implementation of such systemshas become feasible with micropro-cessor technology development:however, conventional processorscannot always provide the computa-tional power to fulfill real-time re-quirements due to their sequentialnature, the large amount of data, andthe heterogeneity of computationsinvolved. Moreover, new trends inembedded system design restrict fur-ther the use of complex processors:large processing power, reducedphysical size, and low power con-sumption are hard constraints tomeet.1 On the other hand, the inher-ent data- and instruction-level par-allelism of most image-processingalgorithms can be exploited to speedthings up. This can be done throughthe development of special-purposearchitectures on a chip based on par-allel computation.2

Within this context, our researchaddresses the design and develop-ment of an FPGA hardware architec-ture for real-time window-based im-age processing. The wide interest inwindow-based or data-intensive pro-cessing is due to the fact that morecomplex algorithms can use low-level results as primitives to pursuehigher-level goals. The addressedwindow-based image algorithms in-clude generic image convolution, 2Dfiltering and feature extrac-tion, gray-level image mor-phology, and templatematching.

Our architecture consistsof a configurable, 2D, sys-tolic array of processing el-ements that provide through-puts of over tens of giga op-erations per second (GOPs).It employs a novel address-ing scheme that significantlyreduces the memory-accessoverhead and makes explicitthe data parallelism at a lowtemporal storage cost.3 Aspecialized processing ele-ment, called a configurablewindow processor (CWP),

was designed to cover a broad rangeof window-based image algorithms.The functionality of the CWPs can bemodified through configuration reg-isters according to a given application.

Figure 1 shows a block diagramof the 2D systolic organization of theCWPs.4 The systolic array exploitsthe 2D parallelism through the con-current computation of window op-erations through rows and columnsin the input image. For each columnof the array there is a local data col-lector that collects the results ofCWPs located in that column. Theglobal data collector module collectsthe results produced in the array andsends them to the output memory.

As a whole, the architecture op-eration starts when a pixel from theinput image is broadcast to all theCWPs in the array. Each concur-rently keeps track of a particularwindow-based operation. At eachclock cycle, a CWP receives a dif-ferent window co-efficient W—stored in an internal register—and aninput image pixel P that is commonto all the CWPs. These values areused to carry out a computation,specified by a scalar function, andto produce a partial result of the win-dow operation. The partial results areincrementally sent to the local reduc-tion function implemented in theCWP to produce a single result whenall the pixels of the window are pro-cessed. The CWPs in a column startworking progressively: each a clock

cycle delayed from the pre-vious one as shown in Fig-ure 1. The shadowed squaresrepresent active CWPs in agiven clock cycle.

A fully-parameterizabledescription of the modules ofthe proposed architecture wasimplemented using VHDL.The digital synthesis was tar-geted to a XCV2000E-6VirtexE FPGA device. For animplemented 7×7 systolic-ar-ray prototype, the architectureprovides a throughput of3.16GOPs at a 60MHz clockfrequency with a power con-

Figure 1. Block diagram of the 2D systolic array of configurable windowprocessors (CWPs) for window-based image processing. D is a delay lineor shift register and LDC is a local data collector.

Figure 2. Performance of the proposed architecture for a512×512 gray level image with different window sizes.

Figure 3. Input image (left) and the output images for LoG filtering (middle) and gray-levelerosion (right). Continues on page 9.

SPIE International Technical Group Newsletter6

ELECTRONIC IMAGING 14.1 JANUARY 2004

Cameras with inertial sensorsInertial sensors attached to a camera can pro-vide valuable data about camera pose and move-ment. In biological vision systems, inertial cuesprovided by the vestibular system play an im-portant role, and are fused with vision at an earlyprocessing stage. Micromachining enables thedevelopment of low-cost single-chip inertial sen-sors that can be easily incorporated alongside thecamera’s imaging sensor, thus providing an ar-tificial vestibular system. As in human vision,low-level image processing should take into ac-count the ego motion of the observer. In this ar-ticle we present some of the benefits of combin-ing these two sensing modalities.

Figure 1 shows a stereo-camera pair with aninertial measurement unit (IMU), as-sembled with three capacitive accelerom-eters and three vibrating structure gyros.The 3D-structured world is observed bythe visual sensor, and its pose and motionare directly measured by the inertial sen-sors. These motion parameters can alsobe inferred from the image flow andknown scene features. Combining the twosensing modalities simplifies the 3D re-construction of the observed world. Theinertial sensors also provide important cues aboutthe observed scene structure, such as vertical andhorizontal references. In the system, inertial sen-sors measure resistance to a change in momen-tum, gyroscopes sense angular motion, and ac-celerometers change in linear motion. Inertialnavigation systems obtain velocity and positionby integration, and do not depend on any exter-nal references, except gravity.

The development of Micro-Electro-Me-chanical Systems (MEMS) technology has en-abled many new applications for inertial sen-sors beyond navigation, including aerospaceand naval applications. Capacitive linear ac-celeration sensors rely on proof mass displace-ment and capacitive mismatch sensing. MEMSgyroscopes use a vibrating structure to mea-sure the Coriolis effect induced by rotation, andcan be surface micromachined providing lower-cost sensors with full signal-conditioning elec-tronics. Although their performance is not suit-able for full inertial navigation, under someworking conditions or known system dynam-ics they can be quite useful.

In humans, the sense of motion is derivedboth from the vestibular system and retinal vi-sual flow, which are integrated at very basicneural levels. The inertial information enhancesthe performance of the vision system in taskssuch as gaze stabilisation, and visual cues aidspatial orientation and body equilibrium. Thereis also evidence that low-level human visualprocessing takes inertial cues into account, andthat vertical and horizontal directions are im-portant in scene interpretation. Currently-

available MEMs inertial sensors have perfor-mances similar to the human vestibular system,suggesting their suitability for vision tasks.1

The inertial-sensed gravity vector providesa unique reference for image-sensed spatialdirections. If the rotation between the inertialand camera frames of reference is known, theorthogonality between the vertical and the di-rection of a level plane image vanishing pointcan be used to estimate camera focal distance.1

When the rotation between the IMU and cam-era is unknown from construction, calibrationcan be performed by having both sensors mea-suring the vertical direction.2 Knowing the ver-tical-reference and stereo-camera parameters,the ground plane is fully determined. Thecollineation between image ground-planepoints can be used to speed up ground-planesegmentation and 3D reconstruction (see Fig-ure 2).1 Using the inertial reference, verticalfeatures starting from the ground plane can also

be segmented and matched across the stereopair, so that their 3D position is determined.1

The inertial vertical reference can also beused after applying standard stereo-vision tech-niques. Correlation-based depth maps obtainedfrom stereo can be aligned and registered us-ing the vertical-reference and dynamic-motioncues. In order to detect the ground plane, a his-togram in height is performed on the vertically-aligned map, selecting the lowest local peak(see Figure 3). Taking the ground plane as areference, the fusion of multiple maps reducesto a 2D translation and rotation problem. Thedynamic inertial cues can be used as a first ap-proximation for this transformation, providing

a fast depth-map registration method.3 Inaddition, inertial data can be integratedinto optical flow techniques. It does thisby compensating camera ego motion, im-proving interest-point selection, match-ing the interest points, and performingsubsequent image-motion detection andtracking for depth-flow computation. Theimage focus of expansion (FOE) and cen-tre of rotation (COR) are determined bycamera motion and can both be easily

found using inertial data alone, provided thatthe system has been calibrated. This informa-tion can be useful during vision-based naviga-tion tasks.

Studies show that inertial cues play an im-portant role in human vision, and that the no-tion of the vertical is important at the first stagesof image processing. Computer-vision systemsfor robotic applications can benefit from low-cost MEMS inertial sensors, using both staticand dynamic cues. Further studies in the field,as well as bio-inspired robotic applications, willenable a better understanding of the underly-ing principles. Possible applications go beyondrobotics, and include of artificial vision andvestibular bio implants.

Jorge Lobo and Jorge DiasInstitute of Systems and RoboticsElectrical and Computer EngineeringDepartmentUniversity of Coimbra, PortugalE-mail: {jlobo, jorge}@isr.uc.pt

References1. J. Lobo and J. Dias, Vision and Inertial Sensor

Cooperation, Using Gravity as a VerticalReference, IEEE Trans. on Pattern Analysis andMachine Intelligence 25 (12), 2003.

2. J. Alves, J. Lobo, and J. Dias, Camera-InertialSensor modelling and alignment for VisualNavigation, Proc. 11th Int’l Conf. on AdvancedRobotics, pp. 1693-1698, 2003.

3. J. Lobo and J. Dias, Inertial Sensed Ego-motion for3D vision, Proc. 11th Int’l Conf. on AdvancedRobotics, pp. 1907-1914, 2003.

Figure 1. Stereo cameras with an inertialmeasurement unit used in experimental work.

Figure 2. Ground-plane 3D-reconstructed patch.

Figure 3. Aligned depth map showing histogramfor ground-plane detection.

7SPIE International Technical Group Newsletter

ELECTRONIC IMAGING 14.1 JANUARY 2004

Color artifact reduction in digital, still, color camerasThe tessellated structure of the colorfilter array overlaid on CMOS/CCDsensors in commercial digital colorcameras requires the use of a consid-erable amount of processing to recon-struct a full-color image. Starting witha tessellated color array pattern—theBayer Color Array is popularly used,and some other common filter-arraytessellations are shown in Figure 1—we need to reconstruct a three-chan-nel image. Clearly, missing data (col-ors not measured at each pixel) needsto be estimated. This is done via a pro-cess called demosaicing that intro-duces a host of color artifacts.Broadly, these can be split into twogroups: so-called zipper and confettiartifacts.1 The former occur at loca-tions in the image where intensitychanges are abrupt, and the latterwhen highly intense pixels are sur-rounded by dark pixels (usually a re-sult of erroneous sensors). These ar-tifacts may be reduced through a se-ries of ‘good’ choices that range fromthe lens system to the choice of a verydense sensor (lots of photosites): usedin conjunction with processing stepsfor correction.

To remove these artifacts, the pro-cessing can either be done during orafter the demosaicing step. Before weconsider how these artifacts are re-moved/reduced, we need to bear inmind that the objective of commercialelectronic photography is not so muchthe accurate reproduction of a scene,but a preferred or pleasing reproduc-tion. In other words, even if there areerrors introduced by the artifact re-moval stage, so long as the image ‘looks’ good,the consumer remains satisfied.2

As alluded to earlier, the reduction of arti-facts could be performed during or after thedemosaicing step. However, it is common toperform the artifact reduction at both stages ofthe image processing chain. Most demosaicingtechniques make use of the fact that the humanvisual system is preferentially sensitive in thehorizontal and vertical directions when com-pared to other directions (diagonal). When per-forming demosaicing, depending upon thestrength of the intensity change in a neighbor-hood (horizontal, vertical, or diagonal) estima-tion kernels are used3 that may be fixed or adap-

(a) (b)

(c) (d)

(e)

Figure 1. Popularly used color filterarray tessellations. R, G, B, C, M,Y, W, stand for red, green, blue,cyan, magenta, yellow and whiterespectively. (a) A RGB BayerArray. (b) A CMYW rectangulararray. (c,d) Color arrays used insome Sony cameras. (e) Arelatively new hexagonal sensorused in some Fuji cameras.

tive.4,5 These are determined by operations overlocal neighborhoods—the goal being to inter-polate along edges rather than across them(which leads to zipper errors).

Once a full-color image has been generatedafter demosaicing a filter-array image, the ar-tifacts are either highly pronounced or relativelysubdued depending on the technique used andimage content. Most color-image-processingpipelines implement another collection of post-processing techniques to make the images ap-pealing.

Most professional and high-end consumercameras also have a post demosaicing noise-reduction step: usually a proprietary algorithm.

However, a common algorithmused to reduce color artifacts is amedian filter. Such artifacts usuallyhave a salt-and-pepper type distri-bution over the image, for whichthe median filter is well suited. Thehuman eye is known to be highlysensitive to sharp edges: we prefersharp edges in a scene to blurredones. Most camera manufacturersuse an edge-enhancement step suchas unsharp masking to make theimage more appealing by reducingthe low-frequency content in theimage and enhancing the high fre-quency content. Another techniqueis called coring, used to remove de-tail information that has no signifi-cant contribution to image detailand behaves much like noise. Theterm coring originates from themanner in which the technique isimplemented. Usually a represen-tation of the data to be filtered isgenerated at various levels of de-tail, and noise reduction is achievedby thresholding (or ‘coring’) thetransform coefficients computed atthe various scales. How much cor-ing needs to be performed (howhigh the threshold needs to be set)is a heuristic.

Rajeev Ramanath and WesleyE. SnyderDepartment of Electrical andComputer EngineeringNC State University, USAE-mail: [email protected] [email protected]

References1. R. Ramanath, W. E. Snyder, G. Bilbro, and W. A.

Sander, Demosaicking Methods for Bayer ColorArrays, J. of Electronic Imaging. 11 (3), pp. 306-315, 2002.

2. P. M. Hubel, J. Holm, G. D. Finlayson, and M. S.Drew, Matrix calculations for digital photography,Proc. IS&T/SID 5th Color Imaging Conf., pp.105-111, 1997.

3. J. E. Adams, Design of practical Color FilterArray interpolation algorithms for digital cameras,Proc. SPIE 3028, pp. 117-125, 1997.

4. R. Kimmel, Demosaicing: Image reconstructionfrom color ccd samples, IEEE Trans. on ImageProcessing 8 (9), pp. 1221-1228, 1999.

5. R. Ramanath and W. E. Snyder, AdaptiveDemosaicking, J. of Electronic Imaging 12 (4),pp. 633-642, 2003.

SPIE International Technical Group Newsletter8

ELECTRONIC IMAGING 14.1 JANUARY 2004

nals cannot be recovered easily and only com-plicated methods using non-linear iterative pro-cessing4 or prior knowledge5 are able to effec-tively deal with this.

By studying the nature of aliasing in digitalcameras, we have found a demosaicing solu-tion that does not require excessive optical bluror complicated algorithms. As shown in Fig-ure 1(b) and described formally elsewhere,6 theFourier spectrum of an image acquired with aBayer CFA image has a particular pattern. Lu-minance (i.e. R + 2G + B) is localized in thecenter, and chrominance, composed of twoopponent chromatic signals (R-G, -R+2G-B),are localized in the middle and corner of eachside. The Fourier representation of a CFA im-age thus has the property of automatically sepa-rating luminance and opponent chromatic chan-nels and projecting them in specific locationsin the Fourier domain. Consequently, it is pos-sible to directly estimate the luminance andchrominance signals with low- and high-passfilters, respectively, and then to reconstruct acolor image by adding estimated luminance andestimated and interpolated chrominance.6,7

Note, however, that luminance and opponentchromatic signals share the same two-dimen-sional Fourier space. Artifacts may result in thedemosaiced image if their representations over-lap (aliasing).

Using the Fourier representation thus alsohelps to illustrate the artifacts that may occurwhen applying any demosaicing algorithm:blurring occurs when luminance is estimatedwith a filter that is too narrow-band. False col-ors are generated when chrominance is esti-mated with a filter bandwidth that is too broad,resulting in high frequencies of luminance in-side the chrominance signal. Grid effects oc-cur when luminance is estimated with a band-width that is too broad, resulting in high fre-quencies of chrominance in the luminance sig-nal. And, finally, water colors are generatedwhen chrominance is estimated with a filterbandwidth that is too narrow. With manydemosaicing algorithms, the two most visibleeffects are blurring and false color. For visualexamples of the different artifacts, see Refer-ence 7.

In general, algorithms that totally removealiasing artifacts do not exist. However, in thecase of a CFA image, the artifacts are not due to‘real’ aliasing because they correspond to inter-ference between luminance and chrominance:

two different types of signals. This is certainlywhy many demosaicing methods work quitewell. With our approach, one can optimally re-construct the image without having recourse toany complicated de-aliasing methods. Ourdemosaicing-by-frequency-selection algorithmsgives excellent results compared to other pub-lished algorithms and uses only a linear approachwithout any prior knowledge about the imagecontent.7 Also, our approach allows us to explic-itly study demosaicing artifacts that could beremoved by tuning spectral sensitivity functions,8

optical blur, and estimation filters.Further information about this work and

color illustrations are available at:http://ivrgwww.epfl.ch/index.php?name=EI_Newsletter

David Alleysson* and Sabine Süsstrunk†*Laboratory of Psychology andNeurocognitionUniversité Pierrre-Mendes, FranceE-mail: [email protected]†Audiovisual Communications LaboratorySchool of Communications and ComputingSciencesEcole Polytechnique Fédérale de Lausanne,SwitzerlandE-mail: [email protected]

References1. B. E. Bayer, Color Imaging Array, US Patent

3,971,065, to Eastman Kodak Company, Patent andTrademark Office, Washington, D.C., 1976.

2. J. Greivenkamp, Color dependant optical prefilterfor the suppression of aliasing artifacts, Appl.Optics 29 (5), p. 676, 1990.

3. J. A. Weldy, Optimized design for a single-sensorcolor electronic camera system, SPIE OpticalSensors and Electronic Photography 1071, p.300, 1989.

4. R. Kimmel, Demosaicing: Image Reconstructionfrom Color Samples, IEEE Trans. ImageProcessing 8, p. 1221, Sept. 1999.

5. D. H. Brainard, Reconstructing Images fromTrichromatic Samples, from Basic Research toPractical Applications, Proc. IS&T/SID 3rdColor Imaging Conf., p. 4, 1995.

6. D. Alleysson, S. Süsstrunk, and J. Hérault, ColorDemosaicing by estimating luminance andopponent chromatic signals in the Fourier domain,Proc. IS&T/SID 10th Color Imaging Conf.,2002.

7. http://ivrgwww.epfl.ch/index.php?name=EI_Newsletter

8. D. Alleysson, S. Süsstrunk, and J. Marguier,Influence of spectral sensitivity functions on colordemosaicing, Proc. IS&T/SID 11th ColorImaging Conf., 2003.

Aliasing in digital camerasContinued from the cover.

the human visual system also attempts to com-pensate for scene lighting variability. Scene re-lighting algorithms should be evaluated basedon how well they simulate real scene re-light-ing, and the appearance of scenes as viewedby human observers. It is important to remem-ber that re-lit scenes will then be color rendered;sometimes these two proprietary steps are com-bined to ensure optimal performance.

Jack HolmHewlett-Packard Company, USAE-mail: [email protected]

References1. J. Holm, I, Tastl, L. Hanlon, and P. Hubel, Color

processing for digital photography, ColourEngineering: Achieving Device IndependentColour, Phil Green and Lindsay MacDonald,editors, Wiley, 2002

2. P. M. Hubel, J. Holm, and G. D. Finlayson,Illuminant Estimation and Color Correction,Colour Imaging in Multimedia, LindsayMacDonald, editor, Wiley, 1999.

3. J. Holm, A Strategy for Pictorial Digital ImageProcessing (PDIP), Proc. IS&T/SID 4th ColorImaging Conf.: Color Science, Systems, andApplications, pp. 194-201, 1996.

4. IEC/ISO 61966-2-2:2003, Multimedia systemsand equipment—Colour measurement andmanagement—Extended RGB colour space—scRGB.

5. ANSI/I3A IT10.7466:2002, Photography—Electronic Still Picture Imaging—Reference InputMedium Metric RGB Color Encoding: RIMMRGB.

6. IEC 61966-2-1:1999, Multimedia systems andequipment—Colour measurement and manage-ment—Default RGB colour space—sRGB.

7. IEC 61966-2-1 Amendment 1: 2003.8. ANSI/I3A IT10.7666:2002, Photography—

Electronic Still Picture Imaging—ReferenceOutput Medium Metric RGB Color Encoding:ROMM RGB.

9. ISO 22028-1:2003, Photography and graphictechnology—Extended colour encodings fordigital image storage, manipulation andinterchange—Part 1: architecture and require-ments.

10. J. Holm, I. Tastl, and S. Hordley, Evaluation ofDSC (Digital Still Camera) Scene Analysis ErrorMetrics—Part 1, Proc. IS&T/SID 8th ColorImaging Conf.: Color Science, Systems, andApplications, pp. 279-287, 2000.

11. J. McCann, Lessons learned from mondriansapplied to real images and color gamuts, Proc.IS&T/SID 7th Color Imaging Conf.: ColorScience, Systems, and Applications, pp. 1-8,1999.

Color processing fordigital camerasContinued from page 4.

9SPIE International Technical Group Newsletter

ELECTRONIC IMAGING 14.1 JANUARY 2004

presents the different modules of our device.The blocks drawn with dashed lines representoptional modules that are not currently imple-mented. The first prototype (see Figure 2) iscomposed of three parts: the imager, the mainboard, and the communications board. The firstboard includes a CMOS imager, analog/digi-tal converter and optics. The main board is thecore of the system: it consists of a Stratix fromAltera; several private memory blocks; and, onthe lower face of the board, an optional DSPmodule that can be connected for dedicatedprocessing and an SDRAM module socket thatallows the memory to be extended to 64 Mb.The communications board is connected to themain board and manages all communicationswith the host computer. On this card, we canconnect a 3D accelerometer, zoom controller,

Smart camera and active vision:the active-detector formalism

and motor controller for an optional turret. Our initial results show high speed tracking

of a gray-level template (see Figure 2). Accord-ing to the size of the window, the acquisitionrate varies from 200-5500 frames per second.2

François Berry and Pierre ChalimbaudLASMEA Laboratory, Université BlaisePascal, FranceE-mail: {berry, chalimba}@lasmea.univ-bpclermont.fr

References1. S. Ullman, Visual Routines, Cognition. 18, p. 97-

159, 1984.2. P. Chalimbaud, F. Berry, and P. Martinet, The task

‘template tracking’ in a sensor dedicated to activevision, IEEE Int’l Workshop on ComputerArchitectures for Machine Perception, 2003.

Continued from page 2.

Real-time image processing in a smallsystolic FPGA architecture

sumption estimation of 1.56W. The architectureuses 6118 slices, i.e. around 30% of the FPGA.The architecture was validated on a RC1000-PPFPGA AlphaData board. The performance im-provement on the software implementation run-ning on a Pentium IV processor is more than anorder of magnitude.

The processing times for a window-basedoperation on 512×512 gray-level images fordifferent window sizes are plotted in Figure 2.The array was configured to use the same num-ber of CWPs as the window size. For all thecases it was possible to achieve real-time per-formance with three to four rows processed inparallel. The processing time for a generic win-dow-based operator with a 7×7 window maskon 512×512 gray-level input images is 8.35ms,thus the architecture is able to process about120 512×512 gray-level images per second.Among the window-based image algorithms al-ready mapped into and tested are generic con-volution, gray-level image morphology andtemplate matching. Figure 3 shows a test im-age and two output images for LoG filteringand gray-level erosion.

According to theoretical and experimentalresults, the architecture compares favorablywith other dedicated architectures in terms ofperformance and hardware resource. Due to itsconfigurable, modular, and scalable design, thearchitecture constitutes a platform to explore

more complex algorithms such as motion esti-mation and stereo disparity computation,among others. The proposed architecture is wellsuited to be the computational core of a com-pletely self-contained vision system due to itsefficiency and compactness. The architecturecan be coupled with a digital image sensor andmemory banks on a chip to build a compactsmart sensor for mobile applications.

César Torres-Huitzil and Miguel Arias-EstradaComputer Science DepartmentINAOE, MéxicoE-mail: {ctorres, ariasm}@inaoep.mx

References1. J. Silc, T. Ungerer, and B. Robic, A Survey of New

Research Directions in Microprocessors,Microprocessor and Microsystems 34, pp. 175-190, 2000.

2. N. Ranganathan, VLSI & Parallel Computing forPattern Recognition & Artificial Intelligence,Series in Machine Perception and ArtificialIntelligence 18, World Scientific Publishing, 1995.

3. Miguel Arias-Estrada and César Torres-Huitzil,Real-time Field Programmable Gate ArrayArchitecture for Computer Vision, J. ElectronicImaging 10 (1), pp. 289-296, January 2001.

4. César Torres-Huitzil and Miguel Arias-Estrada,Configurable Hardware Architecture for Real-timeWindow-based Image Processing, Proc. FPL 2003,pp. 1008-1011, 2003.

Figure 3. High speed tracking of a gray-leveltemplate (32×32 at ~2000 frames per second).

Tell us about yournews, ideas, andevents!If you're interested in sending inan article for the newsletter, haveideas for future issues, or wouldlike to publicize an event that iscoming up, we'd like to hear fromyou. Contact our technical editor,Sunny Bains ([email protected]) tolet her know what you have inmind and she'll work with you toget something ready for publica-tion.

Deadline for the next edition,14.2, is:

19 January 2004: Suggestions forspecial issues and guest editors.

26 January 2004: Ideas for ar-ticles you'd like to write (or read).

26 March 2004: Calendar itemsfor the twelve months startingJune 2004.

Continued from page 5.

SPIE International Technical Group Newsletter10

ELECTRONIC IMAGING 14.1 JANUARY 2004

2004IS&T/SPIE 16th International SymposiumElectronic Imaging: Science andTechnology18–22 JanuarySan Jose, California USAProgram • Advance Registration Ends17 December 2003Exhibitionhttp://electronicimaging.org/program/04/

Photonics West 200424–29 JanuarySan Jose, California USA

Featuring SPIE International Symposia:• Biomedical Optics• Integrated Optoelectronic Devices• Lasers and Applications in Science and

Engineering• Micromachining and MicrofabricationProgram • Advance Registration Ends 7 January 2004Exhibitionhttp://spie.org/Conferences/Programs/04/pw/

For More Information ContactSPIE • PO Box 10, Bellingham, WA 98227-0010 • Tel: +1 360 676 3290 • Fax: +1 360 647 1445

E-mail: [email protected] • Web: www.spie.org

SPIE International SymposiumMedical Imaging 200414–19 FebruarySan Diego, California USAProgram • Advance Registration Ends 5 February 2004Exhibitionhttp://spie.org/conferences/programs/04/mi/

SPIE International SymposiumOptical Science and Technology

SPIE’s 49th Annual Meeting2–6 AugustDenver, Colorado USACall for Papers • Abstracts Due 5 January 2004Exhibitionhttp://spie.org/conferences/calls/04/am/

SPIE International SymposiumITCom 2004

Information Technologies and Communications12–16 SeptemberAnaheim, California USACo-located with NFOEC

26th International Congress onHigh Speed Photography and Photonics20–24 SeptemberAlexandria, Virginia USACall for Papers • Abstracts Due 15 March 2004http://spie.org/conferences/calls/04/hs/

NIH Workshop onDiagnostic Optical Imaging and Spectroscopy20–22 SeptemberWashington, D.C. USAOrganized by NIH, managed by SPIE

SPIE International SymposiumPhotonics Asia 20048–11 NovemberBeijing, ChinaCall for Papershttp://spie.org/conferences/calls/04/pa/

Calendar

acquisition of frames per second is acceleratedsince the images are very small. The frame-grabber size is also dramatically reduced. Com-bined, these two effects make the exploitationof differential algorithms especially interesting.Such image-processing algorithms systemati-cally apply simple operations to the whole im-age, computing spatial and temporal differ-ences. These can be computationally intensivefor large images and the simultaneous storageof several frames for computing temporal dif-ferences can be a hardware challenge. Log-polar image-data reduction can therefore con-tribute to the effective use of differential algo-rithms in real applications.7

In addition to the selective reduction of in-formation, another interesting advantage of log-polar representation is related to polar coordi-nates. In this case, approaching movementalong the optical axis in the sensor plane hasonly a radial coordinate. This type of move-ment is often present with a camera on top of amobile platform like an autonomous robot. Ifthe machine is moving along its optical axis,the image displacement due to its own move-ment has only a radial component. Thus, com-

plex image-processing algorithms are simpli-fied and accelerated.3,7,8 Further, the hardwarereduction achieved in storing and processingimages, combined with the density of program-mable devices, make possible a full image-pro-cessing system on a single chip.9 This approachis especially well suited to systems with powerconsumption and hardware constraints. Wewould argue it is the natural evolution of thereconfigurable architectures employed for au-tonomous robotic navigation7 systems.

This work is supported by the GeneralitatValenciana under project CTIDIA/2002/142.

Jose A. Boluda and Fernando PardoDepartament d’Informa`ticaUniversitat de Vale`ncia, SpainE-mail: [email protected]://www.uv.es/~jboluda/

References1. M. Tistarelli and G. Sandini, Dynamic aspects in

active vision. CVGIP: Image Understanding 56(1), p 108, 1992.

2. R. M. Hodgson and J. C. Wilson, Log polarmapping applied to pattern representation andrecognition, Computer Vision and Image

Processing, Ed. Shapiro & Rosenfeld, AcademicPress, p. 245, 1992.

3. M. Tistarelli and G. Sandini, On the advantages ofpolar and log-polar mapping for direct estimationof time-to-impact from optical flow, IEEE Trans.on PAMI 15 (4), p. 401, April 1993.

4. R. S. Wallace, P. W. Ong, B. B. Bederson, and E.L. Schwartz, Space-variant image processing, Intl.J. of Computer Vision 13 (1), p. 71, 1994.

5. F. Jurie, A new log-polar mapping for spacevariant imaging: Application to face detection andtracking, Pattern Recognition 32 (5), p. 865, May1999.

6. F. Pardo, B. Dierickx, and D. Scheffer, Space-Variant Non-Orthogonal Structure CMOS ImageSensor Design, IEEE J. of Solid-State Circuits 33(6), p. 842, June 1998.

7. J. A. Boluda and J. J. Domingo, On the advantagesof combining differential algorithms, pipelinedarchitectures and log-polar vision for detection ofself-motion from a mobile robot, Robotics andAutonomous Systems 37 (4), p. 283, December2001.

8. J. A. Boluda and F. Pardo, A reconfigurablearchitecture for autonomous visual navigation,Machine Vision and Applications 13 (5-6), p.322, 2003.

9. J. A. Boluda and F. Pardo. Synthesizing on areconfigurable chip an autonomous robot imageprocessing system, Field Programmable Logicand Applications, Springer Lecture Notes inComputer Science 2778, pp. 458-467, 2003.

Space-variant image processing:taking advantage of data reduction and polar coordinates.Continued from page 12.

11SPIE International Technical Group Newsletter

ELECTRONIC IMAGING 14.1 JANUARY 2004

Join the SPIE/IS&T Technical Group...and receive this newsletter

JEI 2003 subscription rates (4 issues): U.S. Non-U.S.Individual SPIE or IS&T member* $ 55 $ 55Individual nonmember and institutions $255 $275

Your subscription begins with the first issue of the year. Subscriptions are entered on a calendar-year basis.Orders received after 1 September 2003 will begin January 2004 unless a 2003 subscription is specified.

*One journal included with SPIE/IS&T membership. Price is for additional journals.

Send this form (or photocopy) to:SPIE • P.O. Box 10Bellingham, WA 98227-0010 USATel: +1 360 676 3290Fax: +1 360 647 1445E-mail: [email protected]

Please send me:■■ Information about full SPIE membership■■ Information about full IS&T membership■■ Information about other SPIE technical groups■■ FREE technical publications catalog

Technical Group Membership fee is $30/year, or $15/year for full SPIE and IS&T Members.

Amount enclosed for Technical Group membership $ _________

❏ I also want to subscribe to IS&T/SPIE’s Journal of Electronic Imaging (JEI) $ _________

Total $ _________❏ Check enclosed. Payment in U.S. dollars (by draft on a U.S. bank, or international money order) isrequired. Do not send currency. Transfers from banks must include a copy of the transfer order.

❏ Charge to my: ❏ VISA ❏ MasterCard ❏ American Express ❏ Diners Club ❏ Discover

Account # _______________________________________ Expiration date ______________

Signature ____________________________________________________________________

This newsletter is produced twice yearly and is available onlyas a benefit of membership of the SPIE/IS&T Electronic Imaging Technical Group.

IS&T—The Society for Imaging Science and Technology has joined with SPIE to form a technical group structurethat provides a worldwide communication network and is advantageous to both memberships.

Join the Electronic Imaging Technical Group for US$30. Technical Group members receive these benefits:• Electronic Imaging Newsletter• SPIE’s monthly publication, oemagazine• annual list of Electronic Imaging Technical Group members

People who are already members of IS&T or SPIE are invited to join the Electronic Imaging Technical Group for thereduced member fee of US$15.

Please Print ■■ Prof. ■ ■ Dr. ■ ■ Mr. ■ ■ Miss ■ ■ Mrs. ■ ■ Ms.

First (Given) Name ______________________________________ Middle Initial __________________

Last (Family) Name ___________________________________________________________________

Position ____________________________________________________________________________

Business Affiliation ___________________________________________________________________

Dept./Bldg./Mail Stop/etc. ______________________________________________________________

Street Address or P.O. Box _____________________________________________________________

City _______________________________ State or Province ________________________________

Zip or Postal Code ___________________ Country ________________________________________

Phone ____________________________________ Fax ____________________________________

E-mail _____________________________________________________________________________

(required for credit card orders)

(see prices below)

Electronic ImagingThe Electronic Imaging newsletter is published bySPIE—The International Society for Optical Engi-neering, and IS&T—The Society for Imaging Sci-ence and Technology. The newsletter is the officialpublication of the International Technical Group onElectronic Imaging.

Technical Group Chair Arthur WeeksTechnical Group Cochair Gabriel MarcuTechnical Editor Sunny BainsManaging Editor/Graphics Linda DeLano

Articles in this newsletter do not necessarilyconstitute endorsement or the opinions of the edi-tors, SPIE, or IS&T. Advertising and copy aresubject to acceptance by the editors.

SPIE is an international technical society dedi-cated to advancing engineering, scientific, and com-mercial applications of optical, photonic, imaging,electronic, and optoelectronic technologies.

IS&T is an international nonprofit society whosegoal is to keep members aware of the latest scientificand technological developments in the fields ofimaging through conferences, journals and otherpublications.

SPIE—The International Society for OpticalEngineering, P.O. Box 10, Bellingham, WA 98227-0010 USA. Tel: +1 360 676 3290. Fax: +1 360 6471445. E-mail: [email protected].

IS&T—The Society for Imaging Science andTechnology, 7003 Kilworth Lane, Springfield, VA22151 USA. Tel: +1 703 642 9090. Fax: +1 703 6429094.

© 2003 SPIE. All rights reserved.

EIONLINEElectronic Imaging Web

Discussion ForumYou are invited to participate in SPIE’sonline discussion forum on ElectronicImaging. To post a message, log in tocreate a user account. For options see“subscribe to this forum.”

You’ll find our forums well designedand easy to use, with many helpfulfeatures such as automated emailnotifications, easy-to-follow ‘threads,’and searchability. There is a full FAQ formore details on how to use the forums.

Main link to the new ElectronicImaging forum:

http://spie.org/app/forums/tech/

Related questions or suggestions canbe sent to [email protected].

Reference Code: 3537

SPIE International Technical Group Newsletter12

ELECTRONIC IMAGING 14.1 JANUARY 2004

P.O. Box 10 • Bellingham, WA 98227-0010 USA

Change Service Requested

DATED MATERIAL

Non-Profit Org.U.S. Postage Paid

Society ofPhoto-Optical

InstrumentationEngineers

Space-variant image processing: taking advantage ofdata reduction and polar coordinates.The human retina exhibits a non-uniformphoto-receptor distribution: more resolution atthe center of the image and less at the periph-ery. This space-variant vision emerges as aninteresting image acquisition system, sincethere is a selective reduction of information.Moreover, the log-polar mapping—as a par-ticular case of space-variant vision—shows in-teresting mathematical properties that can sim-plify several widely-studied image-processingalgorithms.1-4 For instance, rotations around thesensor center are converted to simple transla-tions along the angular coordinate, andhomotheties (linear transformations) with re-spect to the center in the sensor plane becometranslations along the radial coordinate.

The sensor (with the space-variant densityof pixels) and computational planes are calledthe retinal and cortical planes, respectively. Theresolution of a log-polar image is usually ex-pressed in terms of rings and number of cells(sectors) per ring. A common problem with thistransformation is how to solve the central sin-gularity: if the log-polar equations are strictlyfollowed, the center would contain an infinitedensity of pixels that cannot be achieved. Thisproblem of the fovea (the central area withmaximum resolution) can be addressed in dif-ferent ways: the central blind spot model,Jurie’s model,5 and other approaches that givespecial transformation equations for this cen-tral area. Figure 1 shows an example of a log-polar transformation. At the left there is a Car-tesian image of 440×440 pixels; at the centeris the same image after a log-polar transforma-tion with a central blind spot that gives a reso-lution of 56 rings with 128 cells per ring. No-tice there is enough resolution at the center toperceive the cat in detail. The rest of the imageis clearly worse than the Cartesian version, but

this is the periphery of the image. This retinalimage occupies less than 8 kB: the equivalentCartesian image is around 189 kB (24 timeslarger). The computational plane of the imageis shown in Figure 1 (right).

The best way to obtaining log-polar imagesdepends on the available hardware and soft-ware. The simplest approach is to use softwareto transform a typical Cartesian image from astandard camera. This is done using the trans-formation equations between the retinal planeand the Cartesian plane. Since the transforma-tion parameters can be tuned online, this solu-tion is flexible. However, it can be an exces-sively-time-consuming effort if the computermust first process these images in order to per-form another task. The other option is thepurely-hardware solution: the log-polar trans-formation made directly from a sensor with thisparticular pixel distribution. An example of alog-polar sensor is a CMOS visual sensor de-signed with a resolution of 76 rings and 128

cells per ring.6 The fovea is comprised of theinner 20 rings that follow a linear- (not log-)polar transformation to avoid the center singu-larity. This method fixes the image transfor-mation parameters and is not flexible.

As an intermediate approach, a circuit thatperforms a Cartesian to log-polar image trans-formation can be implemented on a program-mable device. This solution gives the advan-tage of speed while retaining flexibility: thetransformation parameters can be changed onthe fly. Moreover, the complexity and densityof current reconfigurable devices represent anew trend in computer architecture, since it ispossible to include microprocessors, DSPcores, custom hardware, and small memoryblocks in a single chip.

The log-polar image data reduction has sev-eral positive consequences for the processingsystem. The first and most obvious is that the

Figure 1. Left: A 440×440 Cartesian image. Center: A 128×56 log-polar image. Right: The computationalimage.

Continues on page 10.