color by correlation: a simple, unifying framework for ......color by correlation: a simple,...

13
Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul M. Hubel Abstract—This paper considers the problem of illuminant estimation: how, given an image of a scene, recorded under an unknown light, we can recover an estimate of that light. Obtaining such an estimate is a central part of solving the color constancy problem—that is of recovering an illuminant independent representation of the reflectances in a scene. Thus, the work presented here will have applications in fields such as color-based object recognition and digital photography, where solving the color constancy problem is important. The work in this paper differs from much previous work in that, rather than attempting to recover a single estimate of the illuminant as many previous authors have done, we instead set out to recover a measure of the likelihood that each of a set of possible illuminants was the scene illuminant. We begin by determining which image colors can occur (and how these colors are distributed) under each of a set of possible lights. We discuss in the paper how, for a given camera, we can obtain this knowledge. We then correlate this information with the colors in a particular image to obtain a measure of the likelihood that each of the possible lights was the scene illuminant. Finally, we use this likelihood information to choose a single light as an estimate of the scene illuminant. Computation is expressed and performed in a generic correlation framework which we develop in this paper. We propose a new probabilistic instantiation of this correlation framework and we show that it delivers very good color constancy on both synthetic and real images. We further show that the proposed framework is rich enough to allow many existing algorithms to be expressed within it: the gray-world and gamut-mapping algorithms are presented in this framework and we also explore the relationship of these algorithms to other probabilistic and neural network approaches to color constancy. Index Terms—Color constancy, illuminant estimation, correlation matrix. æ 1 INTRODUCTION A N image of a three-dimensional scene depends on a number of factors. First, it depends on the physical properties of the imaged objects, that is on their reflectance properties. But, it also depends on the shape and orientation of these objects and on the position, intensity, and color of the light sources. Finally, it depends on the spectral sampling properties of the imaging device. Various appli- cations require that we be able to disambiguate these different factors to recover the surface reflectance properties of the imaged objects or the spectral power distribution (SPD) of the incident illumination. For example, in a number of applications ranging from machine vision tasks, such as object recognition, to digital photography, it is important that the colors recorded by a device are constant across a change in the scene illumination. As an illustration, consider using the color of an object as a cue in a recognition task. Clearly, if such an approach is to be successful, then this color must be stable across illumination change [28]. Of course, it is not stable since changing the color of the illumination changes the color of the light reflected from an object. Hence, a preliminary step in using color for object recognition must be to remove the effect of illumination color from the object to be recognized. Accounting for the prevailing illumination is important in photography, too; it is known [1], [3] that the human visual system corrects, at least partially, for the prevailing scene illumination; therefore, if a photograph is to be an accurate representation of what the photographer saw, the photograph must be similarly corrected. Central to solving the color constancy problem is recovering an estimate of the scene illumination and it is that problem which is the focus of this paper. Specifically, we consider how, given an image of a scene taken under an unknown illuminant, we can recover an estimate of that light. We present in this paper, a simple new approach to solving this problem, which requires only that we have some knowledge about the range and distribution of image colors which can be recorded by a camera under a set of possible lights. We discuss later in the paper how such knowledge can be obtained. The work presented here builds on a range of computa- tional theories previously proposed by other authors [21], [23], [8], [9], [4], [5], [27], [29], [14], [10]. The large number of such theories illustrates that the problem is difficult to solve. Part of the difficulty is due to the fact that the problem is inextricably tied up with other confounding phenomena: We have to account for changes in image intensity and color which are due to the shape of the objects, viewing and illumination geometry, as well as those due to changes in the spectral power distribution of the illuminant and the spectral reflectance properties of the imaged objects. Thus, to simplify the problem, many researchers [21], [23], [14], [8] have considered a simplified two-dimensional world in which all objects are flat, matte, Lambertian surfaces, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 11, NOVEMBER 2001 1209 . G.D. Finlayson and S.D. Hordley are with the School of Information Systems, University of East Anglia, Norwich NR4 7TJ, UK. E-mail: {steve, graham}@sys.uea.ac.uk. . P.M. Hubel is with Hewlett-Packard Labs, Hewlett Packard Inc., 1501 Page Mill Road, Palo Alto, CA 94306. E-mail: [email protected]. Manuscript received 26 July 1999; revised 25 Sept. 2000; accepted 9 Mar. 2001. Recommended for acceptance by S. Sarkar. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number 110311. 0162-8828/01/$10.00 ß 2001 IEEE

Upload: others

Post on 07-Jul-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

Color by Correlation: A Simple, UnifyingFramework for Color Constancy

Graham D. Finlayson, Steven D. Hordley, and Paul M. Hubel

AbstractÐThis paper considers the problem of illuminant estimation: how, given an image of a scene, recorded under an unknown

light, we can recover an estimate of that light. Obtaining such an estimate is a central part of solving the color constancy problemÐthat

is of recovering an illuminant independent representation of the reflectances in a scene. Thus, the work presented here will have

applications in fields such as color-based object recognition and digital photography, where solving the color constancy problem is

important. The work in this paper differs from much previous work in that, rather than attempting to recover a single estimate of the

illuminant as many previous authors have done, we instead set out to recover a measure of the likelihood that each of a set of possible

illuminants was the scene illuminant. We begin by determining which image colors can occur (and how these colors are distributed)

under each of a set of possible lights. We discuss in the paper how, for a given camera, we can obtain this knowledge. We then

correlate this information with the colors in a particular image to obtain a measure of the likelihood that each of the possible lights was

the scene illuminant. Finally, we use this likelihood information to choose a single light as an estimate of the scene illuminant.

Computation is expressed and performed in a generic correlation framework which we develop in this paper. We propose a new

probabilistic instantiation of this correlation framework and we show that it delivers very good color constancy on both synthetic and

real images. We further show that the proposed framework is rich enough to allow many existing algorithms to be expressed within it:

the gray-world and gamut-mapping algorithms are presented in this framework and we also explore the relationship of these algorithms

to other probabilistic and neural network approaches to color constancy.

Index TermsÐColor constancy, illuminant estimation, correlation matrix.

æ

1 INTRODUCTION

AN image of a three-dimensional scene depends on anumber of factors. First, it depends on the physical

properties of the imaged objects, that is on their reflectanceproperties. But, it also depends on the shape and orientationof these objects and on the position, intensity, and color ofthe light sources. Finally, it depends on the spectralsampling properties of the imaging device. Various appli-cations require that we be able to disambiguate thesedifferent factors to recover the surface reflectance propertiesof the imaged objects or the spectral power distribution(SPD) of the incident illumination. For example, in anumber of applications ranging from machine vision tasks,such as object recognition, to digital photography, it isimportant that the colors recorded by a device are constantacross a change in the scene illumination. As an illustration,consider using the color of an object as a cue in arecognition task. Clearly, if such an approach is to besuccessful, then this color must be stable across illuminationchange [28]. Of course, it is not stable since changing thecolor of the illumination changes the color of the lightreflected from an object. Hence, a preliminary step in usingcolor for object recognition must be to remove the effect of

illumination color from the object to be recognized.Accounting for the prevailing illumination is importantin photography, too; it is known [1], [3] that the humanvisual system corrects, at least partially, for the prevailingscene illumination; therefore, if a photograph is to be anaccurate representation of what the photographer saw, thephotograph must be similarly corrected.

Central to solving the color constancy problem isrecovering an estimate of the scene illumination and it isthat problem which is the focus of this paper. Specifically,we consider how, given an image of a scene taken under anunknown illuminant, we can recover an estimate of thatlight. We present in this paper, a simple new approach tosolving this problem, which requires only that we havesome knowledge about the range and distribution of imagecolors which can be recorded by a camera under a set ofpossible lights. We discuss later in the paper how suchknowledge can be obtained.

The work presented here builds on a range of computa-tional theories previously proposed by other authors [21],[23], [8], [9], [4], [5], [27], [29], [14], [10]. The large number ofsuch theories illustrates that the problem is difficult to solve.Part of the difficulty is due to the fact that the problem isinextricably tied up with other confounding phenomena: Wehave to account for changes in image intensity and colorwhich are due to the shape of the objects, viewing andillumination geometry, as well as those due to changes in thespectral power distribution of the illuminant and thespectral reflectance properties of the imaged objects. Thus,to simplify the problem, many researchers [21], [23], [14], [8]have considered a simplified two-dimensional world inwhich all objects are flat, matte, Lambertian surfaces,

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 11, NOVEMBER 2001 1209

. G.D. Finlayson and S.D. Hordley are with the School of InformationSystems, University of East Anglia, Norwich NR4 7TJ, UK.E-mail: {steve, graham}@sys.uea.ac.uk.

. P.M. Hubel is with Hewlett-Packard Labs, Hewlett Packard Inc., 1501Page Mill Road, Palo Alto, CA 94306. E-mail: [email protected].

Manuscript received 26 July 1999; revised 25 Sept. 2000; accepted 9 Mar.2001.Recommended for acceptance by S. Sarkar.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number 110311.

0162-8828/01/$10.00 ß 2001 IEEE

Page 2: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

uniformly illuminated. In this idealized scenario, imageformation can be described by the following simpleequation:

pxk �Z!

E���Sx���Rk���d�: �1�

In this equation, Sx��� represents the surface reflectance at apoint x in the scene: It defines what fraction of the incidentlight is reflected on a per wavelength basis. E��� is thespectral power distribution of the incident illuminant whichdefines how much power is emitted by the illuminant ateach wavelength. Rk��� is the relative spectral response ofthe imaging device's kth sensor, which specifies whatproportion of the light incident at the sensor is absorbed ateach wavelength. These three terms when multipliedtogether and the product integrated over the interval !(the range of wavelengths to which the sensor is sensitive)gives pxk : the response of the imaging device's kth sensor atpixel x. It is clear from (1) that changing either the surfacereflectance function or the spectral power distribution of theilluminant will change the values recorded by the imagingdevice. The task for a color constancy algorithm is totransform the pxk so that they become independent of E���and, hence, correlate with S���. Equivalently, the problemcan be posed as that of recovering an estimate of E��� sincewith this knowledge, it is relatively straightforward [19] torecover an image which is independent of the prevailingillumination.

Even with the simplified model of image formation in(1), the problem remains difficult to solve. To see why,consider a typical imaging device with three classes ofsensors: k � 3 (it is common to refer to the triplet ofsensor responses �px1 ; px2 ; px3� as R, G, and B, or simplyRGBÐbecause typically sensors measure the long (red),medium (green), and short (blue) wavelengths, respec-tively. These expressions are used interchangeably through-out this paper). Now, if there are n different surfaces in animage, then we have 3n knowns from (1). From theseequations, we must estimate parameters for the n surfacesand a single illuminant. Surface reflectance functions andilluminant SPDs, as well as sensor response functions, aretypically specified by their value at a number (m) of discretesample points within the visible spectrum. In this case, theimage formation equation (1) can be rewritten as:

pxk �Xi�mi�1

E��i�Sx��i�Rk��i���; �2�

where the �is are the sample points and �� is the widthbetween them. If surfaces and illuminants are eachdescribed by m parameters in this way, then we have atotal of m�n� 1� parameters to solve for. It is clear that thenumber of knowns 3n can never be bigger than the numberof unknowns m�n� 1� regardless of how many distinctsurfaces appear in a scene.

Fortunately, it is often unnecessary to recover the fullspectra of lights and surfaces, rather it is sufficient torepresent a light by the response of a device to a perfectdiffuser viewed under it and, similarly, to represent asurface by the response it induces under some canonical

light. Continuing with the case of an imaging device withthree classes of sensor, this implies that lights and surfacesare described by three parameters each, so that the totalnumber of parameters to be solved for is 3n� 3 and theproblem is thus further constrained. Nevertheless, theproblem is still underconstrained; there are three moreunknowns than knowns to solve for.

In this paper, we make one further simplification of theproblem: rather than representing lights and surfaces by a3-vector of sensor responses, �p1; p2; p3�t, we insteadrepresent them in terms of their 2D chromaticity vectors,�c1; c2�t, calculated from the original sensor response bydiscarding intensity information. There are many ways inwhich we might discard intensity information: One com-mon way is to divide two of the sensor responses by theresponse of the third:

cx1 �px1px3

cx2 �px2px3: �3�

Ignoring intensity information means that changes in surfacecolor due to geometry or viewing angle, which change onlyintensity, will not affect our computation and, in addition, wehave reduced the problem from a 3D one to a 2D one.Furthermore, we point out that, under the model of imageformation described by (1), illumination can only berecovered up to a multiplicative constant.1 However, evenwith this simplification, we still have 2�n� 1� unknowns and2n knowns, so that the problem remains underconstrained.

Many authors [21], [5], [18], [23], [8] have tried to deal withthe underconstrained nature of the color constancy problemby making additional assumptions about the world. Forexample, Land [21] assumes that every image contains awhite patch, hence, there are now only 3n unknowns and3n equations. Another assumption [5], [18] is that the averagereflectance of all surfaces in a scene is achromatic. In this case,the average color of the light leaving the surface will be thecolor of the incident illumination. Yet, another approach [23],[8] has been to model lights and surfaces using low-dimensional linear models and to develop recovery schemeswhich exploit the algebraic features of these models. Otherauthors have tried to exploit features not present in theidealized Mondrian world, such as specularities [22], [27],[29], shadows [11], or mutual illumination [15], to recoverinformation about the scene illuminant. Unfortunately, theassumptions made by all these algorithms are quite oftenviolated in real images so that many of the algorithms workonly inside the lab [15], [22], [29] and, while others can workon real images, their performance is still short of goodenough color constancy [16].

The fact that the problem is underconstrained impliesthat, in general, the combination of surfaces and illuminantgiving rise to a particular image is not unique. So, settingout to solve for a unique answer (the goal of mostalgorithms) is perhaps not the best way to proceed. This

1210 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 11, NOVEMBER 2001

1. Within our model of image formation, the light incident at the imaging

device is the product of illuminant spectral power and surface reflectance;

E���S���. Clearly, the product sE��� S���s will result in the same incident

light for any value of s. Hence, E��� can only be recovered up to a

multiplicative constant.

Page 3: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

point of view has only been considered in more recentalgorithms. For example, the gamut-mapping algorithmsdeveloped by Forsyth [14] and, later, by Finlayson [10] andFinlayson and Hordely [12], do not, in the first instance,attempt to find a unique solution to the problem; rather, theset of all possible solutions are found and, from thisset, the best solution is chosen. Other authors [4], [26], [9],recognizing that the problem does not have a uniquesolution, have tried to exploit information in the image torecover the most likely solution. Several authors [4], [9] haveposed the problem in a probabilistic framework and, morerecently, Sapiro [26], [25] has developed an algorithm basedon the Probabilistic Hough Transform. The neural networkapproach [17] to color constancy can similarly be seen as amethod of dealing with the inherent uncertainty in theproblem. While these algorithms which model and workwith uncertainty represent an improvement over earlierattempts at solving the color constancy problem, none ofthem can be considered the definitive solution. They areneither good enough to explain our own color constancy [3],nor are they good enough to support other visual tasks suchas object recognition [16].

In Section 2, we address the limitations of existingalgorithms by presenting a new illuminant estimationalgorithm [20] within a general correlation framework. In thisframework, illuminant estimation is posed as a correlationof the colors in an image and our prior knowledge aboutwhich colors can appear under which lights. The light thatis most correlated with the image data is the most likelyilluminant. Intuitively, this idea has merit. If the colors in animage ªlookº more yellow than they ought to, then onemight assume that this yellowness correlated with a yellowilluminant. The correlation framework implements this ideain a three step process. First, in a preprocessing step, wecode information about the interaction between imagecolors and illuminants. Second, we correlate this priorinformation with the information present in a particularimage. That is, the colors in an image are used to derive ameasure of the likelihood that each of the possibleilluminants was the scene illuminant. Finally, these like-lihoods are used to recover an estimate of the sceneilluminant. We develop a particular instantiation of thisframeworkÐa new correlation algorithmÐwhich has anumber of attractive properties. It enforces the physicalrealisability of lights and surfaces, it is insensitive tospurious image colors, it is fast to compute, and it calculatesthe most likely answer. Significantly, we can also calculatethe likelihood of all possible illuminants; effectively, we canreturn the best answer together with the error bars.

We further show (in Section 3) that the correlationframework we have developed is general and can be usedto describe many existing algorithms. We will see howalgorithms ranging from Gray-World [5] to Gamut Map-ping [14], [10] to the Neural Network Approach [17], relateto different definitions of color, likelihood, and correlation;yet, in all cases, the same correlation calculation results.Moreover, by examining these algorithms in the sameframework, we will come to understand how our newsimple algorithm builds on and, more importantly, im-proves upon other algorithms. Furthermore, we will show

that algorithms such as gamut mapping, previouslycriticized for their complexity, are, in fact, no more complexthan the simplest type of color constancy computation.

Finally, in Section 4, we present experimental evidence toshow that our new algorithm formulated in this frameworkdoes provide very good color constancy: It performs betterthan the other algorithms we tested.

2 COLOR BY CORRELATION

We pose the color constancy problem as that of recoveringan estimate of the scene illumination from an image of ascene taken under an unknown illuminant since, from thisestimate, it is relatively straightforward to transform imagecolors to illuminant independent descriptors [19]. Here, werestrict attention to the case of an imaging system with threeclasses of sensor. In such a case, it is not possible to recoverthe full spectral power distribution of the illuminant; so,instead, an illuminant with spectral power distribution E���is characterized by pE : The response of the imaging deviceto an achromatic surface viewed under E���. An estimate ofthe illuminant will accordingly be a 3-vector sensorresponse, pE . However, as pointed out earlier, since wecannot recover the overall intensity of the illuminant, butonly its chromaticity, we instead represent an illuminant byits 2D chromaticity vector cE , derived from the 3-vector ofsensor responses by discarding intensity information (forexample, using (3)). We represent surfaces in a similarfashionÐspecifically, we define surface color by thechromaticity of the response which the surface induces ina device when viewed under some canonical illuminant.

The chromaticity coordinates define a two-dimensionalspace of infinite extent. However, to help us formulate oursolution to the illuminant estimation problem, we make twofurther assumptions. First, we assume that a given devicewill produce responses only within a finite region of thisspaceÐfor a device such as a digital camera giving 8-bitdata, this is clearly valid since sensor responses will beintegers in the range 0 to 255 and, using (3), calculablechromaticity coordinates will be in the range 1=255 to 1.Second, we assume that we can partition this space intoN �N uniform regions. This assumption is justified on twogrounds. First, all devices have some measurement error,which implies that all chromaticities within a regiondefined by this error must be considered equal. Second,for many applications, lights and surfaces with chromati-cities within a certain distance of one another can effectivelybe considered to have the same color. Exactly how finely weneed to partition this spaceÐhow big N should beÐwilldepend on the application. We consider this issue later inthe paper.

Partitioning the chromaticity space in this way impliesthat there are at most N2 distinct illuminants and N2

distinct surfacesÐso that there can be at most N2 distinctchromaticities in an image. In practice, the range ofilluminants which we encounter in the world is much morerestricted than this, so that the number (Nill) of possibleilluminants will be much smaller than N2. For convenience,we define an Nill � 2 matrix Cill whose ith row is thechromaticity of the ith illuminant. And, we use the notationCim to represent the Npix � 2 matrix of image chromaticities.

FINLAYSON ET AL.: COLOR BY CORRELATION: A SIMPLE, UNIFYING FRAMEWORK FOR COLOR CONSTANCY 1211

Page 4: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

We now solve for color constancy in three stages. First, we

build a correlation matrix to correlate possible image colors

with each of the set of Nill possible scene illuminants (see

Fig. 1). For each illuminant, we characterize the range of

possible image colors (chromaticities) that can be observed

under that light (Fig. 1a). More details on how this can be done

are given later in the paper. This information is used to build a

probability distribution (Fig. 1b) which tells us the likelihood

of observing an image color under a given light. The

probability distributions for each light form the columns of

a correlation matrix M (Fig. 1c) (each row of the matrix

corresponds to one of the N �N discrete cells of the

partitioned chromaticity space). Given a correlation matrix

and an image whose illuminant we wish to estimate, we

perform the following two steps (illustrated in Fig. 2). First,

we determine which image colors are present in the image

(Fig. 2a). This information is coded in a vector v of ones and

zeros corresponding to whether or not a given chromaticity

is present in the image. To this end, we define two

operations chist�� and thresh��. The operation chist�� returns

an N2 � 1 vector h:

h�xi � yi� � counti=Npix

counti �PNpix

j�1 cj; cj � 1 if Cim�j� � �xi; yi�0 otherwise:

� �4�

that is, the ith element of h holds the number of times achromaticity corresponding to �xi; yi� occurs in the image,normalized by Npix, the total number of pixels in the image.For example, if chromaticity �xi; yi�t occurs 10 times in theimage, then h�xi � yi� � 10=Npix. The second operation,thresh�h�, ensures that each image chromaticity is countedonly once, regardless of the number of times it occurs in theimage. Formally,

thresh�x� � 1; if x > 00; otherwise:

��5a�

thresh �h1; h2; � � � ; hN �tÿ � ��thresh�h1�; thresh�h2�; � � � ; thresh�hN��t:

�5b�

With these definitions, v can be expressed:

v � thresh�chist�Cim��: �6�Then, we determine a measure of the correlation betweenthis image data v and each of the possible illuminants. Theusual expression of a correlation is as a vector dot-product.For example, if a and b are vectors, then they are stronglycorrelated if a:b is large. We use a similar dot-productdefinition of correlation here. Each column of the correla-tion matrix M corresponds to a possible illuminant so thatthe elements of the vector returned by the product vtM are a

1212 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 11, NOVEMBER 2001

Fig. 1. Three steps in building a correlation matrix. (a) We first characterize which image colors (chromaticities) are possible under each of our

reference illuminants. (b) We use this information to build a probability distribution for each light. (c) Finally, we encode these distributions in the

columns of our matrix.

Page 5: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

measure of how strongly the image data correlates witheach of the possible illuminants. Fig. 2 is a graphicalrepresentation of this process. The highlighted rows of thecorrelation matrix correspond to chromaticities present inthe image (entries of v which are one). To obtain acorrelation measure for an illuminant, we simply sum thehighlighted elements of the corresponding column. Theresult of this is a vector, l (Fig. 2b), whose elements expressthe degree of correlation of each illuminantÐthe bigger anelement of this vector, the greater the correlation betweenthe image data and the corresponding illuminant:

l � vtM � thresh�chist�Cim��tM: �7�The final stage in solving for color constancy is to recover anestimate of the scene illuminant based on the correlationinformation (Fig. 2c). For example, we could choose theilluminant which is most highly correlated with the imagedata:

cE � thresh2�thresh�chist�Cim��tM�Cill; �8�where thresh2�� returns a vector with entries correspond-ing to:

thresh2�h� � h0 h0i �1; if hi � max�h�0; otherwise:

��9�

Equation (8) represents our framework for solving colorconstancy. To completely specify the solution, however, wemust define the entries of the correlation matrix M. Given aset of image data Cim, we would like to recoverPr�EjCim�Ðthe probability that E was the scene illuminantgiven Cim. If we know the probability of observing a certainchromaticity c, under illuminant E: Pr�cjE�, then Bayes'rule [7] tells us how to calculate the corresponding

probability Pr�Ejc�: the probability that the illuminantwas E, given that we observe chromaticity c:

Pr�Ejc� � Pr�cjE�Pr�E�Pr�c� : �10�

Here, Pr�E� is the probability that the scene illuminant is E,and Pr�c� is the probability of observing the chromaticity c,and the set of possible illuminants is defined by theNill � 2 matrix Cill and the range of possible chromaticitiesis defined by the N �N partitions of the chromaticity space.From (10), it follows that the probability that the illuminantwas E, given the image data Cim, is given by:

Pr�EjCim� � Pr�CimjE�Pr�E�Pr�Cim� : �11�

Now, noting that for a given image Pr�Cim� is constant andif we assume that image chromaticities are independent,then we can rewrite (11) as:

Pr�EjCim� �Y8c2Cim

Pr�cjE�24 35Pr�E�: �12�

Furthermore, if we assume that all illuminants are equallylikely, then we have:

Pr�EjCim� � kY8c2Cim

Pr�cjE�; �13�

where k is some constant. Of course, in general, it is notclear that all illuminants will occur with the samefrequency; however, in the absence of information aboutthe frequency of occurrence of individual lights, it makessense to assume that all are equally likely. For someapplications, it may be possible to provide such prior

FINLAYSON ET AL.: COLOR BY CORRELATION: A SIMPLE, UNIFYING FRAMEWORK FOR COLOR CONSTANCY 1213

Fig. 2. Solving for color constancy in three stages. (a) Histogram the chromaticities in the image. (b) Correlate this image vector v with each column

of the correlation matrix. (c) This information is used to find an estimate of the unknown illuminant, for example, the illuminant which is most

correlated with the image data.

Page 6: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

information and, in this case, with a slight modification tothe framework, such information can be incorporated intothe algorithm. For now, we simply assume that all lights areequally likely, but we also remind the reader that we are notconsidering all possible chromaticities as the set of potentialilluminants, but, rather, we use a restricted set correspond-ing to ªreasonableº lights.

Now, we define a likelihood function:

l�EjCim� �X8c2Cim

log�Pr�cjE�� �14�

and note that the illuminant which maximizes l�EjCim� willalso maximize Pr�EjCim�. The log-probabilities measure thecorrelation between a particular chromaticity and a parti-cular illumination. We can then define a correlation matrixMBayes whose ijth entry is:

log�Pr�image chromaticity ijilluminant j��:It follows that the correlation vector l defined in (7) becomesthe log-likelihood l�EjCim�:

l�EjCim� � thresh�chist�Cim��tMBayes �15�and our estimate of the scene illuminant can be written:

cE � thresh2�thresh�chist�Cim��tMBayes�Cill: �16�Equation (16) defines a well-founded maximum-likelihoodsolution to the illuminant estimation problem. It isimportant to note that, since we have computed likelihoodsfor all illuminants, we can augment the illuminantcalculated in (16) with error bars. We could do this byreturning the maximum-likelihood answer defined in (16)together with the likelihoods for each possible illuminant,defined in (15). These likelihoods tell us how muchconfidence we should have in the maximum-likelihoodanswer. Alternatively, rather than just returning the most-likely answer, we could return this answer together withother illuminants which had similar likelihoods. That is, wecould return a set of plausible illuminants Cplausible whichwe define:

Cplausible � diag�thresh3�thresh�chist�Cim��tMBayes��Cill;�17�

where we replace the thresholding function thresh2�� usedin (16) with a new function thresh3��, such that:

thresh3�x� � 1; if x � m;thresh3�x� � 0; otherwise

�18�

and m is chosen in an adaptive fashion such thatm � max�h�. The function diag�a� returns a diagonal matrixwhose nonzero entries are the elements of a so that Cplausibleis an Nill � 2 matrix whose nonzero rows correspond topossible lights. This enables us to inform the user that theilluminant was, for example, either yellow Tungsten or coolwhite fluorescent, but that it was definitely not blue skydaylight. Once more, we can return Cplausible together withthe likelihoods of all the illuminants. We believe that this isa major strength inherent in our method and will be ofsignificant value in other computer vision applications. For

example, face tracking based on color can fail if the color ofthe skin (defined in terms of image colors) varies too much.In the context of the current paper, this is not a problem solong as the variation is explained by the color constancycomputation.

3 OTHER ALGORITHMS IN THE FRAMEWORK

Equation (8) encapsulates our solution to the illuminantestimation problem. At the heart of this framework is acorrelation matrix which encodes our knowledge about theinteraction between lights and image colors. We show inthis section that many existing algorithms can be reformu-lated in this same framework simply by changing theentries of the correlation matrix so as to reflect theassumptions about the interactions between lights andimage colors made (implicitly) by these algorithms.

3.1 Gray-World

We begin with the so-called Gray-World algorithm which aswell as being one of the oldest and simplest is still widelyused. This algorithm has been proposed in a variety offorms by a number of different authors [5], [18], [21] and isbased on the assumption that the spatial average of surfacereflectances in a scene is achromatic. Since the lightreflected from an achromatic surface is changed equally atall wavelengths, it follows that the spatial average of thelight leaving the scene will be the color of the incidentillumination. Buchsbaum [5] who was one of the first toexplicitly make the gray-world assumption, used it,together with a description of lights and surfaces as low-dimensional linear models [6], to derive an algorithm torecover the spectral power distribution of the sceneilluminant E��� and the surface reflectance functions S���.To recover an estimate of the scene illuminant in the formwe require, that is, in terms of the sensor response of adevice to the illuminant, is trivial, we simply need to takethe average of all sensor responses in the image. That is:

pE � mean�RGBim�: �19�Equation (19) can equivalently be written as:

pE � hist�RGBim�tIRGBill; �20�where RGBill and RGBim, respectively, characterize the setof all possible illuminants and the set of image pixels incamera RGB space (they are the 3D correlates of Cill and Cimdefined earlier). The operation hist�� is chist�� modified towork on RGBs rather than chromaticities and the matrix I isthe identity matrix. In this formulation, I replaces thecorrelation matrix MBayes and, as before, can be interpretedas representing our knowledge about the interactionbetween image colors and surfaces. In this interpretation,the columns and rows of I represent possible illuminantsand possible image colors, respectively. Hence, I tells usthat given a sensor response p in an image, the onlyilluminant consistent with it is the illuminant characterizedby the same sensor response. Correspondingly, the vector

l � hist�RGBim�tI �21�

1214 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 11, NOVEMBER 2001

Page 7: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

whose elements contain the number of image colorsconsistent with each illuminant, can be interpreted as ameasure of the likelihood that each illuminant (each distinctRGB present in the image) is the scene illuminant. Based onthese likelihoods, we calculate an estimate of the sceneilluminant by taking the weighted average of all illumi-nants. For example, if �200; 10; 200�t appears in an image,then �200; 10; 200�t could be the illuminant color. Equally, if�100; 50; 20�t appears in the image, it too is a candidateilluminant. If both image colors appear in the image, thenaccording to I they are both possible candidates for theilluminant. The mean estimate of the two provides a well-founded statistical estimate of the illuminant relative to thepriors we have used (the diagonal matrix I). In fact, thisexample highlights one of the problems with using thesepriors since, according to our model of image formation((1)), if either of these RGBs is considered the RGB of theilluminant, then the other cannot possibly be so.

While it is often used for color constancy, the Gray-World algorithm has a number of limitations. First,Gershon et al. [18] have pointed out that the spatial averagecomputed in (19) is biased towards surfaces of large spatialextent. They proposed a modified algorithm which alle-viates this problem by segmenting the image into patches ofuniform color prior to estimating the illuminant. The sensorresponse from each segmented surface is then counted onlyonce in the spatial average, so that surfaces of different sizeare given equal weight in the illuminant estimation stage. Itis trivial to add this feature in our framework; we simplyneed to apply the thresholding operation thresh��, definedin (5), to the output from hist��:

pE � thresh�hist�RGBim�t�IRGBill: �22�A second limitation of the Gray-World algorithm is high-lighted by (20)Ðthe identity matrix I does not accuratelyrepresent our knowledge about the interaction of lights andsurfaces. Improving color constancy then, amounts tofinding matrices such as MBayes defined above which moreaccurately encode that knowledge.

3.2 3D Gamut Mapping

The matrix I tells us that a reddish RGB observed in an imageis only consistent with an illuminant of that color. In fact, suchan RGB is consistent with both a red surface under a whitelight and a white surface under a red light, and with manyother combinations of surface and illuminant. Forsyth [14]developed an algorithm, called CRULE to exploit this fact.CRULE is founded on the idea of color gamuts: The set of allpossible sensor responses observable under different illumi-nants. Forsyth showed that color gamuts are closed, convex,bounded, and that, most importantly, each is a strict subset ofthe set of possible image colors. The gamut of possible imagecolors for a light can be determined by imaging all possiblesurfaces (or a representative subset thereof) under that light.We can similarly determine gamuts for each of our possibleilluminants, i.e., for each row ofRGBill, and can code them in acorrelation matrix. That is, we define a matrixMFor, such thatif image color i can be observed under illuminant j, then weput a one in the ijth entry of MFor, otherwise, we put a zero.

The matrix MFor more accurately represents our knowledgeabout the world and can be used to replace I in (20).

From the colors present in an image, CRULE determineda set of feasible illuminants. An illuminant is feasible if allimage colors fall within the gamut defined by thatilluminant. In our framework, the number of image colorsconsistent with each illuminant can be calculated:

l � thresh�hist�RGBim�t�MFor; �23�where thresh�� and hist, as defined previously and usedtogether, ensure that each distinct image color is countedonly once. If there are Nsurf distinct surfaces in an image,then any illuminant corresponding to entries of l which areequal to Nsurf are consistent with all image colors and aretherefore feasible illuminants.

Once the set of feasible illuminants has been determined,the final step in CRULE is to select a single illuminant fromthis set as an estimate of the unknown illuminant. Previouswork has shown [2] that the best way to do this is to take themean of the feasible illuminants. In our framework, this canbe achieved by:

pE � thresh2�thresh�hist�RGBim��tMFor�RGBill; �24�where thresh2�� is defined as before. Equation (24), thoughequivalent to CRULE (with mean selection), is a muchsimpler implementation of it. In Forsyth's CRULE, thenotion of which image colors can appear under which lightswas modeled analytically as closed continuous convexregions of RGB space. Also, the illuminants themselveswere not represented explicitly. Rather, an illuminant po isdefined by the mapping (actually a 3� 3 diagonal matrix)that takes responses observed under po to reference orcanonical lighting conditions pc. In CRULE, computationinvolves calculating mapping sets for each image color andthen intersecting these sets to arrive at the overall plausibleset (which contains those mappings that take all imagecolors to canonical counterparts).

Computation aside, our new formulation has anothersignificant advantage over CRULE. Since rather than sayingthat an illuminant is possible if and only if it is consistent withall image colors, we can instead look for illuminants that areconsistent with most image colors. This subtle change cannotbe incorporated easily into the CRULE algorithm, yet it isimportant that it is since, in CRULE, if no illuminant isglobally consistent, there is no solution to color constancy. Toimplement majority consistency in our framework isstraightforward. We simply replace thresh2�� with thethresholding function thresh3�� defined in (18). Thus, theimproved CRULE algorithm, can be written as:

pE � thresh3�thresh�hist�RGBim��tMFor�RGBill: �25�

3.3 Color In Perspective (2D Gamut Mapping)

While the formulation of Forsyth's CRULE algorithm givenabove addresses some of its limitations there are otherproblems with CRULE which this formulation doesn'tresolve. First, Finlayson [10] recognized that features suchas shape and shading affect the magnitude of the recoveredlight, but with any significance, not its color. To avoidcalculating the intensity of the illuminant (which cannot be

FINLAYSON ET AL.: COLOR BY CORRELATION: A SIMPLE, UNIFYING FRAMEWORK FOR COLOR CONSTANCY 1215

Page 8: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

recovered [23]), Finlayson carried out computation in a2D chromaticity space. Once again, if we characterize imagecolors and illuminants by their chromaticities, we candefine a new matrix, MFin whose ijth element will be set toone when chromaticity i can be seen under illuminant j andto zero, otherwise. We can then substitute MFin in (8):

cE � thresh2�thresh�chist�Cim��tMFin�Cill: �26�Assuming the thresholding operations thresh�� andthresh2�� are chosen as for Forsyth's algorithm (we could,of course, use thresh3�� instead of thresh2�� if we wanted toimplement majority consistency), then the illuminantestimate cE is the averaged chromaticity of all illuminantsconsistent with all image colors. Previous work [12] hasshown however, that the mean chromaticity is not the bestestimate of the illuminant and that the chromaticity trans-form should be reversed before the averaging operation isperformed. This can be achieved here by defining a matrixRGBN

ill, whose ith row is the ith row of RGBill normalizedto unit length. The illuminant estimate is now calculated:

pE � thresh2�thresh�chist�Cim��tMFin�RGBNill: �27�

Another problem with gamut mapping is that not allchromaticities correspond to plausible illuminants (forexample, purple lights do not occur in practice). Thisobservation is also simple to implement since we cansimply restrict the columns of MFin to those correspondingto plausible lights.

3.4 Illuminant Color by Voting

Sapiro [26], [25] has recently proposed an algorithm forestimating the scene illuminant which is based on theProbabilistic Hough Transform. In this work, Sapirorepresents lights and surfaces as low-dimensional linearmodels and defines, according to this model, a probabilitydistribution from which surfaces are drawn. Given a sensorresponse from an image, a surface is selected according tothe defined distribution. This surface, together with thesensor response, is used to recover an illuminant. If therecovered illuminant is a feasible illuminant (in Sapiro'scase an illuminant on the daylight locus), a vote is cast forthat illuminant. For each sensor response, many surfacesare selected and so many votes are cast. To get an estimateof the illuminant, the cumulative votes for each illuminantare calculated by summing the votes from all sensorresponses in the image. The illuminant with maximumvotes is selected as the scene illuminant.

The votes for all illuminants for a single sensor responsep represent an approximation to the probability distribu-tion: Pr�Ejp�Ðthe conditional probability of the illuminantgiven the observed sensor response. Sapiro chooses theilluminant which maximizes the function:X

p2RGBim

Pr�Ejp�: �28�

Since we know the range of possible image colors, ratherthan compute the probability distributions Pr�Ejp� on aper image basis, we could instead, using Bayes rule,compute them once for all combinations of sensorresponses and illuminants. We can then define a matrix

MSapiro whose ijth entry is Pr�illuminant jjimage color i�,which, by Bayes rule, is proportional to the probability ofobserving image color i under illuminant j. It then followsthat Sapiro's estimate of the illuminant can be found in ourframework by:

pE � thresh3�hist�RGBim�tMSapiro�RGBill; �29�where the matrix MSapiro is equal to keMBayes. We note that in(29) the image histogram is not thresholded so that Sapiro'salgorithm, as Gray-World, will be sensitive to large areas ofuniform color.

3.5 Probabilistic Algorithms

Brainard and Freeman [4] have recently given a Bayesianformulation of the color constancy problem. Their approachis again founded on a linear models representation of lightsand surfaces. That is, each light and surface is representedby a weighted sum of a small number of basis functions sothat these weights are sufficient to define a light or surface.Principal component analyses of collections of surfaces andilluminants were used to determine suitable basis functionsand the corresponding weights for each light and surface.The authors then defined probability distributions for theseweights and used Bayesian decision theory to recoverestimates of the weights for the surfaces and illuminant inan image. So, if x represents the combined vector of weightsfor all the surfaces and the light in an image, the problem isto estimate x.

If there are Nsurf surfaces in the image, then the vector tobe recovered is �3Nsurf � 3�-dimensional. Estimating x istherefore computationally extremely complex. The authorshave implemented the algorithm as a numerical searchproblem and shown results for the case Nsurf � 8. However,since typical images contain many more surfaces than eight,as a practical solution for color constancy, this approach isfar too complex. A precise formulation of their algorithm isnot possible within our framework; however, we can usetheir prior distributions on surfaces and illuminants whenconstructing our correlation matrix. If we then restrict theproblem to that of recovering an estimate of the unknownilluminant, then our approach should produce similarresults.

An approach which is much closer to the algorithm wehave presented was proposed by D'Zmura and Iverson [9].They also adopted a linear models representation ofsurfaces, but they used these models to derive a likelihooddistribution Pr��x; y�jE����. That is the probability ofobserving a given CIE-xy chromaticity [31] coordinate,under an illuminant E���. This is done by first definingdistribution functions for the weights of their linear modelof surfaces. Then, they generated a large number of surfacesby selecting weights according to these distributions andcalculated the corresponding chromaticity coordinates forthese surfaces. By selecting a large number of surfaces, agood approximation to Pr��x; y�jE���� can be found. If weput likelihoods corresponding to these probabilities in acorrelation matrix, then this algorithm can be formulated inthe framework we have developed. We point out that thisalgorithm, like Gray-World, takes no account of the relativefrequency of individual chromaticities: The function

1216 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 11, NOVEMBER 2001

Page 9: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

thresh�� is not used. As such, the algorithm is again highlysensitive to large image areas of uniform color and, so, cansuffer serious failures.

3.6 Neural Networks

The final algorithm we consider is the Neural Networkapproach of Funt et al. [17]. Computation proceeds in threestages and is summarized below:

output1 � thresh4�thresh�hist�Cim�t�MFunt�output2 � thresh4�outputt1MFunt;1�

output3 � outputt2MFunt;2;

�30�

where thresh4 is similar to thresh3 but its exact definitionhas not been specified [17]. In the parlance of NeuralNetworks, output1 is the first stage in a three layerperceptron calculation. The second, hidden layer computa-tion, is modeled by the second equation and the final outputof the Neural Net is output3. The correlation matrix, MFunt;1,typically has many fewer columns than MFunt. Moreover,MFunt;2 only has two columns and, so, the whole networkonly outputs two numbers. These two numbers are trainedto be the chromaticity of the actual scene illuminant. Assuch, we can replace MFunt;2 by Cill (though it is importantto realize that here Cill is discovered as a result of trainingand does not bear a one to one correspondence with actualilluminants).

In the context of this paper, output1 is very similar to (8)albeit with a different correlation matrix and slightlydifferent threshold functions. The other two stages addressthe question of how a range of possible illuminants istranslated into a single illuminant chromaticity estimate. Asone might imagine, the Neural Net approach, whichbasically fits a parametric equation to model image data,has been shown to deliver reasonably good estimates.However, unlike the approach advocated here, it is notpossible to give certainty measures with the estimate nor isit possible to really understand the nature of the computa-tion that is taking place.

4 RESULTS

We conducted two experiments to assess the performanceof our new correlation algorithm and to compare it toexisting algorithms. First, to get some idea of the algor-ithm's performance over a large data set, we tested it onsynthetically generated images. In a second experiment, wetested the algorithm on a number of real images capturedwith a digital camera. We show exemplar results of thesetests for a small number of images.

In the experiments on synthetic images, we tested fivealgorithms: Gray-World, Modified Gray-World, 2D GamutMapping (with mean selection), Sapiro's algorithm, and thenew Color by Correlation algorithm described in this paper.Gray-World simply uses the average of all the sensorresponses in the image as an estimate of the unknownilluminant. Modified Gray-World is similar, except thateach distinct sensor response is counted only once whenforming the average, regardless of how many times itoccurs in the image. More sophisticated segmentationalgorithms could be used, as Gershon et al. [18] suggest;

however, the question of how best to segment is still openand, so, we adopt the simplest approach here. Two-dimensional Gamut Mapping is Finlayson's Color In thePerspective algorithm, formulated in the correlation frame-work and Sapiro's algorithm is also reformulated in thissame framework.

Before running the correlation-based algorithms, we mustmake the correlation matrices themselves. In the case of the2D Gamut Mapping (matrixMFin), we want to determine therange of image chromaticities that are possible under each ofthe illuminants between which we wish to distinguish. Oneway to obtain this information would be to take the cameraand capture a wide range of surface reflectances under each ofthe lights between which we wish to distinguish. In this way,we can define the gamut of colors which the camera recordsunder each light. However, this approach is somewhatcumbersome especially if the number of possible lights islarge. Fortunately, given some knowledge about our camera,we can instead generate these gamuts using synthetic data.Specifically, if we know the spectral response characteristicsof the camera, then we need only measure the surfacereflectances of a range of objects and the spectral powerdistribution of each illuminant between which we wish todistinguish. We can then use (2) to generate the set of possiblesensor responses under each illuminant and, from these, wecan calculate the corresponding chromaticities. We then takethe convex hull of these chromaticities and consider that anychromaticity within the hull is part of the gamut (it'scorresponding entry in the correlation matrix is one) andthat all other chromaticities are outside it (their correspond-ing entries are zero [14]).

We point out that, in using (2) to generate the sensorresponses, we are assuming that the camera has a linearresponse, whereas, in practice, this is often not the case. Insuch cases, we must account for any nonlinearity in thecamera's response when calculating the responsesÐthat is,we must characterize the nature of the camera's nonlinear-ity and modify (2) to take account of it. Alternatively, wecould calculate the matrices using the assumption of lineardata and then linearize the data recorded by the camerabefore applying the illuminant estimation algorithm.

The entries of the matrix MBayes can also be found usingsynthetic responses generated using (2). However, now,rather than recording only whether a chromaticity ispossible or not, we want to record the relative frequencywith which it occurs. To do this, we can use the same set ofsurface reflectances to calculate chromaticity coordinates asbefore, then, to estimate the probability of a given imagecolor, we simply count the number of surfaces falling ineach bin of the discretized chromaticity space. The entries ofMBayes are the log of these raw counts normalized by thetotal number of chromaticities. The matrix MSapiro is createdin the same way except that we put actual probabilitiesrather than log probabilities in the matrix.

When calculating the correlation matrices, we must decidehow to partition the chromaticity space. This partitioning willdepend both on the characteristics of the camera's sensorsand also on which chromaticity space is being used. There aremany chromaticity spaces which could potentially beemployed; for the experiments reported here we used thecoordinates:

FINLAYSON ET AL.: COLOR BY CORRELATION: A SIMPLE, UNIFYING FRAMEWORK FOR COLOR CONSTANCY 1217

Page 10: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

c1 � p1

p2

� �13

; c2 � p3

p2

� �13

�31�

since this leads to chromaticities which are reasonablyuniformly distributed. This has the advantage that a simpleuniform discretization of the space can be employed. Themethod is quite insensitive to the level of discretization of thespaceÐwe have achieved good results using discretizationsranging from 16� 16 to 64� 64. The results reported here arebased on a 24� 24 discretization of the space which we havefound to work well for a range of cameras.

We tested the algorithms' performance using imagessynthesized using (2). For sensor response curves, we usedmonochromator measurements of a HP PhotosmartC500 digital camera. These curves were then used to generatethe correlation matrices. Key to the success of the method isthe choice of lights and surfaces used to build the correlationmatrices. For the experiments reported here, we wanted togenerate a matrix which would give good results for a widerange of illuminants, so we chose a set of 37 lights,representing a range of commonly occurring indoor andoutdoor illumination. The set includes daylights withcorrelated color temperatures ranging from D75 to D40,Planckian blackbody radiators, ranging from 3,500K to2,400K, and a variety of fluorescent sources. The set of surfacereflectances we use is most important when we test thealgorithm on real images since how well the distribution ofthese surfaces matches the distribution of reflectances in thereal world will have a large effect on the success of thealgorithm. For the synthetic experiments, it is enough tochoose a wide range of reflectance functions. We used twosets: a set of 462 Munsell chips [31] and a collection of objectsurface reflectances measured by Vrhel et al. [30].

To create the synthetic images in which we tested thealgorithms, we randomly selected between 2 and 64 surfacesfrom a set of surface reflectances and a single illuminant,drawn from the set of 37. To make the test more realistic weused reflectances from a set of natural surface reflectancesmeasured by Parkkinen et al. [24] rather than using thesame reflectances on which the correlation matrices werebuilt. We calculated the sensor response for a surface andthen weighted each surface by a random factor chosen toensure that the number of pixels in each image was512� 512. Since many real images consist of large areas ofuniform color (for example, outdoor scenes often containlarge regions of blue sky), to make the images morerealistic, the factors were chosen such that one surfacealways occupied at least 40 percent of the image.

To determine an algorithm's estimate of the illuminant,we simply calculate an image histogram, and then thelikelihood for each illuminant according to (7). Theselikelihoods are then used to select a single illuminant fromthe set as an estimate of the scene illuminant. For the2D Gamut Mapping algorithm, we used the mean selectionmethod [12]; whereas, for Sapiro's algorithm and the newColor by Correlation approach, we chose the illuminantwith maximum likelihood.

To assess the relative performance of the algorithms, wechose a root mean square error (RMSE) measureÐspecifi-cally, the chromaticity error between the image underD65 illumination and an estimate (calculated using each

algorithm's estimate of the illumination) of the image underD65. RMSE is commonly used in the computational colorconstancy literature [2], [17] and, while it is not the mostintuitive error measure (it is not easy to interpret what agiven RMS error means in terms of visual differencebetween images), it at least allows the relative performanceof the algorithms to be assessed. To help give some idea ofwhat a certain RMS error corresponds to in terms of visualdifference, in the second experiment, on real images, wegive RMS errors for a number of corrected images, togetherwith a print of those images. RMS error is calculated asfollows: Each of the algorithms was used to generate anestimate of the unknown scene illuminant in terms of a2D chromaticity coordinate. We used this estimate torerender the image as it would have appeared understandard daylight D65 illumination. Rerendering wasperformed by finding the diagonal mapping which takesthe algorithm's estimate of the scene illuminant to thechromaticity of D65 illumination. We then applied thismapping to all RGBs in the image to obtain an estimate ofthe scene as it would have appeared under D65 illumina-tion. We also performed a similar correction using amapping from the chromaticity of the actual sceneilluminant to D65 illumination. We then calculated the rootmean square error in chromaticity space between these twoimages. Since algorithm performance is measured in a2D chromaticity space and since the Gray-World algorithmswork in 3D sensor space, it might be thought that thealgorithms which set out to recover a 2D estimate wouldhave an unfair advantage over the 3D algorithms. We testedthis theory by modifying the Gray-World algorithms towork in 2D chromaticity space and found that this, in fact,led to worse performance than in the 3D case. For thisreason, only the 3D results are reported here.

Fig. 3 shows the relative performance of the fivealgorithms in terms of the average RMS chromaticity erroragainst the number of surfaces in the image. Results werecalculated for images with 2, 4, 8, 16, 32, and 64 surfacesand, in each case, the average was taken over 500 images.We can draw a number of conclusions from these results.First, accurately encoding information about the worldleads to improved color constancy performance; the gamut-mapping algorithm and the two algorithms exploitingprobability information all perform considerably betterthan the gray-world algorithms. Further, we can see thatadding information about the probabilities of image colorsunder different illuminants further improves performance.It is important though, that this information is encodedcorrectly. Our new algorithm, which correctly employsBayes's rule, gives a lower average RMSE than the secondbest algorithm. However, Sapiro's algorithm which doesnot correctly encode probability information performsslightly worse than the 2D gamut-mapping algorithm.

Previous work [12] has demonstrated that 2D gamutmapping produces better results than most other algo-rithms [13], [12]. So, it is significant that our new approachdelivers much better constancy. Moreover, the Neural Netapproach (for which there insufficient information for us toimplement) has also been shown [17] to perform similarlyto 2D gamut mapping. Thus, on the basis of the results

1218 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 11, NOVEMBER 2001

Page 11: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

presented here, it is fair to say that, to our knowledge, thenew algorithm outperforms all other algorithms. Thesecond experiment we ran was to test the performance ofthe algorithm on real images. We used images from twodifferent digital cameras: a HP-Photosmart C500 and aprototype digital still camera based on a Sony ICX085 CCD.Both cameras were modified so that they gave raw sensordata and were calibrated to ensure that this data waslinearly related to incident light. Thus, (2) is an accuratemodel of image formation. We measured the spectral

sensitivity curves of both devices using a monochromatorand used these measurements when building the correla-tion matrices for the various algorithms.

The raw data from the camera was averaged down by afactor of five (in width and height) and this averaged imagewas used as input to the illuminant estimation algorithms.Using each algorithm's estimate of the scene illuminant, wererendered the full-size captured image to D65 illumina-tion, following the procedure described above. Figs. 4 and5, and Table 1 show typical examples of the algorithm's

FINLAYSON ET AL.: COLOR BY CORRELATION: A SIMPLE, UNIFYING FRAMEWORK FOR COLOR CONSTANCY 1219

Fig. 3. Average RMSE chromaticity error for: gray world (dotted line with *), Modified Gray World (dotted line), Sapiro's algorithm (dash-dot line),

2D gamut mapping (dashed line), and Color by Correlation (solid line).

Fig. 4. Left to right: raw camera image, correction based on; measured illuminant, Gray-World, and Color by Correlation. Images were taken under

daylight D50 (top) and simulated D65 (bottom).

Page 12: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

performance for the two cameras. Fig. 4 shows results fortwo scenes captured with the first camera. For each scene,we show four images; the raw data as captured by thecamera, the image rerendered to D65 illumination using aspectral measurement of the scene illuminant (obtained byplacing a white tile in the scene and measuring thespectrum of the reflected light), the image rerendered usingthe illuminant estimate recovered by our new algorithm,and the image rerendered using the Gray-World estimate. Itis clear that the image rerendered using the new algor-ithm's estimate of the illuminant is a very close match tothat obtained using the measured illuminant. In contrast,the performance of Gray-World is worse and while, interms of RMSE (see Table 1), as it is better than doing nocorrection the images are visually a very poor match to theproperly corrected image. Fig. 5 shows results for thesecond camera. Here, rather than comparing performancewith the Gray-World algorithm, we show how well the 2DGamut Mapping algorithm performs (the second bestalgorithm in the experiments with synthetic images). The2D Gamut Mapping algorithm does perform better than theGray-World approach; however, performance is still someway behind that which can be obtained using the newalgorithm. Table 1 summarizes the algorithm performancefor the four images in terms of the RMS error measurement

used in the synthetic experiments. Again, using thismeasurement, the new algorithm performs very well andbetter than the other algorithms tested. Four images do notrepresent an exhaustive test of the algorithm's performanceand the results presented here are intended only to give anidea of the kind of the performance that can be obtainedwith the various algorithms. Though the results are typicalof the performance we have achieved with the newalgorithm, we are currently in the process of compiling adatabase of images on which to more thoroughly testperformance.

5 CONCLUSIONS

In this paper, we have considered the color constancyproblem; that is how we can find an estimate of theunknown illuminant in a captured scene. We have seen thatexisting constancy algorithms are inadequate for a varietyof reasons. For example, many of them make unrealisticassumptions about images or their computational complex-ity is such that they are unsuitable as practical solutions tothe problem. Here, we have presented a correlation frame-work in which to solve for color constancy. The simplicity,flexibility, and robustness of this framework makes solvingfor color constancy easy (in a complexity sense). Moreover,we have shown how a particular Bayesian instantiation ofthe framework leads to excellent color constancy (betterthan other algorithms tested). A number of other previouslyproposed algorithms were also placed within the correla-tion framework, and others which, while they cannot beprecisely formulated within the framework, were shown tobe closely related to it.

ACKNOWLEDGMENTS

This work was supported by Hewlett-Packard Incorporatedand the EPSRC.

1220 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 11, NOVEMBER 2001

Fig. 5. Left to right: raw camera image, correction based on; measured illuminant, 2D Gamut Mapping, and Color by Correlation. Images were taken

under illuminant A (top) and cool white fluorescent (bottom).

TABLE 1Average Root Mean Square Error between Images Corrected tothe D65 Illumination Using an Estimate of the Scene Light and

Images Corrected Using a Measurement of the Scene Light

Page 13: Color by correlation: a simple, unifying framework for ......Color by Correlation: A Simple, Unifying Framework for Color Constancy Graham D. Finlayson, Steven D. Hordley, and Paul

REFERENCES

[1] L. Arend and A. Reeves, ªSimultaneous Color Constancy,ºJ. Optical Soc. Am., A, vol. 3, no. 10, pp. 1743-1751, 1986.

[2] K. Barnard, ªComputational Color Constancy: Taking Theory intoPractice,º master's thesis, Simon Fraser Univ., School of Comput-ing Science, 1995.

[3] D.H. Brainard, W.A. Brunt, and J.M. Speigle, ªColor Constancy inthe Nearly Natural Image. I. Asymmetric Matches,º J. Optical Soc.Am., A, vol. 14, no. 9, pp. 2091-2110, 1997.

[4] D.H. Brainard and W.T. Freeman, ªBayesian Color Constancy,ºJ. Optical Soc. Am., A, vol. 14, no. 7, pp. 1393-1411, 1997.

[5] G. Buchsbaum, ªA Spatial Processor Model for Object ColourPerception,º J. Franklin Inst., vol. 310, pp. 1-26, 1980.

[6] J. Cohen, ªDependency of the Spectral Reflectance Curves of theMunsell Color Chips,º Psychonomic Science, vol. 1, pp. 369-370,1964.

[7] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis.John Wiley and Sons, 1973.

[8] M.M. D'Zmura and G. Iverson, ªColor Constancy. I. Basic Theoryof Two-Stage Linear Recovery of Spectral Descriptions for Lightsand Surfaces,º J. Optical Soc. Am., A, vol. 10, no. 10, pp. 2148-2165,1987.

[9] M.M. D'Zmura and G. Iverson, ªProbabilistic Color Constancy,ºGeometric Representations of Perceptual Phenomena: Papers in Honor ofTarow Indow's 70th Birthday, M.M. D'Zmura, D. Hoffman,G. Iverson, and K. Romney, eds., Laurence Erlbaum Assoc., 1994.

[10] G.D. Finlayson, ªColor in Perspective,º IEEE Trans. PatternAnalysis and Machine Intelligence, vol. 18, no. 10, pp. 1034-1038,Oct. 1996.

[11] G.D. Finlayson and B.V. Funt, ªColor Constancy with Shadows,ºPerception, vol. 23, pp. 89-90, 1994.

[12] G. Finlayson and S. Hordley, ªA Theory of Selection for GamutMapping Colour Constancy,º Proc. Computer Vision and PatternRecognition '98, pp. 60-65, June 1998.

[13] G.D. Finlayson, P.M. Hubel, and S. Hordley, ªColor by Correla-tion,º Proc. Fifth Colour Imaging Conf., pp. 6-11, Nov. 1997.

[14] D.A. Forsyth, ªA Novel Algorithm for Colour Constancy,º Int'lJ. Computer Vision, vol. 5, no. 1, pp. 5-36, 1990.

[15] B.V. Funt, M.S. Drew, and J. Ho, ªColor Constancy from MutualReflection,º Int'l J. Computer Vision, vol. 6, pp. 5-24, 1991.

[16] B. Funt, K. Barnard, and L. Martin, ªIs Machine Colour ConstancyGood Enough?º Proc. Fifth European Conf. Computer Vision, pp. 455-459, June 1998.

[17] B.V. Funt, V. Cardei, and K. Barnard, ªLearning Color Con-stancy,º Proc. Fourth Color Imaging Conf., pp. 58-60, Nov. 1996.

[18] R. Gershon, A.D. Jepson, and J.K. Tsotsos, ªFrom [R,G,B] toSurface Reflectance: Computing Color Constant Descriptors inImages,º Perception, pp. 755-758, 1988.

[19] P.M. Hubel, J. Holm, G.D. Finlayson, and M.S. Drew, ªMatrixCalculations for Digital Photography,º Proc. Fifth Color ImagingConf., pp. 105-111, Nov. 1997.

[20] P.M. Hubel and G.D. Finlayson, ªWhitepoint DeterminationUsing Correlation Matrix Memory,º US Patent Application,Submitted.

[21] E.H. Land, ªThe Retinex Theory of Color Constancy,º ScientificAm., pp. 108-129, 1977.

[22] H.-C. Lee, E.J. Breneman, and C.P. Schulte, ªModeling LightReflection for Computer Color Vision,º IEEE Trans. PatternAnalysis and Machine Intelligence, vol. 12, no. 4, pp. 402-409, 1986.

[23] L.T. Maloney and B.A. Wandell, ªColor Constancy: A Method forRecovering Surface Spectral Reflectance,º J. Optical Soc. Am., A,vol. 3, no. 1, pp. 29-33, 1986.

[24] J. Parkkinen, T. Jaaskelainen, and M. Kuittinen, ªSpectralRepresentation of Color Images,º Proc. IEEE Ninth Int'l Conf.Pattern Recognition, vol. 2, pp. 933-935, Nov. 1998.

[25] G. Sapiro, ªColor and Illuminant Voting,º IEEE Trans. PatternAnalysis and Machine Intelligence, vol. 10, pp. 210-218, 1985.

[26] G. Sapiro, ªBilinear Voting,º Proc. Int'l Conf. Computer Vision '98,pp. 178-183, Nov. 1998.

[27] S.A. Shafer, ªUsing Color to Separate Reflection Components,ºColor Research and Application, vol. 21, no. 11, pp. 1210-1215, 1999.

[28] M.J. Swain and D.H. Ballard, ªColor Indexing,º Int'l J. ComputerVision, vol. 7, no. 1, pp. 11-32, 1991.

[29] S. Tominaga, ªSurface Reflectance Estimation by the DichromaticModel,º Color Research and Application, vol. 21, no. 2, pp. 104-114,1996.

[30] M.J. Vrhel, R. Gershon, and L.S. Iwan, ªMeasurement andAnalysis of Object Reflectance Spectra,º Color Research andApplication, vol. 19, no. 1, pp. 4-9, 1994.

[31] G. Wyszecki and W.S. Stiles, Color Science: Concepts and Methods,Quantitative Data and Formulas, second ed. New York: Wiley, 1982.

Graham D. Finlayson obtained the BSc degree in computer sciencefrom the University of Strathclyde (Glasgow, Scotland) and the MSc andPhD degrees in 1992 and 1995, respectively, from Simon FraserUniversity, Vancouver, British Columbia, Canada. From August 1995 toSeptember 1997, he was a lecturer in computer science at the Universityof York, England and from October 1997 until August 1999, he was areader in color imaging at the Colour and Imaging Institute, University ofDerby, England. In September 1999, he was appointed a professor inthe School of Information Systems, University of East Anglia, Norwich,England.

Steven D. Hordley obtained the BSc degree in mathematics from theUniversity of Manchester, England in 1992. Then, he obtained the MScdegree in image processing from Cranfield Institute of Technology,Cranfield, England in 1996. Between 1996 and 1999, he studied at theUniversity of York (York, England) and the University of Derby, Englandfor the PhD degree in color science. In September 1999, he wasappointed a senior research fellow at the University of East Anglia,Norwich, England and, in October 2000, he was appointed a lecturer atthe same institution.

Paul M. Hubel received the BSc degree in optical engineering from TheUniversity of Rochester in 1986 and the DPhil degree from OxfordUniversity in engineering science in 1990. His doctoral thesis was oncolor reflection holography. Dr. Hubel also worked as a postdoctoralresearcher at MIT-Media Lab in the areas of color holography, colorreproduction, and color holographic video. He has been a member oftechnical staff at Hewlett-Packard Laboratories for six years, working inthe printing technology department. His recent work has been onillumination estimation and color correction for digital cameras; pastprojects include color reproduction for scanners, copiers, and printers.Dr. Hubel is a member of IS&T and SPIE.

. For more information on this or any other computing topic,please visit our Digital Library at http://computer.org/publications/dlib.

FINLAYSON ET AL.: COLOR BY CORRELATION: A SIMPLE, UNIFYING FRAMEWORK FOR COLOR CONSTANCY 1221