statistics of color-matching data

5
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA Statistics of Color-Matching Data* W. R. J. BROWN Kodak Research Laboratories, Rochester 4, New York (Received November 19, 1951) The distribution of color matches about a color center is investigated. Pearson's x 2 test is used, and the data are found to be normally distributed. A comparison is made between the observed variations of gl, g22, and 33, the principal terms of the color-discrimination tensor, and the variations expected from statistical theory. The experimental and theoretical variations are found to agree very closely. Recently published values of gik are shown to be too small by a factor of two. The reproducibility of color-discrimination ellip- soids, determined from sets of thirty color matches, is shown. INTRODUCTION W HEN Silberstein", 2 obtained the formulas for de- NV termining the metric tensor of color space from color matchings in a three-primary colorimeter, he made one very reasonable assumption. He supposed that the distribution of these matches about any color center would be a normal or Gaussian distribution. This as- sumption was not without some experimental support. Silberstein and MacAdam 3 analyzed the color matches made by Nutting in a clorimeter which provided a constant luminance field with variations of chromaticity in one direction through a color center. 4 These data were distributed normally about their mean value. Al- though the data analyzed by Silberstein and MacAdam were results of "guided" color matches in distinction from the "free" color matches obtainable in a three- primary instrument, still the visual problem facing the observer is the same in both cases. He must match a fixed color in one-half of a bipartite field, using the variables provided for him in the other half of the field. If guided matches along a line through the color center are normally distributed, then it may be expected that free matches in any direction about the color would also be distributed normally. It was on this evidence that the theory was based, and on the theory, in turn, was based an extensive program of color matchings. 5 The colorimeters used in this investigation permitted the observer to make "free" color matches. The observer may adjust the amounts of the three primary colors used in the colorimeter in any manner. In this way he may produce any kind of a color difference between the variable half of the colorimeter field and the fixed half. "Guided" matches on the other hand are those made with an apparatus which restricts the color changes to one specific variable at a time. Such variations are analogous to movements along a line in space, whereas "free" color matches are analogous to unrestricted movements in three-dimensional space. * Communication No. 1465 from Kodak Research Laboratories. 1 L. Silberstein, Phil. Mag. (Ser. 7) 37, 126 (1946). 2 L. Silberstein, J. Opt. Soc. Am. 38, 71 (1948). 3 L. Silberstein and D. L. MacAdam, J. Opt. Soc. Am. 35, 32 (1945). 4 D. L. MacAdam, J. Opt. Soc. Am. 32, 247 (1942). 5W. R. J. Brown and D. L. MacAdam, J. Opt. Soc. Am. 39, 808 (1949). It might be mentioned at this point that if the distri- bution of color matches about any color center does not follow the Gaussian distribution, Silberstein's con- clusions are not seriously affected. Nonnormal 6 distribu- tions have been analyzed statistically. Such analyses are considerably more difficult, and their results are only slightly different from the simpler analyses applicable to the Gaussian distribution function. Furthermore, if any more complex distribution is assumed, it implies a higher degree of assurance concerning color-matching data than the experiments warrant. NORMALITY OF COLOR-MATCHING DATA Some check on the observations seemed desirable, however, in order to ascertain the extent to which the assumption of normality of distribution in the data was fulfilled. The most satisfactory test of this normality is the chi-square test as defined by Pearson. In that test the function x 2 = 2(f 0 -f') 2 /f', where fo is the observed frequency of matches and f, is the calculated frequency, is evaluated for a uniform group of color matches. The probability P(x 2 ) of obtaining any particular value of x 2 from a normal parent distribution may be found from tables published by Pearson 7 and Fisher. 8 Values of x 2 and P(X 2 ) for four groups of data are shown in Table I. The data analyzed are from repeated attempts to match the color of a semicircular red field, the diameter of which subtended two degrees at the eyes of the observer. The condition that varied from one to another of the four groups was the color of the surround- ing field. (Each group included about 240 color matches.) In Table I the data for each of the three primaries are analyzed for each of the four divisions of data. The criterion of P(X 2 ) for the normality of data is a particularly stringent one. Low values of P(X 2 ) are encountered, more frequently than the theory indicates; in the analysis of data drawn from parent distributions which are almost certainly normal. 9 Worthing and 6 T. C. Fry, Probability and Its Engineering Uses (D. Van Nostrand Company, Inc., New York, 1928), p. 205. 7 K. Pearson, Trans. Am. Math. Soc. 31, 133 (1929). 8 R. A. Fisher, Statistical Methods for Research Workers (Oliver and Boyd, London, 1925), Appendix, Table III. 9 T. C. Fry, Probability and Its Engineering Uses (D. Van Nostrand Company, Inc., New York, 1928), p. 294. 252 VOLUME 42, NUMBER 4 APRIL, 1952

Upload: w-r-j

Post on 02-Oct-2016

226 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Statistics of Color-Matching Data

JOURNAL OF THE OPTICAL SOCIETY OF AMERICA

Statistics of Color-Matching Data*W. R. J. BROWN

Kodak Research Laboratories, Rochester 4, New York

(Received November 19, 1951)

The distribution of color matches about a color center is investigated. Pearson's x2 test is used, and thedata are found to be normally distributed. A comparison is made between the observed variations of gl, g22,

and 33, the principal terms of the color-discrimination tensor, and the variations expected from statisticaltheory. The experimental and theoretical variations are found to agree very closely. Recently publishedvalues of gik are shown to be too small by a factor of two. The reproducibility of color-discrimination ellip-soids, determined from sets of thirty color matches, is shown.

INTRODUCTION

W HEN Silberstein",2 obtained the formulas for de-NV termining the metric tensor of color space fromcolor matchings in a three-primary colorimeter, he madeone very reasonable assumption. He supposed that thedistribution of these matches about any color centerwould be a normal or Gaussian distribution. This as-sumption was not without some experimental support.Silberstein and MacAdam3 analyzed the color matchesmade by Nutting in a clorimeter which provided aconstant luminance field with variations of chromaticityin one direction through a color center.4 These datawere distributed normally about their mean value. Al-though the data analyzed by Silberstein and MacAdamwere results of "guided" color matches in distinctionfrom the "free" color matches obtainable in a three-primary instrument, still the visual problem facing theobserver is the same in both cases. He must match afixed color in one-half of a bipartite field, using thevariables provided for him in the other half of the field.If guided matches along a line through the color centerare normally distributed, then it may be expected thatfree matches in any direction about the color would alsobe distributed normally. It was on this evidence that thetheory was based, and on the theory, in turn, was basedan extensive program of color matchings.5

The colorimeters used in this investigation permittedthe observer to make "free" color matches. The observermay adjust the amounts of the three primary colors usedin the colorimeter in any manner. In this way he mayproduce any kind of a color difference between thevariable half of the colorimeter field and the fixed half."Guided" matches on the other hand are those madewith an apparatus which restricts the color changes toone specific variable at a time. Such variations areanalogous to movements along a line in space, whereas"free" color matches are analogous to unrestrictedmovements in three-dimensional space.

* Communication No. 1465 from Kodak Research Laboratories.1 L. Silberstein, Phil. Mag. (Ser. 7) 37, 126 (1946).2 L. Silberstein, J. Opt. Soc. Am. 38, 71 (1948).3 L. Silberstein and D. L. MacAdam, J. Opt. Soc. Am. 35, 32

(1945).4 D. L. MacAdam, J. Opt. Soc. Am. 32, 247 (1942).5 W. R. J. Brown and D. L. MacAdam, J. Opt. Soc. Am. 39, 808

(1949).

It might be mentioned at this point that if the distri-bution of color matches about any color center does notfollow the Gaussian distribution, Silberstein's con-clusions are not seriously affected. Nonnormal6 distribu-tions have been analyzed statistically. Such analyses areconsiderably more difficult, and their results are onlyslightly different from the simpler analyses applicable tothe Gaussian distribution function. Furthermore, if anymore complex distribution is assumed, it implies ahigher degree of assurance concerning color-matchingdata than the experiments warrant.

NORMALITY OF COLOR-MATCHING DATA

Some check on the observations seemed desirable,however, in order to ascertain the extent to which theassumption of normality of distribution in the data wasfulfilled. The most satisfactory test of this normality isthe chi-square test as defined by Pearson. In that testthe function x2= 2(f 0-f') 2 /f', where fo is the observedfrequency of matches and f, is the calculated frequency,is evaluated for a uniform group of color matches. Theprobability P(x2 ) of obtaining any particular value of x2

from a normal parent distribution may be found fromtables published by Pearson7 and Fisher.8

Values of x2 and P(X2) for four groups of data areshown in Table I. The data analyzed are from repeatedattempts to match the color of a semicircular red field,the diameter of which subtended two degrees at the eyesof the observer. The condition that varied from one toanother of the four groups was the color of the surround-ing field. (Each group included about 240 color matches.)In Table I the data for each of the three primaries areanalyzed for each of the four divisions of data.

The criterion of P(X2 ) for the normality of data is aparticularly stringent one. Low values of P(X2 ) areencountered, more frequently than the theory indicates;in the analysis of data drawn from parent distributionswhich are almost certainly normal.9 Worthing and

6 T. C. Fry, Probability and Its Engineering Uses (D. VanNostrand Company, Inc., New York, 1928), p. 205.

7 K. Pearson, Trans. Am. Math. Soc. 31, 133 (1929).8 R. A. Fisher, Statistical Methods for Research Workers (Oliver

and Boyd, London, 1925), Appendix, Table III.9 T. C. Fry, Probability and Its Engineering Uses (D. Van

Nostrand Company, Inc., New York, 1928), p. 294.

252

VOLUME 42, NUMBER 4 APRIL, 1952

Page 2: Statistics of Color-Matching Data

OF COLOR-MATCHING DATA

Geffner0 suggest that values of P(X2 ) greater than 0.01are sufficient to class a distribution as probably normal.In all but two cases in Table I, this standard is met.

In order to illustrate the differences between the ob-served and calculated frequency distributions, the twoare plotted together in Figs. 1 to 3. The smooth curve isthe calculated frequency distribution, f, plotted on thebasis of the standard deviation of the group of observa-tions. The observed frequencies of color matches areconnected by straight lines. The ordinate in the threefigures is the frequency of data occurring in a givenrange of abscissas. The abscissas are plotted as devia-tions from the mean in terms of instrument scale-readings.

From these curves it can be seen that there is nosignificant difference between the observed distributionsand those calculated on the asumption of a perfectlynormal distribution. Some of the departures from thenormal curve are quite large, it is true, but they seem to

TABLE I. Values of x2 and P(x 2 ) for color matches made witha red field.

No. ofField Surround Primary groups x2 P(X2)

Red Red Red 18 38.8 <0.01Green 18 24.1 0.16Blue 11 13.6 0.25

Green Red 13 27.4 0.01Green 21 41.0 <0.01Blue 10 10.1 0.46

White Red 9 12.0 0.22Green 16 25.0 0.07Blue 13 19.5 0.11

Blue Red 12 23.3 0.03Green 19 11.7 0.92Blue 16 20.3 0.20

occur in a random fashion from one group to the next.Therefore, the data may be safely assumed to benormally distributed.

In order to assess the reproducibility of color-dis-crimination ellipsoids5 obtained from three-color match-ings, it is of interest to ascertain the standard deviationsof the components gik of the discrimination or metrictensor. Silberstein2 "11 published a formula giving theerrors of the three principal components of the 'tensorg91 , g22, and g33. The formula is incorrectly labeled as theprobable error of gii; it actually gives the standarddeviation. The standard deviation of gii may be writtenas

r(gjj) =gjj(21n)1, = 1,2, 3,where n is the number of observations used to obtain thevalues of gikn It is of interest to note, that the sameformula for the precision of a precision index had been

'0A. G. Worthing and J. Geffner, Treatment of ExperimentalData (John Wiley and Sons, Inc., New York, 1943), p. 187.

'1 L. Silberstein, J. Opt. Soc. Am. 36, 464 (1946).

-80 -60 -40 -20 20 40 60 80Deviation from the mean

FIG. 1. Frequency distribution of the red component of colormatches made with a red field. Zigzag line represents the observedfrequencies; smooth curve shows the calculated frequencies.

derived previously, and in a slightly different form, byKendall.2

In order to check this calculated value of o-(gii) withthe data obtained from color-matching, the values of gik

were computed from eight sets of thirty observationseach. This is the number of observations taken in asingle matching session. From the eight values of gik SOobtained, the value of u(gii) was computed. This ob-served value of the standard deviation of the principalcomponents of gik is compared with the calculatedvalues in Table II. From this table it can be seen thatin most cases the agreement between calculated andobserved values of o'(gii) is very close.

The variations in gii given in the above equation arethe variations expected from sampling an infinite normalparent distribution of color matches. Many other vari-ables are encountered in a psychophysical experiment,which tend to reduce the accuracy of the result. Thesevariations include instrument changes, changes in theattitude of the observer, changes in the experimental

12 M. G. Kendall, The Advanced Theory of Statistics (CharlesGriffin and Company, London, 1948), fourth edition, p. 224.

253April 1952 S T A T I T I C S

Page 3: Statistics of Color-Matching Data

W. R. J. BROWN

-20 -15 -10 -5 5 10 15 20Deviation from the mean

FIG. 2. Frequency distribution of the green component of colormatches made with a red field. Zigzag line represents the observedfrequencies; smooth curve shows the calculated frequencies.

conditions, and physiological changes in the observer.All these factors tend to upset the uniformity of thenormal parent distribution from which the color matchesare drawn. The agreement between the calculated andobserved values of (gii) indicates that these fluctua-tions in the experimental conditions were not serious inthe present experiment.

COMPARISON OF RESULTS

The values of gik obtained by Brown and MacAdam5

and by Brown'3 have been found to be not compatiblewith those derived by MacAdam'4 from older experi-mental data.4 The ellipsoids derived from the morerecently published values of gik, though labeled "stand-ard deviation ellipsoids," are actually too large for thatdesignation. A standard deviation ellipsoid would en-close 68 percent of the color-matchings made about acolor center. The ellipsoids derived from the valuesgiven in the two most recent papers cited enclose 84percent of the normally distributed color matches. To

13W. R. J. Brown, J. Opt. Soc. Am. 41, 684 (1951).14 D. L. MacAdam, J. Opt. Soc. Am. 33, 18 (1943).

make the ellipsoids compatible with the standarddeviation ellipsoids, it is necessary to reduce all theirradii by a factor of 1/v2= 0.707. The shapes andorientations of the ellipsoids are not affected. Thecorresponding values of gik are twice the publishedvalues. All the published values of Cik and Gil also needto be doubled, in order to correspond to standarddeviations of color-matching. The values of gik given inthe present paper correspond directly to the standarddeviation ellipsoid.

The error arose from a definition of Cik compatiblewith the modulus of precision, rather than with thestandard deviation as stated and intended. The inap-propriate definition of cik is implicit in the formula

P== A exp(-C ikXiXk) dxdx 2dx3 ,

which appeared on page 809 of reference 5. With thatdefinition and for equally large ranges, ddx2dx 3 , theprobability of a match for which CikXiXk= is 1/e times

TABLE II. Observed and calculated values of the standarddeviation of gii.

Field Surround gn a obs (gil) a cal (gil)

Red Red 91.74 23.8 23.6Green 10.66 4.0 2.7Blue 51.12 16.1 13.2White 13.98 5.3 3.6

g22 o obs (22) a- Cal (22)Red 609.10 209.4 157.0Green 86.17 11.0 22.2Blue 301.48 97.0 77.8White 103.32 27.2 26.6

g33 a obs (g33) a Cal (33)

Red 125.58 41.2 32.4Green 78.20 24.0 20.2Blue 63.80 29.2 16.5White 102.10 47.2 26.4

the probability of a match near the color center, forwhich CikXiXk=0. According to accepted nomenclature,this ratio of probabilities is characteristic of deviationsequal to the reciprocal of the modulus of precision,usually designated h. On the other hand, the intentionwas to define Cik so that CikXiXk= 1 for the standarddeviation, usually designated as a. According to ac-cepted nomenclature, the standard deviation is that forwhich the probability is e times the probability of amatch near the color center. Therefore, if Cik is to bedefined such that CikXiXk= 1 for the standard deviationof X, X2 X then the original definition of Cik must berevised as follows:

P= A exp(-2CikXiXk) dxjdx2dX3.

The formulas derived for determining the values of Cik,

as originally defined, were all correct. Therefore, thoseformulas give one-half of the values of Cik, as redefined.Consequently, the published values should be doubled

254 Vol. 42

Page 4: Statistics of Color-Matching Data

STATISTICS OF COLOR-MATCHING DATA

to obtain the desired values of Cik. Since the publishedvalues of Gj, and gik were derived from the publishedvalues of Cik by use of correct linear combinations, theyshould likewise all be doubled.t

These corrections do not involve any error in theexperimental method, data, theory, or results but arerather a revision of the manner of representation. Therevision represents the original results of the color-discrimination experiments, but the scale of all theellipsoids has been reduced by the factor 0.707, so as tocorrespond to the standard deviations of color matches,in place of the larger ellipsoids originally shown andmisnamed.

- MU -U -4U -UDeviation from the mean

60 80

FIG. 3. Frequency distribution of the blue component of colormatches made with a red field. Zigzag line represents the observedfrequencies; smooth curve shows the calculated frequencies.

t The text of Brown and MacAdam (reference 5) may be cor-rected to correspond to the redefinition of ik, as follows:

Place before the brackets following "exp" wherever expappears on page 809.

Multiply r3 by 8 in all cases.Multiply Cik2iXk/n by 4., in line 16 of the second column of

page 809.Delete the number 2 in lines 17, 33, and 35 of the second column

of page 809 and in the third line from the bottom of page 810.In the last line on page 826, change the number 8 to 4 in line 4

of page 810.Delete V1 from lines 23 and 26 (three occurrences on page 810).

FIG. 4. Constant-luminance cross sections of color-discrimina-tion ellipsoids for a field subtending two degrees with a bluesurrounding-field, left, and a white surrounding-field, right.

When this correction is applied to the value of Weber'sfraction obtained by Brown and MacAdam,' the cor-rected values agree somewhat better with previouslypublished results. The same is true of the values of thisfraction obtained by Brown.3

The agreement of the cross sections at constantluminance of the color-discrimination ellipsoid withthose of Nutting is also somewhat improved by thecorrection. The Nutting ellipses are correctly presentedin the figures.5 Those for observers DLM and WRJBneed reduction by a factor of 0.707.

The reduction of the size of the ellipsoids, produced bythe correction just explained, brings them into betteragreement with the data reported by Davidson.'5 Hefound that in color judgments of textile samples theleast-perceptible difference of chromaticity correspondedrather well with the published standard deviationellipsoids. Repeated tests have indicated that the stand-ard deviation of color-matching corresponds to onlyabout one-third of the color difference which is "just-noticeable" in the same colorimeter. This ratio is baseddirectly on instrument scale-readings and is not subjectto the error discussed above. Therefore, Davidson'sreport seems a little hard to understand. The observa-tion conditions in these experiments were somewhatdifferent. In Davidson's experiment the observer viewedlarge textile samples with light surroundings. Thecolorimeter experiments permitted only a small (two-

310 '310

.305 30

.300 300

.660 .665 .660 .665

FIG. 5. Constant-luminance cross sections of color-discrimina-tion ellipsoids for a red field subtending two degrees with a greensurrounding-field, left, and a red surrounding-field, right.

H. R. Davidson, J. Opt. Soc. Am. 41, 104 (1951).

Blue primary 60-Red surround

Green surround _0

40

20s

White surround

. ~~~40 -

Blue surround 60-

40 -

April 1952 255

Page 5: Statistics of Color-Matching Data

W. R. J. BROWN

5 2° 12°

,175 .175.265 .270 .265 .270

FIG. 6. Constant-luminance cross sections of color-discrimina-tion ellipsoids for a blue field subtending two degrees, left, andtwelve degrees, right. The surrounding field was dark.

degree) matching field and a dark surrounding field.Comparison of large, closely juxtaposed samples mightbe expected to yield somewhat greater sensitivity tocolor differences than observation in a calorimeter, but athreefold advantage seemed excessive. Part of thediscrepancy may be attributed to the fact that the"just-perceptible" color differences, Davidson reported,were based on a criterion of nine positive reports out oftwelve judgments, rather than on judgments of colordifferences visible with certainty. The latter, admittedlyan ambiguous criterion, was the basis for the report thatthe "just-noticeable" difference is three times as largeas the standard deviation of color-matching. Finally,since the published ellipsoids to which Davidson re-ferred were 30 percent too large, because of the in-appropriate definition of cik, the discrepancy betweenthe just-noticeable differences in the calorimeter, and theleast-perceptible differences in textile inspection issignificantly less than first appeared.

REPRODUCIBILITY OF ELLIPSOIDS

The specification of the standard deviation of themajor components of gik does not completely define theexperimental uncertainty of the color-discriminationellipsoid. Variations occur in the other three terms of thestandard deviation ellipsoid. Whereas the three terms,gll, g22, and g33, specify the size of the ellipsoid, these,together with the other three terms, g12, g23, and g31,

specify its orientation.In order to assess the magnitude of these variations,

the discrimination ellipsoid for each set of thirtyobservations was computed. Since eight of these sets of

observations constitute a homogeneous group of data,the variation of the ellipsoids from one set to the next ina single group is indicative of the reproducibility of theellipsoids. Cross sections of these ellipsoids through theplane of constant luminance are shown in Figs. 4-6.Figures 4 and 5 show four groups of data taken with ared matching-field, as indicated by the chromaticitycoordinates. The differences between the four groups arevariations in the chromaticity of the surrounding visualfield. Figure 6 shows cross sections at constant lumi-nance through the standard deviation ellipsoid for ablue field. The difference between these two groups ofdata is only the size of the matching field.

From these figures it can be seen that a set of thirtyobservations establishes the size and orientation of thediscrimination ellipsoid quite well. Only occasionallydoes a set of observations yield a result which is divergentfrom the rest of the group. The variations are only alittle greater than would be expected from the errors ofsampling of a uniform parent distribution. This is, ofcourse, indicated by the agreement in Table I betweenthe calculated and observed values of o(gii).

This conclusion is encouraging, for it indicates thatthe effects of random, uncontrolled variables in theexperiment are not serious. It also suggests that thenumber of observations in one experimental group couldbe reduced from the 240 color matches now used to asomewhat lesser number with little loss in accuracy.This would permit a reduction in the time (and effort)required to collect color-discrimination data. This, inturn, would tend to reduce any long-term drifts whichcould be encountered in the experiment. It would allowthe completion of a program of observations before theobserver is an old man.

SUMMARY

To summarize these results, it may be said that thedistribution of color matches about a color center isnormal for the three-primary colorimeter used in color-discrimination investigation. The variation of the sizeof the color-discrimination ellipsoid is closely predictedby the variation caused by the error of statisticalsampling.

Discrimination ellipsoids determined from relativelysmall sets of color matches are quite reproducible.

Vol. 42256