analysis of rayleigh match data with psychometric · pdf fileanalysis of rayleigh match data...

11
Vol. 10, No. 8/August 1993/J. Opt. Soc. Am. A 1807 Analysis of Rayleigh match data with psychometric functions William H. Swanson Retina Foundationof the Southwest, Suite 400, 9900 North Central Expressway,Dallas, Texas 75231,and Department of Ophthalmology, University of Texas Southwestern Medical Center at Dallas, 5323 Harry Hines Boulevard, Dallas, Texas 75235 Received July 14, 1992; revised manuscript received January 21, 1993; accepted January 26, 1993 Color matches have been used for a variety of purposes, yet the psychometric properties of color-matching data have not been thoroughly investigated. A method is given for generating psychometric functions for the two ends of the color-matching range by use of a perceptual dimension for stimulus magnitude based on ratios of cone quantal catches. The analysis was applied to Rayleigh match data gathered from 250 naive observers with an automated protocol. Slopes of the psychometric functions were significantly shallower for anomalous trichromats than for normal trichromats, consistent with the assumption that stimulus magnitude is based on ratios of cone quantal catches. These results indicate that the tester's criterion for response consistency can strongly affect Rayleigh match widths. The analysis may also be useful for other perceptual tasks, such as con- trast matching and spatial alignment. INTRODUCTION Color-matching data have been used for more than a cen- tury to study abnormalities in color vision. In 1881, Lord Rayleigh' reported matching data for red-green color mixtures compared with a yellow standard. He noted that there was significant variability among normals and that color defective observers accepted matches outside the normal range. Twenty-six years later Nagel 2 de- signed a clinical instrument for color matching that he named the anomaloscope. Since then a variety of color- mixture equations and anomaloscope designs have been introduced. 3 Anomaloscopes have been used for a variety of tasks beyond identifying congenital color defects. Studies of Rayleigh match variation among normal trichromats have provided evidence for polymorphisms of human-cone photopigments, 4 and comparisons of Rayleigh matches with genetic data have provided insight into relationships between gene sequences and perception.' Studies of patients with retinal disease have shown that Rayleigh matches can be used to detect abnormalities in cone- photopigment regeneration' and that abnormal Rayleigh matches can result from early age-related macular changes. 7 Evaluation of Rayleigh matches of patients with a range of eye diseases has shown that optic-nerve diseases can produce widened matching ranges with nor- mal luminosity and that retinal diseases can produce widened matching ranges with abnormal luminosity. 3 The usefulness of anomaloscopy in this range of endeav- ors is due in large part to the fact that reliable data can be gathered quickly from inexperienced observers, provided that the tester has been thoroughly trained and follows a standardized protocol. 3 In a typical Rayleigh match two contiguous semicircles are presented: a standard field filled with a single monochromatic light (yellow) and a mixture field filled with two monochromatic lights (red and green). The tester's goal is to determine the range of red-green ratios for which the observer perceives the mix- ture field as matching the monochromatic field in color and brightness. The extremes of this range are the match end points, the middle of the range is the match midpoint, and the difference between the two end points is the match width. By systematically presenting different mixture ratios and noting the observer's responses, the tester can typically determine the two match end points in a few minutes. Despite the widespread use of color matching, there has been little effort to investigate it as a psychophysical task. This may be due to the fact that in color matching the stimulus magnitude varies along a perceptual, rather than a physical, dimension. The stimulus is the chromatic dif- ference between the mixture field and the standard field. Since the two fields are never physically identical, the match is perceptual and not physical. Whereas the use of a perceptual dimension for stimulus magnitude is not unique to color matching, it is quite different from the large majority of psychophysical tasks that employ stimuli that vary along a physical dimension of intensity (related tasks are mentioned below,in the Discussion section). In the latter case psychometric functions can be measured by determining probability of detection as a function of stimulus intensity; stimuli with a zero intensity can be used for blank trials, either to measure the false-alarm rate or to construct criteria-free methods (such as two- alternative forced choice). For instance, in measuring acuity the intensity is stimulus size, in measuring contrast thresholds the intensity is stimulus contrast, and in mea- suring spatial-frequency discrimination the intensity is difference in spatial frequency. For all these types of in- tensity, one end of the physical scale is zero intensity and the other end is maximum intensity. The analysis of observer consistency in color matching is complicated by the fact that blank trials cannot be used. If the observer gives inconsistent responses (i.e., gives different responses to repeated presentations of the same 0740-3232/93/081807-11$06.00 C 1993 Optical Society of America William H. Swanson

Upload: lytruc

Post on 30-Jan-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

Vol. 10, No. 8/August 1993/J. Opt. Soc. Am. A 1807

Analysis of Rayleigh match data with psychometricfunctions

William H. Swanson

Retina Foundation of the Southwest, Suite 400, 9900 North Central Expressway, Dallas, Texas 75231, andDepartment of Ophthalmology, University of Texas Southwestern Medical Center at Dallas, 5323 Harry Hines

Boulevard, Dallas, Texas 75235

Received July 14, 1992; revised manuscript received January 21, 1993; accepted January 26, 1993

Color matches have been used for a variety of purposes, yet the psychometric properties of color-matching datahave not been thoroughly investigated. A method is given for generating psychometric functions for the twoends of the color-matching range by use of a perceptual dimension for stimulus magnitude based on ratios ofcone quantal catches. The analysis was applied to Rayleigh match data gathered from 250 naive observerswith an automated protocol. Slopes of the psychometric functions were significantly shallower for anomaloustrichromats than for normal trichromats, consistent with the assumption that stimulus magnitude is based onratios of cone quantal catches. These results indicate that the tester's criterion for response consistency canstrongly affect Rayleigh match widths. The analysis may also be useful for other perceptual tasks, such as con-trast matching and spatial alignment.

INTRODUCTION

Color-matching data have been used for more than a cen-tury to study abnormalities in color vision. In 1881, LordRayleigh' reported matching data for red-green colormixtures compared with a yellow standard. He notedthat there was significant variability among normals andthat color defective observers accepted matches outsidethe normal range. Twenty-six years later Nagel2 de-signed a clinical instrument for color matching that henamed the anomaloscope. Since then a variety of color-mixture equations and anomaloscope designs have beenintroduced.3

Anomaloscopes have been used for a variety of tasksbeyond identifying congenital color defects. Studies ofRayleigh match variation among normal trichromats haveprovided evidence for polymorphisms of human-conephotopigments,4 and comparisons of Rayleigh matcheswith genetic data have provided insight into relationshipsbetween gene sequences and perception.' Studies ofpatients with retinal disease have shown that Rayleighmatches can be used to detect abnormalities in cone-photopigment regeneration' and that abnormal Rayleighmatches can result from early age-related macularchanges.7 Evaluation of Rayleigh matches of patientswith a range of eye diseases has shown that optic-nervediseases can produce widened matching ranges with nor-mal luminosity and that retinal diseases can producewidened matching ranges with abnormal luminosity.3

The usefulness of anomaloscopy in this range of endeav-ors is due in large part to the fact that reliable data can begathered quickly from inexperienced observers, providedthat the tester has been thoroughly trained and follows astandardized protocol.3 In a typical Rayleigh match twocontiguous semicircles are presented: a standard fieldfilled with a single monochromatic light (yellow) and amixture field filled with two monochromatic lights (redand green). The tester's goal is to determine the range of

red-green ratios for which the observer perceives the mix-ture field as matching the monochromatic field in colorand brightness. The extremes of this range are the matchend points, the middle of the range is the match midpoint,and the difference between the two end points is thematch width. By systematically presenting differentmixture ratios and noting the observer's responses, thetester can typically determine the two match end pointsin a few minutes.

Despite the widespread use of color matching, there hasbeen little effort to investigate it as a psychophysical task.This may be due to the fact that in color matching thestimulus magnitude varies along a perceptual, rather thana physical, dimension. The stimulus is the chromatic dif-ference between the mixture field and the standard field.Since the two fields are never physically identical, thematch is perceptual and not physical. Whereas the use ofa perceptual dimension for stimulus magnitude is notunique to color matching, it is quite different from thelarge majority of psychophysical tasks that employ stimulithat vary along a physical dimension of intensity (relatedtasks are mentioned below, in the Discussion section). Inthe latter case psychometric functions can be measuredby determining probability of detection as a function ofstimulus intensity; stimuli with a zero intensity can beused for blank trials, either to measure the false-alarmrate or to construct criteria-free methods (such as two-alternative forced choice). For instance, in measuringacuity the intensity is stimulus size, in measuring contrastthresholds the intensity is stimulus contrast, and in mea-suring spatial-frequency discrimination the intensity isdifference in spatial frequency. For all these types of in-tensity, one end of the physical scale is zero intensity andthe other end is maximum intensity.

The analysis of observer consistency in color matchingis complicated by the fact that blank trials cannot be used.If the observer gives inconsistent responses (i.e., givesdifferent responses to repeated presentations of the same

0740-3232/93/081807-11$06.00 C 1993 Optical Society of America

William H. Swanson

Page 2: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

1808 J. Opt. Soc. Am. A/Vol. 10, No. 8/August 1993

mixture or gives a response of "redder" to a mixture thathas less red light than does a previous mixture to whichthe response was "greener"), then the trained tester mustrely on decision criteria based on testing expertise to de-termine the matching range end points. Since the numberof trials is small, it is not possible to quantify responseconsistency. These factors may reduce test-retest relia-bility and may introduce a source of test-site dependencefor color-matching data.

Recently, automated protocols that use computer-controlled anomaloscopes have been introduced that offerthe advantage of greater standardization and thus thepotential for more widespread clinical use of anoma-loscopy.8 9 Large numbers of trials and finer control overstep size provide the potential for analysis of responseconsistency. The current study evaluated a large clinicaldatabase gathered with an automated protocol. A methodwas developed for analyzing color-matching data in termsof psychometric functions by use of a perceptual stimulusdimension based on ratios of cone quantal catches. Re-sponse consistency was analyzed in terms of the slopes ofthese psychometric functions, and matching range endpoints were determined as the 6Q% points of the functions.

METHODS

ObserversA total of 250 naive observers participated in the study(data from laboratory personnel were excluded). Observ-ers ranged in age from 4 to 79 years [mean 1 standarddeviation (SD) = 26 + 19 years]. There were 120 observ-ers free of ocular disease (83 normal trichromats, 26deuteranomalous trichromats, and 11 protanomaloustrichromats) and 130 patients referred by ophthalmolo-gists specializing in retinal diseases or glaucoma (patientswith congenital color defects were excluded from thisanalysis). All observers were tested with the use of 2-degfields; for a separate study of the effect of field size, 30normal trichromats free of eye disease were also testedwith a range of field sizes.

ApparatusRayleigh matches were measured with a custom-madecomputer-controlled briefcase anomaloscope,8 which usesa combination of light-emitting diodes (LED's) and inter-ference filters in Maxwellian view (with a 2-mm artificialpupil) to yield narrow-band primaries of 667, 588, and551 nm. The left semicircle that is presented is filledwith a red-green mixture (667 and 551 nm), and the rightsemicircle is filled with a yellow standard (588 nm).

The experiment was controlled by a Macintosh II com-puter equipped with a six-channel 12-bit digital-to-analogconverter (DAC) board (National Instruments NB-AO-6)and a three-port digital input/output board (National In-struments NB-DIO-24). Each LED was driven by a lin-ear current amplifier that in turn received input from one12-bit DAC. For finer control over the red and greenLED's, their DAC's had variable voltage ranges that wereset by one additional DAC each.

ProcedureFor each trial, the stimulus was presented continuouslyuntil the observer pushed one of three response buttons:

either the left or the right button to indicate the left or theright semicircle as redder or else the middle button to in-dicate a perfect match. A bidirectional switch permittedcontrol over the radiance of the yellow standard. For datacollection on young children, the child described the colorsseen on each trial and the tester pushed the appropriatebutton.

The strategy of asking the observer to decide whichside was redder was adopted because of two problems ex-perienced in early tests that used the more usual task ofasking whether the mixture field was redder or greenerthan the standard. First, it was noted that some patientswere not able to respond properly with color names of red-der or greener yet were still able to define a range of mix-ture fields that matched the standard (e.g., some of thesepatients had severe tritan defects and referred to the colorof the green LED as white). By asking the patients to de-termine which of the two sides was redder, we eliminatedreference to apparent color, and the patient was requiredonly to refer to color differences. Second, in the middle ofthe test some naive normals would forget which of thebuttons indicated which color and would begin pushing abutton indicating a response opposite to the one that theyintended. The strategy of asking which side was redderminimized this problem, since the buttons were aligned sothat the left button meant left redder and the right buttonmeant right redder.

The mean retinal illuminance of the red-green mixturefield was held constant at 57 trolands (Td) (comparablewith the 3 cd/n 2 of the Nagel model I anomaloscope), andif necessary the observer adjusted the radiance of theyellow standard. As with the Nagel anomaloscope, for allmixtures the radiances of the LED's were such that, fordeuteranopes, the brightness of the mixture matched thebrightness of the standard. In this way R + G was heldconstant across all mixtures, and the stimuli are expressedin units of R/(R + G) ranging from 0.0 (pure green, withno red light) to 1.0 (pure red, with no green light). Theinitial brightness matches of the standard to the red andgreen primaries were used to determine whether an ob-server had deutan, protan, or scotopic spectral sensitivity.The anomaloscope was then run in deutan mode, protanmode, or scotopic mode. In deutan mode the retinal illu-minance of the standard was always 57 Td. In protan andscotopic modes the radiance of the standard was variedaccording to protanopic or scotopic spectral sensitivity.These modes worked well for most observers, but someobservers did require adjustments to the radiance of thestandard on each trial; when this occurred it lengthenedthe duration of the session.

A staircase procedure8 was employed to determine thetwo ends of the matching range by use of two interleavedstaircases, with one staircase controlling the mixture fieldfor even-numbered trials and the other controlling themixture field for odd-numbered trials. Each staircasebegan at either the red or the green end of the range, witha step size of half the range. At each reversal of the stair-case the step size was halved, until a step size of 0.003 wasobtained. The staircase continued until six reversalswere obtained at this step size. Each staircase followeda "left redder" response by making the mixture fieldgreener for the next presentation in the staircase and fol-lowed a "right greener" response by making the mixture

William H. Swanson

Page 3: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

Vol. 10, No. 8/August 1993/J. Opt. Soc. Am. A 1809

0.6-

t0.4-

0.2-

0.010 20 30 40 50

TRIAL NUMBERFig. 1. Example of staircases obtained with the briefcase-anomaloscope procedure.8 Open squares are from the staircasesearching for the red end of the matching range, and filledsquares are from the staircase searching for the green end.

field redder for the next presentation in the staircase.The two staircases differed in how they followed a "match"response: the staircase for the even-numbered trialsmade the next presentation redder, whereas the staircasefor the odd-numbered trials made it greener. Thus thestaircases converged on the two match end points (com-puted as the means of the last six reversals for each stair-case). The mean (±1 SD) number of trials for the250 observers was 62 (±17); with normal adults this typi-cally required 2 to 5 min, although more time was requiredfor children and for some color-defective observers. Anexample is shown in Fig. 1.

Data AnalysisTo analyze the data in greater detail, two yes-no psycho-metric functions were constructed for each data set bycombining responses for both staircases: a G functionfor the question "Is the mixture greener than the stan-dard?" and an R function for the question "Is the mixtureredder than the standard?" The green (EG) and red (ER)

ends of the matching range were determined from thesefunctions. A response of "right redder" was coded as"yes" for the G function and "no" for the R function. Aresponse of "left redder" was coded as "no" for theG function and "yes" for the R function. A response of"match" was coded as "no" for both functions. For eachfunction the fraction of "yes" responses was computed forevery mixture condition presented. Figure 2 shows thetwo psychometric functions generated from the staircasesshown in Fig. 1.

Since the difference in color between the red-greenmixture and the yellow standard is a perceptual ratherthan a physical dimension for stimulus magnitude, it isnecessary to construct an appropriate stimulus metric.The metric used was based on the assumption that thematch midpoint is the mixture ratio at which the quantalcatch ratio for the long-wavelength-sensitive and middle-wavelength-sensitive cones equals the cone quantal catchratio for the yellow standard. The magnitude of the chro-

matic difference between any given mixture field and thestandard is therefore determined by the difference be-tween the cone quantal catch ratios for the mixture andfor the match midpoint.

Two constraints are placed on the stimulus metric.First, the metric must be referenced to the match endpoint, so that the end point is the 50% point of the corre-sponding psychometric function. For stimulus magnitudex, the probability P(x) of an observer's perceiving the colordifference (i.e., perceiving the test field as either redder orgreener than the standard, depending on which functionwas being evaluated) was computed by using

P(x) = 1 -2- (1)

where /3 is a parameter that determines the slope of thefunction." Therefore the metric must be constructed sothat x = 1 when the mixture value equals the end point.Second, the metric must be such that the slope of thepsychometric function is independent of the match mid-point, so that when the match midpoint is changed thefunction is simply moved horizontally along the stimulusaxis without change in shape. These two constraints weresatisfied by setting stimulus magnitude x equal to theantilogarithm of the difference between test mixture Mand the corresponding end of the matching range:

x = 1 0E-M for the G function,

x = 10m-R for the R function.

(2)

(3)

Usually the slope of a psychometric function is consid-ered to reflect effects of internal noise. However, in thiscase the slope could also be affected by changes in conephotopigments. If the photopigments have altered conequantal catches (as a result of either abnormal photo-pigments or reduced cone optical density), then the ratiosof cone quantal catches for the test and standard hemi-fields may change more slbwly with mixture ratio than fornormal cone quantal catches. This would effectivelyminify the stimulus magnitude, reducing the slope by aconstant factor.

During testing, some observers reported that they occa-sionally pushed the wrong button by mistake; others mayhave made such errors without being aware of it. Fittingsuch data with a function that asymptotes to 100% wouldresult in a shallow slope. To avoid this potential source ofshallow slopes, the parameter y was introduced to denotethe upper asymptote:

fraction of "yes" responses = yP(x). (4)

To fit the data, psychometric functions were con-structed for a range of values of the following parame-ters: EG and ER from 0.0 to 1.0 in steps of 0.0025, log(3)from 0.3 to 2.25 in steps of 0.15, and y from 90% to 100%in steps of 2%. The likelihood of obtaining the data setwas calculated for each parameter set, and the parameterset with the highest likelihood was used as the maximum-likelihood estimate." Details for this type of analysis aregiven elsewhere.2

In Fig. 2 the R function shows considerable responseinconsistency (fractions "yes" other than 0.0 or 1.0) when0.55 < R/(R + G) < 0.58. This region is shown in theinset, along with the G and R functions with the highest

William H. Swanson

Page 4: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

1810 J. Opt. Soc. Am. A/Vol. 10, No. 8/August 1993

0.8 -

wIL> 0.6-z0

U.-

0.2 -

0.04-0.0

-- -&em

I

0.2 0.4

-0- G-function|I+* R-function|

[_ I00.6 0.8 1.0

R/(R+G)Fig. 2. R and G functions derived from staircase data shown in Fig. 1. Open circles connected by dashed lines give the fractions "yes"for the G function, and filled circles connected by solid lines give the fractions "yes" for the R function. Note that the R function issomewhat erratic just below R/(R + G) = 0.6. Since only a small number of trials (1-5) were presented for each mixture (there werea total of 52 trials across all mixtures), a few mistakes in button pushing could have produced this amount of noise in the data; the ob-server reported after the experiment had concluded that she had mistakenly pushed the wrong button on a few occasions. ForR/(R + G) = 0.578, the observer made two responses of "same" early in the staircases and one response of "redder" later on, giving 33%for fraction "yes. " For R/(R + G) = 0.569, the observer responded "same" on one trial and "redder" on the other. Responses were "red-der" for all of the remaining trials for test mixtures between 0.566 and 0.581. The inset shows data from the range R/(R + G) = 0.54-0.59, along with the best-fitting psychometric functions. Solid curves are the best-fit functions with all parameters permitted to vary;the dashed curve is the best fit R function when the upper asymptote y was fixed at 100%o. These two R functions have similar values forthe red end (ER) of the matching range (differing by only 0.5 Nagel unit) but much different slopes (differing by 0.6 log unit).

likelihood of having generated the data from the staircasesshown in Fig. 1. For each psychometric function therewere 1-5 trials per point, and likelihoods were weightedby the number of trials. The best-fitting R function hady = 90%; if the upper asymptote were fixed at 100%, thena much shallower slope would be required, as shown by thedashed curve. Both functions gave similar values for thered end (ER) of the matching range.

SimulationsTo investigate the reliability of the parameters obtainedwith the use of the data analysis, Monte Carlo simulationswere performed by using model observers. Details of thistype of simulation are given elsewhere.'2 Each model ob-server was defined by a match midpoint and a range andby a slope and an upper asymptote for the two psycho-metric functions (the same slope and upper asymptotewere used for each function).

For each model observer 100 simulated experiments

were conducted. Each experiment used the same algo-rithm as was used in testing patients. Responses of themodel observer to a test mixture were generated by select-ing one random number (between 0 and 1) for each psycho-metric function; when the random number was higherthan the value of the psychometric function at the testmixture, then this was coded as "higher"; otherwise it wascoded as "lower." If both of the psychometric functionsyielded "lower," then a response of "same" was returned.If only one of the psychometric functions yielded "higher,"then the corresponding response ("redder" or "greener")was returned. If both functions yielded "higher," thenthe response returned corresponded to the function thathad the greater difference between random number andfunction value.

For each simulated experiment, results were analyzedby using the same maximum-likelihood procedure usedfor analyzing patient data. Results were summarized bystoring the means and SD's of the differences between the

w

IL

R/(R+G)

-- w I

1111

b

William H. Swanson

I

Page 5: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

Vol. 10, No. 8/August 1993/J. Opt. Soc. Am. A 1811

parameters obtained by maximum-likelihood estimationand the parameters used to define the model observer.The mean differences provide estimates of the bias in themaximum-likelihood estimates, and the SD's provide esti-mates of the accuracy of the estimates.

Simulations were performed for a total of 36 differentmodel observers, based on all possible combinations forthree different midpoints, two different widths, threedifferent slopes, and two different upper asymptotes.The three midpoints were selected as the mean values forthe healthy normal trichromats [R/(R + G) = 0.56], thedeuteranomalous trichromats [R/(R + G) = 0.26], andthe protanomalous trichromats [R/(R + G) = 0.86]. Thethree slopes were selected as the mean value for all 250observers [log(Q) = .1.35] and as the values 2 SD's aboveand below this [log(Q) = 0.36 and 2.34]. The upperasymptotes were selected as high (100%) and low (90%),and the match widths were selected as narrow (0.01) andwide (0.20).

RESULTS

Psychometric FunctionsThe slopes of the psychometric functions showed consider-able individual variability within each group, as indicatedby the SD's for log(Q) listed in Table 1. There was alsoone significant between-group difference: congenitalcolor-defective observers tended to have shallower slopesthan did normal trichromats. Figure 3 compares valuesfrom healthy normal trichromats with values from healthyanomalous trichromats. The difference in mean valuesfor log(B) was highly significant [Student's t test (t) =7.96, p < 0.001], although the difference in SD's was not[F distribution (F) = 1.29, p > 0.20]. Among normaltrichromats there were no slope differences between adultswith healthy eyes (21 to 66 years old), children withhealthy eyes (4 to 10 years old), or adults with eye diseases(21 to 79 years old) (for comparisons among the threegroups, in all cases t < 0.85, p > 0.50 and F < 1.67,p > 0.05).

In addition to the slope differences across individuals,there were also slope differences within individuals. Themean values for log(f3) across all 250 data sets were notdifferent for the G function and the R function (t = 0.72,p > 0.50). However, for any given individual there wereoften considerable differences between the slopes of the Gand R functions: the mean of the absolute value of thedifference in log(Q) was 0.35 ± 0.35 log unit. Slopes alsotended to be shallower for the smallest fields. For the 30normal trichromats tested with the use of several fieldsizes, mean log(Q) was 1.72 ± 0.35 with a 4-deg field and1.23 ± 0.45 with a 0.5-deg field (t = 6.53, p < 0.001).

For the majority of data sets the upper asymptotes were100%: 142 (57o) of the data sets had y = 100% for bothG and R functions, and only 5 (2%) of the data sets hady < 100% for both G and R functions. The mean value±1 SD for y was 98% ± 4% for both the G and the R func-tions. For each of the subsets listed above, the meanvalue for y was also 98%.

Match Midpoints and WidthsOverall, there was good agreement between matchingranges determined by the means of the final six reversals

of the staircase and by maximum-likelihood estimation.For the following comparisons the R/(R + G) values weremultiplied by 73, yielding units equivalent to those on theNagel model I anomaloscope. Averaged across all 250 datasets, the difference between maximum-likelihood andmean-of-reversals estimates was -0.3 ± 2.2 for EG, 0.2 ±1.8 for ER, 0.1 ± 1.5 for the midpoint, and 0.5 ± 2.7 for thewidth. The small mean differences indicate that there islittle tendency for the maximum-likelihood procedure con-sistently to yield higher or lower estimates than the meanof reversals. On the other hand, the SD's indicate thatthe difference between the two estimates was large forsome data sets. For 192 (77%) of the data sets the differ-ence between the 2 estimates was not >2 Nagel units forboth match midpoint and matching range, yet for 45 (18%)of the data sets the difference was >5 Nagel units foreither match midpoint or matching range. Table 2 sum-marizes differences between these two groups of data sets.For the data sets in which the two estimates were in pooragreement, slopes were shallower, upper asymptotes werelower, and numbers of trials were greater (indicating thatthe staircases took longer to obtain the required numberof reversals); for all parameters the differences were sig-nificant (in all cases t > 2.7, p < 0.01).

Linear regression was used to search for effects of ageon results from the 83 normal observers. There was noeffect of age on either log(Q) [coefficient of correlation(r) = 0.13, p > 0.05) or y (r = 0.03, p > 0.50)]. Bothmatch width (r = 0.34, p < 0.002) and match midpoint(r = 0.52, p < 0.001) tended to decrease with age, as shownin Fig. 4. More-detailed analyses were conducted by com-

Table 1. Slopes of Psychometric Functions andAges for Different Groups of Observers

Group n log(3) Age (years)

No eye diseaseNormal trichromats 83 1.45 ± 0.40 19 ± 16Anomalous trichromats 37 0.97 ± 0.46 15 ± 13

No congenital color defectHealthy adults 30 1.51 ± 0.41 37 ± 13Adult patients 96 1.46 ± 0.49 42 ± 14Healthy children 43 1.43 ± 0.38 7 ± 2

20 -

-J

I-o 15-

U

0I- 10-

LL 5.IL

normal trichromatsC anomalous trichromats

0.0 0.5 1.0 1.5 2.0log(p)

Fig. 3. Distributions for slopes of psychometric functions fornormal and anomalous trichromats (both groups free of eyedisease). The mean slope was significantly lower for anomaloustrichromats, although the SD's were not different.

William H. Swanson

-

02

Page 6: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

1812 J. Opt. Soc. Am. A/Vol. 10, No. 8/August 1993

Table 2. Comparison of Data Sets for Which Mean-of-Reversals and Maximum-Likelihood EstimatesWere in Good Agreementa and Those for Which These Estimates Were in Poor Agreement'

Group n R Function 13 G Function G R Function y () G Function y (%) Number of Trials

Good agreement 192 1.50 ± 0.44 1.47 ± 0.39 98 ± 4 98 ± 4 59 ± 14Poor agreement 45 0.69 ± 0.40 0.62 ± 0.41 96 ± 5 95 ± 5 86 ± 35

'Within 2 Nagel units for both range and midpoint.bDifferent by more than 5 Nagel units for either range or midpoint.

paring the data for the 47 children of ages 4-13 years withthe data for the 36 observers of ages 16-66 years. Therewas no effect of age on match width within either agegroup (r < 0.15, p > 0.20), whereas the difference inmatch widths between the two age groups was highlysignificant (t = 5.02, p < 0.001). In addition, the 3- to13-year-old age group had much greater variability inmatch widths than did the 16- to 66-year-old age group(F = 6.59, p < 0.001). There was no effect of age onmatch midpoint within either age group (r < 0.25,p > 0.10), whereas the difference in match midpoints be-tween the two age groups was highly significant (t = 6.23,

0.65

i+a2 0.60Ci)

0.550IL

0M )

-

zI-

0z

10-

8-

0

1. 0 0

0

0

0 %

* 0

A:~~~~

10 20 30 40

AGE IN YEARS

(a)

00

* 0

10 20 30 40

AGE IN YEARS(b)

Fig. 4. (a) Match midpoint and (b) match widage for the normal trichromats free of ocularregression line is shown in each panel (in p < 0.002). Match midpoint decreases with ayellowing of the lens. Match width decreasetent with either increase in chromatic sensitia stricter criterion for a color match. Thesethe use of adult normative data may be inappichildren.

p < 0.001). There was no difference between the twoage groups in the variability of the match midpoints(F = 1.54, p > 0.10).

For the 30 normal observers tested with 5 differentfield diameters, as field diameter decreased the matchwidth increased and the match midpoint decreased, asshown in Fig. 5. Linear regression across all 150 datapoints showed that these effects were significant (regres-sion against log field diameter, r = 0.26 for midpoint,r = 0.48 for width; in both cases p < 0.001). The meanvalue for the midpoint decreased monotonically from 8 degto 1 deg and then increased slightly for 0.5 deg. The SDfor the midpoint was relatively constant for 1-8 deg (be-tween 0.034 and 0.044) but was much larger for 0.5 deg(0.072). The mean value for the match width was ap-proximately equal at 8 and 4 deg, increased slightly asdiameter decreased down to 1 deg, and then increaseddramatically at 0.5 deg. The SD for the match width alsoincreased dramatically at 0.5 deg.

SimulationsTo evaluate the effect of variations in the upper asymptotey on estimations of slope /3, results were compared for twodifferent analyses: one permitting all three parametersto vary to fit the data and one keeping y fixed at 100%.

I , , These simulations were run for match midpoint = 0.56,50 60 70 range = 0.01, and all three slopes. When the upper

asymptote for the simulated observer was 100%, bothanalyses yielded similar results for log(,6); for both biasand accuracy, maximum differences did not exceed0.08 log unit and mean differences were 0.03 log unit.However, when the analysis in which the upper asymptotewas fixed at 100% was compared with that in which theupper asymptote was permitted to vary, for the conditionwhen the upper asymptote for the simulated observer was90%, the analysis with upper asymptote fixed at 100% in-creased the bias of the slope by as much as 0.6 log unit(underestimating slope) and worsened the accuracy of theslope estimate by as much as 0.4 log unit. This estab-lishes that fixing the upper asymptote at 100% can lead tounderestimates of slope.

Once the necessity for permitting the upper asymptoteto vary was established, simulations were performed for50 60 70all 36 model observers (all combinations of 3 midpoints, 2widths, 3 slopes, and 2 upper asymptotes). The primary

th as a function of factor affecting the bias and accuracy of the parameterdisease. A linear estimates was the slope: as slope decreased, bias in-both cases r > 0.3, creased and accuracy worsened. Upper asymptote was age, consistent with secondary factor: when the model observer had y = 90%,s with age, consis- the accuracy of the width estimate was worse, the bias ofvity or adoption of the slope estimate was increased, and the accuracy of thedata indicate thatropriate for testing asymptote estimate was worse. Match midpoint affected

only the bias of the midpoint estimate (the bias was

6 - e e 00 0 %op

4- 019. 0

8. " 02 - I 0

8 : 0 -'. . . In I I 0 . 0 0 . el I

William H. Swanson

0

Page 7: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

Vol. 10, No. 8/August 1993/J. Opt. Soc. Am. A 1813

larger for the two anomalous midpoints than for the nor-mal midpoint). Width affected only the accuracy of theslope estimate (accuracy was better for the smaller width)and the bias of the width estimate (bias was smaller forthe larger width).

To evaluate bias and accuracy, the means and SD's ofthe differences between actual and estimated parameterswere computed in Nagel units (0.5 to 1 Nagel unit is theresolution limit of most manual anomaloscopes). Formodel observers with the midpoint of a normal trichromat,in all cases the bias of the midpoint estimate was <0.7Nagel unit. However, for some of the model observers theaccuracy of the midpoint estimate was poor: the SD's ofthe estimates were as large as 4.5 Nagel units for modelobservers with log(^) = 0.36, dropping to 1.1 Nagel unitsfor model observers with log(f) > 0.36 and to 0.6 Nagelunit for model observers with = 100% and log(13) >0.36. For model observers with anomalous midpoints, thebias averaged 1.7 times larger, whereas accuracy wasunchanged.

For the width estimates, in many cases there was biastoward overestimating width (underestimates of width

a 0.56-

I 0.55-n4 0.54-I-2O 0.53-CL

2 0.52-

0.51-

n A

uJI-

aC)I-

6

5-

4-

3

1 -

I.DU I . . . . .4 8 6 7 8 0 j 1

FIELD DIAMETER ()(a)

4 6 6 N 1'

4 i 6 g'f 4 6 1 10

FIELD DIAMETER (0)

(b)Fig. 5. (a) Match midpoint and (b) match width as a function offield diameter for the 30 normal trichromats free of ocular diseasewho were tested with different field sizes. Error bars show thestandard error of the mean. Both data sets are consistent withprevious reports on effect of field size. For the 0.5-deg field thematch width was much larger than for other sizes, the matchmidpoint increased relative to 1 deg, and the standard errorswere larger than for other field sizes. This indicates that datafor the 0.5-deg field are less reliable than for other field sizes.

were never >0.5 Nagel unit). For model observers withthe smaller width, for log(3) = 0.36, bias was as large as12 Nagel units and accuracy as poor as 6 Nagel units,whereas, for log(3) > 0.36, bias was never >1.3 Nagelunits and accuracy never worse than 1.3 Nagel units. Formodel observers with the wider width, the bias was usu-ally half that for the model observers, with the smallerwidth and otherwise identical parameters.

Slope estimates tended to be good for model observerswith log(3) = 0.36, with bias never >0.1 log unit (overesti-mated) nor accuracy worse than 0.3 log unit. For modelobservers with log(3) = 1.35, slope tended to be slightlyunderestimated, with bias never >0.2 log unit nor accu-racy worse than 0.6 log unit. For model observers withlog(Q) = 2.34, slope was more severely underestimated:bias was as large as 1 log unit and accuracy as poor as0.8 log unit. Upper (lower) limits for the estimated slopeswere computed as the mean ±2 SD's. For model observerswith log(3) = 0.36, the upper limit was 1.1 log units. Formodel observers with log(3) = 2.34, with the smaller ac-tual width the lower limit was always >1.2 log units, butfor the larger width the lower limit decreased dramatically,to as low as -0.1. This means that a steep estimatedslope rules out a shallow actual slope, but when the esti-mated width is high a shallow estimated slope does notrule out a steep actual slope.

Upper-asymptote estimates were fairly good when theactual asymptote was 100%, with the bias never >1.8%and the accuracy never worse than 3.8%. However, whenthe actual asymptote was 90% the bias was as large as 8%and the accuracy as poor as 5%. This means that ifthe mean estimated asymptote for a population is near90% then the actual asymptote is below 100%, but a highmean estimated asymptote does not rule out a low actualasymptote.

DISCUSSION

The results presented here demonstrate that color-matching data can be analyzed by means of psychometricfunctions that use a perceptual dimension based on ratiosof cone quantal catches. This analysis interprets incon-sistent responses to repeated presentation of a givenstimulus value as indicating that this stimulus value liesunder the ascending portion of one of the psychometricfunctions, whereas stimulus values yielding highly consis-tent responses are under the upper or lower asymptotes ofboth functions. This approach provides a method forevaluating and interpreting response consistency in color-matching data that not only should reduce test-site vari-ability but also should permit more detailed investigationof the causes of reduced chromatic discrimination.

Analysis of data from 250 nalve observers shows thatthere is considerable within-group variability in slopes ofthe psychometric function and that for a given individualthe slopes may vary across field sizes or between the twoends of the matching range. This indicates that themethod used for evaluating response consistency can beimportant in interpreting Rayleigh match data.

Validity of the AnalysisThe finding that anomalous trichromats have shallowerslopes than do normal trichromats, despite similar SD's

William H. Swanson

Page 8: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

1814 J. Opt. Soc. Am. A/Vol. 10, No. 8/August 1993

for the two distributions of slopes, provides strong supportfor the assumption that perceptual stimulus magnitude isrelated to the difference in cone quantal catches for themixture field and the yellow standard. For anomaloustrichromats, the rate of change in cone quantal catch ratioversus color mixture is much lower than for normaltrichromats.' 3 This means that a mixture that is a givendistance in Nagel units from a match end point shouldyield a much lower perceptual magnitude for an anoma-lous trichromat than for a normal trichromat, shifting thedistributions of slopes to lower values without affectingthe SD. This is exactly what was found in the currentstudy, as illustrated in Fig. 3. Pokorny and Smith' 3 givean estimate of -0.4 log unit for the effect of deutera-nomaly on the rate of change in cone quantal catch ratioversus color mixture; this is quite close to the 0.5-log-unitdifference found in the current study for mean values forthe slopes of anomalous versus normal trichromats.

The mean-of-reversals estimates were in general quitesimilar to the maximum-likelihood estimates, indicatingthat the two staircases tended to converge on the 50%points of the two psychometric functions. This is similarto the situation with the more common experimental tech-nique of using a staircase to manipulate physical stimulusintensity: a 1-up-i-down yes/no staircase estimatesthe intensity that is detected 50% of the time. For thesmall fraction of data sets in which there was a largediscrepancy between the staircase estimates and themaximum-likelihood estimates, the slopes tended to beshallower and the upper asymptotes lower; for experimentsmanipulating physical stimulus intensity, these factorshave been shown to increase the difference in the twotypes of threshold estimate.' 2

The match midpoints and widths obtained from normaltrichromats free of ocular disease show systematic effectsof age and field size, indicating that the technique is use-ful for measuring subtle effects on Rayleigh matches inpopulations of inexperienced observers.

SimulationsResults of simulations indicate that shallow slopes increasethe bias of the width estimate (causing overestimates) butnot of the midpoint estimate and that they decrease theaccuracy of both the midpoint and the width estimates.Therefore it is useful to be able to identify data sets gener-ated by shallow slopes. Permitting the upper asymptoteto vary provides better estimates of the actual slope thandoes using a fixed upper asymptote. Simulation resultsindicate that data sets can be excluded as unreliable bysetting a lower limit to acceptable slopes but that this willexclude some data sets for observers with shallow slopesand large match widths. As can be seen in Fig. 3, a lowerlimit of log(,3) = 1.0 (suggested by the simulations) wouldexclude few normal trichromats but many anomaloustrichromats.

Effect of Tester CriterionThe ability of the analysis to make match width relativelyindependent of the slopes of psychometric functions makesit possible to distinguish between reduced chromatic dis-crimination resulting from postreceptoral factors and re-

duced perceptual stimulus magnitude resulting fromaltered cone quantal catches. The 50% points on thepsychometric functions do not vary with changes in slope,so match widths measured at the 50% points should not beaffected by changes in cone spectral sensitivities. Inmanual anomaloscopy there is typically no method for de-termining what points on the psychometric functions areevaluated, since the number of trials is usually small andthe rules for handling variable responses are not alwaysexplicit. If a value other than the 50% point on thepsychometric functions is determined, then match widthshould be dependent on the slopes of the psychometricfunctions.

The fact that the current study determined the 50%points of the psychometric functions may explain why thematching ranges obtained for normal trichromats andanomalous trichromats were in general smaller than ex-pected from the results of population studies that usedmanual anomaloscopy. The following are comparisonswith results of Helve's large population study,4 which aretypical for the Nagel model I anomaloscope.' For the cur-rent study 17 (46%) of the anomalous trichromats hadmatch widths <3 Nagel units, whereas Helve reportedvery few anomalous trichromats with match widths thissmall. For the 83 normal trichromats in the currentstudy, match widths were also in general smaller than ex-pected from Helve's data: for 25%, match width was <1Nagel unit, and for 48%, match width was <2 Nagelunits, compared with <5% and <25%, respectively, fromHelve's data.

To model the possible performance of these observersunder manual anomaloscopy with a stricter criterion forconsistent responses, match widths for all observers testedwith a 2-deg field were recomputed by use of the 80%points on the psychometric functions. The match widthsof the anomalous trichromats increased by an average of8.9 ± 6.8 Nagel units, so that only 3 (8%) had widths <3Nagel units. In contrast, for normal trichromats thewidths increased by only 2.9 ± 2.6 Nagel units (also bring-ing their data closer to those of Helve). This criterionchange also caused small changes in the match midpoints:the mean of the absolute value of the change in thematch midpoint was 0.4 ± 0.9 Nagel unit for the normaltrichromats and 0.8 ± 1.1 Nagel units for the anomaloustrichromats.

Limitations of the AnalysisThe current study considers response variability in termsof psychometric functions based on cone quantal catchratios. It does not account for the fact that an observer'sresponse might not be "match" even when the mixturematches the standard. If the "left redder" and "rightredder" responses were made randomly whenever the mix-ture matched the standard, the psychometric functionscould be shallow and the width could be artifactually nar-rowed. If the observer always responded "left redder"whenever the mixture matched the standard, then thepsychometric functions could be quite steep, the midpointshifted toward green, and the width artifactually nar-rowed; this could be harder to detect than randomguessing. These problems may also affect manualanomaloscopy but may not be obvious because of the

William H. Swanson

Page 9: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

Vol. 10, No. 8/August 1993/J. Opt. Soc. Am. A 1815

smaller number of trials. Further research is needed todetermine how best to handle these problems.

For 5 (2%o) of the data sets the value for ER was less thanthe value for EG, yielding a negative width; of these, theestimated slopes were shallow and the largest magnitudeof the negative width was 2 Nagel units. This indicatesthat the estimates cannot be accurate, since match widthsshould never be less than zero. Of the 3,600 simulatedexperiments, there were also 5 (0.1%) for which the esti-mated width was less than zero; in all cases these werefor the shallowest slope and the narrowest width. For the5 simulated experiments with negative widths, the largestmagnitude was also 2 Nagel units, the estimated slopeswere shallow, and the largest error in estimate of the mid-point was as great as 12 Nagel units. This indicates thata negative estimated match width is evidence of an unreli-able data set.

Advantages over Conventional AnomaloscopyAn advantage of the proposed analysis is its method forhandling situations in which the observer never reports acolor match. In conventional anomaloscopy, data cannotbe obtained when patients never report a match.6 In thecurrent study several patients (and some normal observerstested with a large field) remarked that the two sidesnever matched. The protocol for the briefcase anomalo-scope estimates the ends of the matching range with theuse of two staircases, even if a response of "same" is neverused. The psychometric-function analysis used in thecurrent study also produces match end points even if"same" is not used. The relatively large number of trialsmakes it possible to evaluate response consistency, and ifconsistency is high the match end points are meaningfuleven if the observer never reports a perceptual match:the match end points indicate transitions from the rangeof color mixtures over which the observer does not giveconsistent responses to the ranges of color mixtures overwhich the observer gives highly consistent responses.The two ranges over which the observer gives highlyconsistent responses define the color mixtures that theobserver can reliably discriminate as redder than thestandard (in one range) or greener than the standard (inthe other range). This provides information about boththe photopigments and the postreceptoral processing.If the range of inconsistent responses lies entirely withinthe region of normal Rayleigh matches, then the observercannot have abnormal photopigments. For instance, oneof the normal observers who never accepted a match washighly consistent, responding "left redder" for all mixtureratios >0.561 and "right redder" for all mixture ratios<0.556. The region of inconsistent responses was only0.4 Nagel unit wide; this observer appears to have normalphotopigments and excellent discrimination, as verifiedon other color-vision tests. If the range of inconsistentresponses lies entirely within the region of matches madeby simple deuteranomalous trichromats and the region ofreliable "mixture redder than standard" responses in-cludes all the region of normal Rayleigh matches, thenthis is strong evidence that the observer is deuteranoma-lous. If the range of inconsistent responses is quitenarrow, then the observer probably has good color discrimi-nation and does not have a significant red-green defect.

Conventional anomaloscopy requires that the patientmake a brightness adjustment on each trial. Nagel de-signed his anomaloscope to be in deutan mode, so that thecolor mixtures produced a constant quantal catch for anaverage deuteranope; these are also brightness matchesfor most normal trichromats and deuteranomalous trichro-mats. This means that, for deuteranopes, deuteranoma-lous trichromats, and normal trichromats, at most onlysmall brightness mismatches will be present, and the ob-server never needs to make large adjustments of thebrightness of the standard. In comparison, protanopes,protanomalous trichromats, and monochromats need tomake large adjustments to the brightness. If brightnessmatches are not made, such observers can use brightnesscues to make matches, artifactually narrowing the match-ing range. In the current study several protanomaloustrichromats were intentionally retested in deutan mode,for which their matching ranges were smaller than whentested in protan mode and for which their match midpointswere closer to normal.

In the current study the computer estimated the ob-server's spectral-luminosity function based on initialbrightness matches (matching the standard to the greenand red primaries) and selected one of three brightnessconditions: deutan, protan, or scotopic. For the remain-ing presentations the radiance of the standard was ad-justed to be a deutan, a protan, or a scotopic match to thetest mixture, minimizing the need for observers to makebrightness matches. This was particularly useful in test-ing young children for whom it was not possible to obtainreliable brightness matches because the appropriatebrightness mode was determined from performance on abattery of color-vision tests (including flicker photo-metry).'5 Since brightness matches that are significantlydifferent from deutan, protan, or scotopic matches are ofdiagnostic utility for acquired defects,3 automated anoma-loscopy could be improved by including a method forindividualizing the brightness mode for each observer,continually estimating the observer's spectral sensitivityby analyzing brightness matches throughout the test.16

Clinical ConsiderationsClinically, anomaloscopy is used to diagnose congenitalcolor defects and to evaluate acquired color-vision defects.For these purposes manual anomaloscopy by a skilled andtrained tester may be preferable to automated anoma-loscopy, since it permits personal interaction with an ex-perienced tester. The analysis presented in the currentstudy provides a formal method for evaluating and inter-preting response consistency and can be applied to datafrom manual anomaloscopy if enough trials are presentedand a careful record is kept of observer responses. Forclinical research, the standardization provided by auto-mated anomaloscopy may be desirable. Since in mostcases the staircase and maximum-likelihood estimateswere similar, the simpler staircase procedure shouldoffer most of the benefits of the maximum-likelihoodestimation.

The psychometric functions are defined in terms ofquestions regarding whether the mixture is redder orgreener than the standard. It is important to note thatthese were not the questions put to the observer but are

William H. Swanson

Page 10: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

1816 J. Opt. Soc. Am. A/Vol. 10, No. 8/August 1993

rather ways of interpreting the observer's responses. Asmentioned above (in the Procedure subsection), the ob-server was actually asked which side was redder. In deal-ing with some patients, particularly those with tritandefects, it may be impossible for the patient to respond interms of "mixture redder" and "mixture greener."

Match width provides an index of red-green discrimi-nation that can be useful in clinical studies, and use of astandardized method for evaluating response consistencyshould improve the accuracy of this index. For the 130patients with eye disease in the current study, when theend points were recomputed for the 80% points of thepsychometric functions the match width increased by4.3 ± 4.9 Nagel units. This illustrates the importance ofsystematizing the calculation of match width when study-ing a clinical population.

The data from 83 normals tested with a 2-deg fieldshowed effects of age on both match width and matchmidpoint. The decrease in match width with age appearsto be due to the fact that many young children had largematch widths. For the 4-13-year-old age group, the SDof the match width was much greater than for the 16-66-year-old age group. This suggests that children are muchmore variable than are adults in either their criteria for amatch or their color-discrimination abilities. In compari-son, SD's in match midpoints were similar for both chil-dren and adults, indicating that the age effect was not dueto greater variability than among the children. This sug-gests that children have either lower photopigment opticaldensity or less dense prereceptoral filters than do adults.Whereas the sources of these age effects are not clear, itappears that in clinical use the norms for children shouldbe established separately from the norms for adults.

For data from 30 normals tested with a range of fielddiameters, match width increased and match midpointdecreased as field size decreased. The increase in matchwidth with decrease in field diameter is consistent withthe known dependence of chromatic discrimination onfield size.'7 The decrease in match midpoint with de-crease in field diameter is a phenomenon known as thecolor-match-area effect and has previously been used withmanual anomaloscopy to document photoreceptor changesinduced by retinal disease.' The results indicate that forfield sizes of 1-8 deg the protocol is suitable for measuringthe color-match-area effect in a naYve population. How-ever, with the 0.5-deg field the variability in both matchwidth and match midpoint was greater, mean match widthwas dramatically larger, and, instead of decreasing fur-ther, the mean match midpoint actually increased slightly.Estimated slopes were shallower for the 0.5-deg field, indi-cating less reliable parameter estimates. These resultssuggest that, for clinical studies of the color-match-areaeffect, field diameters smaller than 1 deg may not yieldreliable data.

Comparisons with Psychometric Functions for OtherPerceptual TasksRelated analyses have previously been performed with per-ceptual tasks such as vernier acuity and hue judgments.In some cases a perceptual dimension of stimulus magni-tude has been evaluated by use of a psychometric functionbased on a physical aspect of the stimulus. However, such

metrics may not be appropriate. Whereas the currentstudy is not directly applicable to tasks other than colormatching, the basic principles and conclusions may berelevant.

Studies of spatial distortion in amblyopia have foundthat, for some amblyopes, test objects that are physicallyaligned may subjectively appear to be misaligned and somethat are misaligned may appear to be aligned.'8 Study ofthis phenomenon involves analysis of a perceptual stimu-lus offset in terms of physical stimulus offset. Rentschlerand Hilz'9 plotted fraction of "left" responses as a functionof linear vernier offset and fitted the data with a psycho-metric function whose 50% point determined the posi-tional offset and whose slope determined vernier acuity.Their analysis carries the implicit assumption that de-tectability of a vernier offset is a linear function of physi-cal displacement. If the relation is actually more complex(such as logarithmic), then the shapes of the psychometricfunctions fitted in this way would become distorted forsubjects with large differences between perceived align-ment and physical alignment. As with the current study,if the observer adopted a strategy of giving a fixed re-sponse whenever the stimulus magnitude was nonde-tectable, the midpoint estimate could be biased. Forvernier data, if the observer responded "left" wheneverthe stimuli seemed aligned, then the 50% point of thepsychometric function would be at the left end of therange of physical locations that appeared subjectivelyaligned, and the SD could be quite small. Thus estimatesof both offset and vernier acuity could be affected by ob-server strategy.

Unique-hue measurements require adjustment of aphysical parameter (wavelength) to obtain a specific per-ception (unique hue). Schefrin and Werner2 0 constructedpsychometric functions for unique-hue judgments. Forunique green they asked the subject to respond either"blue" or "yellow" and plotted the proportion of "yellow"responses as a function of wavelength. They then definedunique green as the 50% point of this function. Thisanalysis involves the implicit assumption that perceptualintensity is a linear function of wavelength. If this is nottrue, then the shapes of the psychometric functions couldbe distorted. Schefrin and Werner acknowledged thisand performed control experiments to determine the effectof the use of a wavelength axis. However, the problem ofobserver strategy remains: if an observer always re-sponded "yellow" when the hue appeared neither yellownor green, then the unique-hue estimate could be affected.

A range of studies has measured perceived contrast,perceived spatial frequency, and perceived temporal fre-quency, often by using adaptive staircase procedures.When comparisons are made between eyes or between dif-ferent retinal regions in the same eye, physically identicalstimuli may be perceptually different, and the situation issimilar to that in studies of spatial distortion. When dif-ferent spatial or temporal frequencies are compared at thesame retinal location (for instance, in contrast matching),the stimuli are never physically identical and usually arenot perceptually identical, and a single perceptual at-tribute is compared. This is similar to color matching,except that in color matching it is often possible to pro-duce perceptually identical stimuli. As with the color-matching data discussed in this paper, for these other

William H. Swanson

Page 11: Analysis of Rayleigh match data with psychometric · PDF fileAnalysis of Rayleigh match data with psychometric functions ... trast matching and spatial alignment. ... red-green ratios

Vol. 10, No. 8/August 1993/J. Opt. Soc. Am. A 1817

types of data it may be useful to analyze the staircases byuse of psychometric functions if appropriate metrics canbe found.

ACKNOWLEDGMENTSThis research was supported in part by National Eye In-stitute grant EY07716, by the Home Interiors and GiftsFund of the Communities Foundation of Texas, Inc., andby the Delta Gamma Foundation of Dallas.

REFERENCES AND NOTES

1. Lord Rayleigh, "Experiments on colour," Nature (London)25, 64-66 (1881).

2. W A. Nagel, "Zwei Apparate fur die augendrztliche Funktion-spriifung: Adaptometer und kleines Spektralphotometer(Anomaloskop)," Atschr. Augenh. 17, 201-222 (1907).

3. J. Pokorny, V C. Smith, G. Verriest, and A. J. L. G. Pinckers,Congenital and Acquired Color Vision Defects (Grune &Stratton, New York, 1979).

4. J. Neitz and G. H. Jacobs, "Polymorphism in normal humancolor vision and its mechanism," Vision Res. 4, 621-636(1990).

5. J. Winderickx, D. T. Lindsey, A. Sanbcki, D. Y. Teller, A. G.Motulsky, and S. S. Deeb, "Polymorphism in red photopig-ment underlies variation in color matching," Nature (London)356, 431-433 (1992).

6. A. E. Elsner, S. A. Burns, L. A. Lobes, and B. H. Doft, "Conephotopigment bleaching abnormalities in diabetes," Invest.Ophthalmol. Vis. Sci. 28, 718-724 (1987).

7. A. Eisner, V D. Stoumbos, M. L. Klein, and S. A. Fleming,"Relations between fundus appearance and function. Eyeswhose fellow eye has exudative age-related macular degenera-tion," Invest. Ophthalmol. Vis. Sci. 32, 8-20 (1991).

8. J. Pokorny, V C. Smith, and M. Lutze, 'A computer-controlledbriefcase anomaloscope," in Colour Vision Deficiencies IX,B. Drum and G. Verriest, eds. (Kluwer, Dordrecht, TheNetherlands, 1989), pp. 515-522.

9. M. Pellizone, J. Sommerhalder, A. Roth, and D. Hermes,'Automated Rayleigh and Moreland matches on a computer-controlled anomaloscope," in Colour Vision Deficiencies X,B. Drum, J. D. Mdreland, and A. Serra, eds. (Kluwer,Dordrecht, The Netherlands, 1991), pp. 151-159.

10. Base 2 is used for convenience, so that the 50% point is atx = 1.0.

11. Complete consistency of responses would yield an infiniteslope, and high levels of inconsistency could yield slopes nearzero, so permitting extremely high and low values for log(Q)could skew the means and produce large SD's. The limitedrange of 0.30-2.25 for log(f3) was used to avoid this prob-lem. Across all 500 values obtained for log(B), 17 (3%) were0.30 and 37 (7%) were 2.25, indicating that the range wasindeed limited, but not severely so. Since low values of ycould permit higher values for log(,3), if y had been permittedto be quite low this could have skewed the values of log(3) tohigher values. To minimize this problem the lowest valuethat was allowed for y was 0.90, since it seems unlikely thatthe wrong button was pushed more than 10% of the time.Out of the 500 values obtained for y, 94 (19%) were 0.90, indi-cating that such a lower limit was needed. Given the limitedrange of values of log(j) and the use of the y parameter, theSD for log(^3) can be considered a conservative estimate of in-dividual differences in slopes of the psychometric functions.

12. W H. Swanson and E. E. Birch, "Extracting thresholds fromnoisy psychophysical data," Percept. Psychophys. 51, 409-422 (1992).

13. J. Pokorny and V C. Smith, "Evaluation of single-pigmentshift model of anomalous trichromacy," J. Opt. Soc. Am. 67,1196-1209 (1977).

14. J. Helve, 'A comparative study of several diagnostic tests ofcolour vision used for measuring types and degrees of con-genital red-green defects," Acta Ophthalmol. Suppl. 115,1' 64 (1972).

15. W H. Swanson and M. Everett, "Color vision screening ofyoung children," J. Pediatr. Ophthalmol. Strabis. 29, 49-54(1992).

16. Implementing this may be complex since, for patients withcontrast defects, brightness matches can be highly variableand, at the extremes of the matching range, the color differ-ence increases apparent brightness for normals, causing anonlinear relation between brightness and luminance.

17. J. Pokorny, V C. Smith, and J. T. Ernest, "Macular color vi-sion defects: specialized psychophysical testing in acquiredand hereditary chorioretinal diseases," Int. Ophthalmol. Clin.20, 53-81 (1980).

18. H. E. Bedell and M. C. Flom, "Monocular spatial distortion instrabismic amblyopia," Invest. Ophthalmol. Vis. Sci. 20, 263-268 (1981).

19. I. Rentschler and R. Hilz, 'Amblyopic processing of positionalinformation. Part I. Vernier acuity," Exp. Brain Res. 60,270-278 (1985).

20. B. E. Schefrin and J. S. Werner, "Loci of spectral unique huesthroughout the life span," J. Opt. Soc. Am. A 7, 305-311(1990).

William H. Swanson