( -g>f

7
650 @ S. Hirzel Verlag . EAA ACUSTICA . acta acustica Vol. 85 (1999) 650- 656 A Multidimensional Technique for Sound Quality .A.ssessment Patrick Susini, Stephen McAdams*, Suzanne Winsberg Institut de Recherche et de Coordination Acoustique/Musique (TRCAM), I place Igor-Stravinsky, 75004 Paris, France Summary One approach to improving sound quality is to create a preference map on the basis of several acoustic parameters rele- vant to auditory perception. The map is derived from several stages of subjective testing, acollstic analysis, and auditory modeling. The multidimensional scaling technique CLASCAL reveals common perceptual jimensions shared by sets of sounds samples, perceptual features specific to each sound, and the different subject classes among listeners. The listeners are asked to judge the degree of dissimilarity of all pairs of sounds on a continuou; scale. The analysis gives a perceptual spatial representation of the sounds. From this analysis, acoustic and auditory modelling analyses can be performed to determine the stimulus parameters that are strongly correlated with different p~rceptual dimensions and, where possible, with the specific features. The next stage in the analysis involves determining the probability of one sound being preferred to another. An analysis of the data allows a projection of the structurc of listeners' prefcrences onto the physical parameter space underlying the previously determined multidimensional perceptual space. In many cases, it is found that the physical parameters having the most effect on the listeners' preferences are dependent on the set of stimuli being compared. Furthermore, when one stimulus parameter is kept constant across trials, this may alter the effects of other parameters on the listeners' preferences. Therefore context effects must be taken into ac- count in multidimensional sound quality analysis, particularly since the qualitative aspects of most sounds are clearly multidimensional. PACS no. 43.66.L, 43.66.Y, 43.50.Y 1. Introduction Noises emitted by domestic sound objects (e.g., light switches, vacuum cleaners, and coffee machines) or equip- ment (e.g., car motors, air conditioners, and windshield wipers) are sources whose intrinsic perceptual and acous- tic properties can be characterized and evaluated by clas- sical methods-of experimental assessment [1, 2] and mea- surement [3]. Concerning the characterization of the qual- ity or annoyance associated with a sound object, it is im- portant to define the appropriate techniques in order to al- low designers to define the acoustic properties of a prod- uct based on perceptual data derived from human listeners. In addition, certain acoustic aspects contribute in a signif- icant way to the auditory image projected by the object that produces them and are important for emphasizing its identity, ergonomics, esthetics, and functionality. The problem is that the sounds are complex and thus have a multidimensional nature from both acoustical and perceptual points of view. Further, not all of the possible dimensions have been characterized psychophysically. It is thus necessary at this stage in the development of sound quality research to determine the number of salient per- ceptual dimensions for a corpus of sounds associated with a given type of object, as well as their auditory and phys- ical nature, in order to characterize the sonic identity of the object type under study. Subsequently, it is necessary to determine in a similar way the multidimensional nature Received 15 December 1996, accepted I November 1998. * also with Laboratoire de Psychologic Experimcntale (CNRS), Universite Rene Descartes, EPHE, 28 rue Serpente, 75006 Paris of the quality or annoyance created by a sound source and thus to determine the combined or independent contribu- tions of the salient perceptual dimensions. However, the large majority of studies of sound quality often adopt the method of magnitude estimation [4] or the method of pre- defined semantic differentials [5]. In the former case, the human listener must give a numerical value that is propor- tional to unpleasantness, for example, and the data anal- ysis yields a unidimensional unpleasantness scale. This scale cannot be used to deduce the contribution of different auditory attributes and acoustic parameters to the position of each judged sound source on the scale. In the latter case, the listener evaluates the ,ounds on different continuous scales, the extremities of which are defined by opposing adjectives: bright/dull, loud/soft, agreeable/disagreeable, etc. Each scale thus defined allows a measurement of the predefined auditory (semaltic) attributes without specify- ing their perceptual salience. These methods thus have limits in their ability to cbaracterize the quality of sound objects, particularly when the salient perceptual, acoustic, and semantic properties ot' the objects are not known in advance. Research in the cognitive psychology of audition re- veals the interest of multidimensional analysis methods in order to characterize perceptually the dimensions of the timbre of musical instruments [6, 7]. Recently at IRCAM [8], a three-dimensional perceptual space was obtained using the multidimensional scaling program CLASCAL [9]. A quantitative correlation was established between acoustic parameters and th~ coordinates of the instrument sounds in the perceptual space, one dimension having a spectral nature, another a temporal nature, and the last a spectrotemporal nature [10].

Upload: others

Post on 20-Mar-2022

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ( -G>F

650 S Hirzel Verlag EAAACUSTICA acta acustica

Vol 85 (1999) 650- 656

A Multidimensional Technique for Sound Quality Assessment

Patrick Susini Stephen McAdams Suzanne WinsbergInstitut de Recherche et de Coordination AcoustiqueMusique (TRCAM) I place Igor-Stravinsky 75004 Paris France

SummaryOne approach to improving sound quality is to create a preference map on the basis of several acoustic parameters rele-vant to auditory perception The map is derived from several stages of subjective testing acollstic analysis and auditorymodeling The multidimensional scaling technique CLASCAL reveals common perceptual jimensions shared by setsof sounds samples perceptual features specific to each sound and the different subject classes among listeners Thelisteners are asked to judge the degree of dissimilarity of all pairs of sounds on a continuou scale The analysis givesa perceptual spatial representation of the sounds From this analysis acoustic and auditory modelling analyses can beperformed to determine the stimulus parameters that are strongly correlated with different p~rceptual dimensions andwhere possible with the specific features The next stage in the analysis involves determining the probability of onesound being preferred to another An analysis of the data allows a projection of the structurc of listeners prefcrencesonto the physical parameter space underlying the previously determined multidimensional perceptual space In manycases it is found that the physical parameters having the most effect on the listeners preferences are dependent onthe set of stimuli being compared Furthermore when one stimulus parameter is kept constant across trials this mayalter the effects of other parameters on the listeners preferences Therefore context effects must be taken into ac-count in multidimensional sound quality analysis particularly since the qualitative aspects of most sounds are clearlymultidimensional

PACS no 4366L 4366Y 4350Y

1 Introduction

Noises emitted by domestic sound objects (eg lightswitches vacuum cleaners and coffee machines) or equip-ment (eg car motors air conditioners and windshieldwipers) are sources whose intrinsic perceptual and acous-tic properties can be characterized and evaluated by clas-sical methods-of experimental assessment [1 2] and mea-surement [3] Concerning the characterization of the qual-ity or annoyance associated with a sound object it is im-portant to define the appropriate techniques in order to al-low designers to define the acoustic properties of a prod-uct based on perceptual data derived from human listenersIn addition certain acoustic aspects contribute in a signif-icant way to the auditory image projected by the objectthat produces them and are important for emphasizing itsidentity ergonomics esthetics and functionality

The problem is that the sounds are complex and thushave a multidimensional nature from both acoustical andperceptual points of view Further not all of the possibledimensions have been characterized psychophysically Itis thus necessary at this stage in the development of soundquality research to determine the number of salient per-ceptual dimensions for a corpus of sounds associated witha given type of object as well as their auditory and phys-ical nature in order to characterize the sonic identity ofthe object type under study Subsequently it is necessaryto determine in a similar way the multidimensional nature

Received 15 December 1996accepted I November 1998

also with Laboratoire de Psychologic Experimcntale (CNRS)Universite Rene Descartes EPHE 28 rue Serpente 75006 Paris

of the quality or annoyance created by a sound source andthus to determine the combined or independent contribu-tions of the salient perceptual dimensions However thelarge majority of studies of sound quality often adopt themethod of magnitude estimation [4] or the method of pre-defined semantic differentials [5] In the former case thehuman listener must give a numerical value that is propor-tional to unpleasantness for example and the data anal-ysis yields a unidimensional unpleasantness scale Thisscale cannot be used to deduce the contribution of differentauditory attributes and acoustic parameters to the positionof each judged sound source on the scale In the latter casethe listener evaluates the ounds on different continuousscales the extremities of which are defined by opposingadjectives brightdull loudsoft agreeabledisagreeableetc Each scale thus defined allows a measurement of thepredefined auditory (semaltic) attributes without specify-ing their perceptual salience These methods thus havelimits in their ability to cbaracterize the quality of soundobjects particularly when the salient perceptual acousticand semantic properties ot the objects are not known inadvance

Research in the cognitive psychology of audition re-veals the interest of multidimensional analysis methods inorder to characterize perceptually the dimensions of thetimbre of musical instruments [6 7] Recently at IRCAM[8] a three-dimensional perceptual space was obtainedusing the multidimensional scaling program CLASCAL[9] A quantitative correlation was established betweenacoustic parameters and th~ coordinates of the instrumentsounds in the perceptual space one dimension having aspectral nature another a temporal nature and the last aspectrotemporal nature [10]

AClJSTlCAmiddot acta acusticaVol go (1999) Susini el a Applying psychoacoustics 65

II

PhIsiealdescri ption(j) (j)bull

I Perceptulll 1lt

(I Limuli)

J Ho different are Ihe

Dmiddot 1 timuli peivcd t be GISSlffil anty

judgemnl CLA

mPufcrcncejudgenunl

x llffl~X1hnllnr ~llflilJh IullllJ-ntbcdl~I(In f

~~ S Illl he ~I~~ I f ~plaquotfi~ IIIC rt1f lollnlUlI r lltIJ j

U 1M clphl rl rdlll1ln~ bullbullm r ln1CIIIl~ bullbullbullbull1111(111 I

I l~lta- ~CI1-hl rtrtllllhkKt 11 l)laquo1hlIl( lNdb~ t

WhlCh sllnlUlu~i preferred r

-rererence Iod~

y J

perceptually significantparume[ers

Preference Map

Figure I Global schema of the subjec-tive characterization approach involvingmultidimensional scaling of dissimilar-ity judgemenls (I) acoustic analyses andcorrelations with the perceptual space(II) and construction of a multidimen-sional preference map (III)

This approach is valid for oth~r types of sound objectsThis article proposes a general method for characterizingsound quality on the basis of objective criteria The dif-ferent stages of the method as well as the precautions tobe taken in using it will be discllssed below with a studyperformed on interior car sounds being used as an exam-ple [11 2] This study was performed at IRCAM in col-laboration with the automobile manufacturers Renault andPSA Citroen and falls under the framework of a contrac-tual agreement Due to confidemiality constraints the de-scription will remain formal without explicitly describingthe results obtained (identity of sound sources and exactnature of acoustic parameters) This constraint does nothowever affect the main aim of the article which is thepresentation of the experimental methods and data analy-sis techniques employed to explore sound quality in all itsnatural sonic complexity

A global view of the method showing the relationsamong the different stages is presented in Figure I At theoutset the sound corpus must be defined and the soundscarefully selected in order to obtain a relatively homoge-neous corpus (section 2) As a function of the class ofsound object(s) being studied the panel of listeners mustbe selected according to appropriate criteria (section 3)Globally a first step consists in determining the perceptualattributes common to the panel of listeners that are used tocompare sounds to one another (section 4) A multidimen-sional scaling analysis (CLASCAL) yields a spatial modelthat can be represented graphicdly and which reveals theperceptual structure underlying the listeners judgments interms of continuous dimension~ shared by all the soundsamples and specific features (specificities) of each soundThe CLASCAL output also allows an analysis of differ-ent judgment strategies used by listeners correspondingto their grouping into latent clas~es A subsequent acousticanalysis phase attempts to determine the acoustic and psy-choacoustic parameters of the slUnd signals that are cor-related with the positions of theamples along the percep-

tual dimensions (section 5) In the last stage the degree ofpreference (or inversely annoyance) associated with eachsound is evaluated as a function of the perceptually sig-nificant acoustic parameters revealed in the previous stage(section 6) The advantage of this approach is that it doesnot limit the exploration and characterization of the com-ponents of sound quality to parameters tha~ are alreadyknown It provides a method for finding new perceptu-ally salient parameters that engineers had not imaginedbut that listeners hear and use in their evaluations anyway

2 Establishment of the Sound Corpus

In order to ensure a realistic restitution of the sound fieldin the experimental situation that is as close as possibleto the original conditions it is necessary to use binauraltechniques [13] The car sounds used in the study wererecorded digitally by the automobile manufacturers with adummy-head system seated in each car in the front passen-ger seat with the car running on a test track The soundswere reproduced in the experiments by IRCAMs ISPWreal-time digital signal processing environment [14] us-ing the MAX application on the NeXT workstation Allsounds were edited to 5 s in duration with 50 ms attack anddecay ramps taking care to avoid sounds with recordingartifacts The three important points for sound restitutionare the followingbull the specifications of the binaural recording conditionsbull the inverse filtering that compensates for the etfects of

the transfer function of the sound acquisition and pre-sentation systems and

bull the calibration of the sound level for headphone presen-tationAn alternative to dummy-head recordings may also be

used and is often sufficient if not as realistic simplestereo microphone recordings A recent study [15] on ur-ban sound environments has shown that a stereo record-

652 Susini et al Applying psychoacoustics

Table I Composition of four groups of car sounds used

Motor speed 1 Motor speed 2

Loudness variable Group I Group 2Loudness equalized Group 3 Group 4

ing with cardioid microphones spaced 60 cm apart with a1000 opening conserves the ecologically valid characterof sound samples played back over loudspeakers Certainacoustic characteristics perceptually dominate and over-power other less salient ones This is often the case withloudness for example Two sounds that differ mainly interms of loudness will be judged different according tothis dimension with little contribution from other dimen-sions of variation being taken into account In order toeliminate overly obvious parameters it is sometimes nec-essary to equalize a corpus of sounds along these param-eters such as loudness duration and pitch However inorder to study thc influence of loudness pitch and broad-band noise level (the lattcr two being related hcre to mo-tor speed and car speed) on the perceptual representationfour groups of 16 car sounds were used The cars were thesame for each group Two combinations of motor speedand gear setting were employed In the first part of thestudy the loudness was not modified and the sounds werepresented as they would have been perceived in the carinterior In the second part the loudness of the sound sam-ples was equalized over each of the two sound sets Thecomposition of the four groups is summarized in Table 1

An initial listening to the sound samples is importantto discern salient perceptual properties that might influ-ence the judgments of subjects in the main experimentsIn fact it is necessary to avoid characteristics that are toorecognizable in certain samples due either to their strongidentity or to recording artifacts The set of cars tested cor-responded to the same price range There was no risk thatindividual cars would be recognized It is possible to de-termine more precisely the structure of the corpus used byclassing the sounds according to different predefined dom-inant criteria A cluster analysis [151 can reveal the degreeof homogeneity of the sound corpus If the tree structureobtained reveals a strong categorization of the corpus itis advisable to determine which categories best representthe objectives of the study in order to obtain appropriatestimuli

3 Subject Selection

According to the type of study it is generally advisablenot to use subjects having too great a knowledge of thetest product that could induce biases in their judgmentsA car mechanic for example would listen in a differentmuch more analytical way than an average listener (onemight speak of professional deformation) In the presentstudy subjects aged 25 to 45 years were recruited from alist of volunteers such that an equal repartition of sex and

ACUSTICA acta acusticaVo1S5 (1999)

age category were obtained 30 subjects were recruited forboth the dissimilarity (section 4) and preference (section6) studies and an additional 30 were recruited for the pref-erence study The subjects were required to have a drivingpermit and a car of the appropriate price range had to beowned by someone in their immediate family They werereimbursed for their participation

4 Dissimilarity ScaIing The CLASCAL Program

41 Apparatus

The digitized sounds were processed by inverse filteringon a NeXT computer equipped with an ISPW card and theMAX program They were subsequently converted to ana-log signals with ProPort converters before being amplifiedthrough a Canford stereo amplifier and sent to AKG 1000open-air headphones The experiment was run in the Psi-Exp computer environment [161 which provides stimuluspresentation data acquisition and graphic interface for thesubject

42 Procedure

The experiment was performed in four sessions with onesession for each gear settingmotor speed combinationwith and without loudnes equalization All 120 differentpairs between the 16 car sounds were presented in eachsession At the beginning of the session the subject lis-tened to all the samplesn a random order to get a senseof the range of variation possible Next 15 training trialswere presented to give the subject the possibility to be-come familiar with the rating task On each trial a pair ofsounds was presented separated by a I-s silence The sub-ject saw a horizontal slider on the computer screen witha cursor that could be moved with the computer mouseThe scale was labelled Very Similar at the left end andVery Dissimilar at the ~ightend A judgment was madeby moving the cursor to the desired position on the sliderand clicking on a button to record it in the computer Theinitial position of the cursor was random on each trial andthe subject had to move it at least once before enteringthe judgment Once the judgment was entered the programproceeded automatically to the next trial The judgmentswere coded on a scale from 0 (very similar) to 1 (very dis-similar) for example The scale could equally well varyfrom 0 to any positive nu mber

43 Multidimensional scaling analysis with CLASCAL

The data used in a multidimensional scaling (MDS) anal-ysis consist of judgments of dissimilarity between eachpair of the stimuli under consideration ie N(N-1 )2 judg-ments for N stimuli from each subject The dissimilari-ties are modelled as distances in an extended Euclideanspace of R dimensions In the spatial representation of theN stimuli a large dissimilarity is represented by a largedistance The CLASCAL model for the distance between

ACUSTICAmiddot acta acusticaVo1S5 (1999) Susini et al Applying psychoacoustics 653

03

02

g 010c

~ deg15-01

-02

-03

02

deg-02

Dimension II -04 -04Dimension I

04

Figure 2 Perceptual space of the stimuli fromGroup 3 derived from dissimilarity scaling

stimulus i and stimulus j developed by Winsberg and DeSoete [9 17] and presented in Figure 1 postulates com-mon dimensions shared by all stimuli specific attributes orspecificities particular to each stlmulus and latent classeseach of which have different saliences or weights for eachof the common dimensions and the set of specificitiesMaximum likelihood estimates of the model parametersare determined (that is the coordinates of each stimuluson each dimension of the common space the specifities ofthe stimuli the dimension weights for each class and theproportion of subjects in each class) An EM (expectation-maximization) algorithm is used to determine the classstructure (See [9] for a complete description of the CLAS-CAL algorithm and the latent class approch to MOSSpace does not permit an adequate description here) Theclass structure is latent there is no apriori assumption con-cerning the latent class to which a given subject belongsModel selection including the choice of the appropriatenumber of latent classes the number of common dimen-sions and the presence or absence of specificities is basedon the BIC statistic which is information based and de-pends on the log likelihood the number of model param-eters and the number of observations [18] as well as on aMonte Carlo procedure [19] Models with the lowest BICvalues are chosen The Monte Carlo procedure is used todetermine the appropriate numh~r of latent classes and toverify the spatial model (See [9] for a discussion on modelselection) The CLASCAL analyses yield a spatial repre-sentation of the N stimuli on the R dimensions the speci-ficity of each stimulus the probability that each subjectbelongs to each latent class (generally equal to one forone of the classes) and the weights or saliences of eachperceptual dimension for each class Also included in theoutput results are the model variance the log likelihoodthe value of the BIC statistic and the results of the MonteCarlo procedure if used

44 Results

A single class was sufficient for all data sets An anal-ysis with the number of dimensions varying between Iand 6 without specificities and between I and 5 withspecificities was then performed for a single subject classBIC indicated that the best model was three-dimensionalwith specificities for Group 3 and two-dimensional withspecificities for the other groups A graphic representa-tion of the data corresponding to the sounds of Group 3 isgiven in Figure 2 Each sample is represented in the three-dimensional common space The specifities of the stimuliare not shown in the figure

5 Physical Parameters Underlying the PerceptualSpace

51 Acoustic analysis

Once the perceptual configuration is obtained it is im-portant to give a physical interpretation An interpretationmeans some systematic relationship between the stimuluscharacteristics and the locations in the space

An appropriate spectral representation for the signalanalyses is chosen Since the car sounds studied (constantmotor speed) had a quasi-stationary character the methodof Welch [20] was used In order to take into account audi-tory response as a function of level and frequency the ana-lytic formulations of different classic weighting functionswere used (dBA dBB dBC) A physiological modelingapproach was also used in which the operation of the pe-ripheral auditory system was considered at two levels thefiltering due to the external and middle ears was modeledwith a second-order Butterworth bandpass filter with -3-dB cutofffrequencies of 450 Hz and 8500 Hz [21] As forthe cochlea Moore and Glasbergs [22] ERB (equivalent

654 Susini et al Applying psychoacousticsACUSTlCAmiddot acta acustica

Vol 85 (1999)

61 Procedure

6 Preference Analysis A Thurstone Case VPreference Model with Spline Transformations

At the beginning of the session the subject listened to allthe samples in a random order to get a sense of the rangeof variation possible Then all pairs of different sampleswere presented in a random order On each trial the subjectheard a pair of samples only once and had to choose whichsound was preferred The data consist of the proportion ofsubjects that preferred one sample over another for eachof the N stimuli

(I)

(2)ll = 1(a) + g(b)

where Pij is the probability of preferring sample i oversample j ltJgt is the cumulative normal (Gaussian) functionand lli is the utility of sample i

The greater and more positive the difference in utilitybetween the two sample is the more sample i will bepreferred over sample j

Two types of models were tested additive and multi-variate In the additive model (2) it is assumed that thecontributions of the two parameters are independent andthat the global utility of a sample is the sum of the utilitiesof each parameter Thusu is a function of the predictorparameters (a and b) each having its own transform func-tion (f and g respectively)

62 Stages in the prefer~nce analysis

This phase of the study atempts to construct a preferencemap De Soete and Winsberg [25] developed a Thursto-nian pairwise choice mOGel with univariate and multivari-ate spline transformations In their preference model theprobability that stimulus i is prefered over stimulus j isdefined as a function of be difference in utility of eachstimulus or sample The preference analysis program de-veloped by De Soete and Winsberg is presently limited totwo dimensions generally considered as adequate to de-scribe a preference space (as is the case with Groups I 2and 4) If three perceptual dimensions are recovered in anMDS analysis (as with Group 3) all pairs of candidatesmay be tested The model seeks a function that transformsthe values of the physical parameters ai and bi for stimu-lus i with a utility value Vi that depends on the preferenceprobability The dependence is expressed as follows

stimulus set The preference analysis to be examined inthe next section will determine the utility for preference ofeach of the predictor parameters Those that have a strongweight can thus in turn be equalized over the sound setin a subsequent stage of experimentation As such thisapproach allows a progressive minimization of perceiveddifferences in terms of preference or unpleasantness be-tween the stimuli

Group 1 Group 2

Parameter Dim Dim 2 Dim Dim 2

Pl -092 -006 -091 028P2 009 080 - -

P3 - - 043 -088

Group 3 Group 4

Parameter Dim I Dim 2 Dim 3 Dim I Dim 2

P2 035 -07 -014 -093 -029

P3 - - - -051 t 086

P4 -08] 032 -033 - -

Pr -032 000 -083 - -

rectangular bandwidth) formula was adopted for auditoryfilter bandwidths and Patterson et al s [23] gammatone fil-ters were used On thc bases of these early analysis stagesa set of parameters derived from the psychoacoustic lit-erature [24] and from studies on timbre [10] were calcu-lated These parameters were chosen by way of an em-piricalloop that consisted of listening to the sounds withrespect to their relative positions in the multidimensionalspace in order to isolate the kind of variation that charac-terized a given dimension Next hypotheses on the newphysical or psychoacoustic indices that were appropriatefor the stimuli under study were made and the analyseswere performed to extract the quantitative values Finallythe correlation of each parameter with the coordinates ofthe sound samples along the perceptual dimensions wasevaluated A good correlation is taken as an indication thatthe parameter is a strong candidate for a quantitative pre-dictor of the perceptual dimension

Tabe II Correlation coefficients between perceptual coordinatesand objective parameters The probability p that the two measuresare independent is indicated for p lt 001 () and for p lt 005 (t)Parameter PI = Loudness

52 Correlations

In Table II are presented the correlation coefficients be-tween the perceptual dimensions and the physical or psy-choacoustic parameters for each sample group The prob-ability (p) that the perceptual coordinates and the parame-ters values are independent (or unrelated) is given as wellThe variance in the perceptual dimension explained bythe predictor parameter can be determined by taking thesquare of the correlation coefficient Finally for each ofthe dimensions a quantitative descriptor was determinedIt is important to note that in equalizing loudness overthe set of samples (Groups 3 and 4) other dimensionsemerged that correlated with parameters not revealed inthe analysis of Groups I and 2 with loudness variationThis kind of context effect is particularly important toemphasize since it demonstrates that the salient percep-tual parameters one obtains in such a study depend verystrongly on the kinds of physical variation present in the

ACUSTlCA acta acusticaVol 85 (1999) Susini et a1Applying psychoacoustics 655

15

-1_511 --_--- -b~_1

Leq (dElA)

of parameters in terms of their interpretability The anal-yses yield the preference function for each dimension foran additive model and the preference surface for a multi-variate model

As an example the utility functions for acoustic leveland another parameter (parameter 2) are shown in Figure 3in which the additive model was retained Not surpris-ingly the utility function for level is more or less mono-tonic the more the level increases the more the utilityfor preference decreases The variation in utility with theother parameter is nonmontonic however its overall in-fluence on preference is comparatively small and is com-pletely overpowered by the effect of level variation Forthe level-equalized sample sets the parameters had moreequivalent influence on preference and some of these wereclearly nonmonotonic indicating that there are zones inthe middle of the range of variation that are maximally orminimally preferred by the panel of listeners

11105paramete 2

N 1

~ 05 ~~~ -4

- 00~-O5

5 -1

-15

Figure 3 Evolution of the utility ofLeq (dBA) and parameter 2 fromthe stimuli of group 2 The circles (0) correspond to the stimuli andthe asterics () correspond to the placement of the internal knots ofthe spline

7 Conclusions

The functions f and g are spline that is piecewise poly-nomial functions the number and order of each polyno-mial being variable The order of the spline is taken to beequal to the maximum degree of the polynomial plus oneThe junctions between the polynomials of a given splinecorrespond to the internal knots and are of finite num-ber Attention is restricted to splines with maximal con-tinuity at each knot That is for a spline of order k thefunction and its k - 2 derivatives are continuous The de-fault knot positions are located at a ladder of quantilesthus an equal number of observations is used to determineeach spline parameter Each spEne of a given degree canbe represented as a linear comhination of a set of basisfunctions

In the multivariate model (3) the utility is a function ofboth parameters

This multivariate function is represented by a multivari-ate spline that can be defined as a linear combination oftensor products of univariate basis splines In this modeltherefore the effect of one parameter on utility dependson the value of the other parameter

For a preference matrix and a given set of predictor pa-rameters a large number of models may be tested by vary-ing the type of model (additive or multivariate) and theorder and number of internal knots of the splines BIC isused to chose between the candidate models For a givenpreference matrix this criterion can also be used to choosefrom among various pairs of predictor parameters that hadmore or less similar correlations with the pereeptual spaceThis latter choice is of a heuristic nature since the corre-lations with the different perceptual dimensions are alsotaken into account In general the best model is chosenfor each set of parameters and the resulting utility func-tions are examined in order toompare the different sets

1t = f(a b) (3)

The data analysis reveals representations of the sets of carsounds that have several perceptual dimensions for bothcombinations of gear setting and motor speed The loud-ness equalization allowed the emergence of other factorscontributing both to similarity perception and quite no-tably to preference When loudness is a fact~ in the com-parison of car sounds it clearly dominates the preferencejudgments The role of the secondary parameters varywith gear setting and motor speed and globally have lessimportance in these judgments Generally the perceptualsalience of the dimensions is not the same for the two gearsettingmotor speed combinations indicating that the rel-ative importance of perceptual cues evolves with changesin car functioning and context (equalized loudness or not)

This method when applied to very ditlerent sets ofsound objects (musical instruments and car sounds) thusreveals I) a multidimensional mental representation ofcomplex sound events 2) a close relation between the per-ceptual dimensions and acoustic properties or their au-ditory transforms and 3) specific properties of certainsounds that affect their perception and comparison withother sounds For musical instrument sounds differencesbetween listeners in the perceptual importance accordingto the different dimensions and specificities have also beenfound While such differences were not found in the stud-ies on car sounds mentioned above they should be sys-tematically verified in such experiments in case they doexist The coherence of the results across these differentstimulus sets indicates a great stability in the perceptualprocesses involved in the appreciation of homogeneoussets of sounds Further it is important to emphasize thataside from loudness none of the objective parameters re-vealed corresponded to those currently included by defaultin most sound quality measuring devices A cautionarynote on a methodological issue is warranted here how-ever In another similar study with a set of environmentalsounds that were extremely heterogeneous a perceptual

656 Susini et al Applying psychoacoustics

structure was found that was strongly categorical in natureand which was associated with a high degree of identi-fication of the individual sound sources by the listeners[26] Judgments of dissimilarity seemed in this case to bebased entirely on the categories that the stimuli belongedto and less to their individual acoustic and sensory proper-ties Thus this predominant cognitive factor-recognitionclassification and identification of the sound source [12]-rendered inappropriate the conception of the perceptualspace in terms of continuous underlying dimensions Theabove techniques of preference analysis are not applica-ble in such cases and other techniques need to be usedHowever for homogenous sound sets the experimentalapproach described above is a powerful method for under-standing the perceptual underpinnings of sound quality

References

[] S S Stevens The direct estimation of sensory magnitude-loudness American Journal of Psychology 69 (1956) 1-25

[2] S S Stevens E H Galanter Ratio scales and category scalesfor a dozen perceptual continua Journal of Experimental Psy-chology 54 (1957) 377-411

l3J E Zwicker K Deuter W Peisl Loudness meters based onISO 532 B with large dynamic range InterNoise 85 Pro-ceedings 1985 International Conference on Noise Control En-gineering 19851119-1122

[4] B Berglund U Berglund T Lindvall Scaling loudness nois-ness and annoyance of aircraft noise J Acoust Soc Am 57(1975) 930-934

[5] E A Bjork The perceived quality of natural sounds Acustica57 (1985) 185- 188

[6] J M Grey Multidimensional perceptual scaling of musicaltimbres J Acoust Soc Am 61 (1997) 1270-1277

[7] C L Krumhansl Why is musical timbre so hard to under-stand - In Structure and Perception of Electroacoustic Soundand Music S Nielzen O Olsson (eds) Amsterdam 198943-53

[8] S McAdams S Winsberg S Donnadieu G De Soete JKrimphoff Perceptual scaling of synthesized musical timbresCommon dimensions specificities and latent subject classesPsychological Research 58 (1995) 177-192

[9] S Winsberg G De Soete A latent class approach to fitting theweighted Euclidean model elascal Psychometrika 58 (1993)315-330

ACUSTICAmiddot acla acuslicaVol 85 (1999)

[10] J Krimphoff S McAdams S Winsberg Caracterisation dutimbre des sons complexes II Analyses acoustiques et quan-tification psychophysique Journal de Physique 4(C5) (1994)625-628

[I] 1 P Susini S McAdams S Winsberg Caracterisation percep-tive des bruits de vchiClles Aetes du 4eme Congres Francaisd Acoustique I (1997) 543-546

[12] S McAdams Recognition of auditory sources and events-In Thinking in Sound The Cognitive Psychology of HumanAudition S McAdams E Bigand (eds) Oxford UniversityPress Oxford 1993 146-198

[13] K Genuit Investigations and simulation of vehicle noise us-ing the binaural mesurement technique Noise and VibrationConference Traverse City Michigan 198728-30

[14J C Vogel VMaffiolo JmiddotD Polack M Castellcngo Validationsubjective de la prise de son exterieur 4eme Congres Francaisd Aeoustique Marseille France 1997307-3 O

[I5J B Everitt Cluster analysis John Wiley New York USA[16] B Smith PsiExp an environment for psychoacoustic experi-

mentation using the IRCAM musical workstation Society forMusic Perception and C)gnition Conference95 University ofCalifornia Berkeley USA 1995

[17] S Winsberg G De Sode An extended Euclidean model formultidimensional scaling Proceedings of the IFCS KoheJapan 1996

[] 8] G Schwarz Estimating the dimension of a model Annals ofStatistics 6 (1978) 46 I-464

[19] A C Hope A simplifil~d Monte Carlo significance test pro-cedure Journal of the Royal Statistical Society Series B 30(1968) 582-598

[20J A V Oppenheim R W Schafer Digital signal processingPrentice-Hall NJ 1975

[21] 1 M Pickles An introduction to the physiology of hearing2nd ed Academic Presi London 1988

[22] B C J Moore B R Glasberg Suggested formulae for calcu-lating auditory-bandwidths and exitation patterns J AcousSoc Am 74 (1983) 75C-753

[23] R D Patterson M H Allerhand C Giguere Time-domainmodeling of peripheral auditory processing A modular archi-tecture and a software platform J Acoust Soc Am 98 (1995)1890-1894

[241 E Zwicker H Fastl Psychoacoustics Facts and modelsSpringer- Verlag Berlin 1990

[25] G De Soete S Winsberg A Thurstonian pairwise choicemodel with univariate and multivariate splinc transformationPychometrika 58 (1993233-256

[26] P Susini N Misdarii S Winsberg S McAdams Car-acterisation perceptive je bruits Acoustique et Techniques13 (998) 11-15

Page 2: ( -G>F

AClJSTlCAmiddot acta acusticaVol go (1999) Susini el a Applying psychoacoustics 65

II

PhIsiealdescri ption(j) (j)bull

I Perceptulll 1lt

(I Limuli)

J Ho different are Ihe

Dmiddot 1 timuli peivcd t be GISSlffil anty

judgemnl CLA

mPufcrcncejudgenunl

x llffl~X1hnllnr ~llflilJh IullllJ-ntbcdl~I(In f

~~ S Illl he ~I~~ I f ~plaquotfi~ IIIC rt1f lollnlUlI r lltIJ j

U 1M clphl rl rdlll1ln~ bullbullm r ln1CIIIl~ bullbullbullbull1111(111 I

I l~lta- ~CI1-hl rtrtllllhkKt 11 l)laquo1hlIl( lNdb~ t

WhlCh sllnlUlu~i preferred r

-rererence Iod~

y J

perceptually significantparume[ers

Preference Map

Figure I Global schema of the subjec-tive characterization approach involvingmultidimensional scaling of dissimilar-ity judgemenls (I) acoustic analyses andcorrelations with the perceptual space(II) and construction of a multidimen-sional preference map (III)

This approach is valid for oth~r types of sound objectsThis article proposes a general method for characterizingsound quality on the basis of objective criteria The dif-ferent stages of the method as well as the precautions tobe taken in using it will be discllssed below with a studyperformed on interior car sounds being used as an exam-ple [11 2] This study was performed at IRCAM in col-laboration with the automobile manufacturers Renault andPSA Citroen and falls under the framework of a contrac-tual agreement Due to confidemiality constraints the de-scription will remain formal without explicitly describingthe results obtained (identity of sound sources and exactnature of acoustic parameters) This constraint does nothowever affect the main aim of the article which is thepresentation of the experimental methods and data analy-sis techniques employed to explore sound quality in all itsnatural sonic complexity

A global view of the method showing the relationsamong the different stages is presented in Figure I At theoutset the sound corpus must be defined and the soundscarefully selected in order to obtain a relatively homoge-neous corpus (section 2) As a function of the class ofsound object(s) being studied the panel of listeners mustbe selected according to appropriate criteria (section 3)Globally a first step consists in determining the perceptualattributes common to the panel of listeners that are used tocompare sounds to one another (section 4) A multidimen-sional scaling analysis (CLASCAL) yields a spatial modelthat can be represented graphicdly and which reveals theperceptual structure underlying the listeners judgments interms of continuous dimension~ shared by all the soundsamples and specific features (specificities) of each soundThe CLASCAL output also allows an analysis of differ-ent judgment strategies used by listeners correspondingto their grouping into latent clas~es A subsequent acousticanalysis phase attempts to determine the acoustic and psy-choacoustic parameters of the slUnd signals that are cor-related with the positions of theamples along the percep-

tual dimensions (section 5) In the last stage the degree ofpreference (or inversely annoyance) associated with eachsound is evaluated as a function of the perceptually sig-nificant acoustic parameters revealed in the previous stage(section 6) The advantage of this approach is that it doesnot limit the exploration and characterization of the com-ponents of sound quality to parameters tha~ are alreadyknown It provides a method for finding new perceptu-ally salient parameters that engineers had not imaginedbut that listeners hear and use in their evaluations anyway

2 Establishment of the Sound Corpus

In order to ensure a realistic restitution of the sound fieldin the experimental situation that is as close as possibleto the original conditions it is necessary to use binauraltechniques [13] The car sounds used in the study wererecorded digitally by the automobile manufacturers with adummy-head system seated in each car in the front passen-ger seat with the car running on a test track The soundswere reproduced in the experiments by IRCAMs ISPWreal-time digital signal processing environment [14] us-ing the MAX application on the NeXT workstation Allsounds were edited to 5 s in duration with 50 ms attack anddecay ramps taking care to avoid sounds with recordingartifacts The three important points for sound restitutionare the followingbull the specifications of the binaural recording conditionsbull the inverse filtering that compensates for the etfects of

the transfer function of the sound acquisition and pre-sentation systems and

bull the calibration of the sound level for headphone presen-tationAn alternative to dummy-head recordings may also be

used and is often sufficient if not as realistic simplestereo microphone recordings A recent study [15] on ur-ban sound environments has shown that a stereo record-

652 Susini et al Applying psychoacoustics

Table I Composition of four groups of car sounds used

Motor speed 1 Motor speed 2

Loudness variable Group I Group 2Loudness equalized Group 3 Group 4

ing with cardioid microphones spaced 60 cm apart with a1000 opening conserves the ecologically valid characterof sound samples played back over loudspeakers Certainacoustic characteristics perceptually dominate and over-power other less salient ones This is often the case withloudness for example Two sounds that differ mainly interms of loudness will be judged different according tothis dimension with little contribution from other dimen-sions of variation being taken into account In order toeliminate overly obvious parameters it is sometimes nec-essary to equalize a corpus of sounds along these param-eters such as loudness duration and pitch However inorder to study thc influence of loudness pitch and broad-band noise level (the lattcr two being related hcre to mo-tor speed and car speed) on the perceptual representationfour groups of 16 car sounds were used The cars were thesame for each group Two combinations of motor speedand gear setting were employed In the first part of thestudy the loudness was not modified and the sounds werepresented as they would have been perceived in the carinterior In the second part the loudness of the sound sam-ples was equalized over each of the two sound sets Thecomposition of the four groups is summarized in Table 1

An initial listening to the sound samples is importantto discern salient perceptual properties that might influ-ence the judgments of subjects in the main experimentsIn fact it is necessary to avoid characteristics that are toorecognizable in certain samples due either to their strongidentity or to recording artifacts The set of cars tested cor-responded to the same price range There was no risk thatindividual cars would be recognized It is possible to de-termine more precisely the structure of the corpus used byclassing the sounds according to different predefined dom-inant criteria A cluster analysis [151 can reveal the degreeof homogeneity of the sound corpus If the tree structureobtained reveals a strong categorization of the corpus itis advisable to determine which categories best representthe objectives of the study in order to obtain appropriatestimuli

3 Subject Selection

According to the type of study it is generally advisablenot to use subjects having too great a knowledge of thetest product that could induce biases in their judgmentsA car mechanic for example would listen in a differentmuch more analytical way than an average listener (onemight speak of professional deformation) In the presentstudy subjects aged 25 to 45 years were recruited from alist of volunteers such that an equal repartition of sex and

ACUSTICA acta acusticaVo1S5 (1999)

age category were obtained 30 subjects were recruited forboth the dissimilarity (section 4) and preference (section6) studies and an additional 30 were recruited for the pref-erence study The subjects were required to have a drivingpermit and a car of the appropriate price range had to beowned by someone in their immediate family They werereimbursed for their participation

4 Dissimilarity ScaIing The CLASCAL Program

41 Apparatus

The digitized sounds were processed by inverse filteringon a NeXT computer equipped with an ISPW card and theMAX program They were subsequently converted to ana-log signals with ProPort converters before being amplifiedthrough a Canford stereo amplifier and sent to AKG 1000open-air headphones The experiment was run in the Psi-Exp computer environment [161 which provides stimuluspresentation data acquisition and graphic interface for thesubject

42 Procedure

The experiment was performed in four sessions with onesession for each gear settingmotor speed combinationwith and without loudnes equalization All 120 differentpairs between the 16 car sounds were presented in eachsession At the beginning of the session the subject lis-tened to all the samplesn a random order to get a senseof the range of variation possible Next 15 training trialswere presented to give the subject the possibility to be-come familiar with the rating task On each trial a pair ofsounds was presented separated by a I-s silence The sub-ject saw a horizontal slider on the computer screen witha cursor that could be moved with the computer mouseThe scale was labelled Very Similar at the left end andVery Dissimilar at the ~ightend A judgment was madeby moving the cursor to the desired position on the sliderand clicking on a button to record it in the computer Theinitial position of the cursor was random on each trial andthe subject had to move it at least once before enteringthe judgment Once the judgment was entered the programproceeded automatically to the next trial The judgmentswere coded on a scale from 0 (very similar) to 1 (very dis-similar) for example The scale could equally well varyfrom 0 to any positive nu mber

43 Multidimensional scaling analysis with CLASCAL

The data used in a multidimensional scaling (MDS) anal-ysis consist of judgments of dissimilarity between eachpair of the stimuli under consideration ie N(N-1 )2 judg-ments for N stimuli from each subject The dissimilari-ties are modelled as distances in an extended Euclideanspace of R dimensions In the spatial representation of theN stimuli a large dissimilarity is represented by a largedistance The CLASCAL model for the distance between

ACUSTICAmiddot acta acusticaVo1S5 (1999) Susini et al Applying psychoacoustics 653

03

02

g 010c

~ deg15-01

-02

-03

02

deg-02

Dimension II -04 -04Dimension I

04

Figure 2 Perceptual space of the stimuli fromGroup 3 derived from dissimilarity scaling

stimulus i and stimulus j developed by Winsberg and DeSoete [9 17] and presented in Figure 1 postulates com-mon dimensions shared by all stimuli specific attributes orspecificities particular to each stlmulus and latent classeseach of which have different saliences or weights for eachof the common dimensions and the set of specificitiesMaximum likelihood estimates of the model parametersare determined (that is the coordinates of each stimuluson each dimension of the common space the specifities ofthe stimuli the dimension weights for each class and theproportion of subjects in each class) An EM (expectation-maximization) algorithm is used to determine the classstructure (See [9] for a complete description of the CLAS-CAL algorithm and the latent class approch to MOSSpace does not permit an adequate description here) Theclass structure is latent there is no apriori assumption con-cerning the latent class to which a given subject belongsModel selection including the choice of the appropriatenumber of latent classes the number of common dimen-sions and the presence or absence of specificities is basedon the BIC statistic which is information based and de-pends on the log likelihood the number of model param-eters and the number of observations [18] as well as on aMonte Carlo procedure [19] Models with the lowest BICvalues are chosen The Monte Carlo procedure is used todetermine the appropriate numh~r of latent classes and toverify the spatial model (See [9] for a discussion on modelselection) The CLASCAL analyses yield a spatial repre-sentation of the N stimuli on the R dimensions the speci-ficity of each stimulus the probability that each subjectbelongs to each latent class (generally equal to one forone of the classes) and the weights or saliences of eachperceptual dimension for each class Also included in theoutput results are the model variance the log likelihoodthe value of the BIC statistic and the results of the MonteCarlo procedure if used

44 Results

A single class was sufficient for all data sets An anal-ysis with the number of dimensions varying between Iand 6 without specificities and between I and 5 withspecificities was then performed for a single subject classBIC indicated that the best model was three-dimensionalwith specificities for Group 3 and two-dimensional withspecificities for the other groups A graphic representa-tion of the data corresponding to the sounds of Group 3 isgiven in Figure 2 Each sample is represented in the three-dimensional common space The specifities of the stimuliare not shown in the figure

5 Physical Parameters Underlying the PerceptualSpace

51 Acoustic analysis

Once the perceptual configuration is obtained it is im-portant to give a physical interpretation An interpretationmeans some systematic relationship between the stimuluscharacteristics and the locations in the space

An appropriate spectral representation for the signalanalyses is chosen Since the car sounds studied (constantmotor speed) had a quasi-stationary character the methodof Welch [20] was used In order to take into account audi-tory response as a function of level and frequency the ana-lytic formulations of different classic weighting functionswere used (dBA dBB dBC) A physiological modelingapproach was also used in which the operation of the pe-ripheral auditory system was considered at two levels thefiltering due to the external and middle ears was modeledwith a second-order Butterworth bandpass filter with -3-dB cutofffrequencies of 450 Hz and 8500 Hz [21] As forthe cochlea Moore and Glasbergs [22] ERB (equivalent

654 Susini et al Applying psychoacousticsACUSTlCAmiddot acta acustica

Vol 85 (1999)

61 Procedure

6 Preference Analysis A Thurstone Case VPreference Model with Spline Transformations

At the beginning of the session the subject listened to allthe samples in a random order to get a sense of the rangeof variation possible Then all pairs of different sampleswere presented in a random order On each trial the subjectheard a pair of samples only once and had to choose whichsound was preferred The data consist of the proportion ofsubjects that preferred one sample over another for eachof the N stimuli

(I)

(2)ll = 1(a) + g(b)

where Pij is the probability of preferring sample i oversample j ltJgt is the cumulative normal (Gaussian) functionand lli is the utility of sample i

The greater and more positive the difference in utilitybetween the two sample is the more sample i will bepreferred over sample j

Two types of models were tested additive and multi-variate In the additive model (2) it is assumed that thecontributions of the two parameters are independent andthat the global utility of a sample is the sum of the utilitiesof each parameter Thusu is a function of the predictorparameters (a and b) each having its own transform func-tion (f and g respectively)

62 Stages in the prefer~nce analysis

This phase of the study atempts to construct a preferencemap De Soete and Winsberg [25] developed a Thursto-nian pairwise choice mOGel with univariate and multivari-ate spline transformations In their preference model theprobability that stimulus i is prefered over stimulus j isdefined as a function of be difference in utility of eachstimulus or sample The preference analysis program de-veloped by De Soete and Winsberg is presently limited totwo dimensions generally considered as adequate to de-scribe a preference space (as is the case with Groups I 2and 4) If three perceptual dimensions are recovered in anMDS analysis (as with Group 3) all pairs of candidatesmay be tested The model seeks a function that transformsthe values of the physical parameters ai and bi for stimu-lus i with a utility value Vi that depends on the preferenceprobability The dependence is expressed as follows

stimulus set The preference analysis to be examined inthe next section will determine the utility for preference ofeach of the predictor parameters Those that have a strongweight can thus in turn be equalized over the sound setin a subsequent stage of experimentation As such thisapproach allows a progressive minimization of perceiveddifferences in terms of preference or unpleasantness be-tween the stimuli

Group 1 Group 2

Parameter Dim Dim 2 Dim Dim 2

Pl -092 -006 -091 028P2 009 080 - -

P3 - - 043 -088

Group 3 Group 4

Parameter Dim I Dim 2 Dim 3 Dim I Dim 2

P2 035 -07 -014 -093 -029

P3 - - - -051 t 086

P4 -08] 032 -033 - -

Pr -032 000 -083 - -

rectangular bandwidth) formula was adopted for auditoryfilter bandwidths and Patterson et al s [23] gammatone fil-ters were used On thc bases of these early analysis stagesa set of parameters derived from the psychoacoustic lit-erature [24] and from studies on timbre [10] were calcu-lated These parameters were chosen by way of an em-piricalloop that consisted of listening to the sounds withrespect to their relative positions in the multidimensionalspace in order to isolate the kind of variation that charac-terized a given dimension Next hypotheses on the newphysical or psychoacoustic indices that were appropriatefor the stimuli under study were made and the analyseswere performed to extract the quantitative values Finallythe correlation of each parameter with the coordinates ofthe sound samples along the perceptual dimensions wasevaluated A good correlation is taken as an indication thatthe parameter is a strong candidate for a quantitative pre-dictor of the perceptual dimension

Tabe II Correlation coefficients between perceptual coordinatesand objective parameters The probability p that the two measuresare independent is indicated for p lt 001 () and for p lt 005 (t)Parameter PI = Loudness

52 Correlations

In Table II are presented the correlation coefficients be-tween the perceptual dimensions and the physical or psy-choacoustic parameters for each sample group The prob-ability (p) that the perceptual coordinates and the parame-ters values are independent (or unrelated) is given as wellThe variance in the perceptual dimension explained bythe predictor parameter can be determined by taking thesquare of the correlation coefficient Finally for each ofthe dimensions a quantitative descriptor was determinedIt is important to note that in equalizing loudness overthe set of samples (Groups 3 and 4) other dimensionsemerged that correlated with parameters not revealed inthe analysis of Groups I and 2 with loudness variationThis kind of context effect is particularly important toemphasize since it demonstrates that the salient percep-tual parameters one obtains in such a study depend verystrongly on the kinds of physical variation present in the

ACUSTlCA acta acusticaVol 85 (1999) Susini et a1Applying psychoacoustics 655

15

-1_511 --_--- -b~_1

Leq (dElA)

of parameters in terms of their interpretability The anal-yses yield the preference function for each dimension foran additive model and the preference surface for a multi-variate model

As an example the utility functions for acoustic leveland another parameter (parameter 2) are shown in Figure 3in which the additive model was retained Not surpris-ingly the utility function for level is more or less mono-tonic the more the level increases the more the utilityfor preference decreases The variation in utility with theother parameter is nonmontonic however its overall in-fluence on preference is comparatively small and is com-pletely overpowered by the effect of level variation Forthe level-equalized sample sets the parameters had moreequivalent influence on preference and some of these wereclearly nonmonotonic indicating that there are zones inthe middle of the range of variation that are maximally orminimally preferred by the panel of listeners

11105paramete 2

N 1

~ 05 ~~~ -4

- 00~-O5

5 -1

-15

Figure 3 Evolution of the utility ofLeq (dBA) and parameter 2 fromthe stimuli of group 2 The circles (0) correspond to the stimuli andthe asterics () correspond to the placement of the internal knots ofthe spline

7 Conclusions

The functions f and g are spline that is piecewise poly-nomial functions the number and order of each polyno-mial being variable The order of the spline is taken to beequal to the maximum degree of the polynomial plus oneThe junctions between the polynomials of a given splinecorrespond to the internal knots and are of finite num-ber Attention is restricted to splines with maximal con-tinuity at each knot That is for a spline of order k thefunction and its k - 2 derivatives are continuous The de-fault knot positions are located at a ladder of quantilesthus an equal number of observations is used to determineeach spline parameter Each spEne of a given degree canbe represented as a linear comhination of a set of basisfunctions

In the multivariate model (3) the utility is a function ofboth parameters

This multivariate function is represented by a multivari-ate spline that can be defined as a linear combination oftensor products of univariate basis splines In this modeltherefore the effect of one parameter on utility dependson the value of the other parameter

For a preference matrix and a given set of predictor pa-rameters a large number of models may be tested by vary-ing the type of model (additive or multivariate) and theorder and number of internal knots of the splines BIC isused to chose between the candidate models For a givenpreference matrix this criterion can also be used to choosefrom among various pairs of predictor parameters that hadmore or less similar correlations with the pereeptual spaceThis latter choice is of a heuristic nature since the corre-lations with the different perceptual dimensions are alsotaken into account In general the best model is chosenfor each set of parameters and the resulting utility func-tions are examined in order toompare the different sets

1t = f(a b) (3)

The data analysis reveals representations of the sets of carsounds that have several perceptual dimensions for bothcombinations of gear setting and motor speed The loud-ness equalization allowed the emergence of other factorscontributing both to similarity perception and quite no-tably to preference When loudness is a fact~ in the com-parison of car sounds it clearly dominates the preferencejudgments The role of the secondary parameters varywith gear setting and motor speed and globally have lessimportance in these judgments Generally the perceptualsalience of the dimensions is not the same for the two gearsettingmotor speed combinations indicating that the rel-ative importance of perceptual cues evolves with changesin car functioning and context (equalized loudness or not)

This method when applied to very ditlerent sets ofsound objects (musical instruments and car sounds) thusreveals I) a multidimensional mental representation ofcomplex sound events 2) a close relation between the per-ceptual dimensions and acoustic properties or their au-ditory transforms and 3) specific properties of certainsounds that affect their perception and comparison withother sounds For musical instrument sounds differencesbetween listeners in the perceptual importance accordingto the different dimensions and specificities have also beenfound While such differences were not found in the stud-ies on car sounds mentioned above they should be sys-tematically verified in such experiments in case they doexist The coherence of the results across these differentstimulus sets indicates a great stability in the perceptualprocesses involved in the appreciation of homogeneoussets of sounds Further it is important to emphasize thataside from loudness none of the objective parameters re-vealed corresponded to those currently included by defaultin most sound quality measuring devices A cautionarynote on a methodological issue is warranted here how-ever In another similar study with a set of environmentalsounds that were extremely heterogeneous a perceptual

656 Susini et al Applying psychoacoustics

structure was found that was strongly categorical in natureand which was associated with a high degree of identi-fication of the individual sound sources by the listeners[26] Judgments of dissimilarity seemed in this case to bebased entirely on the categories that the stimuli belongedto and less to their individual acoustic and sensory proper-ties Thus this predominant cognitive factor-recognitionclassification and identification of the sound source [12]-rendered inappropriate the conception of the perceptualspace in terms of continuous underlying dimensions Theabove techniques of preference analysis are not applica-ble in such cases and other techniques need to be usedHowever for homogenous sound sets the experimentalapproach described above is a powerful method for under-standing the perceptual underpinnings of sound quality

References

[] S S Stevens The direct estimation of sensory magnitude-loudness American Journal of Psychology 69 (1956) 1-25

[2] S S Stevens E H Galanter Ratio scales and category scalesfor a dozen perceptual continua Journal of Experimental Psy-chology 54 (1957) 377-411

l3J E Zwicker K Deuter W Peisl Loudness meters based onISO 532 B with large dynamic range InterNoise 85 Pro-ceedings 1985 International Conference on Noise Control En-gineering 19851119-1122

[4] B Berglund U Berglund T Lindvall Scaling loudness nois-ness and annoyance of aircraft noise J Acoust Soc Am 57(1975) 930-934

[5] E A Bjork The perceived quality of natural sounds Acustica57 (1985) 185- 188

[6] J M Grey Multidimensional perceptual scaling of musicaltimbres J Acoust Soc Am 61 (1997) 1270-1277

[7] C L Krumhansl Why is musical timbre so hard to under-stand - In Structure and Perception of Electroacoustic Soundand Music S Nielzen O Olsson (eds) Amsterdam 198943-53

[8] S McAdams S Winsberg S Donnadieu G De Soete JKrimphoff Perceptual scaling of synthesized musical timbresCommon dimensions specificities and latent subject classesPsychological Research 58 (1995) 177-192

[9] S Winsberg G De Soete A latent class approach to fitting theweighted Euclidean model elascal Psychometrika 58 (1993)315-330

ACUSTICAmiddot acla acuslicaVol 85 (1999)

[10] J Krimphoff S McAdams S Winsberg Caracterisation dutimbre des sons complexes II Analyses acoustiques et quan-tification psychophysique Journal de Physique 4(C5) (1994)625-628

[I] 1 P Susini S McAdams S Winsberg Caracterisation percep-tive des bruits de vchiClles Aetes du 4eme Congres Francaisd Acoustique I (1997) 543-546

[12] S McAdams Recognition of auditory sources and events-In Thinking in Sound The Cognitive Psychology of HumanAudition S McAdams E Bigand (eds) Oxford UniversityPress Oxford 1993 146-198

[13] K Genuit Investigations and simulation of vehicle noise us-ing the binaural mesurement technique Noise and VibrationConference Traverse City Michigan 198728-30

[14J C Vogel VMaffiolo JmiddotD Polack M Castellcngo Validationsubjective de la prise de son exterieur 4eme Congres Francaisd Aeoustique Marseille France 1997307-3 O

[I5J B Everitt Cluster analysis John Wiley New York USA[16] B Smith PsiExp an environment for psychoacoustic experi-

mentation using the IRCAM musical workstation Society forMusic Perception and C)gnition Conference95 University ofCalifornia Berkeley USA 1995

[17] S Winsberg G De Sode An extended Euclidean model formultidimensional scaling Proceedings of the IFCS KoheJapan 1996

[] 8] G Schwarz Estimating the dimension of a model Annals ofStatistics 6 (1978) 46 I-464

[19] A C Hope A simplifil~d Monte Carlo significance test pro-cedure Journal of the Royal Statistical Society Series B 30(1968) 582-598

[20J A V Oppenheim R W Schafer Digital signal processingPrentice-Hall NJ 1975

[21] 1 M Pickles An introduction to the physiology of hearing2nd ed Academic Presi London 1988

[22] B C J Moore B R Glasberg Suggested formulae for calcu-lating auditory-bandwidths and exitation patterns J AcousSoc Am 74 (1983) 75C-753

[23] R D Patterson M H Allerhand C Giguere Time-domainmodeling of peripheral auditory processing A modular archi-tecture and a software platform J Acoust Soc Am 98 (1995)1890-1894

[241 E Zwicker H Fastl Psychoacoustics Facts and modelsSpringer- Verlag Berlin 1990

[25] G De Soete S Winsberg A Thurstonian pairwise choicemodel with univariate and multivariate splinc transformationPychometrika 58 (1993233-256

[26] P Susini N Misdarii S Winsberg S McAdams Car-acterisation perceptive je bruits Acoustique et Techniques13 (998) 11-15

Page 3: ( -G>F

652 Susini et al Applying psychoacoustics

Table I Composition of four groups of car sounds used

Motor speed 1 Motor speed 2

Loudness variable Group I Group 2Loudness equalized Group 3 Group 4

ing with cardioid microphones spaced 60 cm apart with a1000 opening conserves the ecologically valid characterof sound samples played back over loudspeakers Certainacoustic characteristics perceptually dominate and over-power other less salient ones This is often the case withloudness for example Two sounds that differ mainly interms of loudness will be judged different according tothis dimension with little contribution from other dimen-sions of variation being taken into account In order toeliminate overly obvious parameters it is sometimes nec-essary to equalize a corpus of sounds along these param-eters such as loudness duration and pitch However inorder to study thc influence of loudness pitch and broad-band noise level (the lattcr two being related hcre to mo-tor speed and car speed) on the perceptual representationfour groups of 16 car sounds were used The cars were thesame for each group Two combinations of motor speedand gear setting were employed In the first part of thestudy the loudness was not modified and the sounds werepresented as they would have been perceived in the carinterior In the second part the loudness of the sound sam-ples was equalized over each of the two sound sets Thecomposition of the four groups is summarized in Table 1

An initial listening to the sound samples is importantto discern salient perceptual properties that might influ-ence the judgments of subjects in the main experimentsIn fact it is necessary to avoid characteristics that are toorecognizable in certain samples due either to their strongidentity or to recording artifacts The set of cars tested cor-responded to the same price range There was no risk thatindividual cars would be recognized It is possible to de-termine more precisely the structure of the corpus used byclassing the sounds according to different predefined dom-inant criteria A cluster analysis [151 can reveal the degreeof homogeneity of the sound corpus If the tree structureobtained reveals a strong categorization of the corpus itis advisable to determine which categories best representthe objectives of the study in order to obtain appropriatestimuli

3 Subject Selection

According to the type of study it is generally advisablenot to use subjects having too great a knowledge of thetest product that could induce biases in their judgmentsA car mechanic for example would listen in a differentmuch more analytical way than an average listener (onemight speak of professional deformation) In the presentstudy subjects aged 25 to 45 years were recruited from alist of volunteers such that an equal repartition of sex and

ACUSTICA acta acusticaVo1S5 (1999)

age category were obtained 30 subjects were recruited forboth the dissimilarity (section 4) and preference (section6) studies and an additional 30 were recruited for the pref-erence study The subjects were required to have a drivingpermit and a car of the appropriate price range had to beowned by someone in their immediate family They werereimbursed for their participation

4 Dissimilarity ScaIing The CLASCAL Program

41 Apparatus

The digitized sounds were processed by inverse filteringon a NeXT computer equipped with an ISPW card and theMAX program They were subsequently converted to ana-log signals with ProPort converters before being amplifiedthrough a Canford stereo amplifier and sent to AKG 1000open-air headphones The experiment was run in the Psi-Exp computer environment [161 which provides stimuluspresentation data acquisition and graphic interface for thesubject

42 Procedure

The experiment was performed in four sessions with onesession for each gear settingmotor speed combinationwith and without loudnes equalization All 120 differentpairs between the 16 car sounds were presented in eachsession At the beginning of the session the subject lis-tened to all the samplesn a random order to get a senseof the range of variation possible Next 15 training trialswere presented to give the subject the possibility to be-come familiar with the rating task On each trial a pair ofsounds was presented separated by a I-s silence The sub-ject saw a horizontal slider on the computer screen witha cursor that could be moved with the computer mouseThe scale was labelled Very Similar at the left end andVery Dissimilar at the ~ightend A judgment was madeby moving the cursor to the desired position on the sliderand clicking on a button to record it in the computer Theinitial position of the cursor was random on each trial andthe subject had to move it at least once before enteringthe judgment Once the judgment was entered the programproceeded automatically to the next trial The judgmentswere coded on a scale from 0 (very similar) to 1 (very dis-similar) for example The scale could equally well varyfrom 0 to any positive nu mber

43 Multidimensional scaling analysis with CLASCAL

The data used in a multidimensional scaling (MDS) anal-ysis consist of judgments of dissimilarity between eachpair of the stimuli under consideration ie N(N-1 )2 judg-ments for N stimuli from each subject The dissimilari-ties are modelled as distances in an extended Euclideanspace of R dimensions In the spatial representation of theN stimuli a large dissimilarity is represented by a largedistance The CLASCAL model for the distance between

ACUSTICAmiddot acta acusticaVo1S5 (1999) Susini et al Applying psychoacoustics 653

03

02

g 010c

~ deg15-01

-02

-03

02

deg-02

Dimension II -04 -04Dimension I

04

Figure 2 Perceptual space of the stimuli fromGroup 3 derived from dissimilarity scaling

stimulus i and stimulus j developed by Winsberg and DeSoete [9 17] and presented in Figure 1 postulates com-mon dimensions shared by all stimuli specific attributes orspecificities particular to each stlmulus and latent classeseach of which have different saliences or weights for eachof the common dimensions and the set of specificitiesMaximum likelihood estimates of the model parametersare determined (that is the coordinates of each stimuluson each dimension of the common space the specifities ofthe stimuli the dimension weights for each class and theproportion of subjects in each class) An EM (expectation-maximization) algorithm is used to determine the classstructure (See [9] for a complete description of the CLAS-CAL algorithm and the latent class approch to MOSSpace does not permit an adequate description here) Theclass structure is latent there is no apriori assumption con-cerning the latent class to which a given subject belongsModel selection including the choice of the appropriatenumber of latent classes the number of common dimen-sions and the presence or absence of specificities is basedon the BIC statistic which is information based and de-pends on the log likelihood the number of model param-eters and the number of observations [18] as well as on aMonte Carlo procedure [19] Models with the lowest BICvalues are chosen The Monte Carlo procedure is used todetermine the appropriate numh~r of latent classes and toverify the spatial model (See [9] for a discussion on modelselection) The CLASCAL analyses yield a spatial repre-sentation of the N stimuli on the R dimensions the speci-ficity of each stimulus the probability that each subjectbelongs to each latent class (generally equal to one forone of the classes) and the weights or saliences of eachperceptual dimension for each class Also included in theoutput results are the model variance the log likelihoodthe value of the BIC statistic and the results of the MonteCarlo procedure if used

44 Results

A single class was sufficient for all data sets An anal-ysis with the number of dimensions varying between Iand 6 without specificities and between I and 5 withspecificities was then performed for a single subject classBIC indicated that the best model was three-dimensionalwith specificities for Group 3 and two-dimensional withspecificities for the other groups A graphic representa-tion of the data corresponding to the sounds of Group 3 isgiven in Figure 2 Each sample is represented in the three-dimensional common space The specifities of the stimuliare not shown in the figure

5 Physical Parameters Underlying the PerceptualSpace

51 Acoustic analysis

Once the perceptual configuration is obtained it is im-portant to give a physical interpretation An interpretationmeans some systematic relationship between the stimuluscharacteristics and the locations in the space

An appropriate spectral representation for the signalanalyses is chosen Since the car sounds studied (constantmotor speed) had a quasi-stationary character the methodof Welch [20] was used In order to take into account audi-tory response as a function of level and frequency the ana-lytic formulations of different classic weighting functionswere used (dBA dBB dBC) A physiological modelingapproach was also used in which the operation of the pe-ripheral auditory system was considered at two levels thefiltering due to the external and middle ears was modeledwith a second-order Butterworth bandpass filter with -3-dB cutofffrequencies of 450 Hz and 8500 Hz [21] As forthe cochlea Moore and Glasbergs [22] ERB (equivalent

654 Susini et al Applying psychoacousticsACUSTlCAmiddot acta acustica

Vol 85 (1999)

61 Procedure

6 Preference Analysis A Thurstone Case VPreference Model with Spline Transformations

At the beginning of the session the subject listened to allthe samples in a random order to get a sense of the rangeof variation possible Then all pairs of different sampleswere presented in a random order On each trial the subjectheard a pair of samples only once and had to choose whichsound was preferred The data consist of the proportion ofsubjects that preferred one sample over another for eachof the N stimuli

(I)

(2)ll = 1(a) + g(b)

where Pij is the probability of preferring sample i oversample j ltJgt is the cumulative normal (Gaussian) functionand lli is the utility of sample i

The greater and more positive the difference in utilitybetween the two sample is the more sample i will bepreferred over sample j

Two types of models were tested additive and multi-variate In the additive model (2) it is assumed that thecontributions of the two parameters are independent andthat the global utility of a sample is the sum of the utilitiesof each parameter Thusu is a function of the predictorparameters (a and b) each having its own transform func-tion (f and g respectively)

62 Stages in the prefer~nce analysis

This phase of the study atempts to construct a preferencemap De Soete and Winsberg [25] developed a Thursto-nian pairwise choice mOGel with univariate and multivari-ate spline transformations In their preference model theprobability that stimulus i is prefered over stimulus j isdefined as a function of be difference in utility of eachstimulus or sample The preference analysis program de-veloped by De Soete and Winsberg is presently limited totwo dimensions generally considered as adequate to de-scribe a preference space (as is the case with Groups I 2and 4) If three perceptual dimensions are recovered in anMDS analysis (as with Group 3) all pairs of candidatesmay be tested The model seeks a function that transformsthe values of the physical parameters ai and bi for stimu-lus i with a utility value Vi that depends on the preferenceprobability The dependence is expressed as follows

stimulus set The preference analysis to be examined inthe next section will determine the utility for preference ofeach of the predictor parameters Those that have a strongweight can thus in turn be equalized over the sound setin a subsequent stage of experimentation As such thisapproach allows a progressive minimization of perceiveddifferences in terms of preference or unpleasantness be-tween the stimuli

Group 1 Group 2

Parameter Dim Dim 2 Dim Dim 2

Pl -092 -006 -091 028P2 009 080 - -

P3 - - 043 -088

Group 3 Group 4

Parameter Dim I Dim 2 Dim 3 Dim I Dim 2

P2 035 -07 -014 -093 -029

P3 - - - -051 t 086

P4 -08] 032 -033 - -

Pr -032 000 -083 - -

rectangular bandwidth) formula was adopted for auditoryfilter bandwidths and Patterson et al s [23] gammatone fil-ters were used On thc bases of these early analysis stagesa set of parameters derived from the psychoacoustic lit-erature [24] and from studies on timbre [10] were calcu-lated These parameters were chosen by way of an em-piricalloop that consisted of listening to the sounds withrespect to their relative positions in the multidimensionalspace in order to isolate the kind of variation that charac-terized a given dimension Next hypotheses on the newphysical or psychoacoustic indices that were appropriatefor the stimuli under study were made and the analyseswere performed to extract the quantitative values Finallythe correlation of each parameter with the coordinates ofthe sound samples along the perceptual dimensions wasevaluated A good correlation is taken as an indication thatthe parameter is a strong candidate for a quantitative pre-dictor of the perceptual dimension

Tabe II Correlation coefficients between perceptual coordinatesand objective parameters The probability p that the two measuresare independent is indicated for p lt 001 () and for p lt 005 (t)Parameter PI = Loudness

52 Correlations

In Table II are presented the correlation coefficients be-tween the perceptual dimensions and the physical or psy-choacoustic parameters for each sample group The prob-ability (p) that the perceptual coordinates and the parame-ters values are independent (or unrelated) is given as wellThe variance in the perceptual dimension explained bythe predictor parameter can be determined by taking thesquare of the correlation coefficient Finally for each ofthe dimensions a quantitative descriptor was determinedIt is important to note that in equalizing loudness overthe set of samples (Groups 3 and 4) other dimensionsemerged that correlated with parameters not revealed inthe analysis of Groups I and 2 with loudness variationThis kind of context effect is particularly important toemphasize since it demonstrates that the salient percep-tual parameters one obtains in such a study depend verystrongly on the kinds of physical variation present in the

ACUSTlCA acta acusticaVol 85 (1999) Susini et a1Applying psychoacoustics 655

15

-1_511 --_--- -b~_1

Leq (dElA)

of parameters in terms of their interpretability The anal-yses yield the preference function for each dimension foran additive model and the preference surface for a multi-variate model

As an example the utility functions for acoustic leveland another parameter (parameter 2) are shown in Figure 3in which the additive model was retained Not surpris-ingly the utility function for level is more or less mono-tonic the more the level increases the more the utilityfor preference decreases The variation in utility with theother parameter is nonmontonic however its overall in-fluence on preference is comparatively small and is com-pletely overpowered by the effect of level variation Forthe level-equalized sample sets the parameters had moreequivalent influence on preference and some of these wereclearly nonmonotonic indicating that there are zones inthe middle of the range of variation that are maximally orminimally preferred by the panel of listeners

11105paramete 2

N 1

~ 05 ~~~ -4

- 00~-O5

5 -1

-15

Figure 3 Evolution of the utility ofLeq (dBA) and parameter 2 fromthe stimuli of group 2 The circles (0) correspond to the stimuli andthe asterics () correspond to the placement of the internal knots ofthe spline

7 Conclusions

The functions f and g are spline that is piecewise poly-nomial functions the number and order of each polyno-mial being variable The order of the spline is taken to beequal to the maximum degree of the polynomial plus oneThe junctions between the polynomials of a given splinecorrespond to the internal knots and are of finite num-ber Attention is restricted to splines with maximal con-tinuity at each knot That is for a spline of order k thefunction and its k - 2 derivatives are continuous The de-fault knot positions are located at a ladder of quantilesthus an equal number of observations is used to determineeach spline parameter Each spEne of a given degree canbe represented as a linear comhination of a set of basisfunctions

In the multivariate model (3) the utility is a function ofboth parameters

This multivariate function is represented by a multivari-ate spline that can be defined as a linear combination oftensor products of univariate basis splines In this modeltherefore the effect of one parameter on utility dependson the value of the other parameter

For a preference matrix and a given set of predictor pa-rameters a large number of models may be tested by vary-ing the type of model (additive or multivariate) and theorder and number of internal knots of the splines BIC isused to chose between the candidate models For a givenpreference matrix this criterion can also be used to choosefrom among various pairs of predictor parameters that hadmore or less similar correlations with the pereeptual spaceThis latter choice is of a heuristic nature since the corre-lations with the different perceptual dimensions are alsotaken into account In general the best model is chosenfor each set of parameters and the resulting utility func-tions are examined in order toompare the different sets

1t = f(a b) (3)

The data analysis reveals representations of the sets of carsounds that have several perceptual dimensions for bothcombinations of gear setting and motor speed The loud-ness equalization allowed the emergence of other factorscontributing both to similarity perception and quite no-tably to preference When loudness is a fact~ in the com-parison of car sounds it clearly dominates the preferencejudgments The role of the secondary parameters varywith gear setting and motor speed and globally have lessimportance in these judgments Generally the perceptualsalience of the dimensions is not the same for the two gearsettingmotor speed combinations indicating that the rel-ative importance of perceptual cues evolves with changesin car functioning and context (equalized loudness or not)

This method when applied to very ditlerent sets ofsound objects (musical instruments and car sounds) thusreveals I) a multidimensional mental representation ofcomplex sound events 2) a close relation between the per-ceptual dimensions and acoustic properties or their au-ditory transforms and 3) specific properties of certainsounds that affect their perception and comparison withother sounds For musical instrument sounds differencesbetween listeners in the perceptual importance accordingto the different dimensions and specificities have also beenfound While such differences were not found in the stud-ies on car sounds mentioned above they should be sys-tematically verified in such experiments in case they doexist The coherence of the results across these differentstimulus sets indicates a great stability in the perceptualprocesses involved in the appreciation of homogeneoussets of sounds Further it is important to emphasize thataside from loudness none of the objective parameters re-vealed corresponded to those currently included by defaultin most sound quality measuring devices A cautionarynote on a methodological issue is warranted here how-ever In another similar study with a set of environmentalsounds that were extremely heterogeneous a perceptual

656 Susini et al Applying psychoacoustics

structure was found that was strongly categorical in natureand which was associated with a high degree of identi-fication of the individual sound sources by the listeners[26] Judgments of dissimilarity seemed in this case to bebased entirely on the categories that the stimuli belongedto and less to their individual acoustic and sensory proper-ties Thus this predominant cognitive factor-recognitionclassification and identification of the sound source [12]-rendered inappropriate the conception of the perceptualspace in terms of continuous underlying dimensions Theabove techniques of preference analysis are not applica-ble in such cases and other techniques need to be usedHowever for homogenous sound sets the experimentalapproach described above is a powerful method for under-standing the perceptual underpinnings of sound quality

References

[] S S Stevens The direct estimation of sensory magnitude-loudness American Journal of Psychology 69 (1956) 1-25

[2] S S Stevens E H Galanter Ratio scales and category scalesfor a dozen perceptual continua Journal of Experimental Psy-chology 54 (1957) 377-411

l3J E Zwicker K Deuter W Peisl Loudness meters based onISO 532 B with large dynamic range InterNoise 85 Pro-ceedings 1985 International Conference on Noise Control En-gineering 19851119-1122

[4] B Berglund U Berglund T Lindvall Scaling loudness nois-ness and annoyance of aircraft noise J Acoust Soc Am 57(1975) 930-934

[5] E A Bjork The perceived quality of natural sounds Acustica57 (1985) 185- 188

[6] J M Grey Multidimensional perceptual scaling of musicaltimbres J Acoust Soc Am 61 (1997) 1270-1277

[7] C L Krumhansl Why is musical timbre so hard to under-stand - In Structure and Perception of Electroacoustic Soundand Music S Nielzen O Olsson (eds) Amsterdam 198943-53

[8] S McAdams S Winsberg S Donnadieu G De Soete JKrimphoff Perceptual scaling of synthesized musical timbresCommon dimensions specificities and latent subject classesPsychological Research 58 (1995) 177-192

[9] S Winsberg G De Soete A latent class approach to fitting theweighted Euclidean model elascal Psychometrika 58 (1993)315-330

ACUSTICAmiddot acla acuslicaVol 85 (1999)

[10] J Krimphoff S McAdams S Winsberg Caracterisation dutimbre des sons complexes II Analyses acoustiques et quan-tification psychophysique Journal de Physique 4(C5) (1994)625-628

[I] 1 P Susini S McAdams S Winsberg Caracterisation percep-tive des bruits de vchiClles Aetes du 4eme Congres Francaisd Acoustique I (1997) 543-546

[12] S McAdams Recognition of auditory sources and events-In Thinking in Sound The Cognitive Psychology of HumanAudition S McAdams E Bigand (eds) Oxford UniversityPress Oxford 1993 146-198

[13] K Genuit Investigations and simulation of vehicle noise us-ing the binaural mesurement technique Noise and VibrationConference Traverse City Michigan 198728-30

[14J C Vogel VMaffiolo JmiddotD Polack M Castellcngo Validationsubjective de la prise de son exterieur 4eme Congres Francaisd Aeoustique Marseille France 1997307-3 O

[I5J B Everitt Cluster analysis John Wiley New York USA[16] B Smith PsiExp an environment for psychoacoustic experi-

mentation using the IRCAM musical workstation Society forMusic Perception and C)gnition Conference95 University ofCalifornia Berkeley USA 1995

[17] S Winsberg G De Sode An extended Euclidean model formultidimensional scaling Proceedings of the IFCS KoheJapan 1996

[] 8] G Schwarz Estimating the dimension of a model Annals ofStatistics 6 (1978) 46 I-464

[19] A C Hope A simplifil~d Monte Carlo significance test pro-cedure Journal of the Royal Statistical Society Series B 30(1968) 582-598

[20J A V Oppenheim R W Schafer Digital signal processingPrentice-Hall NJ 1975

[21] 1 M Pickles An introduction to the physiology of hearing2nd ed Academic Presi London 1988

[22] B C J Moore B R Glasberg Suggested formulae for calcu-lating auditory-bandwidths and exitation patterns J AcousSoc Am 74 (1983) 75C-753

[23] R D Patterson M H Allerhand C Giguere Time-domainmodeling of peripheral auditory processing A modular archi-tecture and a software platform J Acoust Soc Am 98 (1995)1890-1894

[241 E Zwicker H Fastl Psychoacoustics Facts and modelsSpringer- Verlag Berlin 1990

[25] G De Soete S Winsberg A Thurstonian pairwise choicemodel with univariate and multivariate splinc transformationPychometrika 58 (1993233-256

[26] P Susini N Misdarii S Winsberg S McAdams Car-acterisation perceptive je bruits Acoustique et Techniques13 (998) 11-15

Page 4: ( -G>F

ACUSTICAmiddot acta acusticaVo1S5 (1999) Susini et al Applying psychoacoustics 653

03

02

g 010c

~ deg15-01

-02

-03

02

deg-02

Dimension II -04 -04Dimension I

04

Figure 2 Perceptual space of the stimuli fromGroup 3 derived from dissimilarity scaling

stimulus i and stimulus j developed by Winsberg and DeSoete [9 17] and presented in Figure 1 postulates com-mon dimensions shared by all stimuli specific attributes orspecificities particular to each stlmulus and latent classeseach of which have different saliences or weights for eachof the common dimensions and the set of specificitiesMaximum likelihood estimates of the model parametersare determined (that is the coordinates of each stimuluson each dimension of the common space the specifities ofthe stimuli the dimension weights for each class and theproportion of subjects in each class) An EM (expectation-maximization) algorithm is used to determine the classstructure (See [9] for a complete description of the CLAS-CAL algorithm and the latent class approch to MOSSpace does not permit an adequate description here) Theclass structure is latent there is no apriori assumption con-cerning the latent class to which a given subject belongsModel selection including the choice of the appropriatenumber of latent classes the number of common dimen-sions and the presence or absence of specificities is basedon the BIC statistic which is information based and de-pends on the log likelihood the number of model param-eters and the number of observations [18] as well as on aMonte Carlo procedure [19] Models with the lowest BICvalues are chosen The Monte Carlo procedure is used todetermine the appropriate numh~r of latent classes and toverify the spatial model (See [9] for a discussion on modelselection) The CLASCAL analyses yield a spatial repre-sentation of the N stimuli on the R dimensions the speci-ficity of each stimulus the probability that each subjectbelongs to each latent class (generally equal to one forone of the classes) and the weights or saliences of eachperceptual dimension for each class Also included in theoutput results are the model variance the log likelihoodthe value of the BIC statistic and the results of the MonteCarlo procedure if used

44 Results

A single class was sufficient for all data sets An anal-ysis with the number of dimensions varying between Iand 6 without specificities and between I and 5 withspecificities was then performed for a single subject classBIC indicated that the best model was three-dimensionalwith specificities for Group 3 and two-dimensional withspecificities for the other groups A graphic representa-tion of the data corresponding to the sounds of Group 3 isgiven in Figure 2 Each sample is represented in the three-dimensional common space The specifities of the stimuliare not shown in the figure

5 Physical Parameters Underlying the PerceptualSpace

51 Acoustic analysis

Once the perceptual configuration is obtained it is im-portant to give a physical interpretation An interpretationmeans some systematic relationship between the stimuluscharacteristics and the locations in the space

An appropriate spectral representation for the signalanalyses is chosen Since the car sounds studied (constantmotor speed) had a quasi-stationary character the methodof Welch [20] was used In order to take into account audi-tory response as a function of level and frequency the ana-lytic formulations of different classic weighting functionswere used (dBA dBB dBC) A physiological modelingapproach was also used in which the operation of the pe-ripheral auditory system was considered at two levels thefiltering due to the external and middle ears was modeledwith a second-order Butterworth bandpass filter with -3-dB cutofffrequencies of 450 Hz and 8500 Hz [21] As forthe cochlea Moore and Glasbergs [22] ERB (equivalent

654 Susini et al Applying psychoacousticsACUSTlCAmiddot acta acustica

Vol 85 (1999)

61 Procedure

6 Preference Analysis A Thurstone Case VPreference Model with Spline Transformations

At the beginning of the session the subject listened to allthe samples in a random order to get a sense of the rangeof variation possible Then all pairs of different sampleswere presented in a random order On each trial the subjectheard a pair of samples only once and had to choose whichsound was preferred The data consist of the proportion ofsubjects that preferred one sample over another for eachof the N stimuli

(I)

(2)ll = 1(a) + g(b)

where Pij is the probability of preferring sample i oversample j ltJgt is the cumulative normal (Gaussian) functionand lli is the utility of sample i

The greater and more positive the difference in utilitybetween the two sample is the more sample i will bepreferred over sample j

Two types of models were tested additive and multi-variate In the additive model (2) it is assumed that thecontributions of the two parameters are independent andthat the global utility of a sample is the sum of the utilitiesof each parameter Thusu is a function of the predictorparameters (a and b) each having its own transform func-tion (f and g respectively)

62 Stages in the prefer~nce analysis

This phase of the study atempts to construct a preferencemap De Soete and Winsberg [25] developed a Thursto-nian pairwise choice mOGel with univariate and multivari-ate spline transformations In their preference model theprobability that stimulus i is prefered over stimulus j isdefined as a function of be difference in utility of eachstimulus or sample The preference analysis program de-veloped by De Soete and Winsberg is presently limited totwo dimensions generally considered as adequate to de-scribe a preference space (as is the case with Groups I 2and 4) If three perceptual dimensions are recovered in anMDS analysis (as with Group 3) all pairs of candidatesmay be tested The model seeks a function that transformsthe values of the physical parameters ai and bi for stimu-lus i with a utility value Vi that depends on the preferenceprobability The dependence is expressed as follows

stimulus set The preference analysis to be examined inthe next section will determine the utility for preference ofeach of the predictor parameters Those that have a strongweight can thus in turn be equalized over the sound setin a subsequent stage of experimentation As such thisapproach allows a progressive minimization of perceiveddifferences in terms of preference or unpleasantness be-tween the stimuli

Group 1 Group 2

Parameter Dim Dim 2 Dim Dim 2

Pl -092 -006 -091 028P2 009 080 - -

P3 - - 043 -088

Group 3 Group 4

Parameter Dim I Dim 2 Dim 3 Dim I Dim 2

P2 035 -07 -014 -093 -029

P3 - - - -051 t 086

P4 -08] 032 -033 - -

Pr -032 000 -083 - -

rectangular bandwidth) formula was adopted for auditoryfilter bandwidths and Patterson et al s [23] gammatone fil-ters were used On thc bases of these early analysis stagesa set of parameters derived from the psychoacoustic lit-erature [24] and from studies on timbre [10] were calcu-lated These parameters were chosen by way of an em-piricalloop that consisted of listening to the sounds withrespect to their relative positions in the multidimensionalspace in order to isolate the kind of variation that charac-terized a given dimension Next hypotheses on the newphysical or psychoacoustic indices that were appropriatefor the stimuli under study were made and the analyseswere performed to extract the quantitative values Finallythe correlation of each parameter with the coordinates ofthe sound samples along the perceptual dimensions wasevaluated A good correlation is taken as an indication thatthe parameter is a strong candidate for a quantitative pre-dictor of the perceptual dimension

Tabe II Correlation coefficients between perceptual coordinatesand objective parameters The probability p that the two measuresare independent is indicated for p lt 001 () and for p lt 005 (t)Parameter PI = Loudness

52 Correlations

In Table II are presented the correlation coefficients be-tween the perceptual dimensions and the physical or psy-choacoustic parameters for each sample group The prob-ability (p) that the perceptual coordinates and the parame-ters values are independent (or unrelated) is given as wellThe variance in the perceptual dimension explained bythe predictor parameter can be determined by taking thesquare of the correlation coefficient Finally for each ofthe dimensions a quantitative descriptor was determinedIt is important to note that in equalizing loudness overthe set of samples (Groups 3 and 4) other dimensionsemerged that correlated with parameters not revealed inthe analysis of Groups I and 2 with loudness variationThis kind of context effect is particularly important toemphasize since it demonstrates that the salient percep-tual parameters one obtains in such a study depend verystrongly on the kinds of physical variation present in the

ACUSTlCA acta acusticaVol 85 (1999) Susini et a1Applying psychoacoustics 655

15

-1_511 --_--- -b~_1

Leq (dElA)

of parameters in terms of their interpretability The anal-yses yield the preference function for each dimension foran additive model and the preference surface for a multi-variate model

As an example the utility functions for acoustic leveland another parameter (parameter 2) are shown in Figure 3in which the additive model was retained Not surpris-ingly the utility function for level is more or less mono-tonic the more the level increases the more the utilityfor preference decreases The variation in utility with theother parameter is nonmontonic however its overall in-fluence on preference is comparatively small and is com-pletely overpowered by the effect of level variation Forthe level-equalized sample sets the parameters had moreequivalent influence on preference and some of these wereclearly nonmonotonic indicating that there are zones inthe middle of the range of variation that are maximally orminimally preferred by the panel of listeners

11105paramete 2

N 1

~ 05 ~~~ -4

- 00~-O5

5 -1

-15

Figure 3 Evolution of the utility ofLeq (dBA) and parameter 2 fromthe stimuli of group 2 The circles (0) correspond to the stimuli andthe asterics () correspond to the placement of the internal knots ofthe spline

7 Conclusions

The functions f and g are spline that is piecewise poly-nomial functions the number and order of each polyno-mial being variable The order of the spline is taken to beequal to the maximum degree of the polynomial plus oneThe junctions between the polynomials of a given splinecorrespond to the internal knots and are of finite num-ber Attention is restricted to splines with maximal con-tinuity at each knot That is for a spline of order k thefunction and its k - 2 derivatives are continuous The de-fault knot positions are located at a ladder of quantilesthus an equal number of observations is used to determineeach spline parameter Each spEne of a given degree canbe represented as a linear comhination of a set of basisfunctions

In the multivariate model (3) the utility is a function ofboth parameters

This multivariate function is represented by a multivari-ate spline that can be defined as a linear combination oftensor products of univariate basis splines In this modeltherefore the effect of one parameter on utility dependson the value of the other parameter

For a preference matrix and a given set of predictor pa-rameters a large number of models may be tested by vary-ing the type of model (additive or multivariate) and theorder and number of internal knots of the splines BIC isused to chose between the candidate models For a givenpreference matrix this criterion can also be used to choosefrom among various pairs of predictor parameters that hadmore or less similar correlations with the pereeptual spaceThis latter choice is of a heuristic nature since the corre-lations with the different perceptual dimensions are alsotaken into account In general the best model is chosenfor each set of parameters and the resulting utility func-tions are examined in order toompare the different sets

1t = f(a b) (3)

The data analysis reveals representations of the sets of carsounds that have several perceptual dimensions for bothcombinations of gear setting and motor speed The loud-ness equalization allowed the emergence of other factorscontributing both to similarity perception and quite no-tably to preference When loudness is a fact~ in the com-parison of car sounds it clearly dominates the preferencejudgments The role of the secondary parameters varywith gear setting and motor speed and globally have lessimportance in these judgments Generally the perceptualsalience of the dimensions is not the same for the two gearsettingmotor speed combinations indicating that the rel-ative importance of perceptual cues evolves with changesin car functioning and context (equalized loudness or not)

This method when applied to very ditlerent sets ofsound objects (musical instruments and car sounds) thusreveals I) a multidimensional mental representation ofcomplex sound events 2) a close relation between the per-ceptual dimensions and acoustic properties or their au-ditory transforms and 3) specific properties of certainsounds that affect their perception and comparison withother sounds For musical instrument sounds differencesbetween listeners in the perceptual importance accordingto the different dimensions and specificities have also beenfound While such differences were not found in the stud-ies on car sounds mentioned above they should be sys-tematically verified in such experiments in case they doexist The coherence of the results across these differentstimulus sets indicates a great stability in the perceptualprocesses involved in the appreciation of homogeneoussets of sounds Further it is important to emphasize thataside from loudness none of the objective parameters re-vealed corresponded to those currently included by defaultin most sound quality measuring devices A cautionarynote on a methodological issue is warranted here how-ever In another similar study with a set of environmentalsounds that were extremely heterogeneous a perceptual

656 Susini et al Applying psychoacoustics

structure was found that was strongly categorical in natureand which was associated with a high degree of identi-fication of the individual sound sources by the listeners[26] Judgments of dissimilarity seemed in this case to bebased entirely on the categories that the stimuli belongedto and less to their individual acoustic and sensory proper-ties Thus this predominant cognitive factor-recognitionclassification and identification of the sound source [12]-rendered inappropriate the conception of the perceptualspace in terms of continuous underlying dimensions Theabove techniques of preference analysis are not applica-ble in such cases and other techniques need to be usedHowever for homogenous sound sets the experimentalapproach described above is a powerful method for under-standing the perceptual underpinnings of sound quality

References

[] S S Stevens The direct estimation of sensory magnitude-loudness American Journal of Psychology 69 (1956) 1-25

[2] S S Stevens E H Galanter Ratio scales and category scalesfor a dozen perceptual continua Journal of Experimental Psy-chology 54 (1957) 377-411

l3J E Zwicker K Deuter W Peisl Loudness meters based onISO 532 B with large dynamic range InterNoise 85 Pro-ceedings 1985 International Conference on Noise Control En-gineering 19851119-1122

[4] B Berglund U Berglund T Lindvall Scaling loudness nois-ness and annoyance of aircraft noise J Acoust Soc Am 57(1975) 930-934

[5] E A Bjork The perceived quality of natural sounds Acustica57 (1985) 185- 188

[6] J M Grey Multidimensional perceptual scaling of musicaltimbres J Acoust Soc Am 61 (1997) 1270-1277

[7] C L Krumhansl Why is musical timbre so hard to under-stand - In Structure and Perception of Electroacoustic Soundand Music S Nielzen O Olsson (eds) Amsterdam 198943-53

[8] S McAdams S Winsberg S Donnadieu G De Soete JKrimphoff Perceptual scaling of synthesized musical timbresCommon dimensions specificities and latent subject classesPsychological Research 58 (1995) 177-192

[9] S Winsberg G De Soete A latent class approach to fitting theweighted Euclidean model elascal Psychometrika 58 (1993)315-330

ACUSTICAmiddot acla acuslicaVol 85 (1999)

[10] J Krimphoff S McAdams S Winsberg Caracterisation dutimbre des sons complexes II Analyses acoustiques et quan-tification psychophysique Journal de Physique 4(C5) (1994)625-628

[I] 1 P Susini S McAdams S Winsberg Caracterisation percep-tive des bruits de vchiClles Aetes du 4eme Congres Francaisd Acoustique I (1997) 543-546

[12] S McAdams Recognition of auditory sources and events-In Thinking in Sound The Cognitive Psychology of HumanAudition S McAdams E Bigand (eds) Oxford UniversityPress Oxford 1993 146-198

[13] K Genuit Investigations and simulation of vehicle noise us-ing the binaural mesurement technique Noise and VibrationConference Traverse City Michigan 198728-30

[14J C Vogel VMaffiolo JmiddotD Polack M Castellcngo Validationsubjective de la prise de son exterieur 4eme Congres Francaisd Aeoustique Marseille France 1997307-3 O

[I5J B Everitt Cluster analysis John Wiley New York USA[16] B Smith PsiExp an environment for psychoacoustic experi-

mentation using the IRCAM musical workstation Society forMusic Perception and C)gnition Conference95 University ofCalifornia Berkeley USA 1995

[17] S Winsberg G De Sode An extended Euclidean model formultidimensional scaling Proceedings of the IFCS KoheJapan 1996

[] 8] G Schwarz Estimating the dimension of a model Annals ofStatistics 6 (1978) 46 I-464

[19] A C Hope A simplifil~d Monte Carlo significance test pro-cedure Journal of the Royal Statistical Society Series B 30(1968) 582-598

[20J A V Oppenheim R W Schafer Digital signal processingPrentice-Hall NJ 1975

[21] 1 M Pickles An introduction to the physiology of hearing2nd ed Academic Presi London 1988

[22] B C J Moore B R Glasberg Suggested formulae for calcu-lating auditory-bandwidths and exitation patterns J AcousSoc Am 74 (1983) 75C-753

[23] R D Patterson M H Allerhand C Giguere Time-domainmodeling of peripheral auditory processing A modular archi-tecture and a software platform J Acoust Soc Am 98 (1995)1890-1894

[241 E Zwicker H Fastl Psychoacoustics Facts and modelsSpringer- Verlag Berlin 1990

[25] G De Soete S Winsberg A Thurstonian pairwise choicemodel with univariate and multivariate splinc transformationPychometrika 58 (1993233-256

[26] P Susini N Misdarii S Winsberg S McAdams Car-acterisation perceptive je bruits Acoustique et Techniques13 (998) 11-15

Page 5: ( -G>F

654 Susini et al Applying psychoacousticsACUSTlCAmiddot acta acustica

Vol 85 (1999)

61 Procedure

6 Preference Analysis A Thurstone Case VPreference Model with Spline Transformations

At the beginning of the session the subject listened to allthe samples in a random order to get a sense of the rangeof variation possible Then all pairs of different sampleswere presented in a random order On each trial the subjectheard a pair of samples only once and had to choose whichsound was preferred The data consist of the proportion ofsubjects that preferred one sample over another for eachof the N stimuli

(I)

(2)ll = 1(a) + g(b)

where Pij is the probability of preferring sample i oversample j ltJgt is the cumulative normal (Gaussian) functionand lli is the utility of sample i

The greater and more positive the difference in utilitybetween the two sample is the more sample i will bepreferred over sample j

Two types of models were tested additive and multi-variate In the additive model (2) it is assumed that thecontributions of the two parameters are independent andthat the global utility of a sample is the sum of the utilitiesof each parameter Thusu is a function of the predictorparameters (a and b) each having its own transform func-tion (f and g respectively)

62 Stages in the prefer~nce analysis

This phase of the study atempts to construct a preferencemap De Soete and Winsberg [25] developed a Thursto-nian pairwise choice mOGel with univariate and multivari-ate spline transformations In their preference model theprobability that stimulus i is prefered over stimulus j isdefined as a function of be difference in utility of eachstimulus or sample The preference analysis program de-veloped by De Soete and Winsberg is presently limited totwo dimensions generally considered as adequate to de-scribe a preference space (as is the case with Groups I 2and 4) If three perceptual dimensions are recovered in anMDS analysis (as with Group 3) all pairs of candidatesmay be tested The model seeks a function that transformsthe values of the physical parameters ai and bi for stimu-lus i with a utility value Vi that depends on the preferenceprobability The dependence is expressed as follows

stimulus set The preference analysis to be examined inthe next section will determine the utility for preference ofeach of the predictor parameters Those that have a strongweight can thus in turn be equalized over the sound setin a subsequent stage of experimentation As such thisapproach allows a progressive minimization of perceiveddifferences in terms of preference or unpleasantness be-tween the stimuli

Group 1 Group 2

Parameter Dim Dim 2 Dim Dim 2

Pl -092 -006 -091 028P2 009 080 - -

P3 - - 043 -088

Group 3 Group 4

Parameter Dim I Dim 2 Dim 3 Dim I Dim 2

P2 035 -07 -014 -093 -029

P3 - - - -051 t 086

P4 -08] 032 -033 - -

Pr -032 000 -083 - -

rectangular bandwidth) formula was adopted for auditoryfilter bandwidths and Patterson et al s [23] gammatone fil-ters were used On thc bases of these early analysis stagesa set of parameters derived from the psychoacoustic lit-erature [24] and from studies on timbre [10] were calcu-lated These parameters were chosen by way of an em-piricalloop that consisted of listening to the sounds withrespect to their relative positions in the multidimensionalspace in order to isolate the kind of variation that charac-terized a given dimension Next hypotheses on the newphysical or psychoacoustic indices that were appropriatefor the stimuli under study were made and the analyseswere performed to extract the quantitative values Finallythe correlation of each parameter with the coordinates ofthe sound samples along the perceptual dimensions wasevaluated A good correlation is taken as an indication thatthe parameter is a strong candidate for a quantitative pre-dictor of the perceptual dimension

Tabe II Correlation coefficients between perceptual coordinatesand objective parameters The probability p that the two measuresare independent is indicated for p lt 001 () and for p lt 005 (t)Parameter PI = Loudness

52 Correlations

In Table II are presented the correlation coefficients be-tween the perceptual dimensions and the physical or psy-choacoustic parameters for each sample group The prob-ability (p) that the perceptual coordinates and the parame-ters values are independent (or unrelated) is given as wellThe variance in the perceptual dimension explained bythe predictor parameter can be determined by taking thesquare of the correlation coefficient Finally for each ofthe dimensions a quantitative descriptor was determinedIt is important to note that in equalizing loudness overthe set of samples (Groups 3 and 4) other dimensionsemerged that correlated with parameters not revealed inthe analysis of Groups I and 2 with loudness variationThis kind of context effect is particularly important toemphasize since it demonstrates that the salient percep-tual parameters one obtains in such a study depend verystrongly on the kinds of physical variation present in the

ACUSTlCA acta acusticaVol 85 (1999) Susini et a1Applying psychoacoustics 655

15

-1_511 --_--- -b~_1

Leq (dElA)

of parameters in terms of their interpretability The anal-yses yield the preference function for each dimension foran additive model and the preference surface for a multi-variate model

As an example the utility functions for acoustic leveland another parameter (parameter 2) are shown in Figure 3in which the additive model was retained Not surpris-ingly the utility function for level is more or less mono-tonic the more the level increases the more the utilityfor preference decreases The variation in utility with theother parameter is nonmontonic however its overall in-fluence on preference is comparatively small and is com-pletely overpowered by the effect of level variation Forthe level-equalized sample sets the parameters had moreequivalent influence on preference and some of these wereclearly nonmonotonic indicating that there are zones inthe middle of the range of variation that are maximally orminimally preferred by the panel of listeners

11105paramete 2

N 1

~ 05 ~~~ -4

- 00~-O5

5 -1

-15

Figure 3 Evolution of the utility ofLeq (dBA) and parameter 2 fromthe stimuli of group 2 The circles (0) correspond to the stimuli andthe asterics () correspond to the placement of the internal knots ofthe spline

7 Conclusions

The functions f and g are spline that is piecewise poly-nomial functions the number and order of each polyno-mial being variable The order of the spline is taken to beequal to the maximum degree of the polynomial plus oneThe junctions between the polynomials of a given splinecorrespond to the internal knots and are of finite num-ber Attention is restricted to splines with maximal con-tinuity at each knot That is for a spline of order k thefunction and its k - 2 derivatives are continuous The de-fault knot positions are located at a ladder of quantilesthus an equal number of observations is used to determineeach spline parameter Each spEne of a given degree canbe represented as a linear comhination of a set of basisfunctions

In the multivariate model (3) the utility is a function ofboth parameters

This multivariate function is represented by a multivari-ate spline that can be defined as a linear combination oftensor products of univariate basis splines In this modeltherefore the effect of one parameter on utility dependson the value of the other parameter

For a preference matrix and a given set of predictor pa-rameters a large number of models may be tested by vary-ing the type of model (additive or multivariate) and theorder and number of internal knots of the splines BIC isused to chose between the candidate models For a givenpreference matrix this criterion can also be used to choosefrom among various pairs of predictor parameters that hadmore or less similar correlations with the pereeptual spaceThis latter choice is of a heuristic nature since the corre-lations with the different perceptual dimensions are alsotaken into account In general the best model is chosenfor each set of parameters and the resulting utility func-tions are examined in order toompare the different sets

1t = f(a b) (3)

The data analysis reveals representations of the sets of carsounds that have several perceptual dimensions for bothcombinations of gear setting and motor speed The loud-ness equalization allowed the emergence of other factorscontributing both to similarity perception and quite no-tably to preference When loudness is a fact~ in the com-parison of car sounds it clearly dominates the preferencejudgments The role of the secondary parameters varywith gear setting and motor speed and globally have lessimportance in these judgments Generally the perceptualsalience of the dimensions is not the same for the two gearsettingmotor speed combinations indicating that the rel-ative importance of perceptual cues evolves with changesin car functioning and context (equalized loudness or not)

This method when applied to very ditlerent sets ofsound objects (musical instruments and car sounds) thusreveals I) a multidimensional mental representation ofcomplex sound events 2) a close relation between the per-ceptual dimensions and acoustic properties or their au-ditory transforms and 3) specific properties of certainsounds that affect their perception and comparison withother sounds For musical instrument sounds differencesbetween listeners in the perceptual importance accordingto the different dimensions and specificities have also beenfound While such differences were not found in the stud-ies on car sounds mentioned above they should be sys-tematically verified in such experiments in case they doexist The coherence of the results across these differentstimulus sets indicates a great stability in the perceptualprocesses involved in the appreciation of homogeneoussets of sounds Further it is important to emphasize thataside from loudness none of the objective parameters re-vealed corresponded to those currently included by defaultin most sound quality measuring devices A cautionarynote on a methodological issue is warranted here how-ever In another similar study with a set of environmentalsounds that were extremely heterogeneous a perceptual

656 Susini et al Applying psychoacoustics

structure was found that was strongly categorical in natureand which was associated with a high degree of identi-fication of the individual sound sources by the listeners[26] Judgments of dissimilarity seemed in this case to bebased entirely on the categories that the stimuli belongedto and less to their individual acoustic and sensory proper-ties Thus this predominant cognitive factor-recognitionclassification and identification of the sound source [12]-rendered inappropriate the conception of the perceptualspace in terms of continuous underlying dimensions Theabove techniques of preference analysis are not applica-ble in such cases and other techniques need to be usedHowever for homogenous sound sets the experimentalapproach described above is a powerful method for under-standing the perceptual underpinnings of sound quality

References

[] S S Stevens The direct estimation of sensory magnitude-loudness American Journal of Psychology 69 (1956) 1-25

[2] S S Stevens E H Galanter Ratio scales and category scalesfor a dozen perceptual continua Journal of Experimental Psy-chology 54 (1957) 377-411

l3J E Zwicker K Deuter W Peisl Loudness meters based onISO 532 B with large dynamic range InterNoise 85 Pro-ceedings 1985 International Conference on Noise Control En-gineering 19851119-1122

[4] B Berglund U Berglund T Lindvall Scaling loudness nois-ness and annoyance of aircraft noise J Acoust Soc Am 57(1975) 930-934

[5] E A Bjork The perceived quality of natural sounds Acustica57 (1985) 185- 188

[6] J M Grey Multidimensional perceptual scaling of musicaltimbres J Acoust Soc Am 61 (1997) 1270-1277

[7] C L Krumhansl Why is musical timbre so hard to under-stand - In Structure and Perception of Electroacoustic Soundand Music S Nielzen O Olsson (eds) Amsterdam 198943-53

[8] S McAdams S Winsberg S Donnadieu G De Soete JKrimphoff Perceptual scaling of synthesized musical timbresCommon dimensions specificities and latent subject classesPsychological Research 58 (1995) 177-192

[9] S Winsberg G De Soete A latent class approach to fitting theweighted Euclidean model elascal Psychometrika 58 (1993)315-330

ACUSTICAmiddot acla acuslicaVol 85 (1999)

[10] J Krimphoff S McAdams S Winsberg Caracterisation dutimbre des sons complexes II Analyses acoustiques et quan-tification psychophysique Journal de Physique 4(C5) (1994)625-628

[I] 1 P Susini S McAdams S Winsberg Caracterisation percep-tive des bruits de vchiClles Aetes du 4eme Congres Francaisd Acoustique I (1997) 543-546

[12] S McAdams Recognition of auditory sources and events-In Thinking in Sound The Cognitive Psychology of HumanAudition S McAdams E Bigand (eds) Oxford UniversityPress Oxford 1993 146-198

[13] K Genuit Investigations and simulation of vehicle noise us-ing the binaural mesurement technique Noise and VibrationConference Traverse City Michigan 198728-30

[14J C Vogel VMaffiolo JmiddotD Polack M Castellcngo Validationsubjective de la prise de son exterieur 4eme Congres Francaisd Aeoustique Marseille France 1997307-3 O

[I5J B Everitt Cluster analysis John Wiley New York USA[16] B Smith PsiExp an environment for psychoacoustic experi-

mentation using the IRCAM musical workstation Society forMusic Perception and C)gnition Conference95 University ofCalifornia Berkeley USA 1995

[17] S Winsberg G De Sode An extended Euclidean model formultidimensional scaling Proceedings of the IFCS KoheJapan 1996

[] 8] G Schwarz Estimating the dimension of a model Annals ofStatistics 6 (1978) 46 I-464

[19] A C Hope A simplifil~d Monte Carlo significance test pro-cedure Journal of the Royal Statistical Society Series B 30(1968) 582-598

[20J A V Oppenheim R W Schafer Digital signal processingPrentice-Hall NJ 1975

[21] 1 M Pickles An introduction to the physiology of hearing2nd ed Academic Presi London 1988

[22] B C J Moore B R Glasberg Suggested formulae for calcu-lating auditory-bandwidths and exitation patterns J AcousSoc Am 74 (1983) 75C-753

[23] R D Patterson M H Allerhand C Giguere Time-domainmodeling of peripheral auditory processing A modular archi-tecture and a software platform J Acoust Soc Am 98 (1995)1890-1894

[241 E Zwicker H Fastl Psychoacoustics Facts and modelsSpringer- Verlag Berlin 1990

[25] G De Soete S Winsberg A Thurstonian pairwise choicemodel with univariate and multivariate splinc transformationPychometrika 58 (1993233-256

[26] P Susini N Misdarii S Winsberg S McAdams Car-acterisation perceptive je bruits Acoustique et Techniques13 (998) 11-15

Page 6: ( -G>F

ACUSTlCA acta acusticaVol 85 (1999) Susini et a1Applying psychoacoustics 655

15

-1_511 --_--- -b~_1

Leq (dElA)

of parameters in terms of their interpretability The anal-yses yield the preference function for each dimension foran additive model and the preference surface for a multi-variate model

As an example the utility functions for acoustic leveland another parameter (parameter 2) are shown in Figure 3in which the additive model was retained Not surpris-ingly the utility function for level is more or less mono-tonic the more the level increases the more the utilityfor preference decreases The variation in utility with theother parameter is nonmontonic however its overall in-fluence on preference is comparatively small and is com-pletely overpowered by the effect of level variation Forthe level-equalized sample sets the parameters had moreequivalent influence on preference and some of these wereclearly nonmonotonic indicating that there are zones inthe middle of the range of variation that are maximally orminimally preferred by the panel of listeners

11105paramete 2

N 1

~ 05 ~~~ -4

- 00~-O5

5 -1

-15

Figure 3 Evolution of the utility ofLeq (dBA) and parameter 2 fromthe stimuli of group 2 The circles (0) correspond to the stimuli andthe asterics () correspond to the placement of the internal knots ofthe spline

7 Conclusions

The functions f and g are spline that is piecewise poly-nomial functions the number and order of each polyno-mial being variable The order of the spline is taken to beequal to the maximum degree of the polynomial plus oneThe junctions between the polynomials of a given splinecorrespond to the internal knots and are of finite num-ber Attention is restricted to splines with maximal con-tinuity at each knot That is for a spline of order k thefunction and its k - 2 derivatives are continuous The de-fault knot positions are located at a ladder of quantilesthus an equal number of observations is used to determineeach spline parameter Each spEne of a given degree canbe represented as a linear comhination of a set of basisfunctions

In the multivariate model (3) the utility is a function ofboth parameters

This multivariate function is represented by a multivari-ate spline that can be defined as a linear combination oftensor products of univariate basis splines In this modeltherefore the effect of one parameter on utility dependson the value of the other parameter

For a preference matrix and a given set of predictor pa-rameters a large number of models may be tested by vary-ing the type of model (additive or multivariate) and theorder and number of internal knots of the splines BIC isused to chose between the candidate models For a givenpreference matrix this criterion can also be used to choosefrom among various pairs of predictor parameters that hadmore or less similar correlations with the pereeptual spaceThis latter choice is of a heuristic nature since the corre-lations with the different perceptual dimensions are alsotaken into account In general the best model is chosenfor each set of parameters and the resulting utility func-tions are examined in order toompare the different sets

1t = f(a b) (3)

The data analysis reveals representations of the sets of carsounds that have several perceptual dimensions for bothcombinations of gear setting and motor speed The loud-ness equalization allowed the emergence of other factorscontributing both to similarity perception and quite no-tably to preference When loudness is a fact~ in the com-parison of car sounds it clearly dominates the preferencejudgments The role of the secondary parameters varywith gear setting and motor speed and globally have lessimportance in these judgments Generally the perceptualsalience of the dimensions is not the same for the two gearsettingmotor speed combinations indicating that the rel-ative importance of perceptual cues evolves with changesin car functioning and context (equalized loudness or not)

This method when applied to very ditlerent sets ofsound objects (musical instruments and car sounds) thusreveals I) a multidimensional mental representation ofcomplex sound events 2) a close relation between the per-ceptual dimensions and acoustic properties or their au-ditory transforms and 3) specific properties of certainsounds that affect their perception and comparison withother sounds For musical instrument sounds differencesbetween listeners in the perceptual importance accordingto the different dimensions and specificities have also beenfound While such differences were not found in the stud-ies on car sounds mentioned above they should be sys-tematically verified in such experiments in case they doexist The coherence of the results across these differentstimulus sets indicates a great stability in the perceptualprocesses involved in the appreciation of homogeneoussets of sounds Further it is important to emphasize thataside from loudness none of the objective parameters re-vealed corresponded to those currently included by defaultin most sound quality measuring devices A cautionarynote on a methodological issue is warranted here how-ever In another similar study with a set of environmentalsounds that were extremely heterogeneous a perceptual

656 Susini et al Applying psychoacoustics

structure was found that was strongly categorical in natureand which was associated with a high degree of identi-fication of the individual sound sources by the listeners[26] Judgments of dissimilarity seemed in this case to bebased entirely on the categories that the stimuli belongedto and less to their individual acoustic and sensory proper-ties Thus this predominant cognitive factor-recognitionclassification and identification of the sound source [12]-rendered inappropriate the conception of the perceptualspace in terms of continuous underlying dimensions Theabove techniques of preference analysis are not applica-ble in such cases and other techniques need to be usedHowever for homogenous sound sets the experimentalapproach described above is a powerful method for under-standing the perceptual underpinnings of sound quality

References

[] S S Stevens The direct estimation of sensory magnitude-loudness American Journal of Psychology 69 (1956) 1-25

[2] S S Stevens E H Galanter Ratio scales and category scalesfor a dozen perceptual continua Journal of Experimental Psy-chology 54 (1957) 377-411

l3J E Zwicker K Deuter W Peisl Loudness meters based onISO 532 B with large dynamic range InterNoise 85 Pro-ceedings 1985 International Conference on Noise Control En-gineering 19851119-1122

[4] B Berglund U Berglund T Lindvall Scaling loudness nois-ness and annoyance of aircraft noise J Acoust Soc Am 57(1975) 930-934

[5] E A Bjork The perceived quality of natural sounds Acustica57 (1985) 185- 188

[6] J M Grey Multidimensional perceptual scaling of musicaltimbres J Acoust Soc Am 61 (1997) 1270-1277

[7] C L Krumhansl Why is musical timbre so hard to under-stand - In Structure and Perception of Electroacoustic Soundand Music S Nielzen O Olsson (eds) Amsterdam 198943-53

[8] S McAdams S Winsberg S Donnadieu G De Soete JKrimphoff Perceptual scaling of synthesized musical timbresCommon dimensions specificities and latent subject classesPsychological Research 58 (1995) 177-192

[9] S Winsberg G De Soete A latent class approach to fitting theweighted Euclidean model elascal Psychometrika 58 (1993)315-330

ACUSTICAmiddot acla acuslicaVol 85 (1999)

[10] J Krimphoff S McAdams S Winsberg Caracterisation dutimbre des sons complexes II Analyses acoustiques et quan-tification psychophysique Journal de Physique 4(C5) (1994)625-628

[I] 1 P Susini S McAdams S Winsberg Caracterisation percep-tive des bruits de vchiClles Aetes du 4eme Congres Francaisd Acoustique I (1997) 543-546

[12] S McAdams Recognition of auditory sources and events-In Thinking in Sound The Cognitive Psychology of HumanAudition S McAdams E Bigand (eds) Oxford UniversityPress Oxford 1993 146-198

[13] K Genuit Investigations and simulation of vehicle noise us-ing the binaural mesurement technique Noise and VibrationConference Traverse City Michigan 198728-30

[14J C Vogel VMaffiolo JmiddotD Polack M Castellcngo Validationsubjective de la prise de son exterieur 4eme Congres Francaisd Aeoustique Marseille France 1997307-3 O

[I5J B Everitt Cluster analysis John Wiley New York USA[16] B Smith PsiExp an environment for psychoacoustic experi-

mentation using the IRCAM musical workstation Society forMusic Perception and C)gnition Conference95 University ofCalifornia Berkeley USA 1995

[17] S Winsberg G De Sode An extended Euclidean model formultidimensional scaling Proceedings of the IFCS KoheJapan 1996

[] 8] G Schwarz Estimating the dimension of a model Annals ofStatistics 6 (1978) 46 I-464

[19] A C Hope A simplifil~d Monte Carlo significance test pro-cedure Journal of the Royal Statistical Society Series B 30(1968) 582-598

[20J A V Oppenheim R W Schafer Digital signal processingPrentice-Hall NJ 1975

[21] 1 M Pickles An introduction to the physiology of hearing2nd ed Academic Presi London 1988

[22] B C J Moore B R Glasberg Suggested formulae for calcu-lating auditory-bandwidths and exitation patterns J AcousSoc Am 74 (1983) 75C-753

[23] R D Patterson M H Allerhand C Giguere Time-domainmodeling of peripheral auditory processing A modular archi-tecture and a software platform J Acoust Soc Am 98 (1995)1890-1894

[241 E Zwicker H Fastl Psychoacoustics Facts and modelsSpringer- Verlag Berlin 1990

[25] G De Soete S Winsberg A Thurstonian pairwise choicemodel with univariate and multivariate splinc transformationPychometrika 58 (1993233-256

[26] P Susini N Misdarii S Winsberg S McAdams Car-acterisation perceptive je bruits Acoustique et Techniques13 (998) 11-15

Page 7: ( -G>F

656 Susini et al Applying psychoacoustics

structure was found that was strongly categorical in natureand which was associated with a high degree of identi-fication of the individual sound sources by the listeners[26] Judgments of dissimilarity seemed in this case to bebased entirely on the categories that the stimuli belongedto and less to their individual acoustic and sensory proper-ties Thus this predominant cognitive factor-recognitionclassification and identification of the sound source [12]-rendered inappropriate the conception of the perceptualspace in terms of continuous underlying dimensions Theabove techniques of preference analysis are not applica-ble in such cases and other techniques need to be usedHowever for homogenous sound sets the experimentalapproach described above is a powerful method for under-standing the perceptual underpinnings of sound quality

References

[] S S Stevens The direct estimation of sensory magnitude-loudness American Journal of Psychology 69 (1956) 1-25

[2] S S Stevens E H Galanter Ratio scales and category scalesfor a dozen perceptual continua Journal of Experimental Psy-chology 54 (1957) 377-411

l3J E Zwicker K Deuter W Peisl Loudness meters based onISO 532 B with large dynamic range InterNoise 85 Pro-ceedings 1985 International Conference on Noise Control En-gineering 19851119-1122

[4] B Berglund U Berglund T Lindvall Scaling loudness nois-ness and annoyance of aircraft noise J Acoust Soc Am 57(1975) 930-934

[5] E A Bjork The perceived quality of natural sounds Acustica57 (1985) 185- 188

[6] J M Grey Multidimensional perceptual scaling of musicaltimbres J Acoust Soc Am 61 (1997) 1270-1277

[7] C L Krumhansl Why is musical timbre so hard to under-stand - In Structure and Perception of Electroacoustic Soundand Music S Nielzen O Olsson (eds) Amsterdam 198943-53

[8] S McAdams S Winsberg S Donnadieu G De Soete JKrimphoff Perceptual scaling of synthesized musical timbresCommon dimensions specificities and latent subject classesPsychological Research 58 (1995) 177-192

[9] S Winsberg G De Soete A latent class approach to fitting theweighted Euclidean model elascal Psychometrika 58 (1993)315-330

ACUSTICAmiddot acla acuslicaVol 85 (1999)

[10] J Krimphoff S McAdams S Winsberg Caracterisation dutimbre des sons complexes II Analyses acoustiques et quan-tification psychophysique Journal de Physique 4(C5) (1994)625-628

[I] 1 P Susini S McAdams S Winsberg Caracterisation percep-tive des bruits de vchiClles Aetes du 4eme Congres Francaisd Acoustique I (1997) 543-546

[12] S McAdams Recognition of auditory sources and events-In Thinking in Sound The Cognitive Psychology of HumanAudition S McAdams E Bigand (eds) Oxford UniversityPress Oxford 1993 146-198

[13] K Genuit Investigations and simulation of vehicle noise us-ing the binaural mesurement technique Noise and VibrationConference Traverse City Michigan 198728-30

[14J C Vogel VMaffiolo JmiddotD Polack M Castellcngo Validationsubjective de la prise de son exterieur 4eme Congres Francaisd Aeoustique Marseille France 1997307-3 O

[I5J B Everitt Cluster analysis John Wiley New York USA[16] B Smith PsiExp an environment for psychoacoustic experi-

mentation using the IRCAM musical workstation Society forMusic Perception and C)gnition Conference95 University ofCalifornia Berkeley USA 1995

[17] S Winsberg G De Sode An extended Euclidean model formultidimensional scaling Proceedings of the IFCS KoheJapan 1996

[] 8] G Schwarz Estimating the dimension of a model Annals ofStatistics 6 (1978) 46 I-464

[19] A C Hope A simplifil~d Monte Carlo significance test pro-cedure Journal of the Royal Statistical Society Series B 30(1968) 582-598

[20J A V Oppenheim R W Schafer Digital signal processingPrentice-Hall NJ 1975

[21] 1 M Pickles An introduction to the physiology of hearing2nd ed Academic Presi London 1988

[22] B C J Moore B R Glasberg Suggested formulae for calcu-lating auditory-bandwidths and exitation patterns J AcousSoc Am 74 (1983) 75C-753

[23] R D Patterson M H Allerhand C Giguere Time-domainmodeling of peripheral auditory processing A modular archi-tecture and a software platform J Acoust Soc Am 98 (1995)1890-1894

[241 E Zwicker H Fastl Psychoacoustics Facts and modelsSpringer- Verlag Berlin 1990

[25] G De Soete S Winsberg A Thurstonian pairwise choicemodel with univariate and multivariate splinc transformationPychometrika 58 (1993233-256

[26] P Susini N Misdarii S Winsberg S McAdams Car-acterisation perceptive je bruits Acoustique et Techniques13 (998) 11-15