emotional responses to the perceptual dimensions of timbre: a …slegroux/publications/pubs/... ·...

15
Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot Study Using Physically Informed Sound Synthesis. Sylvain Le Groux 1 and Paul F.M.J. Verschure 1,2 1 The Laboratory for Synthetic, Perceptive, Emotive and Cognitive Systems (SPECS). Universitat Pompeu Fabra Roc Boronat, 138 08018 Barcelona, Spain 2 Institucio Catalana de Recerca i Estudis Avancats (ICREA) Barcelona, Spain {sylvain.legroux,paul.verschure}@upf.edu Abstract. Music is well known for aecting human emotional states, and most people enjoy music because of the emotions it evokes. Yet, the relationship between specific musical parameters and emotional re- sponses is not clear. A key question is how musical parameters can be mapped to emotional states of valence, arousal and dominance. Whereas many studies have focused on emotional responses to performance or structural musical parameters [6–8], little is know about the role of tim- bre on emotions. Here, we propose a sound synthesis-based approach to the study of sonic emotion using physically-inspired sound models, and investigate the emotional eects of three perceptually salient timbre de- scriptors, namely log-attack time, spectral centroid and spectral flux. Our results show that an increase in spectral centroid and spectral flux pa- rameters produces an increased emotional response on the arousal scale. Moreover we could observe interaction eects between log-attack time and spectral flux on the dominance scale. This study suggests a rational approach to the design of emotion-driven music systems for multimedia installations, live performance and music therapy. Key words: Timbre, Perception, Emotion, Physically Inspired Sound Synthesis 1 Perceptual Dimensions of Timbre Sounds can be described by five main perceptual components, namely pitch, loudness, duration, timbre, and spatialization [25]. To a first degree of approxi- mation, pitch, loudness and duration are unidimensional parameters which relate to fundamental frequency, sound level, and time respectively. On the other hand, timbre is notoriously dicult to define [28]. The American Standards Association (ASA) proposes the following definition [32]: timbre is the perceptual attribute

Upload: others

Post on 23-Jan-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

Emotional Responses to the Perceptual

Dimensions of Timbre:

A Pilot Study Using Physically Informed Sound

Synthesis.

Sylvain Le Groux1 and Paul F.M.J. Verschure1,2

1 The Laboratory for Synthetic, Perceptive, Emotive and Cognitive Systems(SPECS).

Universitat Pompeu FabraRoc Boronat, 138 08018 Barcelona, Spain

2 Institucio Catalana de Recerca i Estudis Avancats (ICREA)Barcelona, Spain

{sylvain.legroux,paul.verschure}@upf.edu

Abstract. Music is well known for a↵ecting human emotional states,and most people enjoy music because of the emotions it evokes. Yet,the relationship between specific musical parameters and emotional re-sponses is not clear. A key question is how musical parameters can bemapped to emotional states of valence, arousal and dominance. Whereasmany studies have focused on emotional responses to performance orstructural musical parameters [6–8], little is know about the role of tim-bre on emotions. Here, we propose a sound synthesis-based approach tothe study of sonic emotion using physically-inspired sound models, andinvestigate the emotional e↵ects of three perceptually salient timbre de-scriptors, namely log-attack time, spectral centroid and spectral flux. Ourresults show that an increase in spectral centroid and spectral flux pa-rameters produces an increased emotional response on the arousal scale.Moreover we could observe interaction e↵ects between log-attack timeand spectral flux on the dominance scale. This study suggests a rationalapproach to the design of emotion-driven music systems for multimediainstallations, live performance and music therapy.

Key words: Timbre, Perception, Emotion, Physically Inspired SoundSynthesis

1 Perceptual Dimensions of Timbre

Sounds can be described by five main perceptual components, namely pitch,loudness, duration, timbre, and spatialization [25]. To a first degree of approxi-mation, pitch, loudness and duration are unidimensional parameters which relateto fundamental frequency, sound level, and time respectively. On the other hand,timbre is notoriously di�cult to define [28]. The American Standards Association(ASA) proposes the following definition [32]: timbre is the perceptual attribute

Page 2: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

2 Sylvain Le Groux, Paul F.M.J. Verschure

of sound that allows a listener to distinguish among sounds that are otherwiseequivalent in pitch, loudness and subjective duration.

Over the years, several approaches to the study of timbre have been investi-gated, leading to disparate but related results. One possible approach is to searchfor a reduced semantic field that would describe sounds accurately. Following thisapproach, [4] asked subjects to rate sounds on 30 verbal attributes. Based ona multidimensional scaling analysis of the semantic-di↵erential data, he foundfour statistically significant axes, namely the dull-sharp, compact-scattered, full-empty and colorful-colorless pairs.

Other type of studies have tried to relate the perceptual dimensions of soundsto acoustics descriptors. Those descriptors can be spectral, temporal or spectro-temporal [22], and generate a “timbre space” [35].

The timbre space is determined using multidimensional analysis of data de-rived from experiments where a listener estimates the dissimilarity between pairsof sounds with di↵erent timbral characteristics. Multidimensional analysis tech-niques allow to extract a set of “principal axes” or dimensions that are commonlyassumed to model the criteria used by participants to estimate the dissimilarity.It is then possible to find acoustical descriptors that correlate with the dimen-sions derived from multidimensional analysis. The three descriptors consistentlyreported by dissimilarity studies on instrumental sounds are the spectral cen-troid, the log-attack time and the spectral flux [21, 9, 10].

In this study we propose to investigate the relationship between the timbre ofa musical note and the emotional response of the listener. To this aim, we use aphysically informed synthesizer that can produce realistic sounds e�ciently andintuitively from a three-dimensional timbre space model mapped to a reducedset of physical parameters.

2 Sound Generation: Modal Synthesis

In modal synthesis, a sound is modeled as a combination of modes which oscillateindependently (Figure 1). While this kind of modeling is in theory only accu-rate for sounds produced by linear phenomena, it allows for e�cient real-timeimplementations [5, 1, 33].

2.1 Modal Synthesis

The dynamics of a simple unidimensional mass-spring-damper system (our modelof a mode) is given by the second order di↵erential equation:

mx + µx + kx = 0 (1)

where m is the mass, µ the damping factor, k the tension and x the displacement.The solution is an exponentially decaying oscillator:

x = e

�↵tAcos(!0t + �) (2)

Page 3: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

Emotional Responses to the Perceptual Dimensions of Timbre 3

where ↵ = µ2m , !0 =

q( k

m � ( µ2m )2, A is the amplitude and � the phase.

The modal approximation consists in modeling the time-varying sound s(t)by a linear combination of damped oscillators representing the vibrating prop-erties of a structure. In the general case, the damping factor ↵n is frequency-dependent and relates to the physical properties of the material .

s(t) =X

n=1

sn(t) =X

n=1

Ane

2i⇡fnte

�↵nt (3)

Fig. 1. Within the modal paradigm, a vibrating structure is decomposed into a sumof simple modes.

2.2 Implementation

By discretizing equation 1 with finite di↵erence and using a sampling intervalT = t

n we obtain:

x[n] = x[n� 1]2m + Tr

m + Tr + T

2k� x[n� 2]

m

m + Tr + T

2k(4)

which is the formulation of a standard IIR two-poles resonant filter. Conse-quently we implemented a real-time synthesizer as a bank of biquad resonantfilters in Max5 [11, 37] excited by an impulse (Figure 2).

3 Perceptual Control of The Synthesis Parameters

Previous psychoacoustics studies on timbre have emphasized the perceptual im-portance of features such as spectral centroid, log-attack time and spectral flux(cf. Section 1). As a means to generate perceptually relevant sounds, we havebuilt a simple and e�cient physically-inspired modal synthesizer that producesrealistic percussive sounds. We will now show how to map those psychoacousticalfeatures to sound synthesis parameters.

Page 4: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

4 Sylvain Le Groux, Paul F.M.J. Verschure

Resonator (F,A,d)

Resonator (F,A,d)

Resonator (F,A,d)

+ ADSRImpulse Generator

.

.

.

+

Amp

Attack Time

Freq, Amp, Decay

Fig. 2. The sounds are produced by an impulse that injects energy in a bank ofexponentially-decaying resonant filters, modulated by a time envelope.

3.1 Background

Few studies have explored the control of audio synthesis from perceptual param-eters. We can distinguish four main trends: the machine learning view, where amodel of the timbre is learned from audio data, the concatenative view, where thetimbre is “constructed” by the juxtaposition of pre-existing sonic grains, the sig-nal processing view, where the transformations on timbre are model-dependentand direct applications of the signal model possibilities, and the physical modelview, where the sound generation is modulated by intuitive physical parame-ters. One of the earliest examples of the machine learning point of view can befound in [36] where an additive synthesis model was driven by pitch, loudnessand brightness using artificial neural networks. This paradigm has been morerecently generalized and adapted to Support Vector Machine learning [16]. Inthe case of concatenative synthesis [29], a sound is defined as a combination ofpre-existing samples in a database. These samples are already analyzed, classi-fied and can be retrieved by their audio characteristics. For signal models, a setof timbral transformations based on additive modeling is proposed. For instanceby explicitly translating and distorting the spectrum of a sound, one can achievethe control over vibrato, tremolo and gender transformation of a voice [30]. Inaddition, a few studies have investigated the relationship between perceptualand physical attributes of sounding objects [2, 20].

Here we follow this last approach with a simple physical model of impactsound and propose a set of acoustic parameters corresponding to the tridimen-sional model of timbre proposed by psychoacoustics studies [21, 9, 10] (cf. Section1).

Page 5: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

Emotional Responses to the Perceptual Dimensions of Timbre 5

3.2 Tristimulus / Spectral Centroid

We propose to control the spectral centroid of the sound by using the tristimulusvalues to specific brightness surfaces [24, 26].

The tristimulus analysis of timbre proposes to quantify timbre in terms ofthree coordinates (x, y, z) associated with band-loudness values. Inspired fromthe tristimulus theory of colour perception, it associates high values of x to dom-inant high-frequencies, high values of y to dominant mid-frequency componentsand high values of z to dominant fundamental frequency. The coordinates arenormalized so that x + y + z = 1.

T1 =a1

HX

h=1

ah

, T2 =a2 + a3 + a4

HX

h=1

ah

, T3 =

HX

h=5

ah

HX

h=1

ah

(5)

X

Y

0 1

1

FundamentalHigh-

Frequency

Mid-Frequency

Fig. 3. An intuitive 2D representation of the tristimulus timbre space is a triangle. Thearrow represents a specific time-varying trajectory of a sound in the tristimulus timbrespace [24].

For synthesis, the sound is approximated by a sum of weighted dampedoscillators at frequencies multiple of the fundamental. We use an additive modeland neglect the e↵ect of phase:

s(t) 'N�1X

k=0

e

�↵ktAkcos(2⇡fkt) (6)

Page 6: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

6 Sylvain Le Groux, Paul F.M.J. Verschure

where n is the time index, N is the number of harmonics in the synthetic signal,Ak is the amplitude of the kth partial, fk is the frequency of the kth partial.with fk = k.f0 where f0 is the fundamental frequency (harmonicity hypothesis).

In the synthesis model, the harmonic modes belong to three distinct fre-quency bands or tristimulus bands: f0 belongs to the first low frequency band,frequencies f2...4 belong to the second mid-frequency band, and the remainingpartials f5...N belong to the high-frequency band.

The relative intensities in the three bands can be visualized on a tristimulustriangular diagram where each corner represents a specific frequency band. Weuse this representation as an intuitive spatial interface for timbral control ofthe synthesized sounds (Figure 4). The inharmonicity of the sound was set byscaling the values of partials following a piano-like law proposed by[2] fk =kf0

p1 + �k

2.

Fig. 4. The synthesizer GUI allows for graphical control over the tristimulus values(left). It automatically updates and displays the amplitudes of the corresponding par-tials (right).

3.3 Damping / Spectral Flux

While the tristimulus is only a static property of the spectrum, the modal syn-thesis technique described in Section 2 also allows for realistic control of thetime-varying attenuation of the spectral components. As a matter of fact, thespectral flux, defined as the mean value of the variation of the spectral com-ponents, is known to play an important role in the perception of sounds [21].We decided, as a first approximation, to indirectly control the spectral flux -or variation of brightness- by modulating the relative damping value ↵r of thefrequency-dependent damping parameter ↵ (Figure 5 and Equation 3) proposedby [2]:

↵(!) = exp(↵g + ↵r!) (7)

3.4 Time Envelope / Log Attack Time

The log-attack time (LAT) is the logarithm (decimal base) of the time dura-tion between the time the signal starts to the time it reaches its stable part

Page 7: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

Emotional Responses to the Perceptual Dimensions of Timbre 7

Fig. 5. The synthesizer GUI allows for graphical control of the damping parameters↵g and ↵r

[22]. Here, we control the log-attack time manually using our synthesizer’s in-terface (Figure 6). Each time the synthesizer receives a MIDI note-on message,an Attack-Decay-Sustain-Release (ADSR) time envelope corresponding to thedesired LAT parameter is triggered (Figure 7).

LAT = log10(tstop � tstart) (8)

Fig. 6. The synthesizer GUI allows for graphical control over log-attack time parametervia the Attack Decay Sustain Release envelope.

4 From perception to emotion

4.1 Background

Music and Emotion Music and its e↵ect on the listener has long been asubject of fascination and scientific exploration from the Greeks speculating onthe acoustic properties of the voice [13]to Musak researcher designing “soothing”elevator music. It has now become an omnipresent part of our day to day life,

Page 8: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

8 Sylvain Le Groux, Paul F.M.J. Verschure

0 0.5 1 1.5 2 2.5 3 3.5 40

2000

4000

6000

8000

10000

12000Envelope

time (s)

am

plit

ude

t_stopt_start

Fig. 7. The ADSR time envelope allows for direct control over the LAT each time anote is triggered.

whether by choice when played on a personal portable music device, or imposedwhen di↵used in malls during shopping hours for instance.

Studies on musical emotion have traditionally been influenced by “standard”research on emotion induction and usually assumed that musical emotion relieson general emotional mechanisms such as cognitive appraisal. Nevertheless, whileobserving that most emotional responses to music do not involve implicationsfor goals in life, a recent review article [12] proposes to challenge the cogni-tive appraisal perspective, and put forward six additional mechanisms that canbe accounted for. Namely, brain stem reflex, evaluative conditioning, emotionalcontagion, visual imagery, episodic memory and musical expectancy.

In our experiment, we decided to limit our focus to the investigation of therelationship between perceptual acoustic properties and emotional responses. Inorder to control for mechanisms such as musical expectancy and emotional con-tagion, the stimuli consisted of one short note. In fact, previous studies haverevealed that only a few seconds of music are necessary to elicit an emotionalreaction [3, 23]. There was no conditioning task involved, and the sounds gener-ated by our model were synthetic, but realistic. Therefore within the frameworkproposed in [12], we most probably look at some reflex-like mechanisms reflectingthe impact of auditory sensation in the most basic sense.

Emotional Responses There exist a number of di↵erent methodologies tomeasure emotional responses. They can be broadly classified into three categoriesdepending on the extent to which they access subjective experience, alterationsof behavior or the impact on physiological states [19]. Self-reports of emotional

Page 9: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

Emotional Responses to the Perceptual Dimensions of Timbre 9

experience measure the participant’s “subjective feeling” of emotions, i.e. “theconsciously felt experience of emotions as expressed by the individual” [31]. Self-reports can be subcategorized into verbal [34] and visual self-reports [27]. Wedecided to use visual reports during experiments since these are more universalthan the verbal measures.

We followed the well established three dimensional theory of emotions: va-lence and intensity of activation or arousal [27] plus dominance using the SelfAssessment Manikin (SAM) scale [14] (Figure 8). Emotions can then be placedin a three-dimensional emotional space, where the valence scale ranges from pos-itive to negative, the activation scale extends from calmness to high arousal andthe dominance scale goes from dominated to dominant.

4.2 Methods

Fig. 8. The presentation software S-Blast programmed in Max/MSP uses SAM scales(Dominance, Arousal and Valence)[14] to measure emotional responses to sound stimulifrom the subjects.

Page 10: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

10 Sylvain Le Groux, Paul F.M.J. Verschure

Participants A total of ten university students (2 female; mean age 29 years)took part in the pilot experiment. The experiment was conducted in accordancewith the ethical standards laid down in the 1964 Declaration of Helsinki.

Stimuli Eight sound samples of about 4 s. duration each were used as acousticstimuli 3. They were presented to the participants using stereo headphones (AKGK66).

We investigated the e↵ect of three timbre descriptors (two levels each) onarousal, valence and dominance:

– Attack time: low level corresponded to 1ms while high level corresponded to150 ms.

– Spectral Centroid (controled by tristimulus): low level corresponded to en-ergy in the first and second tristimulus band while high level correspondedto energy in the third tristimulus band only.

– Spectral-flux (controled by damping): high level corresponded to high damp-ing ↵r = 0.6 while low level corresponded to low damping ↵r = �1.6 while↵g was kept at the constant value 0.23 (cf. Equation 7).

– All the other parameters of the synthesizer were fixed (partial amplitudedecay, harmonicity, even/odd ratio, decay, sustain, release), and the soundsamples were normalized in amplitude with Peak Pro (BIAS4).

Procedure We investigated the influence of di↵erent sound features on theemotional state of the patients using a fully automated and computer basedstimulus presentation and response registration system. In our experiment, eachsubject was seated in front of a PC computer with a 15.4” LCD screen andinteracted with custom-made stimulus delivery and data acquisition softwarecalled S-blast (Figure 8) made with the MAX programming language [37]. Soundstimuli were presented through headphones (K-66 from AKG). At the beginningof the experiment, the subject was exposed to a sinusoidal sound generatorto calibrate the sound level to a comfortable level. Subsequently, a number ofsound samples with specific sonic characteristics were presented together withthe di↵erent scales (Figure 8).

The subject had to rate each sound sample in terms of their emotional content(valence, arousal, dominance) by clicking on the SAM manikin representing herfeelings [14]. The subject was given the possibility to repeat the playback of thesamples. The SAM graphical scale was converted into an ordinal scale (from 0to 5) where 0 corresponds to the most dominated, aroused and positive and 5 tothe most dominant, calm and negative (Figure 8). The data was automaticallystored into a SQLite5 database composed of a table for demographics and a tablecontaining the emotional ratings. SPSS (IBM) statistical software suite was used

3 Available from http://www.dtic.upf.edu/⇠slegroux/confs/CMMR104 http://www.bias-inc.com/5 http://www.sqlite.org/

Page 11: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

Emotional Responses to the Perceptual Dimensions of Timbre 11

to assess the significance of the influence of sound parameters on the a↵ectiveresponses of the subjects 6.

4.3 Results

We verified that, for each condition, the data passed the Kolmogorov-Smirnovtest of normality. We then conducted a factorial (three-way) repeated measureANOVA to look at the influence of perceptual parameters of timbre on theemotional responses of the participants as measured by a five points SAM scales.

Arousal Mauchly’s test indicated that the assumption of sphericity had beenviolated for e↵ects of attack, centroid and damping (p¡0.05). Therefore degreesof freedom were corrected using Greenhouse-Geisser estimates of sphericity.

There was a significant main e↵ect of centroid (F(1,9)=11.663, p¡0.05) as welldamping (F(1,9)=5.857, p¡0.05) on arousal. Post-hoc pairwise comparisons withBonferonni corrections showed a significant mean di↵erence of -0.725 betweenhigh spectral centroid (M=1.825) and low spectral centroid (M=2.55) showinghigher values produce higher ratings along the arousal scale (i.e. lower value onthe scale). Similarly, a significant mean di↵erence of -1.075 was observed betweenlow damping (M=1.65) and high damping (M=2.725), showing low damping wasmore arousing than high damping (Figure 9).

Valence No sound property significantly a↵ected ratings along the valence scale.

Dominance Mauchly’s test indicated that the assumption of sphericity hadbeen violated (p¡0.05). Therefore degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity.

There was a significant interaction e↵ect between attack and damping (F(1,9)=7.23,p¡0.05) on the dominance scale. A contrast analysis for repeated measures re-vealed the increased dominance found when using low damping (compared tohigh damping) is actually reversed when high log-attack time is used instead oflow log-attack time. In other words, the increase in dominance due to low damp-ing is significantly greater when used in conjunction with a short log-attack time(low value).

5 Conclusion

In this study we investigated the role of timbre on emotional responses as mea-sured by the SAM scale. We proposed a physically-inspired model to synthesizesound samples from a set of well-defined and perceptually-grounded sound pa-rameters, and showed their correspondence to arousal and dominance scales. Inparticular, spectral centroid and damping were significantly related to arousal6 http://www.spss.com/

Page 12: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

12 Sylvain Le Groux, Paul F.M.J. Verschure

Conditions

LLLLLHLHLLHHHLLHLHHHLHHH

Arousal

4

3

2

1

0

Fig. 9. Boxplot of the di↵erent conditions on the arousal scale (from 0 to 5 with 0maximum arousal). Notation: a 3 character string codes for condition (Attack, Centroidand Flux in that order) and levels (High, Low). For example, HHL corresponds to HighAttack-time, High Spectral Centroid and Low Spectral flux .

Log(Attack Time)

HighLow

Estimated Marginal Means

2.2

2

1.8

1.6

Interaction Graph

High

Low

Damping

Fig. 10. Interaction graph between log-attack time and spectral flux on the dominancescale (from 0 to 5 where 5 represents highest dominance). Estimated Marginal Meansare obtained by taking the average of the means for a given condition.

Page 13: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

Emotional Responses to the Perceptual Dimensions of Timbre 13

subjective ratings, while interaction e↵ects between log-attack time and dampingwere found, showing that short attack with low damping maximized dominantresponses.

The study of the relationship between sonic features and emotional responsescombined with techniques for real-time sound synthesis paves the way for thedesign of a↵ective interactive music systems. We plan to use this type of systemfor a↵ective state diagnosis, a↵ective state modulation (e.g. involving Alzheimerpatients and autistic children [15]) and neurofeedback studies [17].

In future experiments, we want to investigate in more details the potentialof timbre and higher level musical parameters to induce specific a↵ective states.We wish to expand our analysis to time-varying parameters and to co-varyingparameters. Our results, however, already demonstrate that a rational approachtowards the definition of a↵ective interactive music systems is possible. We willintegrate this knowledge into SiMS, the biomimetic interactive music system wehave developed [18], and explore adaptive emotional mappings of musical param-eters using algorithms such as reinforcement learning with a response evaluationloop.

References

1. J. Adrien. The missing link: Modal synthesis. 1991.2. M. Aramaki, R. Kronland-Martinet, T. Voinier, and S. Ystad. A percussive sound

synthesizer based on physical and perceptual attributes. Computer Music Journal,30(2):32–41, 2006.

3. E. Bigand, S. Vieillard, F. Madurell, J. Marozeau, and A. Dacquet. Multidimen-sional scaling of emotional responses to music: The e↵ect of musical expertise andof the duration of the excerpts. Cognition & Emotion, 2005.

4. G. Bismarck. Timbre of steady sounds: A factorial investigation of its verbalattributes. Acustica, 30(3):146–59, 1974.

5. P. Cook. Real sound synthesis for interactive applications. AK Peters, Ltd., 2002.6. A. Friberg, R. Bresin, and J. Sundberg. Overview of the kth rule system for

musical performance. Advances in Cognitive Psychology, Special Issue on MusicPerformance, 2(2-3):145–161, 2006.

7. A. Gabrielson and P. N. Juslin. Emotional expression in music performance: Be-tween the performer’s intention and the listener’s experience. In Psychology ofMusic, volume 24, pages 68–91, 1996.

8. A. Gabrielsson and E. Lindstrom. Music and Emotion - Theory and Research,chapter The Influence of Musical Structure on Emotional Expression. Series inA↵ective Science. Oxford University Press, New York, 2001.

9. J. Grey. An exploration of musical timbre using computer-based techniques for anal-ysis, synthesis and perceptual scaling. PhD thesis, Dept. of Psychology, StanfordUniversity., 1975.

10. P. Iverson and C. Krumhansl. Isolating the dynamic attributes of musical timbre.The Journal of the Acoustical Society of America, 94:2595, 1993.

11. T. Jehan, A. Freed, and R. Dudas. Musical applications of new filter extensions tomax/msp. In International Computer Music Conference,(Beijing, China, 1999),ICMA, pages 504–507. Citeseer.

Page 14: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

14 Sylvain Le Groux, Paul F.M.J. Verschure

12. P. N. Juslin and D. Vastfjall. Emotional responses to music: the need to considerunderlying mechanisms. Behav Brain Sci, 31(5):559–75; discussion 575–621, Oct2008.

13. P. Kivy. Introduction to a Philosophy of Music. Oxford University Press, USA,2002.

14. P. Lang. Behavioral treatment and bio-behavioral assessment: computer applica-tions. In J. Sidowski, J. Johnson, and T. Williams, editors, Technology in MentalHealth Care Delivery Systems, pages 119–137, 1980.

15. S. Le Groux, M. Brotons, and P. F. M. J. Verschure. A study on the influence ofsonic features on the a↵ective state of dementia patients. In Conference for theAdvancement of Assistive Technology in Europe, San Sebastian, Spain, October2007.

16. S. Le Groux and P. F. M. J. Verschure. PERCEPTSYNTH: Mapping perceptualmusical features to sound synthesis parameters. In Proceedings of IEEE Inter-national Conference ons Acoustics, Speech and Signal Processing (ICASSP), LasVegas, USA, April 2008.

17. S. Le Groux and P. F. M. J. Verschure. Neuromuse: Training your brain throughmusical interaction. In Proceedings of the International Conference on AuditoryDisplay, Copenhagen, Denmark, May 18-22 2009.

18. S. Le Groux and P. F. M. J. Verschure. Situated interactive music system: Connect-ing mind and body through musical interaction. In Proceedings of the InternationalComputer Music Conference, Montreal, Canada, August 2009. Mc Gill University.

19. R. W. Levensons. The nature of emotion: Fundamental questions, chapter Humanemotion: A functional view, pages 123–126. Oxford University Press, 1994.

20. S. McAdams, A. Chaigne, and V. Roussarie. The psychomechanics of simulatedsound sources: Material properties of impacted bars. The Journal of the AcousticalSociety of America, 115:1306, 2004.

21. S. McAdams, S. Winsberg, S. Donnadieu, G. De Soete, and J. Krimpho↵. Per-ceptual scaling of synthesized musical timbres : Common dimensions, specificities,and latent subject classes. Psychological Research, 58:177–192, 1995.

22. G. Peeters, S. McAdams, and P. Herrera. Instrument sound description in thecontext of mpeg-7. In Proceedings of the 2000 International Computer Music Con-ference. Citeseer, 2000.

23. I. Peretz, A. J. Blood, V. Penhune, and R. Zatorre. Cortical deafness to dissonance.Brain, 124(Pt 5):928–40, May 2001.

24. H. Pollard and E. Jansson. A tristimulus method for the specification of musicaltimbre. In Acustica, volume 51, 1982.

25. R. Rasch and R. Plomp. The perception of musical tones. d. deutsch (ed.). thepsychology of music (pp. 89-112), 1999.

26. A. Riley and D. Howard. Real-time tristimulus timbre synthesizer. Technicalreport, University of York, 2004.

27. J. A. Russell. A circumplex model of a↵ect. Journal of Personality and SocialPsychology, 39:345–356, 1980.

28. G. Sandell. Definitions of the word” timbre, 2008.29. D. Schwarz. Data-Driven Concatenative Sound Synthesis. PhD thesis, Ircam -

Centre Pompidou, Paris, France, January 2004.30. X. Serra and J. Bonada. Sound transformations based on the sms high level at-

tributes, 1998.31. . L. J. Stout, P. A. Measuring emotional response to advertising. Journal of

Advertising, 15(4):35–42, 1986.

Page 15: Emotional Responses to the Perceptual Dimensions of Timbre: A …slegroux/publications/pubs/... · 2016. 4. 4. · Emotional Responses to the Perceptual Dimensions of Timbre: A Pilot

Emotional Responses to the Perceptual Dimensions of Timbre 15

32. A. Terminology. S1. 1–1960. American Standards Association, New York, 1960.33. K. van den Doel and D. Pai. Synthesis of shape dependent sounds with physical

modeling. In Proceedings of the International Conference on Auditory Displays,1996.

34. C. L. E. . T. A. Watson, D. Development and validation of brief measures ofpositive and negative a↵ect: The panas scales. Journal of Personality and SocialPsychology, 54:1063–1070, 1988.

35. D. Wessel. Timbre space as a musical control structure. In Computer MusicJournal, 1979.

36. D. Wessel, C. Drame, and M. Wright. Removing the time axis from spectral modelanalysis-based additive synthesis: Neural networks versus memory-based machinelearning. In Proc. ICMC, 1998.

37. D. Zicarelli. How I learned to love a program that does nothing. Computer MusicJournal, (26):44–51, 2002.