physical modelingof plucked string instruments with application...

PAPERS

Physical Modeling of Plucked String Instruments withApplication to Real-Time Sound Synthesis*

VESA VALIMAKI**, AES Member, JYRI HUOPANIEMI, AES Member, and MA'rTI KARJALAINEN, AES Member

Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing, F/N-02/50 Espoo, Finland

AND

ZOLTA, N J.&.NOSY, AES Member

Technical University of Budapest, Department of Telecommunications, H-I I 11 Budapest, Hungary

An efficient approach for real-time synthesis of plucked string instruments using physi-cal modeling and DSP techniques is presented. Results of model-based resynthesis areillustrated to demonstrate that high-quality synthetic sounds of several string instrumentscan be generated using the proposed modeling principles. Real-time implementation usinga signal processor is described, and several aspects of controlling physical models ofplucked string instruments are studied.

0 INTRODUCTION The former approach is typical for physicists who aimat understanding the acoustics of musical instruments.

Physical modeling of musical instruments has become The latter approach--the one we have taken--is morean increasingly active field of research in musical acous- practical, but a comprehensive understanding of thetics and computer music. The term physical modeling physics of the musical instruments to be synthesized isrefers in this case to the mathematical or computational still needed.simulation of sound production mechanisms of musi- Three different viewpoints are of paramount impor-cal instruments, tance when designinga physicallybased synthesistech-

Traditionally musical sound synthesis, such as FM nique. First, essential features of the physics of a musi-

synthesis or waveshaping, has tried to achieve a desired cal instrument have to be studied carefully. Second, thewaveform or spectrum [1], [2]. In sampling or wavetable properties of the human auditory system need to be con-synthesis, digital recordings of acoustic sounds are ed- sidered because it is the ear that finally judges whetherited and processed for resynthesis. Although sampled the synthetic sound is satisfactory or not. In practice atones may sound perfect, convincing synthesis of musi- physical model can be constructed by simplifying thecai instruments has been difficult since one sample cor- underlying physical principles. Here the knowledge of

responds to a sound played by one instrument in a certain hearing can be of help: features that are not perceptuallyway. A small change in the playing technique would relevant need not be simulated precisely, which leadsdemand the synthetic sound to be resampled, to simplification. The third point of view is that of a

The approach taken in physical modeling can be dj- DSP engineer. The model should be computable in realvided into two classes: time, preferably on a commercial (signal) processor.

1) Mathematical modeling of physical principles Thus the model should be easily and efficiently

2) Design of model-based sound synthesizers, programmable.In this paper we show how synthesis models for

* Presented at the98th Convention of the Audio Engineering plucked string instruments, such as guitars, the banjo,Society, Paris, France, 1995 February 25-28; revised 1995 the mandolin, and the kantele, can be constructed fol-July 7 and 1996 March 13. lowing these principles. In our approach the model for** Also with School of Electronic and Manufacturing Sys-tems Engineering, University of Westminster, London W1M a single string is based on refinements of the Karplus-8JS, UK. Strong (KS) algorithm [3], which is a simple computa-

J.AudioEng.Soc.,Vol.44,No.5,1996May 331

i i i

VALIMAKIETAL. PAPERS

tional technique for synthesizing plucked string sounds, modeling string instrumenis are discussed in Section 2.

As reported by Jaffe and Smith [4], this algorithm is a Two basic models for a vibrating string and signal pro-simplification of a physical model for a vibrating lossy cessing techniques such as Lagrange interpolation arestring, which is based on the solution of the wave equa- briefly described. The basic models are extended bytion. Later, Smith generalized this approach to what is introducing, for example, a dual-polarization stringnowadays called digital waveguide modeling [5]-[7]. model. The modeling of the body of a guitar is consid-

We have further improved these principles in our model, ered from three points of view--digital filter approxima-such as with a continuously variable digital delay line tion, a principle based on commutativity of linear sys-using Lagrange interpolation for adjusting the length of terns, and a hybrid where some resonances of the bodythe string [8] and a simple yet precisely tuned low,pass are explicitly modeled and a processed input signal in-filter--called a loop filter--to bring about the frequen- eludes the rest of the body response. Section 3 concen-cy-dependent damping of a string [9]. The model incor- trates on the analysis techniques that can be used forporates the influence of the body in an exceptionally estimating the values of the modelparameters. Examplesefficient way [7], [9]-[ 11]. Additions to the basic string of synthesis results are reported in Section 4. Real-timeand body model are discussed in order to include beat implementation techniques are considered in Section 5.

effects inherent in string vibration and sympathetic ecu- In Section 6 control aspects of the physical models ofplings between strings [4], [11], [12]. plucked string instruments are discussed. Finally, con-

After the model structure has been designed, an esti- clusions are drawn and directions for future work are

mation method is needed for calibrating the values of given in Section 7.the model parameters. A particular subproblem is the

estimation of the coefficients of the loop filter. This I OVERVIEW OF PLUCKED STRINGproblem was first addressed by Smith in the early 1980s INSTRUMENTS[12], [13]. He studied several parametric methods for

matching the loop filter and also proposed some nonpara- The history of guitarlike plucked string instrumentsmetric methods, such as the use of deconvolution. In spans to antiquity. Cultural and geographical differences

[14] a time-frequency representation was applied to the have led to the evolution of a large family of stringanalysis of musical sounds, and the loop filter was de- instruments over a long period of time. In this papersigned based on those data. In [15] the same approach we have concentrated on the analysis and model-basedwas used in the analysis, but instead of using the KS sound synthesis of six example cases of the pluckedmodel, a pair of poles was assigned to each sinusoidal string instrument family:component. The technique was applied to the synthesis 1) The modern classical nylon-string acoustic guitarof the piano, but was reported to have been tested for 2) The flat-top steel-string acoustic guitarguitartonesas well. 3) Theelectricguitar

We have obtained prototype frequency responses for 4) The banjothe loop filters by means of short-time Fourier analysis 5) The mandolinand envelope tracking of the harmonics. A first-order 6) The kantele

all-pole filter is matched to the analysis data using a All these string instruments exhibit some special char-weighted least-squares design. Smith has independently acteristics, which are to be taken into account when

developed a very similar analysis method [16]. There- designing model-based sound synthesizers. The goal inafter the input signal for the model can be extracted this research was, however, not only to create soundfrom a recorded string instrument sound using inverse synthesis methods, but also to use physical modeling asfiltering. This approach has also been proposed in [12], a tool in the research of different string instruments. A[1;4], [15]. The residual can be truncated or windowed, short overview of the string instruments featured in thisand the resulting short-duration sequence can be fed into paper is given in this section. A more detailed analysisthe synthesis model to produce an approximation of the of the acoustics of some of these instruments can be

original signal. The parameters of the model can be found in [17], for example.modified, and yet a very natural sounding signal willresult. 1.1 Acoustic Guitar

This paper presents an overview of our recent work Analysis and modeling of the classical acoustic guitaron model-based sound synthesis ofplucked string instru- has been presented in previous papers [8], [9], [11],ments. Our contribution has been to introduce high- [18], but asan extension to guitar modeling we measuredquality real-time synthesis for the acoustic guitar [8], and modeled the behavior of the steel-string acoustic[11] and for several other instruments. The work in- guitar. The main difference in the acoustics of these

eludes refinements of modeling techniques, such as frae- instruments is causedby two facts: 1) the material of thetional delay filtering, body modeling, estimation of the strings and 2) the plucking method. Steel-string guitarsmodel parameters, implementation techniques, and typically have crossed bracing in the soundboard due tomethods for controlling the physical models, the higher tension of the strings [18]. The use of a plec:

The organization of the paper is as follows. An over- trum instead of the finger as the excitation method resultsview of the plucked string instruments that are studied in a stronger pluck and a brighter tone. These featuresin this paper is given in Section 1. The principles of can be taken into account in a physical model.

332 J. Audio Eng. Soc., Vol. 44, No. 5, 1996 May

PAPERS PHYSICAL MODELING OF PLUCKED STRING INSTRUMENTS

of a banjo tone is depicted in Fig. 2. The figure shows1.2 Electric Guitar the response after the first string was plucked with a

The main difference between acoustic and electric finger pick at a distance of 140 mm from the bridge

guitars is in the body of the guitar. The solid-body con- while other strings were damped. The resonances of thestruction, as the one we measured (Fender Stratocaster), body can be observed in the first 200 ms of the tone.radiates very little sound from the body itself or the The impulse response of the soundboard has been foundstrings. Our model-based approach to the electric guitar to be quite long. This must be taken into account in thedoes not cover the effects of magnetic pickups or nonlin- excitation signal of the synthesis model (see Sectionearly behaving amplifiers. Instead we focused on the 3.5). There is also two-stage decay behavior in the thirdbehavior of the string vibration in order to design a harmonic, which may result from mode summation orphysical model for the plain electric guitar, coupling effects.

The three-dimensional plot in Fig. 1 exhibits thesound behavior of an electric guitar. The third string 1.5 Kantelewas plucked while other strings were damped. The neck The kantele is an ancient Finnish instrument that fea-pickup was used for the recording. It can be seen that tures 5 to 40 strings. Fig. 3 shows the five-string kanteledue to the lack of body resonances the attack part is that has been used in the analysis. There are two special

very well behaved. The decay of the harmonics is also characteristics in the kantele that are not found in othersmooth, that is, there is no nonlinear behavior. The sig- string instruments [20].nal was analyzed using the short-time Fourier transform 1) There are very strong beats in harmonic envelopes.(STFT) technique [19], which is discussed in Section 2) A prominent second harmonic exists due to longitu-3.2. dinalvibrationeffects.

Both effects were found to be due to the unusual way1.3 Mandolin the strings have been terminated. The beats are gener-

The mandolin is a string instrument used widely in ated when the horizontally and vertically polarized vi-

folk and bluegrass music in western countries. It features bration components are superimposed since there is aeight steel strings, which are tuned in four pairs. The clear difference in the effective string length in the twosound of the mandolin is brighter and decays faster than polarizations (see Section 2.6.2). This is due to the knotthat of the acoustic guitar. The pairs of equally tuned termination around the metal bar [Fig. 4(a)]. The strongstrings usually result in beating due to a minor difference second harmonic comes from the longitudinal tensionin the tuning. The STFT-based spectral analysis of one variation that bends the tuning peg [Fig. 4(b)] and radi-string of the first string pair of the mandolin plucked utes from the soundboard proportionally to the squarewith a plectrum near the bridge is discussed in Section of the transversal string displacement.4.2 (see Fig. 27). The remaining strings, including the Fig. 5 shows a three-dimensional STFT analysis resultother string of the string pair, were carefully damped, of a kantele tone. Note that the second harmonic is al-The quite rapid decay of the harmonics as well as a most 10 dB louder than the first harmonic in the attacklinear behavior of the strings can be observed, part, but it attenuates at a faster rate. Strong beating can

be observed in higher harmonics.1.4 Banjo

The banjo differs from the classical acoustic guitar 1.6 Instrument Measurementsmainly in the construction of the instrument body. The In order to find relevant analysis data for model-basedfive strings have been coupled via a metal bridge to a synthesis we conducted accurate and thorough measure-drumlike resonating plate. The time-varying spectrum ments of the chosen string instruments. Professional mu-

.. .. ::.

01·_ .... :

0 !

=?o] o

_-20 _-20 _

0.5 1 0_ _ 0

1.5 1.5

2.5 _ 0.6

Frequency(kHz) Time(s) Frequency(kHz) Time(s)

Fig. 1. STFT analysis of electric guitar tone. (Fender Strato- Fig. 2. STFT analysis of banjo tone. (Gibson Epiphone, firstcaster, third open string, fundamental frequency of 195 Hz.) open string, fundamental frequency 289 Hz.)

J. Audio Eng. Soc., Vol. 44, No. 5, 1996 May 333

VALIM.a,KI ET AL. PAPERS

sicians and high-quality instruments were used. The ba- of the string. The system is assumed to be linear, andsic measurements consisted of recording single notes thus all losses and other linear nonidealities may beplayed on each string at several fret positions. Both fin- lumped to the termination and excitation or pickupger picking and plectrum picking were used as the string points. This is because we are only interested in theexcitation method for instruments such as the electric output of the system and not in the amplitude values atguitar and the steel-string guitar. Typical playing styles, arbitrary points inside the system. Thus we can combine

such as strumming, vibrato, and glissando, were also all the linear operations, namely, delays and filters, be-recorded to obtain information for the control of the tween the input and output points (see, for example,models. [6]). The stringitselfcan thenbe describedas an ideal

All measurements were carried out in a large anechoic lossless waveguide, which is a bidirectional discrete-

chamber using Briiel & Kjaer microphones and pream- time delay line [22], [6]. This system may be modeledplifiers. The measurement data were first stored on a using a pair of delay lines and a pair of reflection filtersDAT recorder at 44.1-kHz sampling rate and 16-bit Rb(Z) and Rf(z), as depicted in Fig. 7. Each of thesequantization, giving a theoretical signal-to-noise ratio filters includes the reflection function of the correspond-of about 96 dB. In the second phase the data were trans- ing termination of the string and the dissipative andferred onto a hard disk using the QuickSig software dispersive contributionduetothestringmaterial. A loss-[21 ] designed at the Laboratory of Acoustics and Audio less delay line can be implemented computationally ef-Signal Processing of the Helsinki University of Tech- ficiently as a circular buffer where the data do not move,

nology, butonly readandwritepointersareupdated.This mayreduce the computational load by several orders of mag-

2 GENERAL MODEL FOR A STRING nitude as compared to a shift register, where the dataINSTRUMENT are actually moved.

A string instrumentcan be dividedinto functional iblocks, as illustrated in Fig. 6. Thesound sourcesof theinstrument are the strings, which have been connected tothe bodyvia the bridgeand whichalso interactwith ::

eachother.Thesoundradiatesmainlythroughthebody. _The vibrating strings themselves act as dipole sources _ illand radiate sound inefficiently. First we concentrate onthestrings. 0

2.1 Basic Model for a Vibrating String lThe general solution of the wave equation for a vibrat- 2

ing string is composedof two independenttransversal 3 0.2 0wavestraveling in opposite directions(see, for example, 0.6 0.4[17]). At the string terminations the waves reflect back Frequency(kHz) 4 1 0.8 Time (s)

with inverted polarity and form standing waves. The Fig. 5. STFT analysis of kantele tone. (Fourth open string,losses in the system,damp the quasi-periodic vibration fundamental frequency 349 Hz.)

MetalBar FiveStrings TuningPegs

Soundboard

length = 60 cm

fFig. 3. Kantele, a traditional Finnish string instrument.

Metal Bar

String String MetalN Tuning

_l_ Knot _ Peg

---_ _-- Differenceofeffective Y////////////ASoundboardbending points of twovibrating modes

(a) (b)

Fig. 4. String terminations of kantele. (a) Termination at metal bar, causing beat effect. (b) Termination at tuning peg, creatinglongitudinal nonlinearity.



An efficient structure for a string model is obtained a similar comb filter [24].by commuting one of the delay lines with one reflection The fact that the finger or plectrum exciting the string

filter. Then the delay lines may be combined into a has a finite (nonzero) touching width has been ignoreddouble-length line and the filters into a single one that in this model, and it has been assumed that the excitationcan be expressed as acts at a single point. Widening the excitation point to

an interval adds more low-pass filtering to the excitation.Hi(z) = Rb(z)Rf(z) . (1) We can bring about this effect using a low-order excita-

tion filter E(z), which models the dependence of theThe transfer function Hi(z ) is called the loop filter. This spectral tilt on the dynamic level of the pluck. Jaffeformulation is in practice equivalent to the model of Fig. and Smith [4] suggested that a one-pole digital filter7 when a comb filter P(z) is cascaded with the string be suitable for this purpose. Furthermore the complexmodel to cause the filtering effect due to the plucking interaction between the finger and the string has not beenposition [4]. The main difference is the delay between modeled. Instead, the excitation has been supposed tothe excitation point and the output, but this is insignifi- be an event that can be modeled using linear operations.cant and can be compensated if necessary. This modified In reality, the finger may grab the string for a shortsystem, which we call a generalized KS model, is illus- while, causing nonlinear or linear, but time-varyingtrated in Fig. 8. A more detailed derivation and analysis interaction.are given in [23].

The filter P(z) is called a plucking-point equalizer, 2.2 Length of Stringand its transfer function is The effective delay length of the feedback loop in the

string model determines the fundamental frequency ofP(z) = 1 + z -M Rf(z) (2) the output signal. The delay length (in samples) can be

computed aswhere Rf(z) is the reflection filter of one end of the

string model and M is the delay (in samples) between _ fs (3)the excitation points in the upper and lower lines of the L - fooriginal waveguide model (see Fig. 7). In practice Rf(z)

in Eq. (2) need not be modeled very accurately since its where fs and fo are the sampling rate of the synthesismagnitude response is only slightly less than unity at all system and the desired fundamental frequency, respec-frequencies. In practice the most remarkable difference tively. In general L is a positive real number, and thusbetween direct and reflected excitation is the inverted implementation of the delay line calls for the use of aphase, and thus the filter can be replaced by a constant fractional delay filter, which we denote by F(z). It is amultiplier rf = -1 + e, where e is a small nonnegative phase equalizer that brings about a controllable frac-real number. The contribution of a pickup point in an tional delay. A first-order all-pass filter [4] and a third-electric guitar can be incorporated into the model using order maximally flat FIR filter based on Lagrange inter-

polation [8] have been suggested as alternatives for theExcitation implementation of the fractional-length delay line in a

stringmodel.I String1 _

InteractionS,· _ Sound We used Lagrange interpolation for fractional delayString_ Body [Radiation. approximation. The filter coefficients h(n)of the La-

grange interpolator are computed as (see, for example,/_/--i : [25])

Fig. 6. Block diagram of general model for plucked string /vinstrument, h(n) = I'I D - k

k=on- k' n = 0,1,2 ..... N (4)k_n

Excitation

Output where N is the order of the FIR filter and D the desired

Rf_lDelayLiiel 1_ v

. delay. Lagrange interpolation is equivalent to the maxi-mally flat approximation (at _o = 0) of band-limited

Delay_ interpolation. The approximation error (and its N deriva-

Fig. 7. Waveguide model for vibrating lossy string, tives) is zero at o_ = 0 and negative everywhere else(unless d = 0), that is, the Lagrange interpolator is alow-pass filter. The computational efficiencies of Nth-

PluckPoint String order maximally fiat all-pass and FIR fractional delayInput Equalizer Model Output filters are compared in [26]. All-pass fractional delayx(n) P(z)_ _ y(n) filters are recursive and thus are prone to transient effects

Fig. 8. Generalized Karplus-Strong model consisting of string when their delay is changed. In [27], [28] a new methodmodel S(z) cascaded with plucking-point equalizer P(z). was introduced to suppress the transients in time-varying

d. Audio Eng. Soc., Vol. 44, No. 5, 1996 May 335


recursive filters. This method was tested in the context the filter. For/-/l(Z) tO be a stable low-pass transfer func-

of string modeling [27]. An extensive review of digital tion, we require that - 1 < al < 0. The numerator 1 +'filter approximations for fractional delay has been pre- a I scales the frequency response (divided by g) at 0 tosented by Laakso et al. [25]. unity, thus allowing control of the gain at co = 0 using

Real strings are more or less stiff so that the wave the coefficient g. We require that Igl_< 1.

propagation is dispersive, that is, the total loop delay is Fig. 10 shows the magnitude response and the groupfrequency dependent. In principle, a single filter may delay of the loop filter HI(z) [Eq. (6)] with three different

implement both the magnitude and the phase in the string values of coefficient a1. These values have been chosenmodel loop, but in practice it is easier to use three filters so that they represent cases found in practice. In this exam-to control the three parts separately-- a linear-phase all- ple g = 0.99. (A change in g merely shifts the magnitudepass approximation F(z)to control the fundamental fre- response curve vertically.) Note that the group delay ofquency, the loop filter Hi(z) to control the magnitude the loop filter is very small in all three cases.response, and an all-pass filter D (z) to adjust the frequen- Fig. 11 shows the impulse response of the string modelcy-dependent overall phase delay. See [29], [30] for a S(z) with the three loop filters of Fig. 10. Here the lengthdiscussion on the implementation of dispersive wave-guide models using all-pass filters. For low strings, espe-

cially on the piano [31], [32], dispersivity is important xp(n) '_'ri_)_Ht(z_ Il L y(n)or crucial to the timbre of sound synthesis. For higherstrings the dispersivity is relatively small. In mostplucked string instruments the effect due to dispersion Fig. 9. Block diagram of the string model S(z). F(z)--FIR orcan be neglected and the filter D(z) may be left out of all-pass-type fractional delay filter; Hl(z)--loo p filter.the model.

To conclude our discussion on the frequency depen- i' 'al:-o00, _

dence of the string length, let us express L((o) by means 41=---¢_i-

of the three filters discussed, _0.95i ai_L(co) = L I + 'tH(CO) + 'TF(00 ) "{- 'rD(la) ) (5) _'0'9 t085'

0 0.05 0:l 0.15 0'.2 0.25 0'.3 0.35 0.4 0.45 0.5

where L[ is the integer number of unit delays in the·delay N.... lized Frequency

line, and Xn(CO),xr(co)m, and 'rD(m)are the phase delays (a)

of the loop filter Hi(z), the fractional delay filter F(z), _______ . , ..,_and the dispersion filter D(z), respectively. Fig. 9 illus- 0.05

2.3trateSLoopthe implementationFilterof the string model S(z). _0_ {_l_ -- -- _'"'"_ _ !l----__i_alal=-0.01=-0.001In the original KS model the loop filter Hl(z ) is a -0o5........' -0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

two-tap averager which is easy to implement without NormalizedFrequency

multiplications [3]. This simple filter, however, cannot (b)match the frequency-dependent damping of a physical Fig. 10. Magnitude response and group delay of one-pole loopstring. We feel that an FIR filter is not suited to this filter for three different values of a I. g = 0.99 in all cases.purpose since its order should be rather high to matchthe desired characteristics. Instead we suggest the use

al = -0.001, g = 0.99, L = 19

o a,m ,owpass.,terforsimu,ati. th dam,,in 'ILIILLLi III1Icharacteristics of a physical string. The same decision ?was made in [12]. From [17], [33] it can be concluded _0.5

that a second- or third-order all-pole filter could be suita- < 0_10 100 260 300 400 560 600 700 800 900 1000ble for a loop filter. It is important to use the simplest Sample Indexsatisfying loop filte r since one such filter is needed for 41=-0.ol.g=o.99.L=19

the modelof each string.Also, the loop filteris not a i ]. j i ''_static filter, but its coefficients should be changed as a _0_tll 1function of the string length and other playing param- < o/tit ,eters. This is easiest to achieve using a low-order IIR 0 100 200 300 400 500 600 700 800 900 1000

· SampleIndex

filter. We found that a reasonable approximation mayal = -O.O5,g =0.99, L = 19

be obtainedusinga first-orderall-polefilter, i .........

_0.5_ ' ' ' ', '_1 + a I (6) <EoHr(z) = g 1 + a]z-1 0 100 200 300 400 500 600 700 800 900 1000

Sample Index

where g is the·gain of the filter at 0 Hz and a 1 is the Fig. 11. Impulse responses of string model obtained using loopfilter coefficient that determines the cutoff frequency of filters of Fig. 10.

336 J.AudioEng.Soc.,VoL44,No.5,1996May


of the delay line is L = 19. It is seen that the impulse tic guitar, respectively. The impulse response can beresponse decays more rapidly as the magnitude of al seen to be long enough so that the temporal envelope canincreases, beassumedto be perceptuallyimportant.Thesituationis

comparable to the perception of reverberation in a small2.4 Modeling the Body room. This implies that a digital filter approximation

From the viewpoint of signal processing, the body of should not be designed exclusively based on magnitude-the acoustic guitar and the transfer function from the response criteria. The spectral envelope seen in Fig. 13 isbridge to a specific direction may be considered as a relatively flat, but there are a large number of resonanceshigh-order linear filter. A set of transfer functions would starting from the lowest resonance around 100 Hz.be needed to simulate the directivity pattem of a string We have studied several digital filter approximationsinstrument [34]. In most physical models only one transfer for modeling the body of the acoustic guitar [18]. Anfunction has been included. This simplified approach simu- FIR filter model of th6 body response must be at leastlates the radiation of the sound to one point in front of the 50 to 100 ms long (more than 1000 taps) to yield good

sound hole of the string instrument. This kind of model synthetic sound. Linear prediction (LPC) analysis sug-will thus at best produce a sound reminiscent of a recording gests an all-pole filter model of order 500 or more. Bothof a musical instrument in a relatively anechoic room. In of these are computationally too expensive for real-timeSection 2.5 we discuss different methods to incorporate implementation using a single DSP processor. We alsodirectivity characteristics into physical models, designed reduced-order IIR filters that approximate the

The transfer function from the bridge to the listener frequency resolution of the human auditory system, butcan be measured approximately by exciting the bridge even these did not reduce the computational load enoughwith an impulse hammer and registering the radiated without audible degradation of the sound quality. A moresound. Figs. 12 and 13 show the impulse response and detailed discussion of the body modeling problem isthe magnitude spectrum of the body of a classical acous- given elsewhere [23], [35].

To overcome the inherently heavy computational loadof the filter-based body models a novel method wasinvented [7], [9]-[ 11]. Let us consider the string instru-

0_ ment model of Fig. 14(a). The output signal y(n) of thissystem can be expressed as

0.6

0.4 y(n) = e(n) * h(n) * b(n) (7)

.g$. 02 where the asterisk denotes the operation of linear dis-

crete convolution, e(n) is the excitation signal [that is,<

the impulse response of the excitation filter E(z)], and-o h(n) and b(n) are the impulse responses of the general-

ized KS string model [H(z) = P(z)S(z)] and the body,-04 respectively. Fundamental properties of linear opera--0.6 tors, such as associativity and commutativity, are valid

0 2'0 4m0 6'0 8mo 100 120 140 160 180 200 for discrete convolution, and thus Eq. (7) can be rewrit-Time (ms) ten as

Fig. 12. Impulse response of body of classical acoustic guitar.y(n) = x(n) * h(n) (8)

i// where we have definedx(n) = e(n) * b(n). (9)

-10

Thus it is possible to swap the transfer functions of'15

the generalized KS model and the body, and, in addition,

-25 Excitation GKS String Body-_ Impulse Filter Model Model Output

_-30 fi(n) _, E(z) e(n)_ · y(n)

-35 (a)

-40

Excitation Excitation GKS String-45 SequenceFilter Model Output

-5( b(n) E(z)_ · y(n)1 2 3 4 5 6 7 8 9 10

Frequency (kHz) (b)

Fig.. 13. Frequency response of body of classical acoustic Fig. 14. (a) Block diagram of string instrument model.gmtar. (b) Equivalent model obtained by system reordering.

d.AudioEng.Soc.,Vol.44,No.5,1996May 337

VALIMAKI ET AL. PAPERS

e(n) and b(n) can be combined, as shown in Fig. 14(b). cessing point of view and as good as possible from theThe signal x(n) is then used as the input to the general- perceptual point of view. In earlier studies we have con-ized KS string model. The combination of the excitation sidered three strategies [34], [37]:signal and the impulse response of the body is motivated 1) Directional filteringby the fact that then the body need not be modeled 2) Set of elementary sourcesexplicitly during real-time synthesis. The signal x(n) 3) Direction-dependent excitation.

combining the excitation and the body response can be A direction-dependent digital filter may be attachedestimated by precomputing it based on some model, by to each path from the source to the listener. Moving andmeasuring it somehow, or by inverse filtering a digital rotating sources can be modeled by changing the filterrecording of an instrument, as will be discussed in Sec- parameters of the paths in a proper way (for example,tion 3.5. theLeslieeffectof a rotatingloudspeakercan be simu-

We may develop the model further by explicitly mod- lated). The directional filtering method was studied foreling the excitation signal. The first step toward this the acoustic guitar. We came to the conclusion that even

direction is to extract the most prominent resonances of first- or second-order directivity filters give useful re-the body, measure their central frequencies and Q val- suits, thus leading to an efficient implementation.

ues, and design a second-order all-pole filter to represent Fig. 15 depicts the modeling of direction-dependenteach resonance. These resonances must then be removed radiation of the acoustic guitar (in the horizontal plane)from the excitation signal, for example, by using nar- relative to the main-axis radiation. Shown in the figurerow-band linear-phase FIR filters, are magnitude spectra for second-order IIR filters at azi-

We have successfully extracted the two lowest reso- muth angles of 90, 135, and 180°. The reference magni-nances of the body of an acoustic guitar. The main ad- tude spectrum at 0 ° is assumed to be flat. The low-vantage of this approach is that the residual signal with pass characteristic is noticeably increased as the relativethe low-frequency high-Q resonances removed decays angle is greater. The measurement was carried out bymore rapidly than the original one. As a consequence, exciting the bridge of the instrument using an impulsea shorter excitation signal can be used, and the memory hammer and registering the reference response at 0° andrequirements for the model-based synthesizer are low- the related response in various directions. The measuredered. Also, the Q values and the central frequencies of reference and the directional response were fitted sepa-the lowest resonances are now parameterized and thus rately with first- or second-order AR models. A simpleindependently controllable, division of the models was performed to obtain the

pole-zero directivity filter.2.5 Modeling Directional Properties of String Fig. 16(a) shows the transfer function measured atInstruments azimuth angles of 0 and 180 °. Fig. 16(b) gives the re-

The directional characteristics of musical instrument sponse at 0° filtered with a first-order IIR directivitysound radiation are of interest in several applications filter and the actual measured response at 180 ° azimuth.using model-based sound synthesis. From the scientific Note that the spectral slopes are nearly the same, as waspoint of view the physical modeling approach serves as expected. In this example the transfer function of thea flexible tool for analysis and simulation of the directiv- directional filter isity of the instruments. Directional characteristics must

be taken into account when model-based sound synthesis R (z) - bo + biz- lmethods are used for sound generation in room acoustics 1 + a]z- _ (10)simulation and other virtual reality applications [34],[36].

Plucked string instruments exhibit complex sound ra- o ____<__

diation patterns due to various reasons. The resonant- _,mode frequencies of the instrument body account formost of the sound radiation (see, for example, [17]). In -4

string instruments different mode frequencies of the -6 obodyhave their ownpatterns, such as monopoles,di- _ _ _35°poles, or quadrupoles, and their combinations. The _-_

sound radiated from the vibrating strings, however, is :-10weak and can be neglected in the simulation.

Another noticeable factor in the modeling of directiv- -_2

ityis masking caused by (and reflection from) the player -_4of the instrument.Maskingplays an importantrole in _80°virtual environmentswhere the listener and sound -x6 __

sources are freely moving in a space. -_8.....It is clear that computational modeling of the detailed o5 I 11.5 '_ 21.5 3 3.5 4Frequency (kHz)

directivity patterns is beyond the capacity of real-timeFig. 15. Modeling direction-dependent radiation of acoustic

DSP sound synthesis. It is therefore important to find guitar relative to main-axis radiation with second-order HRsimplified models that are efficient from the signal pro- filter.



where the coefficient values are bo = 0.2041, b I = in parallel as there are directions to be included. This-0.1431, and a_ = -0.4389. limits the number of simulated directions, for example,

When zooming to the details of the lowest resonance to the six main directions of the Cartesian coordinate

modes we notice, as described in [17], that the individual system. Even then this approach is computationallymodes behave differently. To model such details, a rela- very inefficient.tively high-order directional filter is needed. It is im- The foregoing considerations as well as our experi-portant to notice that due to the critical-band frequency ments have shown that the directional filtering techniqueresolution, an auditory smoothing may be applied to the is normally the most practical one. A first- or second-responses before directional filter estimation. This helps order filter approximation is often a satisfactory solutionto reduce the order of the filter, in a real-time implementation.

The use of elementary sources is based on the idea thatthe radiation pattern of an instrument is approximated by 2.6 Extensions of the String Modela small number of monopoles or dipoles. In general the The string model can be extended in several ways tomethod is computationally expensive if a large number more carefully simulate a vibrating physical string. Inof paths to the receiver is needed since each new elemen- the following we present three examples. For a moretary source adds a new set of path filters, detailed discussion, see [23].

In the case of commutative excitation [such as theplucked string instrument model shown in Fig. 14(b)] 2.6.1 Sympathetic Vibrationsthe directivity filtering may be included in the composite Sympathetic vibration is a phenomenon where someexcitation in a way similar to the inclusion of the (early) modes of a string are excited by the vibration of otherroom response [7]. The problem with this method is that strings primarily due to couplings through the bridge.the same number of string model instances are to be run It can be simulated by feeding a small fraction of the

output of each string model to the other strings [4], [11].This simplified technique does not take into account thefrequency dependence of sympathetic couplings, but it

mo still addsrealismto the synthetictones.Sincethereis90 feedbackviaall strings,thevaluesof thecouplingcoef-

ficients must be carefully adjusted so that they are smalls0 enoughnot to makethe systemunstable.Anotherand

70 theoreticallymorecorrectwaytoincorporatethesympa-thetic vibrations is to represent the bridge by a separate

60 filterthatis commonto all stringsin a waveguidemodelof an instrument [7]. This approach adds, however, c0m-

50 putationalcomplexityto the overallmodel.

40 2.6.2 Dual-Polarization String Model

30 Physicalstringsvibratein boththe verticaland the0 I 2 3 4 5 6 7 8 9 10 horizontal directions. If the effective length of the string

Freque,cy(mz) is not the same in the two polarization planes, the mixing(a) of the two subsignals of slightly different frequencies

il°Il createsbeatsin thesound.Thishasbeenfoundto bean

]00H[]_ importantfeature in the sound qualityof the kantele[_ll.I_' [20], wheretheexplanationforthis effectis thepeculiar

9°_'_[[1__ _a A_, Fi't_redr_fer_nce.180° way in which the strings have been terminated. In otherstring instruments, however, the effective length in the

80 two directions can differ due to changes in the driving-

7o point impedance of the bridge. This can be incorporatedin the model by using two basic string models with

60 different L for each string [11]. Fig. 17 illustrates acomplete model, including dual-polarization string,

5o wavetableexcitation,andsympatheticcouplings.

40 In thecaseof themandolin,beatingis causedbythefact that there are four pairs of equally tuned strings.

30 This can be modeled using two separate string models0 _ 2 3 4 5 6 7 s 9 10 for each pair.

Frequency (Hz)

(b) 2.6.3 Nonlinearities

Fig. 16. Modeling of guitar sound radiation with first-order In strings the polarization of vibration may changeIIR filters. (a) Transfer functions measured at azimuth angles0 and 180°. (b) Response at 0° filtered with directivity filter continuously in a complex manner since there are weakand measured transfer function at 180°. nonlinearities transferring energy between modes and


m · · ....

V]_LIM_.KIETAL. PAPERS

polarizations. In the kantele, bending the tuning peg at where y(n) is the signal to be analyzed and w(n) is aone end of the strings has been shown to induce nonlin- window function (such as the Hamming window). Anearity due to longitudinal forces [20]. This effect has estimate for the pitch is obtained by searching for thebeen simulated successfully in the following way. The maximum of _bk(m) for each k.

delay line signal is squared, then filtered using a leaky Typical pitch contours of a guitar and a kantele toneintegrator, and the result is added to the output of the are shown in Fig. 18. The contours have been smoothed

string model. This procedure boosts the even harmonics using a three-point running median filter [40]. In bothof the signal, cases the pitch decreases with time and approaches a

In [38] a passive nonlinear filter was PrOPosed for constant value after about 0.5 s. In the kantele the de-

producing effects similar to mode coupling in string increase of the fundamental frequency is quite substantial..struments. A first-order digital all-pass filter is attached The problem now is to determine the best estimate foto the end of the delay line of the string model. The · for the fundamental frequency in a perceptual sense. Incoefficient of the all-pass filter depends on the sign of practice a good solution is to use the average pitch valuethe delay line signal. Another alternative is to use a after some 500 ms as the nominal value, since it isvariable delay based on a Lagrange interpolator FIR important to have a reliable pitch estimate toward the

filter. This technique was found to be suitable for pro- end of the note. This improves the quality of synthe-ducing synthetic signals that exhibit time-varying behav- sized tones.

ior in their decay parts. Once the estimate .f0 for the fundamental frequencyhas been chosen, the effective length L of the delay line

3 CALIBRATION OF MODEL PARAMETERS can be computed as

The synthesis system includes three parts that corn- fspletely determine the character of the synthetic sound-- L = r-. (13)fothe string model, the plucking-point equalizer, and the

input sequence. This implies that to calibrate the model 250[ , jto some particular instrument it is needed to estimate _249.8Ithe values for the length of the delay line L, the coeffi- __

cients of the loop filter Hi(z), the delay M and the param- _249'6I_249.4I

eter rf of the plucking-point equalizer, and the input _ 249'2Isignal x(n). In this section the parameter estimation pro-cedures that will extract these values are described. 249_ o'.l o'.2 o13 0'.4 o15 0.6

(a)3.1 Estimation of the Delay Line Length L

The delay length L (in samples) determines the funda- 409 1mental frequency fo of the synthetic signal according to __4o8Eq. (3) so that

fo = (11) _--i

406 01.! 0'.2 0'.3 01.4 0.5 0.6

Time(s)wherefs is the sample rate. For pitch detection we have (b)

used a well-known method based on the autocorrelation Fig. 18. Fluctuation of fundamental frequency. (a) Steel-stringfunction. The short-term or windowed autocorrelation gmtar tone. (b) Kantele tone.functio n is defined as [39]

1N-1

_k(m) = _ _ [y(n + k)w(n)y(n + k + m)w(n + m)] , 0 _<m _<M (12)_'=o

Sympathetic SympatheticPluck &Body Coupling from Horizontal Coupling toWavetables Other Strings Polarization Other Strings

Ill OutputWavetable n E(z_

VerticalPolarization

Fig.. 17. Overview of plucked string synthesis model, including dual-polarization strings and sympathetic coupling betweenstnngs.

340 J.AudioEng.Soc.,VoL44,No.5,1996May

PAPERS PHYSICALMODELINGOFPLUCKEDSTRINGINSTRUMENTS

When the loop filter has also been designed, we can Spectral peak detection results in a sequence of magni-subtract its phase delay from L and define the nominal tude and frequency pairs for each harmonic. The se-fractional delay Lr as quence of magnitude values for one harmonic is called

the envelope curve of that harmonic. A straight line isLr = L - 'tH(C00)- floor[L - %/(COo)] (14) matched to each envelope curve on a decibel scale, since

ideally the magnitude of every harmonic should decaywhere cb0 = 2_rjeo. The delay Lr is used as the desired exponentially, that is, linearly on a logarithmic scale.delay in the design of the fractional delay filter F(z). In Measurements show that this idealized case is rarely metpractice the phase delay of this filter can be expressed in practice, and many different kinds of deviations areas xF(Coo) = Mr + Lr, where positive integer Mr is the common. It is possible to decrease the error in fit byintegral part of the phase delay that depends on the order starting it at the maximum of the envelope curve [43]of the fractional delay filter and Lr is the fractional part. and by terminating it before the data get mixed with the

noise floor, such as after the decay of about 40 dB. As3.2 Measuring the Frequency-Dependent a result, a collection of slopes I_k for k = 1, 2.....Damping N h is obtained (see Fig. 19). The corresponding loop

Next we discuss how to measure damping of a string gain of the string model at the harmonic frequencies isas a function of frequency. As discussed in Section 2.1, computed asall losses, including the reflection at the two ends of thestring, are incorporated in the loop filter Hi(z). In order Gk = 10_Aa°H , k = 1,2 ..... N h (17)to design a loop filter we need to estimate the damping

factors at the harmonic frequencies of the sound signal, where [3k are the slopes of the envelopes and H is theThis is achieved using the short-time Fourier transform hop size. The sequence Gk determines the prototype(STFT) [19] and tracking the amplitude of each har- magnitude response of the loop filter Hl(o_) at the har-monic, monic frequencies o_k, k = 1, 2 ..... N h, as illustrated

The STFT of y(n) is a sequence of'discrete Fourier in Fig. 20.transforms (DFT) defined as The desired phase response for Hl(_) can be deter-

mined based on the frequency estimates of the harmon-N- _ ics. We do not, however, try to match the phase response

Ym(k)= _ w(n)y(n + mH)e -j'°_ , m = 0, 1, 2,... of the filter. There are two principal reasons for this.n=O

(15) First we believe that it is much more important to matchthe time constants of the partials of the synthetic sound

with than the frequencies of the partials, since rather smalldeviations from harmonic frequencies occur in the case

2_rk of plucked string instruments. In case we wanted to,r_°k - N ' k = 0, 1, 2 ..... N - 1 ' (16) resynthesize strongly inharmonic tones, the phase re-

sponse of the loop filter should be considered. The sec-where N is the length of the DFT, w(n) is a window ond reason is that as we want to use a low-order loopfunction, and H is the hop size or time advance (in filter, the complex approximation of the frequency re-'

samples) per frame. In practice we compute each DFT sponse would not be very successful. Thus we restrictusing the FFT algorithm. To obtain a suitable compro- ourselves to magnitude approximation only in the loopmise between time and frequency resolution, we use a filter design.

window length of four times the period length of thesignal. The overlap of the windows is 50%, implyingthat H is 0.5 times the window length. We apply exces- . . ' .....

sive zero padding by filling the signal buffer with zeros _ ' 2 '__to reach N = 4096 in order to increase the resolution

inthefrequencydomain.by Thespectralpeakscorrespond- _ _ .. 3._

ing to harmonics can be found from the magnitude spec-

trum first finding the nearest local minimum at both -_c

sides of an assumed maximum. The largest magnitude _-_5 _---_______between these two local minima is assumed to be the ._harmonic peak. The estimates for the frequency and :_magnitude of the peak are fine-tuned by applying para- -2cbolic interpolation around the local maximum and solv-ing for the value and location of the maximum of this -2_interpolating function (see [41] or [42] for details). The

number of harmonics to be detected is typically Nh < -3c 0.05 01! 0.15 012 0.25 0'.3 0.35 0'.4 0.45 0.520 (for the acoustic guitar). The STFT analysis is applied Time(s)

to the portion of the signal that starts before the attack Fig. 19. Straight-line fits (dotted lines) of envelope curvesand ends after some 0.5-1 s after the attack. (solid lines) of six lowest harmonics of electric guitar tone.

J.AudioEng.Soc.,Vol.44,No.5, 1996May 341

VALIMAKI ET AL. PAPERS

Let us denote the numerator of Eq. (6) by3.3 Loop Filter Design

It is known that our hearing is sensitive to the change A = g(1 + al) (20)of decay rate of a sinusoid, and in practice we measure

the time constant of some 20 lowest harmonics with the and HI(to) with the numerator removed byintent of matching the frequency response of the loop

filter Hl(Z) to those data. Since we use a first-order loop //l(to) - HI(t°)filter, it is clear that there are more restrictions than A (21)

unknowns in this problem, and only an approximate

solution is possible. Then Eq. (18) can be rewritten asIn principle the loop filter should be designed based

on an auditory criterion, but since that would be compli- tq- 1cated, we decided to use weighted least-squares design. E = _ W(GD[IAPi(%)I - Gk]2 . (22)

k=0The error function to be minimized in the magnitude-

only approximation can be defined as The gain g of the loop filter at low frequencies can

Nh-l be chosen based on the loop gain values of the lowest

E = ___ W(GO[lHl(toOI- Gk]2 (18) harmonics. In many cases it is good enough to set g =k=0 Go, whereas sometimes the average of two or three low-

est loop gain values gives a better result.

where N h is the number of frequency points where the The value for the coefficient a I that minimizes E canloop gain Gk is approximated, % = kO0 are the central be found by differentiating Eq. (22) with respect to a 1.frequencies of the N h lowest harmonics, and W(Gk) is This yieldsa nonnegative error weighting function. It is reasonable

ltO choose a weighting function W_ Ut _ that gives a larger

Nh - 1OE- 2A0 _'. W(GO01[-/1(%)1''[IAo[/l(to0l- Gk].

weight to the errors in the time constants of the slowly Oa1 k=o Oaldecaying harmonics since our hearing tends to focus ontheirdecay.A candidatefor sucha weightingfunctionis (23)

By substituting Eqs. (6) and (19), we can writeNh-l

OE _ 2A° _ Aocos tok(1 + alcos%) -3 - Gkcos ton(1 + alcos%) -l (24)Oal k=o 1 -- Gk

The aim is now to find the zero of this function. In

1 practice we find a near-optimal solution in the following

W(GO - 1 - Gk' (19) way. The value of the derivative is evaluated, and de-pending on the sign of the result, a I is changed by asmall increment, the derivative is evaluated again, and

We require that 0 _< Gk < 1 for all k, which is a physi- so on. After the derivative has reached a very smallcally reasonable assumption since the system to be mod- value, the iteration is terminated and the final value foreled is passive and stable. a1 is used in the synthesis model. We have verified

the convergence of this design procedure in practice by....... analyzing signals generated using the synthesis model.

'9 Theloopfilterthatwasdesignedbasedonthe analysis0.995 data yields the same filter parameters--within numeri-

cal accuracy--as were used in the synthesis. Also the

°'99I I/ [ T"_ _-- _ match with natural tones has been found to be satisfac-

·S['_°'985[ I I_I '"'-( tory in many cases. Fig. 20 illustrates the loop filter

094r I I I I I design fora typical kantele tone.

0.975 It iswellunderstoodthatwhena stringisput tovibra-90.97 tion by plucking it, the sound signal will lack those har-

monies that have a node at the plucking point (see, for.0'9651 example, [17]). However, in general the string is not

0 96_/ ' 015 '_'.s ' _' _'._ , _ , _ plucked exactly at the node of any of the lowest harmon-' . _.5 2 2.5 3 3.5 ics, and since the amplitude of the higher harmonics isFrequency (kHz)

Fig. 20. Loop filter design for kantele. Circles--loop gains considerably small anyway, it is not possible to accu-G2 at harmonic frequencies; solid line--the magnitude re- rately detect the plucking point by simply searching forsponse [H_(z)l2. the lacking harmonics in the magnitude spectrum. An-

342 d. Audio Eng. Soc., Vol. 44, No. 5, 1996 May


other practical problem may occur because of nonlinear using a time-varying inverse filter, which cancels thebehavior of the string. Namely, the amplitude of vibra- impulse response of the string. This filter can be' de-tion of a weak harmonic can gain energy from other signed based on the STFT analysis discussed above.modes so that its amplitude begins to rise, reaching a However, satisfactory results are obtained when usingmaximum about 100 ms after the attack, and then begins carefully chosen tones that behave well, that is, thereto decay [44]. This can often be seen in the analysis of are no strong nonlinear effects nor beating present. Fig.harmonic envelopes of the guitar. For these reasons we 21 shows the spectrum of a mandolin tone and the magni-believe that a more comprehensive understanding of the tude response of the inverse filter. The magnitude spec-effect of the plucking point can be achieved by studying trum of the residual (the result of inverse filtering) isthe time-domain behavior of the string in terms of the illustrated in Fig. 22. Note that in this case the harmonicsshort-time autocorrelation function. A frequency- have been canceled quite accurately.

domain technique for estimating the plucking point has

been reported by Bradley et al. [45]. 4 SYNTHESIS OF PLUCKED STRINGSEstimation of the plucking position is an inherently

difficult problem since a recorded tone can include con- The residual signals resulting from the analysis andtributions of several delays of approximately the same inverse filtering of an original string instrument tone canmagnitude, such as early reflections from objects near be directly applied as the excitation to the string model,the player, the floor, the ceiling, or a wall. To minimize as described in Sections 2 and 3. The synthesis procedurethe effect of these additional factors, we suggest per- is carried out in the following way.

forming the analysis of tones recorded in an anechoic 1) Obtain the residual signal by inverse filtering,chamber. 2) Window the first 100 ms of the residual signal

It is, however, not absolutely mandatory to estimate using, for example, the right half of a Hammingand model the effect of the plucking position. Its contri- window.bution can be left in the excitation signal obtained usinginverse filtering, which is described in the followingsection.

0

3.5 Inverse Filtering

hein utsignalxn, an s ima e usinginv rse_,o...!1filtering, that is, filtering the signal y (n) which is now _

assumed to be the output of the model with the in- _-20 ·

vetted transfer function S-'(z) of the string model. The _ii [

transfer function of the string model shown in Fig. 9canbeexpressedas 30

1

S(z) = 1 - z -L_F(z)HI(z)' (25)

If S(z) had zeros outside the unit circle, the inverse filter - ' ' 9 1S-l(z) would be unstable. In that case, inverse filtering Fr*q.... y(mz)would not yield an acceptable result. Fortunately this is Fig. 21. Magnitude spectrum of mandolin tone (solid line) andmagnitude response of inverse filter (dotted line).not a problem in practice, as can be seen by substitutingEq. (6) into Eq. (25). Inverting this equation yields

S-i(2)=l-[-alz-1--lg(l+alz-+ alz -1 1)z-L' F(z) (26) -it-10

This technique is simple to apply, but since the order -_5

of the loop filter is low, the harmonics are not canceled 4_20very accurately using the inverse filter S-_(z). The re- vsulting x(n) may suffer from high-frequency noise. The _-25low-order loop filter was chosen primarily since it is then -30

efficient to implement the synthesis model. However,inverse filtering is an off-line procedure, where accuracy -35is much more important than efficiency. We could thus -40

design a higher order filter to be used in the inverse -45filtering (see for example, [12] or [15]).

Another deficiency of the inverse filtering technique I 2 3 4 5 6 7 8 9 l0

is that it does not take into account the nonlinear effects Frequency(kHz)

of the physical strings. This can be accounted for by Fig. 22. Magnitude spectrum of inverse filtered mandolin tone.



3) Use the truncated signal as the excitation to the The analysis and synthesis results of the steel-string

string model, acoustic guitar were similar to those of the nylon-string4) Run the string model using the parameters derived acoustic guitar. The use of the plectrum instead of the

from the analysis, finger in the string excitation results in a different-sound-

The original and the synthesized signals presented and ing residual signal, but the string model can be the samediscussed in this chapter are downloadable as sound files for both instruments.

(AIFF, WAV, and MPEG-1 layer II formats) using theWorld Wide Web [46]. 4.2 Mandolin

The analysis and resynthesis of the mandolin are illus-

4.1 Guitars trated in Figs. 25 and 26. We notice that windowing and

'From the analysis and synthesis results of the electric truncation of the residual signal have only minor effects,

guitar shown in Figs. 23 and 24 it can be seen that the shortening the decay of the lowest body resonances. The

length of the residual signal used in resynthesis can be resynthesized tone in Fig. 26(b) has been calculated using

reduced to about 50 ms without any significant loss of the first 100 ms of the residual and a single-polarization

information. This is due to the lack of body resonances, string model. The original mandolin signal shows a slight

The resulting sound is quite similar to that of the basic beating effect after the transient part, whereas the resynthe-

KS model excited with an impulse. It is clear that in this sized tone has an exponentially decaying behavior. It is,

case physical modeling principles should be extended to however, almost impossible 'to distinguish the synthetic

various magnetic pickup combinations and amplifiers, tone from the original one by ear. Using a dual-polarization

i0.5 0. -

-_o. i'ii. .-1 0.05 0.1 ' 0.15 0.2 0.25 I 0.! 0.]5 012 0.25

(a) (a)

0._

-o, -o_

i i i

-10 0.05 0.1 0.15 012 0.25 0.05 0.1 0.115 012 0.25

Time(s) Time(s)

(b) (b)

Fig. 23. (a) Electric guitar tone. (b) Residual signal after in- Fig. 25. (a) Mandolin tone. (b) Residual signal after inverseverse filtering, filtering.

0.5 0.5

0 *. 0

-0.5 -0.

' i ' ' '-10 0.05 0 1 0.15 0 2 0.25 0 0.05 0.1 0.15 0.2 0.25

(a) (a)

I , , , , ,

0'5I0 '_/

-0.

. i

0 0.05 0.1 0.15 0.2 0.25 0 0.05 0.1 O.15 0.2 0.25

Time(s) Time(s)

(b) (b)

Fig. 24. (a) Truncated residual signal. (b) Resynthesized elec- Fig. 26. (a) Truncated residual signal. (b) Resynthesized man-tricguitartone. dolintone.

344 J.AudioEng.Soc.,Vol.44, No.5, 1996May


string model with a very small beating effect would result attack part of the signal. The distinctive sound of thein an even more natural tone. Figs. 27 and 28 depict the banjo is retained in the residual signal after inverse fil-temporal characteristics of the six lowest harmonics of the tering. Figs. 31 and 32 depict the results of the analysisoriginal and the synthetic mandolin tones, respectively. It and the synthesis.can be clearly seen that the transient part is identical in In this example some of the harmonics of the originalboth plots. The decay of the harmonics is very similar but signal decay much faster than the first harmonic. Thenot identical, synthesis model with a first-order loop filter is not able

to catch this phenomenon. For this reason the decay4.3 Kantele rate of some harmonics is too slow in the resynthesized

The use of the dual-polarization string model is essen- signal. This can be seen by comparing Figs. 31 andtial in the case of the kantele. As shown in Fig. 5, there 32. Apart from this slight difference, the overall soundis significant beating in the signal, which clearly affects quality of the synthetic signal is excellent.the sound quality. Modeling this dual-polarization be-

havior is quite straightforward with two separate string 5 REAL-TIME IMPLEMENTATION ON A SIGNALmodels, as was discussed in Section 2.6.2. Figs. 29 and PROCESSOR30 depict the analysis and synthesis results for a kanteletone. A dual-polarization string model was used for the The real-time synthesis models were implemented us-synthesis. The nonlinear characteristics of the kantele ing a Texas Instruments TMS320C30 floating-point sig-have been resynthesized according to the principles pre- nal processor. This processor is capable of executing asented in Section 2.6.3. maximum of 30 million floating-point operations per

4.4 Banjo _ .....

In the analysis of the banjo we experienced that most osof the characteristics of a banjo tone are included in the

0

': 4L

..' '· · 0 0.05 O.I 0.15 0.2 0.25

(a)_-lO, '

._-30, 0.5

0 I.............. )AJ_........ Jt i I.

-0.5[1 ':: -lo 0.85 oh 0.'15 012 0.252 Time (s)

o (b)3 o.2 o.t Fig. 29. (a) Originalkanteletone. (b) Residualsignalafter0.3

0.4 inverse filtering.Frequency (kHz) 4 0.5 Time (s)

Fig. 27. STFT analysis of mandolin tone. (Gibson model A,first open single string, fundamental frequency 649 Hz.)

0.i

.f.. .!.

: : :. -0

_,0, '" -_ 0.85 d, o._s d2 o.ls-20, ". (a)

0-30

-50_0" 0.1 ' -0.52

0 - I 0.05 0]1 0.15 0.2 0.253 0.2 0.1 Time(s)

0.30.4 (b)

Frequency(kHz) 4 0.5

Time (s) Fig. 30. (a)Truncated residualsignal. (b) Resynthesizedkan-Fig. 28. STFr analysis of resynthesized mandolin tone. tele tone.

J. Audio Eng, Soc., Vol. 44, No. 5, 1996 May 345

· m - -

VALIM_.KI ET AL. PAPERS

second (30 MFLOPS, 15 MIPS). Our experience was, PC platforms. We use commercially available DSP boards.however, that this limit cannot be reached in practice, Our developments include using a multiprocessor en-

except in special cases when register, pipeline, and mem- vironment for running several instruments in parallelory conflicts can be totally avoided. Our hand-optimized or for a more detailed simulation of the instruments,assembly code taking up about 110 instructions per output including nonlinear effects. We have experimented withsample for a single string was able to run six strings in a system containing two TMS320C31 processors, whichreal time with a sampling rate of 22 kHz, including host are less expensive and slightly reduced versions of thecommunication and parameter calculation. However, this TMS320C30, and we have built a more advanced systemwas found to be adequate for producing excellent plucked using TMS320C40 floating-point processors, which sup-

string sounds since there is not much energy in the spec- port multiprocessing in hardware [36].tram of these signals above 10 kHz. The implementation of plucked string synthesis on a

An overview of the software and hardware environ- signal processor follows the principles of Fig. 9. Each

ment for the real-time implementation of physical mod- substring (single polarization) consists of a ring buffer

els is given in Fig. 33. Programs written in the QuickC30 delay line with third-order Lagrange interpolation (forenvironment [47] can be run on different hardware plat- fine-tuning) and a loop filter. Table look-up is used forforms without modification. All hardware-dependent de- computing the interpolator coefficients from pitch infor-

tails are hidden from the application by the use of spe- mation. Also the loop filter parameters have been tablecialized macros and functions. So far the QuickC30 coded. String excitations are read from wavetables that

system has been implemented on the Macintosh and the have been constructed using inverse filtering techniques.Most control parameters are MIDI-like, that is, they

cover a number range from 0 to 127. The computation

of the model parameters, such as the delay and loop-

{). filter parameters and excitation filtering, is done as much

as possible by table look-up since computationally ex-pensive divisions, logarithms, and exponents are other-

-o._ wise needed. (For control parameters, see also Section6.)" 0;5 or, 015 0'2 ,,25

(a) The updating of model parametersis executedevery1 ms, which has been found fast enough in most tran-sientsandtransitions.Thereducedparameterupdaterate

llll is necessaryto keepthecomputationalcost lowenough,o.5 suchas inthecaseof guitarsynthesis.

5.1 Multirate String Model

-0 Astringinstrumenttone,likemostnaturalsignals,is

0 005 o l 0'15 0'2 o25 a low-pass signal whose spectrum varies through timeTimeCs) SO that high-frequency components are damped faster(b) than low-frequencycomponents. Thus it would be ad-

Fig. 31. (a) Original banjo tone. (b) Residual signal after vantageous to design a multirate synthesis model where

inversefiltering, the input sequenceis fed in at a high samplingrate, butwhere the delay line.and the loop filter Hi(z) would runat a rate considerablylower.This idea has alsobeenmentioned in Smith [48, p. 50]. As a result, the attack

0.5

-0.5

DSP

r ,-I 0.05 0.1 0,15 02 (}.25

(a)

0.1

MIDI controllers· -0.5

0.05 0.1 Time (s) 0.15 0'2 0.25 other MIDI applications _ _,

I I

U(b)

Fig. 32. (a) Truncated residual signal. (b) Resynthesized Fig. 33. Software and hardware environment for real-time

banjotone. implementationofphysicalmodels.



part of the synthetic sound will still have energy at high modeling, we believe that model-based synthesis is su-frequencies, thus preventing the sound to be too low- perior to conventional methods when it comes to control

pass filtered, of the synthesis. Most model parameters have a directFig. 34(a) illustrates the idea of the multirate string physical meaning, and thus adjusting them changes the

model. In the feedback loop the signal is first decimated sound in a predictable way. One disadvantage of model-by a factor K, which is a small integer number, say, 2 based methods is that each instrument family has to be

or 3. In decimation, the signal is low-pass filtered using modeled and controlled differently. In order to get aa linear-phase FIR filter Hr(z), and every Kth sample is convincing sound, one has to study the performanceretained. The delay line length L has to be divided by techniques used with the given instrument, and, in aK to keep the fundamental frequency constant. Also the way, the performer has to be modeled as well.loop filter Hl(z ) has to be redesigned. The output of theloop filter is upsampled by the factor K and added to 6.1 Problems Related to String Instruments

the input sequence. Most plucked string instruments have several strings,A more efficient version of the multirate model is the length of which can be fixed (such as the kantele or

presented in Fig. 34(b). Now two versions of the excite- the harp) or variable (as in the lute family). There aretion signal are needed--the original xp(n) (the output several problems of sound synthesis control that appearof the plucking-point equalizer) and a decimated _p(n). only with certain types of instruments [49]. Here weIn this model the excitation signal xp(n) is not processed consider two of them--note transitions and string allo-at all since the feedback loop is computed separately, cation.

The output of the feedback loop, which runs at a lower Note transitions appear when the next note has to besample rate, is upsampled using a linear-phase interpo- played on the same string that was used for the previouslating FIR filter Hr(z ). note. Transitions are perceptually very important. The

The advantages of the multirate approach are reduced lack of transitions makes the sound artificial in a melodicmemory requirements (that is, shorter delay lines) and context, even if the individual notes sound realistic. Cor-

computational savings due to the lower sample rate in rect simulation of note transitions depends on two condi-the feedback loop. However, this method also requires tigris--the ability to control the synthesizer in a propera more sophisticated fractional delay filter F(z). The way and the ability to recognize when transitions areLagrange interpolator does not work well in this case needed. Physical modeling synthesis produces convinc-since its magnitude response error is unacceptable at lng note transitions without any special effort when thehigh frequencies. An all-pass fractional delay filter is parameters of a model are adjusted for a new note. How-better suited to this case. ever, special care must be taken for eliminating extra

noises caused by parameter updating.

6 CONTROL OF THE STRING INSTRUMENT The string allocation mechanism is responsible forMODELS finding the proper sound generation (that is, string

model) for a new note. This is straightforward for instru-In this section we examine the necessary environment ments with fixed strings, since each pitch can be played

for using the models as musical instruments. Though only on a single string. If the number of generators isthe ideas described herein are not restricted to physical less than the number of available pitches on the real

instrument (as with certain types of the kantele or, forexample, in the case of the piano), a voice allocation

xe(n) ___H+z)Ht(z__e-y(n)_K mechanism is needed [49]. Otherwise each string can

have a separate sound generator assigned to it. In thiscase the algorithm is straightforward: select the genera-tor with the given (fixed) pitch. There are no note transi-tions in the sounds of these instruments.

A problem appears with variable-length strings. To(a) allow polyphony, several strings are used on a single

instrument. The pitch ranges of individual strings usu-xp(n) _1) r- y(n) ally overlap. Thus the pitch does not uniquely determine

_z_ the string. On the real instrument the performer has toselect the appropriate string for a given note. Since indi-

_ vidual strings typically have different physical proper-

z_ ties (thickness, mass, and tension), choosing a different

_An) l' ' ' ' ' ' one will result in a slightly different timbre. When thei computer has to perform a piece of music, a string alloca-(b) tion mechanismis needed to simulate this decision. This

Fig. 34. (a) Multirate synthesis model where feedback loop is mechanism is entirely different from the voice alloca-computed at lower sample rate fs/K, with K being the decimation. It is needless to say that string allocation determinestion factor. (b) Equivalent system, where original input signalxp(n) is fed directly to output and a decimated input signal note transitions as well._,(n) is used in feedback loop. The simplest string allocation mechanism tries to as-


V_,LIMAKIET AL. PAPERS

sign the lowest string that can play the given pitch to While pitch is a well-defined property--in practice itthe note. A more elaborate method would be allocating corresponds roughly to the fundamental frequency of

the strings so that the movement of the left hand would a note--expression cannot be uniquely defined. Still,be minimized. This can be accomplished easily if it is everyone "feels" what it means. Although these feelings

possible to preprocess the score. However, this latter are not necessarily the same, it can be agreed that amethod cannot be used with real-time control. As a eom- low value of expression means something like colorless,

promise, nearly real-time control can be used [49]. passing, or soft, whereas a high value can be interpretedas emotional, bright, emphasized, or loud. Moreover

6.2 MIDI and Intelligent Synthesis Control the interpretation of the expression seems to be closelyMIDI has several caveats when used with complex related to the performance style. The realization of this

sound synthesis such as physical modeling. In principle, led to the idea of combining the timbre and the mappingone could use many different controller messages for of pitch and expression to the actual synthesis param-detailed control over the synthesis. This approach, how- eters into performance styles, like classical guitar, bluesever, has severe drawbacks. It is rather difficult to con- guitar, or flamenco guitar.

trol many parameters manually and there are many pa- Reducing the control parameters to only two drasti-rameter combinations that simply do not make a sound, cally reduces the complexity of the control interface,

Another problem is that MIDI glues together the pitch and it still allows human interaction. For more detailedset and note triggering into a single "note on" message, control, the attack part of the notes, which carries theThese events are clearly separated in time on the variable most important indication of the playing technique,source instruments. Therefore when such sound is syn- could be controlled separately. On the keyboard thesethesized, these control events should be separated to expression controls could be assigned to the aftertouchcreate natural note transitions, and key velocity, respectively.

In spite of these problems it would be convenient touse existing MIDI equipment with new synthesizers. 6.4 Overview of OuickMuslcWe suggest a new approach for controlling complex QuickMusic is the musical sound synthesis controlsynthesis with MIDI: let the control interface--which package of the QuickSig system (see Fig. 33) and ismaps the performance events into control events--be written entirely in Common Lisp/CLOS. It contains amore intelligent. By intelligent we mean that it should platform-independent sequencer, which uses a Lisp-syn-work reasonably, even if not all details are given explic- tax musical notation and a real-time MIDI interface.itly. The missing information can be deducted from a Originally the package had been developed under Macin-knowledge base created by studying the performance tosh Common Lisp (MCL). Recently we have ported ittechniques of the given instrument and analyzing the (together with other parts of the QuickSig environment)impact of different playing techniques on the sound, to the PC platform under Allegro Common Lisp forThis approach has an additional advantage--it is much Windows.easier to play an intelligent instrument than a complexbut dumb synthesizer. 6.5 The Lisp Sequencer

The Lisp sequencer can be used to quickly create test6.3 Simplifying the Control Problem note sequences or even demo pieces. For describing a

The usual timbre space has three independent param- score a Lisp-like musical notation is used, allowing easy

eters--pitch, level, and timbre. These parameters are parameterizing of the' individual notes by keyword pa-too general for controlling specialized synthesizers such rameters. The sequencer is object oriented and thus cas-as those based on physical models. However, during the ily expandable.

performance of a musical piece the timbre will probably The score is processed in four phases--parsing, per-not exit a bounded subspace stretched by the possible forming, exploding, and playing. During the parse phasesounds of, for example, a given acoustic guitar within a high-level object description of the note sequence is

a given performance style. Naturally there will be differ- generated. The perform phase was included to allowent subspaces for different instruments and performance automatic performance of the piece using performance

styles, but the main point is that the performer can rules. This part has not been exploited so far. The ex-choose the adequate subspace a priori. It may also be plode phase generates (usually several)low-level controlnecessary to allow shifting from one subspace into an- events from the notes. Finally the built-in schedulerother during the performance if the conditions change plays the piece on the DSP hardware in real time.substantially.

For moving around inside one of these subspaces we 6.6 MIDI Control Interfacecan use fewer parameters--pitch and something we call The most important elements of QuickMusic are the"expression." Choosing pitch as the primary parameter different control interfaces that map the input MIDIis motivated by our interest in traditional Western music, events (such as key on/off, wheel movement) to thewhich is entirely based on the twelve-tone scale and actual synthesis parameters. They were added to allowwhere the most important property of a note is its pitch, interactive performance and easy experimenting with re-In our approach all internal parameter calculations are al-time synthesis algorithms. As an additional advantagebased on the desired pitch, of using MIDI, an external MIDI sequencer can be used


PAPERS PHYSICAL MODELING OF PLUCKEDSTRING INSTRUMENTS

to record and replay performances. Unfortunately this technique cannot be used in real-A platform-independent MIDI message dispatch sys- time performance.

tem has been built on top of the low-level interface.Message dispatching is accomplished automatically by 6.7.3 Special Control Modesthe CLOS method dispatching mechanism, since the ge- We have experimented with advanced control modesneric method process-midi-message is specialized on for the guitar synthesizer. They help in creating a naturalboth the instrument class and the type of MIDI message, sounding guitar performance using a standard MIDI key-The MIDI control interfaces are implemented as differ- board. Two basic styles have been investigated--soloent instrumentclasses, playingand strumming.

Solo Guitar: Playing a guitar solo is slightly different6.7 Operating Modes from using the MIDI mono mode. On the guitar a solo

The interactivity of Lisp makes it easy to experiment is usually played on several strings. Therefore automaticwith different control strategies. We have developed sev- string allocation is used. In the solo mode all specialeral different modes, partly to overcome the limitations playing techniques that will be discussed later areposed by MIDI itself or to facilitate playing "guitarlike" available.on a normal keyboard. The keyboard, of course, by no Rhythm Guitar: One difficulty of playing voiced guitarmeans can replace a guitar controller (just as much as chords correctly on the keyboard is the relatively largethe synthesizer does not replace the acoustic guitar), but internote separation of the chord members. To reducethese specialized modes help getting the most out of the this problem we have implemented an intelligent chordsynthesizer in terms of fidelity and ease of playing, recognizer, allowing simplified chord fingering. The al-

gorithm tries to fit different chords to the depressed keys6.7.1 Standard MIDI Modes and the most probable chord is selected (such as A, Am,

In order to be able to use existing MIDI files se- A7). It is then revoiced for the guitar using look-upquenced by others we have implemented the standard tables. When alternative voicings are possible (as inMIDI control modes as an option, most cases), the position of the previously played chord

Mono Mode: The MIDI mono mode is often used with can be used to select the chord in the closest position.MIDI guitar controllers to send the performance data The player can influence the selection of the chord posi-on separate MIDI channels for each string. This mode tion by using different octaves of the keyboard.simplifies the MIDI implementation in the receiver. Eachperformancestylecanhaveitsownsetofchords,There is one sound generator per channel, and there is which is based on the most often used chords of theno need for voice allocation. Ambiguous pitches, which given style. Furthermore, styles can contain chords thatcould be played on different strings, do not cause a use only the upper strings, leaving the lower strings freeproblem. Pitch bend can be used individually on differ- for playing bass notes, as in folk or Latin-Americanent strings without affecting other notes, styles. In this case the bass notes can be played on

While the mono mode is perfectly suited for the MIDI another part of the keyboard using the keyboard splitguitar, it is difficult to use it with a keyboard. Although mode, or they can be played by using a special note-keyboard split could be used to send data on different triggering mode. The notes are not triggered immedi-channels, playing chords is quite difficult if not impossi- ately by fingering a chord, but separate keys--one keyble this way. per string--are used topluckthestringsafterward.This

Poly Mode: In the poly mode the data for all strings technique can be used for playing nice arpeggios as well.are received on a single MIDI channel, and the control With these techniques many common guitar styles caninterface has to assign the notes to available sound gener- be played from the keyboard with little effort.ators. There is a potential problem with instruments ofthe variable source type. Assigning a different generator 6.8 Mixing Control Modesto a new note will not result in a correct transition. String It is possible to switch between control modes duringallocation is necessary for most string instruments, performance. This feature helps to play chordal solos

interleaved with short runs. It is also possible to split6.7.2 Smart MIDI Modes the keyboard and define different control modes over

Smart MIDI modes are enhanced MIDI modes. While the two ranges.they accept MIDI events intended for normal use, theytry to add more details to the sound, such as by splitting 6.9 Special Techniquesthe "note on" to separate pitch set and note start events. In this section we discuss some special playing tech-Since setting the pitch happens before starting the note, niques not available on keyboard instruments. Thesethis can only be achieved by delaying the start of a techniques can be simulated with our intelligent con-note by a small amount. Other enhancements, such as trol interfaces.automatic fret noise generation, need even more delay.Allowing a relatively large (0.5-1 s) constant pro- 6.9.1 Hammer-On and Puli-Offcessing delay helps in both cases. When the synthesizer Hammer-on and pull-off are left-hand techniques thatis driven by a sequence, the delay can be compensated a guitar player sometimes uses to pluck a note. By strik-by advancing the given track by the amount of delay, lng a finger against a fret, one can set the string into

J. Audio Eno. Soc., Vol. 44, No. 5, 1996 May 349

i I II

VALIM_.KI ET AL. PAPERS

motion without plucking it with the right hand. This decreasing the pressure on the frets with the left hand

technique can be used if the pitch of the hammered note (a technique often used in country or beat strummingis higher than the previous one on the same string. It is styles). Both techniques result in a quick damping ofconvenient for playing grace notes or fast passages. A the string. Right-hand muting often causes an extra

similar technique is fret tapping, used by many electric sound when the palm hits the guitar body and the strings.guitarists, which allows playing very fast solos with both In contrast, left-hand muting usually does not introduce

hands on the fret board, extra noise, and it is often only partial, damping thePull-off is a complementary technique, which results higher harmonics more than the fundamental.

in a lower pitch. The left-hand finger releases the fret In our system muting can be accomplished by steppingused for the previous note, and at the same time it plucks on the' sustain pedal. The same effect can be assignedthe string by pulling it a little sideways. Advanced play- to a key. Partial muting--an often used effect with the

ers can play complete melodies without using their right electric guitar--can be controlled with the expressionhandatall. pedal.

When hammer-on recognition is on, the control inter-face assumes that the new "note on" that is played on 6,9.5 Changing Pluck Position

an already sounding string should be a hammer-on or a Plucking a string close to the bridge causes a softer,pull-off, depending on the direction of the pitch change, brighter, and sharper tone, while plucking toward theIn the solo mode this can be accomplished by hitting neck makes a louder, mellower sound. Because of the

the next key before the previous one has been released position of the right-hand fingers, the low strings are(similarly to the fingered portamento in the mono mode usually plucked further away from the bridge than theon many synthesizers). In the poly mode the string allo- higher ones. Moreover, varying the pluck position hascation algorithm suggests a hammer-on or pull-off when an artistic effect by changing the overall timbre. On theit cannot allocate a new string with the given configura- keyboard the position can optionally be mapped to thetion. In every case, releasing a key before the next one key velocity (softer touch will move it closer to theis pressed ensures that the new note will be plucked, bridge) or to the modulation wheel.

6.9.2 Tremolo Picking 6.10 Parameter Calculation

Tremolo picking is a fast up-down alternate stroke The calculation of the actual synthesis parameters--plucking technique, typically used with the mandolin, delay-line length, loop filter coefficients, and fractionalthe bouzouki, or the balalaika. The string is not damped delay filter coefficients--is based on two-dimensionalmanually between the individual strokes, but a new look-up tables indexed by the desired pitch and expres-pluck will automatically damp the previous note, creat- sion values. We are experimenting with an alternativelng a characteristic sound. This effect can be somehow method that employs a neural network for implementingimitated on a normal keyboard by fast repetition of a the nonlinear mapping functions. Both methods use pa-key. However, this way it is not possible to tell the rameters derived from actual performances recorded pre-stroke direction, which is important in double-stringed viously and analyzed by a computer.instruments. Moreover, the standard interpretation of

, "note on" and "note off" would cut away the note in 7 CONCLUSIONS AND FURTHER WORKbetween the repetitions, which sounds very unnatural.Though the sustain pedal could be used to avoid this A physically based modeling technique for pluckedartifact, as a result the new notes would be triggered string instruments has been developed aiming at high-without damping the old ones, thus making a ringing quality real-time sound synthesis. The model for a vi-effect instead of the tremolo picking, brating string used in this method is based on a general-

The use of the special string excitation keys (discussed ized KS model, which is constructed of digital delayunder Rhythm Guitar) automatically results in a correct lines and linear filters. Many approaches to the modelingtransition to the new note. Furthermore there is a sepa- of the body of string instruments have been studied, andrate tremolo key which, when pressed, plucks the last particularly successful results in terms of efficiency andnote again with a different pluck sequence, correspond- sound quality have been obtained by using an inverselng to the opposite stroke direction, filtered input signal.

The parameter values of the synthesis model can be6.9.3 Fingered Vibrato calibrated based on the analysis of acoustic signals, and

Automatic vibrato sounds usually rather unnatural. It thus the model can imitate the sound of a real instrument.

can be replaced by mapping the keyboard aftertouch to The analysis consists of pitch detection and short-time"pitch bend up" with a small pitch bend range. This way Fourier analysis. Estimates for the length of the delayexpressive vibratos can be played easily by hand. line of the string model and the coefficients of the loop

filter can be obtained based on these analysis data.

6.9.4 Muting the Strings The synthesis model accurately reproduces the attackUnlike on the piano, on the guitar the strings are not part (the first 100 ms or less) of the signal and approxi-

muted automatically. The guitar player can mute them mates the average decay rate of the harmonics. Sinceeither by using his or her right hand or palm, or by these two aspects are of great importance in the recogni-

350 d. Audio Eng. Soc., Vol. 44, No. 5, 1996 May


tion of a musical instrument, the individual synthetic Conf. (Tokyo, Japan, 1993 Sept. 10-15), pp. 64-71.

signals are often indistinguishable by ear from the origi- [8] M. Karjalainen and U. K. Laine, "A Model fornal ones. Real-TimeSynthesisof GuitarUsing a Floating-Point

Principles presented in this paper can be applied to Signal Processor," inProc. 19911EEEInt. Conf. Acous-many plucked string instruments. We have investigated tics, Speech, and Signal Processing (Toronto, Ont.,model-based synthesis of the acoustic guitar, the steel- Canada, 1991 May), vol. 5, pp. 3653-3656.string guitar, the electric guitar, the banjo, the mandolin, [9] M. Karjalainen and V. Vfilim_iki, "Model-Basedand the kantele. Nonlinear extensions which are required Analysis/Synthesis of the Acoustic Guitar," in Proc.

for imitating the beating of harmonics were also dis- 1993 Stockholm Music Acoustics Conf. (Stockholm,cussed. Real-time implementation of the synthesis Sweden, 1993 July 29-Aug. 1), pp. 443-447. Soundmodel using a signal processor was studied, and a novel examples included on SMAC'93 CD.multirate implementation technique was developed. Sew [10] J. O. Smith, M. Karjalainen, and V. X/_ilim'fiki,eral aspects of controlling the physical models of personal communication (New Paltz, NJ, 1991 Oct.).plucked string instruments were studied. [11] M. Karjalainen, V. V_lim'aki, and Z. Jtinosy,

Future work in modeling plucked string instruments "Towards High-Quality Synthesis of the Guitar andincludes the design of a more sophisticated inverse ill- String Instruments," in Proc. 1993 Int. Computer Musictering technique that would take into account the time- Conf. (Tokyo, Japan, 1993 Sept. 10-15), pp. 56-63.varying character of each harmonic. Furthermore a [12] J. O. Smith, "Techniques for Digital Filter De-method to estimate the plucking point is needed. The sign and System Identification with Applications to theplucking point of the string has to be determined to Violin," Ph.D. thesis, CCRMA, Dept. of Music, Stan-cancel its effect from the analyzed signal, ford University, Stanford, CA, Rep. STAN-M-14 (1983

June).

8 ACKNOWLEDGMENT [13] J. O. Smith, "Synthesis of Bowed Strings," inProc. 1982 Int. Computer Music Conf. (Venice, Italy,

The authors wish to thank Timo I. Laakso for his 1982), pp. 308-340.

helpful comments on an earlier version of this paper and [14] J. Laroche and J. M. Jot, "Analysis/SynthesisJulius O. Smith for inspiring discussions on model-based of Quasi-harmonic Sound by the Use of the Karplus-synthesis. The various plucked string instruments were Strong Algorithm," in Proc. 2nd French Congress onplayed by Jukka Savijoki, Lassi Logr6n, Tuomas Lo- Acoustics (Arcachon, France, 1992 Apr.).gr6n, and Tuomo Laakso. Their contribution is acknowl- [15] J. Laroche and J. L. Meillier, "Multi-channeledged. This work was conducted while Matti Karja- Excitation/Filter Modeling of Percussive Sounds withlainen was a visiting scholar at CCRMA, Stanford Application to the Piano," IEEE Trans. Speech AudioUniversity, Stanford, CA, during the academic year Process., vol. 2, pp. 329-344 (1994 Apr.).1994-1995. [16] J. O. Smith, "StructuredSamplingSynthesis--

Part of this research was carried out at CARTES (Es- Automated Construction of Physical Modeling Synthesis

poo, Finland). This work was financially supported by Parameters and Tables from Recorded Sounds," pre-the Academy of Finland. sented at CCRMA Associates Meeting, Stanford Univer-

sity, Stanford, CA (1993 May).

9 REFERENCES [17] N. H. Fletcher and T. D. Rossing, The Physicsof Musical Instruments (Springer, New York, 1991).

[1] C. Roads and J. Strawns, Eds., Foundations of [18] M. Karjalainen, U. K. Laine, and V. V_ilim'fiki,Computer Music (M.I.T. Press, Cambridge, MA, 1985). "Aspects in Modeling and Real-Time Synthesis of the

[2] F. R. Moore, Elements of Computer Music (Pren- Acoustic Guitar," in Proc. 1991 IEEE ASSP Workshoprice-Hall, Englewood Cliffs, NJ, 1990). on Applications of Signal Processing to Audio and

[3] K. Karplus and A. Strong, "Digital Synthesis of Acoustics (New Paltz, NY, 1991 Oct.).Plucked String and Drum Timbres," Computer Music [19] J. B. Allen and L. R. Rabiner, "A Unified Ap-J., vol. 7, no. 2, pp. 43-55 (1983). proach to Short-Time Fourier Analysis and Synthesis,"

[4] D. Jaffe and J. O. Smith, "Extensions of the Kar- Proc. IEEE, vol. 65, pp. 1558-1564 (1977 Nov.).

plus-Strong Plucked String Algorithm," Computer Mu- [20] M. Karjalainen, J. Backman, and J. P61kki,sic J., vol. 7, no. 2, pp. 56-69 (1983). "Analysis, Modeling, and Real-Time Sound Synthesis

[5] J. O. Smith, "Efficient Yet Accurate Models for of the Kantele, a Traditional Finnish String Instrument,"Strings and Air Columns Using Sparse Lumping of Dis- in Proc. 1993 IEEE Int. Conf. Acoustics, Speech, andtributed Losses and Dispersion," CCRMA, Dept. of Mu- Signal Processing (Minneapolis, MN, 1993 Apr.sic, Stanford University, Stanford, CA, Rep. STAN-M- 27-30), vol. 1, pp. 229-232.67 (1990 Dec.). [21] M. Karjalainen, "DSP Software Integration by

[6] J. O. Smith, "Physical Modeling Using Digital Object-Oriented Programming: A Case Study of QuickSig,"Waveguides," Computer Music J., vol. 16, no. 4, pp. IEEE ASSP Mag., vol. 7, pp. 21-31 (1990 Apr.).74-87 (1992). [22] J. O. Smith, "Music Applications of Digital

[7] J. O. Smith, "Efficient Synthesis of Stringed Mu- Waveguides," CCRMA, Dept. of Music, Stanford Uni-sical Instruments," in Proc. 1993 Int. Computer Music versity, Stanford, CA, Rep. STAN-M-39 (1987 May).



[23] M. Karjalainen, Ed., "Modeling and Synthesis oja, J. Huopaniemi, T. Huotilainen, and M. Karja-of String Instruments," URL: http://www.hut.fi/HUT/ lainen, "An Integrated System for Virtual Audio Real-Acoustics/stringmodels/(1996). ity," to be presented at the 100th Convention of the

[24] C. R. Sullivan, "Extending theKarplus-Strong Audio Engineering Society, Copenhagen, DenmarkAlgorithm to Synthesize Electric Guitar Timbres with (1996 May 11-14).

Distortion and Feedback," Computer MusicJ., vol. 14, [37] M. Karjalainen, J. Huopaniemi, and V. Viili-no. 3, pp. 26-37 (1990). miiki, "Direction-Dependent Physical Modeling of Mu-

[25] T. I. Laakso, V. Viilimiiki, M. Karjalainen, and sical Instruments," in Proc. 15th Int. Congr. on Acous-U. K. Laine, "Splitting the Unit Delay--Tools for Frae- tics (Trondheim, Norway, 1995 June 26-30), vol. III,tional Delay Filter Design," IEEE Signal Process. Mag., pp. 451-454.

vol. 13, pp. 30-60 (1996 Jan.). [38] S. A. Van Duyne, J. R. Pierce, and J. O. Smith,[26] T. I. Laakso, V. Viilim'aki, M. Karjalainen, and "Traveling Wave Implementation of a Lossless Mode-

U. K. Laine, "Real-Time Implementation Techniques Coupling Filter and the Wave Digital Hammer," in Proc.

for a Continuously Variable Digital Delay in Modeling 1994 Int. Computer Music Conf. (Aarhus, Denmark,Musical Instruments," in Proc. 1992 Int. ComputerMu- 1994 Sept. 12-17), pp. 411-418.sic Conf. (San Jose, CA, 1992 Oct.), pp. 140-141. [39] L. R. Rabiner, "On the Use of Autocorrelation

[27] V. V_ilim'fiki, T. I. Laakso, and J. Mackenzie, Analysis for Pitch Detection," IEEE Trans. Acoust.,"Elimination of Transients in Time-varying All-pass Speech, SignalProcess., vol. 25, pp. 24-33 (1977 Feb.).Fractional Delay Filters with Application to Digital [40] L. R. Rabiner, M. R. Sambur, and C. E.

Waveguide Modeling," in Proc. 1995 Int. Computer Schmidt, "Applications of a Nonlinear Smoothing Algo-Music Conf. (Banff, Alta., Canada, 1995 Sept. 3-7), rithm to Speech Processing," IEEE Trans. Acoust.pp. 327-334. Speech, Signal Process., vol. 23, pp. 552-557 (1975

[28] V. V_ilim_iki, "Discrete-Time Modeling of Dec.).

Acoustic Tubes Using Fractional Delay Filters," doc- [41] J. O. Smith and X. Serra, "PARSHL: An Analy-toral thesis, Helsinki University of Technology, Labora- sis/Synthesis Program for Non-harmonic Sounds Basedtory of Acoustics and Audio Signal Processing, Espoo, on a Sinusoidal Representation," in Proc. 1987 Int.Finland, Rep. 37 (1995 Dec.). computer Music Conf. (Urbana, IL, 1987 Aug.), pp.

[29] A. Paladin and D. Rocchesso, "A Dispersive 290-297.

Resonator in Real-Time on MARS Workstation," in [42] X. Serra, "A System for Sound Analysis/Trans-Proc. 1992 Int. Computer Music Conf. (San Jose, CA, form/Synthesis Based on a Deterministic plus Stochastic1992 Oct.), pp. 146-149. Decomposition," Ph.D. thesis, CCRMA, Dept. of Mu-

[30] S. A. Van Duyne and J. O. Smith, "A Simplifed sic, Stanford University, Stanford, CA, Rep. STAN-M-Approach to Modeling Disperion Caused by Stiffness in 58 (1989 Oct.).

String and Plates," in Proc. 1994 Int. Computer Music [43] J. O. Smith, personal communication (Espoo,Conf. (Aarhus, Denmark, 1994 Sept. 12-17), pp. Finland, 1993 Aug.).407-410. [44] K. A. Legge and N. H. Fletcher, "Nonlinear

[31] J. O. Smith and S. Van Duyne, "Commuted Pi- Generation of Missing Modes on a Vibrating String," J.ano Synthesis," in Proc. 1995 Int. Computer Music Acoust. Soc. Am., vol. 76, pp. 5-12 (1984 July).Conf. (Banff, Alta., Canada, 1995 Sept. 3-7), pp. [45] K. Bradley, M. Cheng, and V. L. Stonick, "Au-335-342. tomated Analysis and Computationally Efficient Synthe-

[32] S. Van Duyne and J. O. Smith, "Developments sis of Acoustic Guitar Strings and Body," in Proc. 1995for the Commuted Piano," in Proc. 1995 Int. Computer IEEE ASSP Workshop on Applications of Signal Pro-Music Conf. (Banff, Alta., Canada, 1995 Sept. 3-7), cessing to Audio and Acoustics (New Paltz, NY, 1995pp.319-326. Oct.).

[33] A. Chaigne, A. Askenfelt, and E. Jansson, [46] J. Huopaniemi, Ed., "HUT Acoustics Labora-"Time Domain Simulations of String Instruments. A tory Sound Examples," URL: http://www.hut.fi/HUT/Link between Physical Modeling and Musical Percep- Acoustics/sounddemos/(1996).

tion," J. Phys. IV, Coll. C1, Suppl. to J Phys. III, vol. [47] M. Karjalainen, "Object-Oriented Programming2, pp. C1-51-C1-54 (1992 Apr.). of DSP Processors: A Case Study of QuickC30," in

[34] J. Huopaniemi, M. Karjalainen, V. Vfilim_iki, Proc. 1992 IEEE Int. Conf. on Acoustics, Speech, andand T. Huotilainen, "Virtual Instruments in Virtual Signal Processing (ICASSP'92) (San Francisco, CA,

Rooms--A Real-Time Binaural Room Simulation Envi- 1992 Mar. 23-26), vol. V, pp. 601-604.ronment for Physical Models of Musical Instruments," [48] J. O. Smith, "Digital Waveguide Modeling ofin Proc. 1994 Int. C.omputerMusic Conf. (Aarhus, Den- Musical Instruments," CCRMA, Stanford University

mark, 1994 Sept. 12-17), pp. 455-462. (Stanford, CA, 1993 July 24), unpublished.[35] M. Karjalainen and J. O. Smith, "Body Model- [49] Z. Ifinosy, M. Karjalainen, and V. V_ilim'fiki,

ing Techniques for String Instrument Synthesis," to be "Intelligent Synthesis Control with Applications to apresented at the 1996 Int. Computer Music Conf., Hong Physical Model of the Acoustic Guitar," in Proc. 1994

Kong (1996 Aug. 19-24). Int. Computer Music Conf. (Aarhus, Denmark, 1994[36] T. Takala, R. H_inninen, V. V_ilim_iki, L. Savi- Sept. 12-17), pp. 402-406.

352 J.AudioEng.Soc.,Vol.44,No.5,1996May

PAPERS PHYSICALMODELINGOFPLUCKEDSTRINGINSTRUMENTS

THE AUTHORS

!

V. Viilim'fiki J. Huopaniemi M. Karjalainen Z. Jlinosy

Vesa Viilimiiki was born 1968 May 12 in Kuorevesi, ber of the IEEE, the ICMA, the Acoustical Society ofFinland. He studied acoustics and digital signal pro- Finland, and the Polar Bear Club Finland.cessing and received the degrees of M.Sc. (1992), Li-centiate in Technology (1994), and Doctor of Technol- ·ogy (1995) from the Helsinki University of Technology, Matti Karjalainen was born in Hankasalmi, Finland,Espoo, Finland. He was with the Laboratory of Acous- in 1946. He received the M.Sc. and the Dr. Tech. de-tics and Audio Signal Processing of the university from grees in electrical engineering from the Tampere Uni-1990 to 1995 working first as a research assistant and versity of Technology in 1970 and 1978, respectively.later as a research scientist. His research activities in- His doctoral thesis dealt with speech synthesis by rulecluded sound synthesis using physical models and digital in Finnish.filter design. From 1980 he was an associate professor and from 1986

Dr. Viilimfiki is currently a postdoctoral research fel- he has been a professor in acoustics at the Helsinki Univer-low at the School of Electronic and Manufacturing Sys- sity of Technology in the faculty of Electrical Engineering.terns Engineering, University of Westminster, London, His research activities have covered speech synthesis,UK, where he is developing signal processing algo- analysis, and recognition, auditory modeling, DSP hard-rithms that employ nonuniform sampling and fractional ware, software, and programming environments, as wellsample delays. He is a member of the Audio Engineering as acoustics and modeling of musical instruments.Society, the Institute of Electrical and Electronics Engi- Professor Karjalainen is a member of the IEEE, theneers (IEEE), the International Computer Music Associ- ASA, the AES, the ICMA, the Acoustical Society ofation (ICMA), the Acoustical Society of Finland, and Finland, and the Polar Bear Club Finland.the Polar Bear Club Finland.

· Zoltfin Jfinosy was born in Budapest, Hungary, inJyri Huopaniemi was born in Helsinki, Finland, in 1967. He received his M.Sc. degree in electrical engi-

1968. He received an M.Sc. degree in electrical engineering from the Technical University of Budapest inneering from the Helsinki University of Technology 1992. His thesis dealt with computer processing of re-(HUT) in 1995. He is currently a Ph.D. student at the producing piano rolls.HUT majoring in acoustics, audio signal processing, From 1992 to 1995 he has been a Ph.D. student atand digital media. Mr. Huopaniemi has worked as a the Technical University of Budapest, working on intel-research assistant and a research scientist at the Labora- ligent control of physical model based sound synthesis.tory of Acoustics and Audio Signal Processing of the His main areas of interest are digital audio signal pro-HUT since 1993. His research activities include model- cessing, sound synthesis, and stereo photography.based sound synthesis and analysis techniques, spatial From 1996 Mr. J_inosy has been with Novotrade Soft-hearing and auralization (3-D sound), and room acous- ware Ltd. He is involved in 3-D computer graphics andtics modeling, game programming.He is a memberof the AES and

Mr. Huopaniemi is a member of the AES and the the IEEE Computer Society Technical Committee onsecretary of the AES Finnish Section. He is also a mem- Computer Generated Music.

J. AudioEng.Soc.,Vol.44, No.5, 1996May 353

physical modelingof plucked string instruments with application...

Documents