CONCEPTUAL METAPHOR IN THE PRACTICE OF COMPUTER MUSIC
Thesis
Submitted in partial fulfillmentof the requirements for the
Degree of
Master of Fine Arts
in Electronic Music and Recording MediaMills College, Spring 2011
by
Peter Ho-Kin Wong
Approved by:Reading Committee
_________________________Chris BrownDirector of Thesis
_________________________James FeiReader of Thesis
_________________________Chris BrownHead of the Music Department
_________________________Dr. Sandra C. GreerProvost and Dean of the Faculty
CONTENTS
1. Introduction 4
2. Background: 5
2.1. The mapping question 5
2.2. Conceptual metaphor 14
3. Interface strategies 21
3.1. Manifested metaphors: simple yet telling examples 22
3.1.1. A thought experiment: a prototype conforming to an alternate pitch metaphor 26
3.1.2. Our prototype as a methodological proposal: what really happened here? 29
3.2. Existing interfaces and conceptual coherence 33
3.2.1. Tangible user interfaces (TUIs) 36
3.2.2. The “CHOAM” fiducial ball controller 40
3.3. Proposed future work 42
3.3.1. Input: gestural imaging and marked objects 42
3.3.2. Mapping image–schemas to low or high level outputs 44
4. Informing the conceptual sphere 47
4.1. “Choices:” just pitch space, the conduit metaphor 47
5. Conclusion 53
6. Appendix: contents of accompanying media 55
7. Bibliography 56
1. Introduction
Arguably the quintessential musical instrument of the early 21st century,
the computer as a musical performance instrument has received generous
attention. Practitioners of computer music have devoted their energy to detailed
study of very narrow aspects of its use; by small changes in key places in the
chain of causation, it can be made into nearly any kind of instrument. How
then, within a nearly infinite realm of possibility with regard both to generable
sounds and to input mechanisms, can one decide what kind of instrument to
build into it? After some background on the “mapping question” as it pertains
to computer music, and on some particularly pertinent aspects of human
cognition, I will examine a crude controller prototype which will illustrate the
fundamentals behind a design procedure that maintains coherence between
mappings of gestural controls to sonic outputs and metaphorically based
cognitive structures. After a discussion of how this method pertains to existing
interfaces, both of others’ construction and of my own, I will propose a
direction for future work which maintains consistency with this methodology.
Finally, I will examine ways in which, beyond the more technical
correspondences in the previous sections, the cognitive structure discussed can
inform the conceptual and compositional grounding of musical works, using a
piece of my own as an example.
4
The ideas which I will bring up in this paper are incredibly simple.
However, their simplicity belies a subtlety which should not be discounted; the
structure of the language we must use inadvertently encourages eliding some
important distinctions.1
2. Background
There are two areas with which the reader will need to be familiar before
going further: the question of mapping as it pertains to computer music, and
the contemporary theory of conceptual metaphor.
2.1.The mapping question
The issue of input to output mapping through the computer as a musical
instrument is a vexing problem. Jon Drummond defines mapping as
“connecting gestures to processing and processing to response.”2 Thus at its
most general, it is little more than connecting what goes into the black box with
what comes out.3 (see Figure 1)
If one considers the case of traditional acoustic instruments, the
1 Michael J. Reddy, “The conduit metaphor: A case of frame conflict in our language aboutlanguage,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (Cambridge: CambridgeUniversity Press, 1993).
2 Jon Drummond, “Understanding Interaction in Contemporary Digital Music: frominstruments to behavioral objects,” Organised Sound 14/2 (2009): 131.
3 Drummond's paper goes into much further detail about the different ways to conceive ofthis metaphor, with varying degrees of complexity. This simpler conception will providebetter clarity here.
5
relationship is straightforward: A performer's physical gestures, breath and
movement, for example, “go into
the box,” directly causing some
kind of resonance, which then
“comes out of the box” as sound
in response to the gestures of
that performer. The mapping
here is direct physical coupling and cannot be altered without an alteration of
the instrument itself, which may or may not be possible, but will always have
limitations. This relationship is further simplified in that there is a direct
correlation between the processing and the response; the resonance of the
instrument’s body is, in a real sense, both these things. In addition, as Bown, et
al. point out, for traditional acoustic instruments “a musician is adaptive
towards an instrument,”4 meaning that the instrument, by its synchronic
inalterability, forces change on the musician’s part during the interaction. In
order for the instrument to change, an instrument builder would need to
incorporate any modifications in the next iteration. Thus the mapping is largely
constant for a given instrument and performer pair.
The computer’s mapping is in no way as rigidly coupled. Because the
4 Oliver Bown, Alice Eldridge, and Jon McCormack, “Understanding Interactive Systems,”Organised Sound, 14/2 (2009): 191.
6
Figure 1: The computer as a musical instrument is essentiallya black box.
processing is, in contrast to acoustic instruments, something very open, its
connection to the sound output is arbitrary and can become so complicated
that it often is considered an integral part of the compositional process. Further,
especially with the current arsenal of human interface devices (HIDs),5 the
kinds of gestures that can be captured as input to the processing are even more
open-ended than the already huge realm of possibility in translating a single
gesture to a single datum for processing. Several strategies have surfaced to deal
in particular with how to map the input half of the black box metaphor: one-to-
one, one-to-many, many-to-one , and many-to-many.6 A one-to-one mapping is
the most transparent, but as a proliferation of such mappings can affect either
system performance or the instrument’s performability, a one-to-many strategy
is often employed to reduce processing load on input mappings and mental
load on the performer. A possible manifestation of this strategy would be a
single control updating several synthesizer parameters each of which use a
differently scaled value from the control. To reduce output mappings while
keeping a greater number of input mappings, a many-to-one strategy is used.
This could be useful if, as one example, several performers each have separate
controls for the same parameter. Many-to-many combines the two in any
5 Commonly found on laptops at the time of writing are joysticks, trackballs, trackpads (manymulti-touch), keyboards, cameras, accelerometers, photosensors, fingerprint readers, infraredsensors, bluetooth modems, wireless ethernet, and microphones, to name a few. This listingexcludes any attachable peripherals, which only increase the possibilities.
6 Drummond, “Understanding Interaction,” 131.
7
number of ways, and is probably the most commonly used in practice.7 Notably,
with this freedom in gesture/processing/response mapping, and the clear notion
that the computer user is a programmer,8 the adaptive relationship of the
musician towards the instrument is then broken down, and one need not wait
for the next iteration of the instrument or for the builder’s whims to also allow
the adaptation of the instrument toward the musician.9 Furthermore, this
adaptation can take place immediately or even dynamically.
Given the degree of complexity in dealing with the mapping question,
electronic musicians have been known to approach the design of systems and
algorithms from a compositional standpoint. Even before the widespread use of
computers, for Gordon Mumma, his “designing and building of circuits is really
‘composing,’” and his “‘instruments’ are inseparable from the compositions
themselves.”10 In light of this attitude, which is by no means Mumma’s alone,11 it
seems unsurprising that a culture of behavioral objects12 has arisen in the
internet-connected virtual community that is inseparable from the computer
7 Ibid.8 A programmer for these purposes can be anyone who causes a change in a computer's
behavior through intentional manipulation of that behavior. This manipulation can beaccomplished through writing original software or by manipulating pre-written software.
9 Bown et al., “Understanding Interactive Systems.”10 Gordon Mumma, “Creative Aspects of Live-Performance Electronic Music Technology,”
Papers of 33rd National Convention (1967): 1.11 Among proponents are David Tudor (reported in John D.S. Adams, “Giant Oscillators,”
Musicworks 69 (1996)), Chris Brown, and John Bischoff (Chris Brown and John Bischoff,Indigenous to the Net: Early Network Music Bands in the San Francisco Bay Area, (2002)<http:crossfade.walkerart.org/brownbischoff/IndigenoustotheNetPrint.html> (15 April2010)).
12 Bown et al., “Understanding Interactive Systems.”
8
music world. Interface systems and other components of the sound production
process, written by programmer/musicians for their own purposes, have been
and are being shared as code snippets and modular patches. These objects
(Bown, et al. use the term in both its material and its programmatic sense) can
take a number of different forms, and have varying degrees of utility. They may
be nearly whole programs that could almost be considered pieces in their own
right, or modules that require some manipulation to be usable at all. That these
objects “can be shared, modified and repurposed and are the currency and
building blocks both functionally and aesthetically in contemporary music
culture”13 also bespeaks of an undercurrent of openness and a propensity for
hacking. Where the traditional instrumentalist is forced to be content being
handed a completed and largely unmodifiable object to produce sounds, the
computer musician feels compelled both to build most of the instrument she or
he will use, from new or modified components, and to then use the instrument
to produce interesting sounds.14
Another way in which the mapping question poses interesting problems
is brought to light when Thor Magnusson exhorts us to
take Michel Waisvisz’s The Hands instrument. Every sensor of the complexinterface is mapped to a specific parameter in the software-based sound engine.A change in the engine will result in a new (or altered) instrument. Although
13 Ibid., 195.14 “Live coding” as practiced by a small community would be a notable counter-example where
the building of the instrument is done in real time. However, this type of interface is far fromintuitive.
9
the interface has not been altered by a change in the mapping algorithm, theinstrument behaves differently. For Waisvisz, changing the algorithms thatconstitute the sound engine means learning a new instrument, which involvesthe re-incorporation of the conceptual understanding of the engine’sfunctionality into bodily memory.15
Despite an immutability that parallels an acoustic instrument, a physical
interface can, for the purposes of motor memory, effectively become a new
instrument purely by virtue of a change in the internal mapping: in this case at
a lower level than the gesture-to-processing as discussed above, perhaps even at
the level of processing itself. This example brings forth yet another level of
mapping involved in the overall question, though one not specific to the
computer-as-musical-instrument. At the level of human cognition,
representation can be thought of to be manifested in the form of symbols16 or
signs,17 and in the act of living we develop a mapping schema between these
symbols and our perceptions of the outside world.18 When one learns an
instrument, as in the cases of The Hands or a traditional instrument, one
performs such a mapping of a symbolic relation to a sonic result and another of
a motor-memory symbol, or motor program, that maps to the gesture necessary
15 Thor Magnusson, “Of Epistemic Tools: musical instruments as cognitive extensions,”Organised Sound, 14/2: 169 (footnote).
16 Douglas Hofstadter, Gödel, Escher, Bach: an Eternal Golden Braid (New York: Basic Books,1999).
17 Charles Morris, Foundations of the Theory of Signs (Chicago: University of Chicago Press,1938).
18 Gerald Edelman, Bright Air, Brilliant Fire: on the Matter of the Mind (New York: Basic Books,1992), 81-98. (though Edelman and others in the school of embodied cognition might arguethat these symbols are our perceptions, and that representation is obviated)
10
to produce that result.19 (see Figure 2)
It is interesting to note that this
implies a full circle from the black box
metaphor; the formerly implicitly
related input and output ends of the
box can now be seen as forming one
side of a feedback loop that the cognitive symbolic map now completes (Figure
3). Sensibly, a break in one part of that loop destroys it, and a new loop of
relationships must be built. Given this tenuous hold on connectivity and the
tendency of the computer music community toward rapid fluctuation in
technique and materials (it is a young field), and given also the rapid changes in
the hardware itself, it makes sense that one could fail to settle on a methodical
answer to the mapping question.20
Along with the internal mappings from gesture to sonification is the
issue of how to actually capture the gestures themselves. Most HIDs that come
with computers, though they can be used as such, are not intended to be
musical interfaces. The devices beyond the usual keyboard, monitor, and
19 Gerald Edelman refers to this pattern as a re-entrant activation in what he calls a globalmapping, implying that the separation portrayed here is false. The motor response in a globalmapping is co-occurrent with the perceptual response (and thus is part of the samemechanism). (see §2.2 below)
20 For an interesting discussion in this vein with a comparison of classically trained acousticinstrumentalists to live-coders, see: Nick Collins, “Live Coding Practice,” (paper presented atthe International Conference on New Interfaces for Musical Expression, New York, USA,June 6–10, 2007).
11
Figure 2: the mapping schema from sound to motorprogram on the cognitive side
pointing device, though increasingly common on most laptops, are still not as
widely used for gesture-capture. Beyond
that, the integration of those standard
interfaces within the body of the machine
on laptops, and the resulting machine–
centric body language on the part of the
computer performer lend many to frown
upon them as musical control devices. To
combat this disengagement, many
performers are, at the very least, using
detached controllers, or, like Waisvisz, designing and building their own custom
interfaces. Dan Trueman built the Bowed-Sensor-Speaker-Array (BoSSA), a
multi-directional speaker clustered with various sensors that the performer
holds in his or her lap and bows much as an orchestral instrument would be
played. Trueman has also designed multi-directional speakers and has used
various corded peripheral interfaces (e.g., tablets and drum pads) for use with
laptops in the Princeton Laptop Orchestra (PLOrk). His aim here was to pull
the laptop performer away from the disengaging stance of integrated controller
use. He also challenged the problem of the disembodiment of the resultant
12
Figure 3: the feedback loop between theperformer and the instrument in aninteractive/reactive system
sound21 with spatial localization via the speaker clusters.22 These solutions show
that in addition to the problem of designing a mapping schema at the software
level to translate gesture data into processable data for sonification, a similar
bottleneck also exists at the level of performer/instrument interaction; the
computer musician/performer often needs to find her or his own solution to the
dilemma of gesture capture within a realm of near infinite possibility. This area
would, however, be one where thoughtful design could improve the ability to
capture musical gestures. With these factors in mind, perhaps the computer,
rather than be compared to “the saxophone” or “the dulcimer,” should be
likened to “an instance of all possible acoustic instruments.”
For these reasons, the question of mapping in the computer as a musical
instrument brings to light the strange fact that when it comes to the ability of
this instrument to be used in a live setting one is largely concerned with
questions that have traditionally had more to do with instrument building and
design than with performance. Because of its relative youth in the performance
world, furthermore, there exist few pre–designed systems that would allow
someone to approach it purely as a performer learning an instrument. The
question then becomes: how can one pare down the realm of possibility into a
design schema relevant to a musical performer?
21 Laptop concerts are nearly always amplified through a P.A. system of some kind, displacingthe sound from its source of origin. See Powerbooks Unplugged (http://pbup.goto10.org/ (15April 2010)) for a notable exception.
22 Dan Trueman, “Why a Laptop Orchestra?” Organised Sound 12/2 (2007).
13
2.2.Conceptual metaphor
Before we attempt an answer to that question, let’s pose another, more
basic question. What is relevant to a musical performer?
A musical performer, being a human, is also something of a black box.
However, it is one on which research in the last thirty years in cognitive science
and linguistics has shed some light. Of enormous explanatory potential and
central relevance to this essay is the contemporary theory of metaphor, which
builds upon its foundation in the theory of embodied cognition. This discipline
has given rise to a conception of cognitive organization that has direct relevance
to issues of music in general, but especially to music in relation to human–
computer–interface (HCI).
The theories are grounded in recent research in neuroscience, in what
Gerald Edelman calls “neural Darwinism.”23 Edelman postulates that neural
development proceeds along lines explained by population thinking, as
understood in evolutionary biology, and that the units of selection are groups of
neurons which activate together when they receive a particular input stimulus.
These groupings become relevant to cognitive theories because they don’t
simply activate as a single group upon stimulation, but in networks of
interconnected groups24 which Edelman calls maps. Key to music is the fact that
23 Edelman, Bright Air, Brilliant Fire, ch. 9.24 He refers to this interconnection as “reentry.”
14
these maps are also activated together with non-mapped parts of the brain25 and
with the motor behavior of the animal in question, in this case, the musician.26
These findings in neuroscience are relevant because they indicate a
physical basis for Mark Johnson’s image schemas, on which conceptual
metaphor27 depends. When Johnson says that “[r]ecurring adaptive patterns of
organism–environment interaction are the basis for our ability to survive and
flourish,”28 he is referring to just this type of neuro–motor cross–activation.
Johnson and Rohrer describe it succinctly:
The patterns of our ongoing interactions...define the contours of our world andmake it possible for us to make sense of, reason about, and act reliably withinthis world. Thousands of times each day we see, manipulate and move into andout of containers, so containment is one of the most fundamental patterns ofour experience. Because we have two legs and stand up within a gravitationalfield, we experience verticality and up–down orientation. Because the qualities(e.g., redness, softness, coolness, agitation, sharpness) of our experience varycontinuously in intensity, there is a scalar vector in our world.29
The experiences they mention are by no means arbitrary; they are some of the
fundamental image schemas on which we base our cognition through the
process of metaphor, which George Lakoff defines as “a cross–domain mapping
in the conceptual system.”30 The plasticity of the human mind comes from an
25 i.e., specialized brain structures whose function is not mainly for cognition26 Edelman, Bright Air, Brilliant Fire, 83–93. (cf. global mapping)27 Hereafter, I will assume the reader understands that by “metaphor” I mean “conceptual
metaphor” as defined by Lakoff and Johnson (1980), and not “metaphorical linguisticexpression.”
28 Mark Johnson and Tim Rohrer, “We are living creatures: Embodiment, AmericanPragmatism and the cognitive organism,” in Cognitive Linguistics Research, 35.1: Body,Language, and Mind, Volume 1: Embodiment, eds. Tom Ziemke, Jordan Zlatev, Roslyn M. Frank(Berlin: Mouton de Gruyter, 2008): 32.
29 Ibid.30 George Lakoff, “The contemporary theory of metaphor,” in Metaphor and Thought, 2nd ed., ed.
15
ability to extend inferences one can make through anticipation of and
understanding of the structure of these general physical experiences to
inferences about unrelated but correlated experiences that don’t necessarily
have such a physical manifestation.
As an illustration of the relationship between image schemas and
language, consider a part of speech: the preposition. These are familiar words
we use everyday, and which are such a vexing problem to native speakers of
other languages learning English: words like in and into. However, even native
speakers are hard pressed to define them. The reason for this difficulty is that
they are linguistic representations of image schemas, which are so basic to
cognition as to be below the level of conscious attention. Johnson and Rohrer
have already introduced us to the container image schema, which is the basis of
the preposition in. At its most basic, the word can only be taken to represent the
most salient part of our idea of a container, which is the location of its
containment: a simple enough idea. With into, the situation is slightly more
complicated, as it demonstrates the fact that image schemas can be
compounded to aid in inference about situations or ideas that exhibit
characteristics which can map to attributes of more than one schema. The word
to is an expression of what Lakoff calls the SOURCE–PATH–GOAL31 schema, which
Andrew Ortony (Cambridge: Cambridge University Press, 1993): 203.31 A convention of the literature is to use small caps to refer to a set of conceptual mappings.
The normal admonition accompanying them is that they are not referents to any particularlinguistic expression but to a static set of conceptual correspondences.
16
is an extension of the PATH schema. We understand into as superimposing the
container schema onto the goal/destination of the SOURCE–PATH–GOAL
schema.32 The interesting part, however, arises when people say things like,
“Falling in love is just getting yourself into trouble.” When interpreting this
sentence, we extend our inferences about the space inside containers to
reasoning about states of being, and about transitioning between states of being
in terms of moving along a path toward a destination. Inferences we can make
about being inside containers map to inferences we make about being in
certain states, and so we speak of being in and out of love, despite the fact that
there is no such clearly defined line between these states that corresponds to
the shell of a physical container. In this way metaphor can be both an
indispensable aid and an insidious hindrance to thought.
A second type of image schema is what are clumsily known as “vitality
affect contours.” Johnson and Rohrer describe them as
the swelling qualitative contour of a felt experience. We can experience anadrenaline rush, a rush of joy or anger, a drug–induced rush, or the rush of ahot–flash. Even though these rushes are felt in different sensory modalities,they are all characterizable as a rapid, forceful building up or swelling contourof the experience across time.33
Such abstract experiences are of extreme importance to art, and their time–
based form understandably lends them to relevance to time–based arts like
music. Contours like the envelope of a rush are metaphorically extended to
32 Johnson and Rohrer, “We are live creatures,” 34.33 Ibid., 36.
17
inferences about the course of actions, a means through which we generate
expectation about those actions.34 As Johnson and Rohrer note, “We crave the
emotional satisfaction that comes from pattern completion, and witnessing even
just a portion of the pattern is enough to set our affect contours in motion.”35
This kind of metaphor is the mechanism by which suspense, resolution,
cadence, and other such musical ideas work. In fact, Candace Brower brings
many of the metaphors discussed thus far into her analysis of Edgard Varèse’s
Density 21.5.36 She analyses the first seventeen bars in a series of phrases, in
each of which the melody, seen as an agent whose will is a driving force toward
goal–directed motion, strains toward the boundary of a container defined by the
pitches in the given phrase. Each phrase builds tension by slowly expanding the
container’s boundaries as the agent battles both against those boundaries and
against opposing forces (another very basic image schema), encountering
barriers, and resting on the metaphorical platforms of stable pitches. Her
analysis differs from others’, but, by means of the metaphors she applies, a
coherent pattern emerges.
Despite the enormous descriptive power of this mode of thought,
metaphor can also serve to obscure reasoning in subtle ways, as I briefly alluded
34 Johnson and Rohrer give the example of a child quieting down as soon as it sees its parentbegin to reach for the bottle. (Ibid., 34.)
35 Johnson and Rohrer, “We are live creatures,” 34.36 Candace Brower, “Pathway, Blockage, and Containment in Density 21.5,” Theory and Practice
22–23 (1997-98): 35–54.
18
to above. Michael Reddy was an early pioneer of this discipline, and the parable
he invents in his classic paper “The Conduit Metaphor” serves to illustrate one
of the most pervasive ways that it does.
Reddy invites us to imagine a situation he
calls “the toolmaker’s paradigm”37 wherein a
number of people live alone in a wheel
structure in wedges separated by spoke–like
walls, the outer circumference, and a hub.
(see Figure 4) In this hypothetical world,
there are no possible means of communication except through the hub, and no
information can be gained in any other way about each neighbor’s space. Each
inhabitant can only pass notes through the hub, and when one invents a new
tool with which she improves her own life, she passes a note to the others with
instructions to build it. The instructions being fundamentally imperfect, and
there being a drastic difference in the environment and resources in each cell,
the tools are always manifested differently by each toolmaker, unless one or
more of them engage in a dialogue to figure out more about what was intended
to be built versus what was actually built. Reddy presents this paradigm as a
model of the way communication must actually work: as a cooperative effort
between speaker and listener, or else as only a shadow of the speaker’s intent.
37 Reddy, “The conduit metaphor,” 171–176.
19
Figure 4: the toolmaker's paradigm (afterReddy)
However, the conduit metaphor, which is the operative metaphor in common
speech about communication, in which, for example, words are metaphorized as
containers for ideas which are given, packaged and ready, to the listener,
obscures this cooperative effort, implying that the lion’s share of the effort in
communication lies with the speaker/packager. Reddy points to the conceptual
development of mathematical information theory as an illustration of the
insidiousness of the conduit metaphor. He first establishes that
“[i]nformation is defined as the ability to make nonrandom selections fromsome set of alternatives. Communication, which is the transfer of this abilityfrom one place to another, is envisioned as occurring in the following manner.The set of alternatives and code relating these alternatives to physical signals areestablished, and a copy of each is placed at both the sending and receiving endsof the system. …The whole point of the system is that the alternativesthemselves are not mobile, and cannot be sent, whereas the energy patterns, the‘signals’ are mobile.”38
In light of this simple but incredibly subtle distinction, Reddy portrays the
English language, and the underlying conceptual structure, as an “evil
magician” who flies over the toolmaker’s world and modifies the hub such that
the toolmakers believe they are receiving the tools themselves instead of
instructions to build those tools.
With the power of metaphor both to bring conceptual richness and to
wreak conceptual havoc fresh in the reader’s mind, consider another of Reddy’s
well–worded cautions:
“A code is a relationship between two distinct systems. It does not ‘change’anything into anything else. It merely preserves in the second system the
38 Ibid., 181. (emphasis in original)
20
pattern of organization present in the first system. Marks or sounds are nottransmuted into electronic pulses. Nor are thoughts and emotions magicallymetamorphosed into words. ...Signals do something. They cannot containanything.”39
All of this information is as relevant to any human, but as I will show
below, it is of special relevance to the composer/performer of computer music
once the question of interface design arises.
3. Interface strategies
The brief survey presented above suggests that a cognitive theory of
metaphor can account simply and elegantly for a number of structures upon
which we call to manifest our language and our music. Given this frame of
organization, which, quite relevantly, is tied directly to bodily movement, one
should permit oneself to be informed by it when approaching the mapping
question in computer music performance.
After another brief reportage of relevant research, the reader will be
asked to consider an example of a simple controller which can create an
experience coherent with an image–schematic expectation. For the sake of
clarity and simplicity, the controller will be one that can be varied over a single
dimension, and which will then be mapped to control a single audible
parameter. The procedure followed in the conception, construction, and use of
this example controller will be proposed as a general methodology for
39 Ibid., 183–184. (emphasis in original)
21
metaphorically coherent interface design. A look at existing interfaces and
their relationship to metaphor, and at an interface I built for a performance
piece then precedes a proposal for future work.
3.1.Manifested metaphors: simple yet telling examples
A pervasive metaphor in English, as well as in many other languages, is
MORE–IS–UP, an extension of the VERTICALITY image schema. Such metaphors
can be realized in physical objects through which the metaphors they manifest
are reenforced in successive generations of users as they grow up with
experience of these objects.40 Consider for a moment how a mercury bulb
thermometer works; as the temperature of the metal increases, its density
decreases but its volume increases. These may be just two ways of thinking about
the same physical result, but it’s important to distinguish that the physical
result of higher temperature, which can be likened to the input to the
thermometer’s “processing,” doesn’t have an inherent vector; it has only a delta
in its physical state which is physically mapped to the output, the
thermometer’s gauge. Whichever way one thinks of it, as an increase in volume
or a decrease in density or any arbitrary change in another attribute entirely, is
irrelevant, as the important aspect is that when the temperature gets hotter, the
substance behaves in a predictable way which can be factored into the design of
40 Lakoff, “Contemporary theory of metaphor,” 241.
22
the thermometer such that it expresses this change by means of the MORE–IS–
UP metaphor. Such objects “exhibit a correlation between MORE and UP and are
much easier to read and understand than if they contradicted the metaphor.”41
It is key, though, that one realize that in this case one has no control over what
is taken here as the crux of the mapping question, because this mapping is
determined by physical laws. Though one could modify the scale on the output
to indicate Fahrenheit, Celsius, or even mood or DEFCON level for that matter,
the mapping from temperature (input gesture) to metal density/volume
(processing) remains constant and fixed in a one–to–one relationship.
Such an unbreakable mapping is not the case with computer input to
output. Take an example of a computer control which starts out analogously to
the thermometer example: the fader.
(see Figure 5) The most common use
for this control is to change amplitude
of an audio signal, which, unless the
aim is to be contrarian or malicious, is
now always mapped to the fader’s path in correspondence to the MORE–IS–UP
metaphor. Certainly, there have been manifestations of both mappings
historically, but it is perhaps testament to the influence of metaphor that the
current standard has been the one to persevere; we map loud to more in our
41 Ibid.
23
Figure 5: the author actuating a fader
thinking about amplitude in English, so one side of the fader, usually the
extreme that is farthest from the body of the actuating agent, is considered the
top of the fader’s throw.42 Thus, just like the thermometer example, a fader maps
control motion to delta in amplitude coherently with the MORE–IS–UP
metaphor. However, since the hypothetical fader in question controls a
computer output, the mapping is not limited to this one application. What if it
were a pitch slider? How would it succeed or fail at manifesting our
metaphorical understanding of pitch?
Consideration of this remapping by an English speaker, who conceives of
pitch in terms of the metaphor HIGH–PITCH–IS–UP and is conditioned by the
conventional mapping of a fader, will likely seem so straightforward as to elicit a
hypothetical scoff. The reader is invited to scoff away, but to know that in fact,
though they are less common, English also exhibits other metaphors for pitch;
low pitch can be deep and high pitch can be shrill, for example.43 Cross–
linguistically, however, HIGH–PITCH–IS–UP is not the only default metaphor used
conventionally to conceptualize pitch. In Kpelle, a language in the Mandé
family of the Niger–Congo macro–family spoken mainly in Liberia, speakers
distinguish wóo su kéte (“voice with a large inside”) from wóo su kuro têi(“voice with a small inside”). … These concepts of large and small apply tosinging voices, instrumental sounds, and speaking voices, and the ideaincorporates both pitch and resonance attributes. A large voice is both lower in
42 I am indebted to James Fei (personal communication) for bringing the BBC–style fader tomy attention.
43 The words used are not always nicely paired antonyms.
24
pitch and more resonant than a smaller voice.44
Also, Shayan et al. have reported a consistent use of a thick/thin metaphor in
Farsi, Turkish, and Zapotec, three unrelated languages.45 Further, Eitan and
Timmers presented an empirical study of a comprehensive set of pitch
metaphors tested against speakers of languages that did not necessarily
conventionally use them. They found that when asked to describe pitch,
subjects would consistently map pairs of antonyms to the pitch vector in the
same way as would native speakers of languages which did conventionally use
those antonym pairs.46 To complicate matters, the coincidence that exists in
English between the top of a slider being both louder and higher pitch
wouldn’t necessarily correspond in a language that uses a different metaphor, as
metaphorical attributes of a complicated concept such as pitch vary along
different vectors. Small sounds (our high sounds) are quieter in languages that
use the big/small metaphor, while big sounds (our low sounds) are louder.47
44 Ruth M. Stone, “Toward a Kpelle Conceptualization of Music Performance,” Journal ofAmerican Folklore 94/372 (1981): 196. (Please note that for formatting reasons, some diacriticson the Kpelle transcriptions are not properly reflected in this quotation. Please refer to theoriginal source for the authoritative orthography.)
45 Shakila Shayan, Ozge Ozturk, and Mark A. Sicoli, “The Thickness of Pitch: CrossmodalMetaphors in Farsi, Turkish, and Zapotec,” Senses & Society 6/1 (2011). The authors note thatthough Farsi and Turkish are not related, there is a cultural exchange between the twospeech communities. However, such is not the case between Zapotec and either of the othertwo.
46 Zohar Eitan and Renee Timmers, “Beethoven’s last piano sonata and those who followcrocodiles: Cross–domain mappings of auditory pitch in a musical context,”Cognition 114(2010): 405–422. Subjects even mapped the Shona crocodile/those who follow crocodile (forlow/high) with statistical consistency despite its unusualness for most speakers of otherlanguages.
47 Ibid., 420.
25
However, following from the above discussion about metaphor and its
grounding in image schematic knowledge of our physical environment, the
finding that these ideas can be readily understood in novel situations is not
surprising; the metaphors exist because they coherently map to an underlying
image schematic structure which is, if not universal, then at least quickly
understandable by our common physical experience.
3.1.1. A thought experiment: a prototype conforming to an alternate pitch metaphor
Perhaps the simplest way to demonstrate my proposition for
metaphorical coherence with a computer interface is to offer a test case in
which the relevant principles are taken into account. The first step in designing
a coherent controller is to choose a metaphor that can be successfully realized.
Were we to choose, for example, the Shona pitch metaphor crocodile/those who
follow crocodile, the implementation might be nontrivial and the mechanism
would probably not be immediately transparent to a potential user.48 If, however,
we were to choose the thick/thin metaphor mentioned above, a physical
realization would be far more intuitive; a brief internet search revealed,
however, that practical acquisition of sensors capable of digitizing such a
change would be prohibitively expensive, and a feasible control interface would
48 No, I don’t actually have an idea how to do this.
26
be harder to implement than others. Furthermore, whether or not its physical
form would afford49 thinning and thickening would be debatable. For these
reasons, the current attempt will involve the big/small pitch metaphor used in
Kpelle and other languages.50
A good shape for a big/small pitch controller would be a squeezable foam
ball.51 Such a shape and material choice would encourage squeezing, which
would be consistent physically
with the chosen metaphor;
squishing the ball smaller would
correspond to higher pitch and
releasing the ball into its bigger,
relaxed state would correspond
to lower pitch.52 However, rather than the ideal sphere, the prototype for the
controller (see Figure 6) is a cube the size of a small handful, composed of anti-
49 On the notion of affordance, see: Orit Shaer and Eva Hornecker, “Tangible User Interfaces:Past, Present, and Future Directions,” Foundations and Trends in Human–Computer Interaction3/1–2 (2009): 62–63. For incisive clarification, see: Shaleph O’Neill, Interactive Media: TheSemiotics of Embodied Interaction, (London: Springer–Verlag, 2008): 49–65.
50 Zibkowski cites the use of this metaphor in Bali and Java in: Lawrence M. Zibkowski,“Metaphor and Music Theory: Reflections from Cognitive Science,” Music Theory Online 4/1(1998): note 12.
51 A nod must go to Andrew Mead who, unbeknownst to me until revision of this paper,proposed a very similar thought experiment in a footnote of his article: Andrew Mead,“Bodily Hearing: physiological metaphors and musical understanding,” Journal of MusicTheory 43/1 (1999): 17, note 13.
52 An added benefit to such a design is that the controller would also be coherent with atense/relaxed opposition for other sonic parameters, although that application will not beexplored here.
27
Figure 6: prototype control for big/small pitch metaphor
static foam rectangles found in the packaging of integrated circuits, collected
together with two wired electrodes on opposing sides of the cube.53 This whole
assembly is a resistor
which is attached to a
simple inverter
oscillator, such as can
be built with a 74C14
integrated circuit,54 as
half of a voltage
divider which feeds
the audio input to a
computer. (see Figure 7) In this way one can control the amplitude of the analog
oscillator circuit with the squeezable cube. Importantly, the effect of the
control’s variance on the output of this circuit is irrelevant, so long as it varies
in a scalar fashion between two differentiable extremes; in this case, however,
squeezing the controller causes the amplitude to increase because a shorter
distance between the electrodes yields less resistance. The computer tracks the
amplitude of the incoming signal, using that value to vary the frequency of
some sound generator. The prototype controls a simple oscillator written in
53 A version of this resistor design can be found in: Nicolas Collins, Handmade Electronic Music:the art of hardware hacking, 2nd ed., (New York: Routledge, 2009): 102. For the currentdemonstration, altered battery terminals were used instead of coins.
54 Ibid., 135.
28
Figure 7: circuit diagram for the squeeze ball big/small pitch controller. R1and C1 can be varied to change the frequency of the oscillator(approximate values suggested here). A 2 kΩ resistor was added on theoscillator output to reduce the extremely hot signal.
SuperCollider:
Ndef(\hiamp_to_hipitch, { var in = SoundIn.ar(0), amptrack; amptrack = Amplitude.kr(in, 0.01, 0.01, 1200, 400); SinOsc.ar(amptrack ! 2, 0, 0.5);}).play;
Since, in this case, the amplitude increases as the cube is squeezed, the
amplitude tracker’s output can simply be plugged into the frequency argument
of the sine oscillator. We now have a working prototype of a controller that
causes a sound to move to a higher pitch when the controller gets smaller.
3.1.2. The prototype as methodological proposal: what really happened here?
Concisely, and in list form, what we did was:
1. pick a metaphor that could be realized physically.2. build a controller that generates an arbitrary output that can correlate
somehow to the chosen metaphor.3. receive the controller output as an input signal to the computer.4. map the input through some processing to a sonic parameter
metaphorically coherent with the controller gesture.55
5. output the signal.
Following these steps is sufficient to yield a metaphorically coherent interface
both for other instances of simple metaphors and also for more complicated
mappings and metaphors. (see §3.3.2 below) If one were to build on this
example, however, some caveats should be kept in mind.
55 Though the processing in this example is fairly transparent (tracking amplitude andassigning to pitch) it certainly still counts.
29
What is most important to keep stock of is twofold. First, the
correspondence between the physical form of the controller and the intended
output in step two does not exist until the successful completion of step four,
and only then if step four is approached in keeping with the aim of
metaphorical coherence. Before step four, it is merely a potential correlation.
Second, as is completely obvious when laid out this way, steps one through four
are not one step. These points are stressed to lay out the areas where conduit
metaphor thinking can get in the way of conceptualizing about control
interfaces and mapping. Though the simplicity of this example makes it easy to
grasp, one should remember that the signal is not contained within the controller.
Only once we built both the control interface and its chain of causation up to
step four could we generate a signal to transmit, and only when a listener
interprets the signal can a message then be of concern.
In keeping with the spirit of our attempt, we chose to map the
controller’s state to a signal that maintains consistency with our embodied
metaphorical understanding of the controller’s form. It would be entirely
possible to place the variable resistor, the squeezable foam, on the other side of
the voltage divider, thus causing a drop in amplitude from the output of the
analogue circuit when the cube is squeezed. Had we chosen to do it this way,
and then rewrote the computer code such that a drop in amplitude caused a
30
higher frequency output, then there would be no perceptible difference
between the two versions of the controller. However, had we instead chosen to
correlate the bigger, relaxed state to a higher frequency, by moving the resistor
and leaving the code the same, the controller would no longer be
metaphorically coherent. With such a simple one–to–one mapping, it would
clearly be no less performable, but it would no longer correlate to an attested
metaphorical understanding of pitch. Were we then to build a second controller
to vary the amplitude of the output, creating a foam–ball theremin of sorts, the
“reversed” mapping would correlate well to our metaphorical understanding of
loudness. Squeezing the resistor to create a drop in input amplitude could map
coherently to our output amplitude. Again, reversing the relationship would
destroy the coherence, but not necessarily the usability. The choice in this case
of correlating the compressed state to higher frequency or lower amplitude
demonstrates the crux of a metaphorically coherent control interface.
Along these lines, there has been a fair amount of research
demonstrating that computer interfaces, graphical or tangible, benefit in
usability when they cohere with our conceptual expectations of their behavior.
Jacob et al. have demonstrated this empirically, noting that a controller which
can simultaneously vary three parameters is a faster interface for completing a
task that requires the matching of three conceptually integral attributes, while
31
one that can only vary two at a time is faster for matching conceptually
separable attributes. Thus, their three–dimensional pointer provides a faster
means to match two shapes in X–Y position and size, while matching X–Y
position and color works better with a mouse; their experiment allows
positioning the shape with an unmodified mouse movement and changing the
color through mouse movement along one axis in conjunction with a button
depressed. They posit that this match between controller and task comes about
because of the conceptual separability of color versus the conceptual integrality
of size in our understanding of the shapes’ attributes.56 Wanderley and Orio57
have also applied these ideas specifically to musical tasks, and Antle, Courness
and Droumeva58 have approached the specific question of interface and
mapping in gesture capture that is of concern in this paper. However, all these
studies exhibit a particular focus; as Wanderley and Orio ask, “[w]hat is part of
the composition, and what is part of the technology? How can we rate the
usability of an input device if the only available tests were done by few–possibly
one–expert and motivated performers?”59 Their concern is more in empirically
testing the validity of these ideas, and applying them to the design of interfaces
56 Robert Jacob et al., “Integrality and Separability of Input Devices,” ACM Transactions onComputer–Human Interaction 1/1 (1994): 3–26.
57 Marcelo Wanderley and Nicola Orio, “Evaluation of Input Devices for Musical Expression:Borrowing Tools from HCI,” Computer Music Journal 26/3 (2002): 62–76.
58 Alissa Antle, et al., “Human–computer–intuition? Exploring the cognitive basis for intuitionin embodied interaction,” International Journal of Arts and Technology, 2/3 (2009): 235–254.
59 Ibid., 62.
32
for general musical use. However, on the strength of their work, as well as on
more performance–minded work such as that of Wessel and Wright,60 I am
advocating an adoption of image schema and metaphor into the individual
practice of interface design, partly in answer to Wanderley and Orio’s question;
in computer music today, interface design and mapping choice, along with the
more traditional elements of structure, method, material, form and aesthetic are
all definitively part of the compositional process.
3.2.Existing interfaces and conceptual coherence
At this point, the question bears asking: is metaphorical coherence really
necessary? Palle Dahlstedt, in speaking about mapping schemas for live
synthesizer improvisation offered that
It has been said that a good mapping should be intuitive, in the sense that youshould immediately understand the internals of the system. But this is not truefor most acoustic instruments. Many musicians do not know their instrumentfrom a physics point of view. Some phenomena are extremely complex, e.g.,multiphonics in wind instruments, but instrumentalists learn to master them.61
In his paper, Dahlstedt is advocating a mapping system that involves a large
degree of randomness, and thus raises this objection in order to defend his
position that a fundamentally incomprehensible mapping system can still yield
60 David Wessel and Matthew Wright, “Problems and Prospects for Intimate Musical Controlof Computers,” Computer Music Journal 26/3 (2002): 11–22.
61 Palle Dahlstedt, “Dynamic Mapping Strategies for Expressive Synthesis Performance andImprovisation,” Computer Music Modeling and Retrieval: Genesis of Meaning in Sound and Music.5th International Symposium, CMMR Revised Papers (2008): 237.
33
an expressive performance instrument. Along similar lines are concerns raised
by Ian Whalley on the idea of software-agents, stating that one “should then
allow each interactive session to develop something of its own language.
Machine agency can then lead or follow in the interactive process with human
agency, acknowledging that not all conversations are symmetrical in terms of
knowledge and participation.”62 He advocates the other extreme, proposing to
use adaptive and semi-autonomous software-agents to perform the mapping on
the computer side.
These are completely valid points whose approaches and results I would
be sorry to see gone from the world of computer music. I would venture one
point of clarification, however, with regard to Dahlstedt’s characterization of
“intuitive” interfaces. The system of mapping proposed above does not suppose
or even advocate that the user of a metaphorically coherent interface
“immediately understand the internals of the system.” On the contrary,
concerning the point of mapping the computer’s input to modulation of an
output, as we saw with our input amplitude controlling the output pitch, it is
irrelevant to the user of such an interface what the internals of the system
actually are. It is precisely what is limited to the externals of the system that is
of concern for the use of such a controller. Indeed, as advocated by the IUUI
62 Ian Whalley, “Software Agents in Music and Sound Art Research/Creative Work: currentstate and possible direction,” Organised Sound, 14/2 (2009): 165.
34
research group (Intuitive Use of User Interfaces), the very notion of intuitive use
precludes any conscious understanding on the part of the user; they define
intuitive use as the unconscious use of pre–existing knowledge.63
Certainly, however, it is usually the case in computer music that the
builder of the interface and the mapper of the stimulus to the response is also
the composer and the performer, and so that person likely does “immediately
understand the internals of the system.” Indeed, in computer music, choosing
your mappings is part of the practice. However, the composer/performer, as a
human animal, is still beholden to image schemas, which “[b]ased on embodied
experience, … are learnt early in life, shared by most people and processed
automatically. Violating the metaphorical extensions results in increased
reaction times and error rates.”64 As a young person in the 21st century
composing and performing electronic music, and like many of my compatriots,
lacking confidence about my real–time musical performance skills while also
feeling the pressure to build and learn to play a new instrument for each piece,
as is often the case, I welcome the possibility of “decreasing reaction times and
error rates” so that I can focus on the music itself. That said, however, the music
itself, these days as like no earlier time in history, is so incredibly varied that all
approaches should be welcome.
63 Anja Nauman, Jörn Hurtienne, Johann Israel, et al., “Intuitive Use of User Interfaces:Defining a Vague Concept,” Engineering Psychology and Cognitive Ergonomics: Lecture Notes inComputer Science, 4562 (2007): 128-136
64 Shaer and Hornecker, “Tangible User Interfaces,” 64.
35
Bearing these ideas in mind, let us now turn to examination of some
existing interfaces: the tangible user interface object in general, and then a
particular adaptation of that technology that I built for a piece of my own.
3.2.1. Tangible user interfaces (TUIs)
TUIs (Tangible User Interfaces) are a form of computer interface whose
focus places it in opposition to GUIs (Graphical User Interfaces), and pertain to
all uses of computers. Shaer and Hornecker cite three basic types of TUIs:
interactive surfaces on which objects, usually marked somehow, are placed,
constructive assemblies of smaller interactive modules, and token and constraint
systems that limit the movements or positioning of objects by means of physical
constraints.65
One of the most powerful ideas with regard to TUIs is the notion of
“space–multiplexing,” which is given as a property of graspable user interfaces, a
subset of TUIs. Shaer and Hornecker describe it:
When only one input device is available, it is necessarily time–multiplexed: theuser has to repeatedly select and deselect objects and functions. A graspableuser interface on the other hand offers multiple input devices so that input andoutput are distributed over space, enabling the user to select an object orfunction with only one movement by reaching for its physical handle.66
TUI research also touts the “integration of physical representations and
control...which basically eliminates the distinction between input and output
65 Ibid., 49–50.66 Ibid., 47.
36
devices.”67 User feedback is collocated with the input device, thus giving the
seductive illusion that the user is directly touching the digital information. The
degree to which this coupling takes place is measured in the literature on a
continuum called, problematically, embodiment, which axis “represents how
closely the input focus is tied to the output focus in a TUI application, or in
other words, to what extent does the user think of the state of computation as
being embodied within a particular physical housing.”68 In situations where the
“embodiment” index is high, the objects can help to extend memory during
reasoning or other kinds of thought:
Actions such as pointing at objects, changing their arrangement, turning them,occluding them, annotating, and counting all recruit external elements (whichare not inside the mind) to decrease mental load.69
Perhaps because the TUI discipline is largely concerned with general
computing and comes in a large part from product design, an arena where
mappings are predetermined by the designer and usually not alterable by the
end user, the coupling between TUIs, which generate an input, and their output
is often discussed as effectively a direct connection. For example, Overbeek and
Wensveen discuss action to function coupling in a way that implies that the
design process obviates the basic concern in this essay. In their paradigm,
although they cite six parameters that need to coincide for a “natural coupling,”
67 Ibid., 48.68 Ibid., 52–53.69 Ibid., 67.
37
all the parameters map directly to the output; the designer chooses and fixes the
entirety of the black box.70 Thus, much of the discussion about what is
essentially mapping conflates the first four steps above (in §3.1.2), and the
discourse is replete with conduit metaphor phrases implying that the TUI
objects contain the information read from the outputs with which they are
coupled.
Nonetheless, these interfaces are physical objects which were designed
by humans, for use by humans, and as such do exhibit a high degree of
metaphorical coherence despite any discussion about them. For a brief example,
consider the interactive surface class of TUIs. The surface
of interaction is, both in a physical and in a metaphorical
sense, a bounded space. The visual patterns recognized
by the computer, called fiducial markers (see Figure 8),
are objects in this bounded space, and are therefore
beholden to the “physical” laws that define that space. In the commercial
implementation of the reacTable, for example, some actions are triggered by
proximity between fiducials representing certain functions. These interactions
can be understood by the user in terms of attraction forces, as magnetism can
be, when inputs and outputs between objects are connected dynamically and
70 Kees Overbeek and Stephan Wensveen, “From perception to experience, from affordances toirresistibles,” Proceedings of DPPI03 (Designing Pleasurable Products and Interfaces (New York:ACM, 2003): 95–6.
38
Figure 8: a fiducial marker(76)
automatically based on physical collocation. That these markers are objects in a
bounded space implies that they are also destinations on an indeterminate path
through this space,71 which lends such interfaces to exploration via the journey
metaphor as a point of departure.
Beyond the interactive surface category, constructive assemblies are also open
to metaphorical interpretation, as the bounded space they occupy is the same
one which we also occupy. Objects in this category are often manifested as
sequencers in a way that corresponds well to our metaphorical understanding
of time in terms of space and movement along a path.72 However, these
interfaces are only intuitive so long as their chosen mappings pan out with
respect to our physical expectations.
In practice, while from a design perspective token and restraint systems
form a separate category, they can be seen with respect to the present
discussion to fall into subcategories of interactive surfaces or constructive
assemblies. Whether their plane of function is taken to be a surface one level
abstracted from our environment or to be part of our environment itself would
be the distinguishing factor in a given implementation.
With these ideas as background, I sought with the piece discussed below
71 See Lakoff’s discussion of duality in metaphorical representation: Lakoff, “Contemporarytheory of metaphor,” 218–229.
72 Lakoff, “Contemporary theory of metaphor,” 216–18. For examples of this kind of interface,see: Martin Kaltenbrunner, “Musical Building Blocks,” Tangible Music,<http://modin.yuri.at/tangibles/?list=2> (15 Apr 2011).
39
to incorporate aspects of this type of interface in a simple and metaphorically
coherent controller.
3.2.2. The “CHOAM” fiducial ball controller
“Combine Honnette Ober Advancer Mercantiles.”73 is a solo electronic
performance piece that I composed and performed in 2010. With many of the
ideas discussed thus far floating nebulously apart from the verbal level of my
mind, I attempted to build a new interface which would yield a more human–
controlled sound than I had been able to achieve to that point with other live
electronic pieces. I had been able to develop sounds which were expressively
modulated by the rotation and position data from fiducial markers, recognized
through the built–in camera on my laptop by means of the open–source
reacTIVision software and fed through Max/MSP to SuperCollider. The original
idea was to place them on large objects with which one or more performers
would interact in a rule–based game piece. However, as most of the time was
spent with sound design, and the concert was impending, that idea was
scrapped for a much simpler approach.
I duct–taped over the disagreeable pattern covering an inflatable rubber
ball, and placed the fiducial markers in various locations on its surface. Some
73 After the interstellar trade conglomerate in Frank Herbert’s Dune series. (The piece waswritten for a science fiction–themed concert.)
40
were repeated, and they were grouped in clusters such that different areas on
the ball would have distinct sonic characters. The compositional process was
then the choice of sounds, parameters of variance, and the arrangement of the
markers on the sphere.
Though I didn’t realize it at the time, the layout of the controller
manifested the MUSIC–IS–A–JOURNEY metaphor, an extension of the SOURCE–
PATH–GOAL schema. Because I had arranged the markers as stopping points,
they fulfilled the entailment SOUNDS–ARE–DESTINATIONS, and the interface
afforded exploration of the sonic “landscape,” which exploration I embarked
upon semi–improvisationally. An important point to note is that this type of
interface allows tactile control over rather high–level aspects of the musical
performance. Indeed, a major concern with my inquiry into interface is finding
a way to resolve questions of form in real–time and by means of a physical
connection.
A shortcoming of this interface was the use of the built–in camera on the
laptop display; its position opposing me necessitated a mental remapping of the
image, as the metaphorical agent should have been navigating the surface from
my point of view rather than from the computer’s. Though similar visual
distortions have been shown to be tolerable and even to be elided
experimentally with habituation,74 the point with this interface was to be usable
74 George Stratton, “Vision without inversion of the retinal image,” Psychological Review 4/4
41
without any training. A vast improvement would be to use an external camera
mounted somewhere on my head such that the computer’s input image would
be from the same point of view as mine. Exploration of the surface between
sound–destinations would then be completely straightforward and require no
training or wasted compositional effort whatsoever.
3.3.Proposed future work
Along the lines of the ideas briefly skirted in “CHOAM,” my future work
will incorporate gestural imaging technology to more directly map gesture to
sound in conjunction with the use of TUI objects. This mapping, however, will
not be limited to the simplistic one–to–one correspondence of the thought
experiment in the above demonstration (in §3.1.1).
3.3.1. Gestural imaging and marked objects
Within the last year affordable full–body gesture recognition through
time–of–flight three–dimensional imagers has finally become a reality. The
cheapest such imager is the Xbox Kinect controller, but open–source versions
of similar function exist, such as ofxStructuredLight, which use a projection of
a known image onto a surface along with stereo imaging, which can calculate
the distance of everything in the field of view from a combination of distortions
(1897): 341–360.
42
in the known image and of triangulation between the two cameras. This
technique can acquire similar images without proprietary equipment. Of
particular relevance to this line of exploration is research by linguist Eve
Sweetser, who focuses on gesture as co–articulated with normal conversational
speech. Sweetser sees gesture of this nature, in TUI terminology, as space–
multiplexing metaphorical expression, often inconsistently, though never
incoherently, with the metaphors that are co–expressed verbally in the time–
multiplexed flow of speech.75 I have plans to use this technology both to
gesturally76 explore virtual representations of pitch space and of other
multidimensional spaces and to sonify dance, along similar lines as Antle et al.77
Besides exploration of full–body gesture, I plan to continue development
of tangible objects. Of the many metaphors that can be applied to performance
and composition of music, I find MUSIC–IS–A–JOURNEY the most amenable both
to these objects and to realtime exploration of musical form. From this point of
view, I plan to expand on the “CHOAM” interface by building a larger ball
covered with velcro on which movable and recombinable fiducial badges and/or
barcodes can be placed. The new interface would use barcodes or smaller
fiducials to “zoom into” the fiducial destinations beside them, thus redefining
75 Eve Sweetser, “Looking at space to study mental spaces: Co–speech gesture as a crucial datasource in cognitive linguistics,” in Methods in Cognitive Linguistics, eds. Monica Gonzalez–Marquez, Irene Mittelberg, and Seana Coulson (Amsterdam: John Benjamins Publishing,2007).
76 That is, “metaphorically by means of physical movement.”77 Antle, et al., “Human–computer–intuition?” 242.
43
the explorable space bounded by the object in hand. “Out–zooming” codes
would bring it back “up” a level, or physical zooming would shift focus to the
larger fiducials. One ball could thus effect multiple levels of control. The
compositional effort would then be in effectively systematizing the distribution
of markers to allow flexible and intuitive control with relatively constant
placement, keeping in mind that the markers would be movable.
Along the lines of the TUI constructive assembly model, marking physical
objects with symbols, either barcodes or fiducials, is also in the works. I am
planning small sculptures which bear barcodes that can be scanned to produce
sound, effectively making the sculptural object the score, though it may only be
machine–readable. A simple long sheet of paper is first, with barcodes as sound
destinations, where a branching path will be defined along the sheet; sounds
will be both signified by and encoded in the barcodes along this path. The
piece would essentially be a graphical score, and would provide a smooth
conceptual segue from more traditional forms of representation into slightly
less traditional three–dimensional manifestations.
3.3.2. Mapping image schemas to low or high level outputs
Aside from these mainly technical directions, I plan also to look deeper
into the mapping question. While one can usefully think of mapping in
44
Drummond’s more mechanical terms, taking a step back and re–acknowledging
the end goal can also be of help. Wanderley and Orio do so by positing different
levels of musical control, which they refer to as “note–level” and “score–level”
control.78 However, rather than adopt their terminology here, I prefer to use the
more generic “low–level” versus “high–level” control, as these terms don’t invite
the possibility of drawing false distinctions in a style of music that favors
transgression of traditional demarcations. They bound a vector that implies a
continuum along which an arbitrary level of complexity can be seen as possible,
rather than arbitrarily privileging particular areas along this continuum.
How can we approach this low– or high–level control? Johnson posited
several basic image schemas beyond the few discussed above, some of which
may be appropriate to inform the construction of interfaces in keeping with the
proposed methodology. Borrowed from Brower, a list of some promising
schemas follows:
1. containment2. balance3. blockage4. diversion79
Low–level one–to–one mappings of these image schemas would manifest like
the pitch ball, and higher–level mappings like the “CHOAM” controller. As an
example, given the near ubiquity of three–axis accelerometers at the time of
78 Wanderley and Orio, “Evaluation of Input Devices for Musical Expression,” 69.79 See Brower, “Pathway, Blockage, Containment,” 36.
45
writing, both in laptops and in mobile phones, the balance schema seems ripe
for appropriation in a control interface. A semi–fixed composition could be
guided through tension/resolution–based musical structures by tilting a control
device in and out of the level plane.80
The balance schema could be combined with the containment schema
using a balance scale as an interface. Tokens could be placed on the pans of the
scale, and then recognized by the computer as triggers for sound sources, either
by RFID or by a visual recognition scheme, and the tokens’ physical weight
could offset the scale’s balance, and thus the composition of the musical
structure. As a free–swinging structure, the blockage and diversion schemas
could be factored in by tracking the rotation of the torque arm and pans,
executing appropriate changes when reversal (on the X–Y plane) or deflection
(on the Z plane) of direction occurs. Force and agency in such an interface
would come directly from the performer.
These hypothetical forays into metaphorically coherent interface design
serve to illustrate the kind of possibilities that pursuit of this line of thought
can open up. However, the mechanics of generating computer music is not the
only area that can benefit from application of these ideas.
80 Along vaguely similar lines, as mentioned above, the pitch ball example above could also bemade to conform to a tense/relaxed concept by mapping to higher–level musical structuresin a similar way.
46
4. Informing the conceptual sphere
Metaphor is also a rich conceptual domain that can be drawn upon for
compositional inspiration. The most elaborate piece I have produced to date is
as an example of a work informed by this area of inquiry, both in its musical and
in its conceptual content.
4.1.“Choices:” just pitch space, the conduit metaphor
“Choices” incorporates much of what I have learned during my study at
Mills. It is a multi–modal dance and electronic music piece made in
collaboration with Rebecca Gilbert, a Bay Area modern dancer and the main
choreographer.81 In the piece, six dancers first bring plain, empty boxes, on
which are affixed two–dimensional Data Matrix barcodes bearing SuperCollider
code, to the musician, in the role of the checker, to be scanned for them,
resulting in an atmosphere which is polyrhythmic, pitched, and stable. Later
they take it upon themselves to scan codes on different parts of these boxes,
resulting in a much different sonic atmosphere, one that is chaotic, arhythmic,
and textural. The first six or so minutes, the pitched atmosphere, consists of
more homogenous group movement, from which some individual dancers show
a desire to break away, while the second half, about six minutes of the chaotic
81 Five other dancers, Kate Knuttel, Sergio Lobito, Mica Miro, Jeanne Platt, and Natalie Rael,also contributed to the development of the movement in the piece.
47
atmosphere, is characterized by highly social interactions between pairs of
dancers and socially motivated actions by individual dancers. The course of the
melody in the first half is indeterminate with respect to performance, varying
depending on the randomly selected box. The pitch material comes from a scale
composed of sixty–three pitches in a single octave in a seven–limit just–intoned
pitch space from which the harmonic progression is built by transposing a
single chord structure gradually to different center pitches within the octave.
The musical structure is complicated by the fact that the melodic line, derived
from a path moving from pitch to pitch within the chord (see Figure 11 below),
is separated into three parts, with rests holding places for pitches that appear in
another part. All of these parts played together with an identical rhythmic
structure would yield the underlying melody, but each part in practice has a
different number of beats, which lends a polyrhythmic texture and
systematically varying pitch contour when played together. Each box contains
codes for parts of the pattern around a particular tonal center, and so each time
a dancer brings up a box to be scanned, the chord goes through a transposition,
which happens in three stages by virtue of its split into three parts. The chaotic
material is mainly synthesized sounds modulated in various ways both by the
din of shopping carts rattling through a large warehouse and by the kernel of
my computer’s operating system read as a sound file.
48
The program notes were presented as follows:
Fredric Rzewski says of our choice to participate in the politics of ourcommunities, “If people choose to ignore this fact [of their responsibility](whether consciously and spontaneously, or because they are manipulated intodoing so), if they choose to turn their minds away from politics, they give uptheir right to share in the political life of the community. They in fact abandontheir duty to contribute to the collective organization of that community'sfuture.”82 (emphasis added) It is of the utmost importance that we choosewhether or not to be manipulated.
By this oblique reference to politics, the piece is meant to convey that as
members of a society, despite being pulled in many directions by many
subgroups with their own self–interests in mind, it is our responsibility to
participate in the course of our collective progress.83 However, many powerful
elements seek to manipulate the majority into believing they haven’t the right to
this participation, that only those elements’ own idea of a proper course should
be followed. The ritual of which the dancers are part in the first half, “the
purchase,” is a familiar one to us, and is also one that is a powerful form of
manipulation, though by no means the only one.
The piece incorporates metaphor in two ways that demonstrate the
fruitfulness of conceptual metaphor in composition: firstly, in the use and
exploration of just–intoned pitch–space, and secondly in its discourse on the
conduit metaphor.
82 Fredric Rzewski, “Music and Political Ideals,” in Nonsequiturs: writings and lectures onimprovisation, composition and interpretation, eds.Gisela Gronemeyer and ReinhardOehlschlägel, (Köln: MusikTexte. 2007): 188–200.
83 Progress in the sense of movement somewhere, not necessarily betterment. In other words,in our evolution, in its correctly construed sense.
49
The present discussion about just–intonation must necessarily assume
some background knowledge.84 The piece explores pitch in terms of the seven–
limit just intoned pitch space, which is a three–dimensional representation of
pitch relationships in whole–
number ratios involving up to
three primes (one prime factor
per XYZ axis). The portrayal of
these relationships in what one
can understand as three–
dimensional space entails a
number of correspondences which can carry over from that understanding and
apply to reasoning about pitch relationships and movement within that space.
Thus, chords can be represented as shapes within that space (see Figure 9),
which remain consistent in their internal relationships regardless of their
transposition within that space (see Figure 10), purely by virtue of the system of
its organization. With regard to this piece, the melody was built thinking of
these chord shapes as path descriptors, with the pitch sequence manifesting the
course of travel along that path (see Figure 11). Thus, the compositional
approach to transposition and melodic structure can be seen as a metaphorical
84 For the reader who would like more background, the author would suggest: David B. Doty,The Just Intonation Primer: an introduction to the theory and practice of just intonation, 3rd ed. (SanFrancisco: Other Music, Inc., 2002–6).
50
Figure 9: chord shapes pictured in a 7–limit pitch space:dominant 9th (top) and the "4th/6th" chord (bottom)
extension of the SOURCE–PATH–GOAL schema.
Perhaps at a deeper level than the metaphors involved in the
composition of the pitch material is the conceptual relationship of the piece to
the conduit metaphor. Of central importance is the distinction Reddy raised
between the signal and the message.
During the course of the average
person’s humdrum, everyday
manipulation into complacency,
conduit metaphorical messages are
abundant. He is made to feel that his
life is empty, that there are things that
can be acquired that can fill it, or that
his spirit is empty and that some
beliefs can be imbibed that can fill it,
or made to believe in any number of conceivable lacunae and offered their
purported remedies. Throughout this type of discourse is the underlying
assumption that there is an indefinable something that can be received that can
fill these lacunae, and the unspoken further assumption about these things is
that they will be received, unpacked, and taken in without effort, as that effort
has already been expended by the packager, the giver as he would be led to
51
Figure 10: chord transpositions as movement in pitchspace
believe. In the piece, the checker/musician represents the mouthpiece of the
forces who bear this message, and the computer, the sound source, is the hub in
Reddy’s toolmaker’s paradigm. The music is the set of instructions which are to
be interpreted. In this case,
the “fourth wall” of the
toolmaker’s wheel is broken in
that the instructions are
passed between the performers and the audience, rather than between each
other. In the first half, the signal, which is the sound itself, open for
interpretation, is constrained by the ritual in which the dancers must
participate. They present these empty boxes hoping for the solutions they are
purported to hold, and leave confused when the boxes are immediately
discarded. Once they realize that the information they seek is not contained
within these shells, they are freed from the group bondage and interpersonal
growth and relationships can ensue. Thus, the composition can be interpreted
as a loose portrayal and extension of Reddy’s toolmaker’s paradigm.
It should be noted that this piece was originally conceived of long before
my research into conceptual metaphor and embodied cognition began in
earnest. However, the ideas in Reddy’s article had been simmering below the
level of verbal cohesion at least since then. It is testament to their coherence
52
Figure 11: melodic structure as path traversal along the chordshape (pitches are destinations)
with life experience that this piece on which I had expended so much thought
conformed so cleanly to interpretation through their lens.
Though the conduit metaphor can be a dangerous concept when
theorizing about many ideas that require conceptual precision, on an artistic
level it can provide rich and nuanced possibilities, as it underlies so much of
our thought about social mores and conventions. “Choices” is a telling
illustration of how it, and the theory of conceptual metaphor Reddy’s article
helped spark, have informed my work.
5. Conclusion
A theory so pervasive and that has such a clear generative conceptual
power as the contemporary theory of metaphor cannot be ignored when
considering the study of or production of a symbolic system of such
fundamental importance to the human as music. Building a system that takes
into account this fact of our status as animals in a physical world, one that is
informed by the linguistic clues which betray our understanding of that
situation, can open the door onto a world of expressiveness that might
otherwise remain hidden.
It is of paramount importance that one keep in mind the different stages
of mapping computer input to output, as this rapidly changing technology may
53
metamorphose into something that, possibly by virtue of our metaphorical
conception of it, may obscure these distinctions. Nonetheless, that metaphorical
understanding is a deep source domain which can be explored artistically to
great benefit.
54
6. Appendix: contents of accompanying media
video partition:
1. Combine Honnette Ober Advancer Merchantiles
2. Choices
data partition:
1. source code and documentation for “CHOAM”
2. source code and documentation for “Choices”
3. copies of open-source or free software used in performance
55
7. Bibliography
Adams, John D.S. “Giant Oscillators.” Musicworks 69 (1996): as reprinted at<http://www.emf.org/tudor/Articles/jdsa_giant.html > (15 April 2010).
Antle, Alissa N., Greg Corness, and Milena Droumeva. “Human–computer–intuition? Exploring the cognitive basis for intuition in embodiedinteraction.” International Journal of Arts and Technology, 2/3 (2009): 235–254.
Bown, Oliver, Alice Eldridge, and Jon McCormack. “Understanding InteractiveSystems.” Organised Sound 14/2 (2009): 188-196.
Brower, Candace. “Pathway, Blockage, and Containment in Density 21.5.” Theoryand Practice 22–23 (1997-98): 35–54.
Brown, Chris and John Bischoff. “Indigenous to the Net: Early Network MusicBands in the San Francisco Bay Area.” 2002.<http://crossfade.walkerart.org/brownbischoff/IndigenoustotheNetPrint.html> (15 April 2010).
Collins, Nicolas. Handmade Electronic Music: the art of hardware hacking. 2nd ed.,New York: Routledge, 2009.
Collins, Nick. “Live Coding Practice.” Paper presented at the InternationalConference on New Interfaces for Musical Expression, New York, USA,June 6–10, 2007.
Dahlstedt, Palle. “Dynamic Mapping Strategies for Expressive SynthesisPerformance and Improvisation.” Computer Music Modeling and Retrieval:Genesis of Meaning in Sound and Music. 5th International Symposium, CMMR2008 Revised Papers (2008): 227-242.
Doty, David B. The Just Intonation Primer: an introduction to the theory and practiceof just intonation. 3rd ed. San Francisco: Other Music, Inc., 2002–6.
Drummond, Jon. “Understanding Interaction in Contemporary Digital Music:from instruments to behavioral objects.” Organised Sound 14/2 (2009):124-133.
56
Eitan, Zohar and Renee Timmers. “Beethoven’s last piano sonata and those whofollow crocodiles: Cross–domain mappings of auditory pitch in a musicalcontext.”Cognition 114 (2010): 405–422.
Edelman, Gerald. Bright Air, Brilliant Fire: on the Matter of the Mind. New York:Basic Books, 1992.
Hofstadter, Douglas R. Gödel, Escher, Bach: an Eternal Golden Braid. 20thanniversary ed. New York: Basic Books, 1999.
Jacob, Robert J. K., Linda E. Sibert, Daniel C. McFarlane, and M. PrestonMullen, Jr. “Integrality and Separability of Input Devices.” ACMTransactions on Computer–Human Interaction 1/1 (1994): 3–26.
Johnson, Mark and Tim Rohrer. “We are living creatures: Embodiment,American Pragmatism and the cognitive organism,” in CognitiveLinguistics Research, 35.1: Body, Language, and Mind, Volume 1: Embodiment.Edited by Tom Ziemke, Jordan Zlatev, Roslyn M. Frank. Berlin: Moutonde Gruyter, 2008.
Kaltenbrunner, Martin. “Musical Building Blocks.” Tangible Musical Interfaces.<http://modin.yuri.at/tangibles/?list=2> (15 Apr 2011).
Lakoff, George and Mark Johnson. Metaphors We Live By. Chicago: University ofChicago Press, 1980.
Lakoff, George. “The contemporary theory of metaphor.” in Metaphor andThought, 2nd ed. Edited by Andrew Ortony. Cambridge: CambridgeUniversity Press, 1993.
Magnusson, Thor. “Of Epistemic Tools: musical instruments as cognitiveextensions.” Organised Sound 14/2(2009): 168-176.
Mead, Andrew. “Bodily Hearing: physiological metaphors and musicalunderstanding.” Journal of Music Theory 43/1 (1999): 1–19.
Morris, Charles. Foundations of the theory of signs. Chicago: University of ChicagoPress, 1938.
57
Mumma, Gordon. “Creative Aspects of Live-Performance Electronic MusicTechnology.” Papers of 33rd National Convention. New York: AudioEngineering Society, 1967.
Naumann, Anja, Jörn Hurtienne, Johann Habakuk Israel, Carsten Mohs, MartinChristof Kindsmüller, Herbert A. Meyer, Steffi Hußlein, and the IUUIresearch group. Engineering Psychology and Cognitive Ergonomics: LectureNotes in Computer Science. 4562 (2007): 128–136.
O’Neill, Shaleph. Interactive Media: The Semiotics of Embodied Interaction.London: Springer–Verlag, 2008.
Overbeek, Kees and Stephan Wensveen. “From perception to experience, fromaffordances to irresistibles.” Proceedings of DPPI03 (Designing PleasurableProducts and Interfaces. New York: ACM, 2003.
Reddy, Michael J. “The conduit metaphor: A case of frame conflict in ourlanguage about language.” in Metaphor and Thought, 2nd ed. Edited byAndrew Ortony. Cambridge: Cambridge University Press, 1993.
Rzewski, Fredric. “Music and Political Ideals.” in Nonsequiturs: writings andlectures on improvisation, composition and interpretation. Edited by GiselaGronemeyer and Reinhard Oehlschlägel, (Köln: MusikTexte. 2007).
Shaer, Orit and Eva Hornecker. “Tangible User Interfaces: Past, Present, andFuture Directions,” Foundations and Trends in Human–Computer Interaction3/1–2 (2009): 1–137.
Shayan, Shakila, Ozge Ozturk, and Mark A. Sicoli. “The Thicknes of Pitch:Crossmodal Metaphors in Farsi, Turkish, and Zapotec.” Senses & Society6/1 (2011): 96–105.
Stone, Ruth M. “Toward a Kpelle Conceptualization of Music Performance,”Journal of American Folklore 94/372 (1981): 188–206.
Stratton, George. “Vision without inversion of the retinal image.” PsychologicalReview 4/4 (1897): 341–360.
58
Sweetser, Eve. “Looking at space to study mental spaces: Co–speech gesture as acrucial data source in cognitive linguistics.” in Methods in CognitiveLinguistics, Edited by Monica Gonzalez–Marquez, Irene Mittelberg, andSeana Coulson. Amsterdam: John Benjamins Publishing, 2007.
Trueman, Dan. “Why a laptop orchestra?” Organised Sound 12/2 (2007): 171-179.
Wanderley, Marcelo and Nicola Orio. “Evaluation of Input Devices for MusicalExpression: Borrowing Tools from HCI.” Computer Music Journal 26/3(2002): 62–76.
Wessel, David and Matthew Wright. “Problems and Prospects for IntimateMusical Control of Computers.” Computer Music Journal 26/3 (2002): 11–22.
Whalley, Ian. “Software Agents in Music and Sound Art Research/CreativeWork: current state and possible direction.” Organised Sound 14/2 (2009):156-167.
Zibkowski, Lawrence M. “Metaphor and Music Theory: Reflections fromCognitive Science,” Music Theory Online 4/1 (1998).
59