problems and solutions in game audio

1

Bits, Bytes and Beats: Problems and Solutions in Video Game Audio

Karen [email protected]

2

What Goes Into Game Audio?

3

Games = Gateway to Geekdom?

4

Video Games = Evil?

5

Games = Gateway to Innovation

6

Outline

1. History of Game Audio Innovations

2. Three Fundamental Open Problems:• Mixing• Variation• Adaptability

– Directions of my research

3. Summary/Conclusions

7

2. A Brief History of

Game Audio

Innovations

(What’s so different about

game audio??)

8

A Brief Historical Outline

2.1 8-Bit (1970-1990)

2.2 16-Bit (1985-1995)

2.3 64-Bit (1995-2000)

2.4 128-Bit (2000-2005)

2.5 Mobile/Handheld

2.6 Today

9

Computer Space!

(Nutting Associate 1971)

First game to have sound.

2.1 8-BIT

10

Pong (Atari 1972) (Al Alcorn)

2.1 8-BIT

11

Space Invaders (Midway 1978)

First use of continuous “background” music.

2.1 8-BIT

12

Rally X (Namco/Midway 1980)

The birth of the loop as response to memory constraints.

2.1 8-BIT

13

Making a “beep” in assembly (Time & memory constraints)

Beep PROC USES AX BX CX IN AL, 61h ; Save state PUSH AX

MOV BX, 6818 ; 1193180/175 MOV AL, 6Bh ; Select Channel 2, write LSB/BSB mode 3 OUT 43h, AL MOV AX, BX OUT 24h, AL ; Send the LSB MOV AL, AH OUT 42h, AL ; Send the MSB IN AL, 61h ; Get the 8255 Port Contence OR AL, 3h OUT 61h, AL ; End able speaker and use clock channel 2 for input MOV CX, 03h ; High order wait value MOV DX 0D04h ; Low order wait value MOV AX, 86h ; Wait service INT 15h POP AX ; restore Speaker state OUT 61h, AL RETBEEP ENDP

From Using Assembly Language by Allen L Wyatt

2.1 8-BIT

14

Technological Constraints

Up ‘N Down (Sega 1983)

2.1 8-BIT

15

Atari VCS (2600)

Up ‘N Down (Sega 1984)

2.1 8-BIT

16

Captain Comic (Color Dreams 1988)

2.1 8-BIT

17

Captain Comic’s songs:

Borrowing from classical music.

(Skill constraints)

2.1 8-BIT

18

Working With Constraints: Nintendo NES

Metroid (Nintendo 1987) (Hip Tanaka)

vibrato (pitch modulation), tremolo (volume modulation), slides, portamento, echo effects

2.1 8-BIT

19

Ballblazer (LucasArts 1984) (Peter Langston)

• Algorithmic generation• “Riffology” method (Optimized randomness) by Peter

Langston• 32 eight-note melody fragments• Algorithm chooses how fast, how loud, when to omit

notes, when to insert rhythmic break• Developed based on “lazy guitarist”

2.8-BIT

20

Track 1 (e.g. “Level_One”)

Instrument 1(envelope, waveform,effect filters, etc.)

Instrument 2(envelope, waveform,effect filters, etc.)

Working With Constraints: Commodore 64

Rob Hubbard’s Module Format

Pattern 1 (sequence of notes)



2.1 8-BIT

21

; track format:; high address of pattern to execute; low address of pattern to execute; max index in pattern to execute (FF=max, can be

terminate by instruction); number of times to repeat the given patterntrack1:.byte >pat00, <pat00, $3D, $02.byte >pat00, <pat00, $51, $00.byte >pat01, <pat01, $0A, $04.byte >pat02, <pat02, $16, $08.byte >pat03, <pat03, $10, $16.byte >pat03, <pat03, $FF, $00.byte >pat0b, <pat0b, $FF, $00.byte >pat0c, <pat0c, $FF, $00.byte >pat0d, <pat0d, $FF, $00.byte >pat0e, <pat0e, $FF, $00.byte >pat0f, <pat0f, $FF, $00.byte >pat10, <pat10, $FF, $00.byte >pat11, <pat11, $FF, $00.byte >pat12, <pat12, $FF, $00.byte >pat13, <pat13, $FF, $00.byte >pat14, <pat14, $FF, $00.byte >pat15, <pat15, $FF, $00.byte >pat16, <pat16, $FF, $00.byte >pat17, <pat17, $FF, $00.byte >pat18, <pat18, $FF, $00.byte >pat26, <pat26, $FF, $00.byte >pat37, <pat37, $FF, $00.byte >pat27, <pat27, $FF, $00.byte $00, $00, $00, $00

;C900pat38:.byte $86, $0D, $00 ; set instrument.byte $23, $D5.byte $9B, $05, $23 ; play note.byte $86, $0E, $00 ; set instrument.byte $9B, $05, $5A ; play note.byte $D6.byte $9B, $05, $5A ; play note.byte $86, $0D, $00 ; set instrument.byte $9B, $05, $23.byte $D5.byte $9B, $05, $23 ; play note.byte $86, $0E, $00 ; set instrument.byte $9B, $05, $5A ; play note.byte $D6.byte $9B, $05, $5A ; play note.byte $86, $0F, $00 ; set instrument.byte $00, $D0.byte $9B, $03 ; restore state.byte $00.byte $00, $00

2.1 8-BIT

22

Shadow of the Beast 2 (Psygnosis 1989)(David Whittaker)

2.2 16-BIT

MOD/Tracker on Amiga

23

Combining modules (in MIDI) with control statements

MIDI and the Creation of iMUSE

2.2 16-BIT

Land, Michael Z. and Peter N. McConnell. Method and Apparatus for Dynamically Composing Music and Sound Effect Using a Computer Entertainment System. US Patent No. 5,315,057. 24 May, 1994.

24

Super Mario World (Nintendo 1991) (Koji Kondo)

2.2 16-BIT

Musical layering techniqueMario jumps on Yoshi & gets extra layer of music(SNES).

25

Legend of Zelda: Ocarina of Time (Nintendo 1999) (Koji Kondo) (N64)

Proximity-based algorithms control cross-fades

2.3 64-BIT

26

The Sims (Maxis 2000) • Player-input/selectable music

2.4 128-BIT

27

Music driving gameplay elements.

New Super Mario Bros (Nintendo DS 2006)(Koji Kondo)

2.5 Mobile

28

State-of-the-Art Today

• 7.1 to 8.1 surround sound• Combination of synth with orchestra, choir• At least 512 channels of sound

• God of War (Gerard Marino, Sony 2006)• Bioshock (Gary Schymann, 2K 2007)

2.6 TODAY

29

3. Three Fundamental Open

Problems

30

Fundamental Problems

3.1 Mixing

3.2 Repetition. Repetition. Repetition (variability!)

3.3 Adaptability

Pathology: turning off sound/music, cognitive dissonance (failure of music to respond)

> reduces immersiveness

31

3.1 Mixing

Who needs mixing?

Chicken Shift(Bally 1984)

32

Problem: Mixing: Unpredictability, Variability

3.1 Mixing

33

Problem: Mixing: current state of dynamic range

… in a popular film

… in a popular game

Graphics adapted from those supplied by Rob Bridgett of Swordfish Studios.

3.1 Mixing

34

Solutions: Real-time Weighted Mixing

Weighted permutations – Predict which sounds can recur without making

obvious.• Example:

– Dialogue, Sound FX A. Sound FX B, player sounds, music, ambience

– If dialogue = “run!”, set parameter to 1– If gunshot is coming towards us, set parameter to 2– If no action, fade out music and raise ambience

REQUIREMENT: “intelligent” Engine to predict and set weighting

3.1 Mixing

35

Solution: Location-Based Run-Time Mixing

• Real-time DSP to adjust sound • E.g. bottle drop on hard floor of kitchen or in next

carpeted room• Factor in 5.1 surround to adjust real-time panning

• REQUIREMENT: audio engine to pass parameters from game and from player back and forth to engine.

3.1 Mixing

36

3.2 Variability

• Problem: Users get bored with hearing same sounds BUT sound designers can’t possibly record enough variations of sounds (time, budget)

• Problem: Users need a new experience every time they play the game (promised by LucasArts’ Euphoria technology)

• Problem: audio not responding to physics

37

Solution: Granular Synthesis

3.2 Variability

38

“Granular” Synthesis

• “acoustical quanta” (Dennis Gabor: 1947 "Acoustical Quanta and the Theory of

Hearing." Nature 159 (1044):591-594.)

• “sonic quanta” (Abraham Moles 1968 ”Information Theory and Esthetic Perception”.

Urbana: University of Illinois Press.)

• “particle audio” (Parker and Behm 2007 ”Generating Audio Textures by Example”,

Journal of Game Development, 2007)

3.2 Variability

39

Granular synthesis: Graphic Equivalent

3.2 Variability

Input Sample

Synthesized Result

"Texture Synthesis from Multiple Sources", by Li-Yi Wei. In SIGGRAPH 2003 Applications and Sketches.

40

Making a Sound Granular3.2 Variability

Parker and Behm 2007 ”Generating Audio Textures by Example”, Journal of Game Development, 2007

41

Granular Synthesis Examples

1. Crowd

2. Tennis

3. Speech

3.2 Variability

Crowd and speech examples borrowed from Leonard Paul at Vancouver Film School

42

Granular: Remaining Open Questions

• What elements in a sound effect can be varied while still maintaining the “meaning” of the sound?

• How can we create AI systems that are aware of these potential meanings, and make real-time adjustments to sounds in a game?

• How to develop an “audio physics engine”: e.g. footsteps change based on how much player is carrying, etc.

3.2 Variability

43

3.3 Adaptability

Problem: Gamesare non-linear, unpredictable andvery long!

A to B: 16 units

A to all: 376 units

30 rooms: 11280 units

10 levels: 100K+ units

Transitional Units

44

Solution: Game Audio Algorithms

By varying existing individual parameters, we can create algorithms to:– Write transitions– Vary compositions– Create new compositions– Allow user-generated content

3.3 Adaptability

45

Variable Musical Parameters

1. Variable tempo

2. Variable pitch

3. Variable rhythm/metre

4. Variable volume/dynamics

5. Variable DSP/timbres

6. Variable harmony (chordal arrangements, key or mode)

7. Variable mixing: from the speaker placement of certain sounds to run-time adjustments of orchestration mix

3.3 Adaptability

46

8. Variable form (open form) random structure9. Variable form (branching parameter-based music)10. Variable melodies: algorithmic generation

…in what follows we will focus on these last three

3.3 Adaptability

47

#8 Variable (Open) Form

• Random structure• Songs are segmented into components whose order

can be changed• Used in “hyrule field” of Legend of Zelda: Ocarina of

Time: player spends a lot of time, and the same sequence in the same order would get monotonous

3.3 Adaptability

48

#8 Example: Variable (Open) Form3.3 Adaptability

http://www.home.cs.utwente.nl/~zsofi/mozart/

Variations: 1114 x 22 = 1 518 999 334 332 964

49

#8 Variable Form: Non-linear Sequencing

• Musical control structures (repetitions, jumps, procedure calls) and grammars modelled on existing characteristics

• Music is to some extent already hierarchical (notes > phrases > sections > movements> pieces) how do we teach/learn to composer in this manner?

“Grammars as Representations for Music” C. Roads; Paul Wieneke, Computer Music Journal, Vol. 3, No. 1. (Mar., 1979), pp. 48-55.

• How can we create sequencing software to better prepare composers to write this type of music?

3.3 Adaptability

50

#9 Parameter Based Music: Parameters

• Number/action of non-playing characters• Number/action of playing characters• Actions• Locations (place, time of day, etc.)• Scripted or unscripted events• Player health or enemy health• Difficulty• Timing• Player properties (skills, endurance)• Bonus objects • Movement (speed, direction, rhythm)• “Camera” angle

The transition matrix approach and the creation of transitional units

3.3 Adaptability

51

#9 Example: Parameter-Based Music

No One Lives Forever (Guy Whitmore 2000)Six standard music states are based on number of

NPC enemies:

1. Silence 2. Super ambient 3. Ambient 4. Suspense/sneak 5. Action/combat 1 6. Action/combat 2

3.3 Adaptability

52

#9 Example: No One Lives Forever

Earth Orbit: Ambush theme starts in music state 5 (combat 1), transitions to music state 2 (ambient: in

elevator)

then transitions to music state 6 (combat 2)

3.3 Adaptability

53

#10. Algorithmic Variations(ongoing research focus)

Problems: • How do we create emotionally effective algorithmic

adaptive audio?• What aspects of audio carry meaning?

– How do these work individually and together?– What universals (within the Western world) are

there that can be codified?• How generalized/simplified can/do the rules/grammar

need to be?

3.3 Adaptability

54

Semiotics

• Sound/music as a symbolic language• What (and how) does music/sound communicate?• How can we study and break these down into a

grammar to generate algorithms?– What combinations are effective?– What variations/substitutions can be made with and

without changing meanings?

(For more info, see the work of Philip Tagg, Eero Tarasti, Jean-Jacques Nattiez, and Raymond Monelle; especially Phiip Tagg’s “Ten Little Title Tunes”, Mass Media Music Scholars’ Press 2002)

3.3 Adaptability

55

Semiotics of sound: Why is it important?An example…

3.3 Adaptability

56

Revised…

3.3 Adaptability

57

Defining a Sound Semiotics Grammar

Problem: Can we codify a semiotic grammar of sound? How? How do we gather enough data?

One solution: distributed classification, or crowd sourcing

3.3 Adaptability

58

Distributed Classification Examples

3.3 Adaptability

59

What Does the User Get?

• Contribution to knowledge• Feeling of being part of community• Believe it or not -- fun!

• See Luis von Ahn, “Games with a Purpose” IEEE Computer Magazine or

“Why do tagging systems work?” Conference on Human Factors in Computing Systems CHI 06

3.3 Adaptability

60

ESP Game

3.3 Adaptability

Player 1 Player 2

GUESSING: KID GUESSING: BOY

GUESSING: CAR

GUESSING: HAT GUESSING: CAR

SUCCESS!Consensus on: CAR

Input:

61

Games for Audio Tagging: Interactively Building an Online Database

Three games under development:

1. Game like ESP game but for audio (PHP and Flash front end with MySQL backend)

2. Audio-visual game in which users select image to audio

3. Audio-visual based game where users select appropriate audio content for visual image

3.3 Adaptability

62

Adapting the Algorithms For MIR

• MIR = Music Information Retrieval• Retrieval based on bpm, harmonic content, melodic

intervals, timbre, etc.• How can we use MIR techniques to make better game

audio?– User-generated playlists + new algorithms =

appropriate and new user-generated audio content

3.3 Adaptability

63

4. Summary/Conclusions

64

Unified Architecture

Routing, allocation and scheduling(includes system clocks)Input: Game

Data ParametersDetection(Beat tracking, phrase matching, pitch matching, harmony and key matching).

Prediction(Neural nets, fuzzy logic.)

Wave banks

Audio Data (MIDI)

Algorithmic composition/modellingSamplers, synths, tonegenerators

Intelligent mixing engine

AI Audio Engine

65

Why CS needs Arts (and vice versa)

66

Thank-you to…

Further information: [email protected]

www.GamesSound.com (my web site)www.algorithmic.netwww.granularsynthesis.comwww.audiokinetic.comwww.iasig.org

mailto:[email protected]

http://www.gamessound.com/

problems and solutions in game audio

Entertainment & Humor

audio engine

sound granular

audio textures

variability problem

h pop ax

lsb mov al

al alcorn

h high order