Precomputed Wave Simulation for Real-Time Sound Propagation of Dynamic Sources in Complex Scenes
Nikunj Raghuvanshi†‡, John Snyder†, Ravish Mehra‡, Ming C. Lin‡, and Naga K. Govindaraju†
†Microsoft Research ‡University of North Carolina at Chapel Hill
Sound Propagation
• Essential for immersion• Physically complex, perceivable effects
Positive pressure Negative pressure
State-of-the-art: Games
Hand-tuned filters: difficult & tediousSimulated propagation → automatic, detailed, scene-dependent acoustics
Direct path muffledReflections muffled
Direct path muffledReflections clear
Direct path clearReflections muffled
Images: Dmitri Gait © 2003
Previous Work
• Beam-tracing [Funkhouser et. al. 2004]
• Frustum-tracing [Chandak et. al., 2008]
• Image-source method [Tsingos, 2009]
• Geometric methods: diffraction/scattering, high-order reflections challenging
Our Approach
• Wave simulation– diffraction: attenuation behind obstructions,
low-pass filtering– scattering from complex surfaces– dense reverberation
• Precompute & render– similar to [James et al. 06]– but propagation, not radiation
Comparison with Half Life 2TM
• “Train Station” scene
• Our engine vs. game’s default sound engine
Precompute & Render
Impulse ResponseTime
Pres
sure
Precompute
direct
reflected
Render
?
+=Pr
essu
re
Time*Convolution
Moving Source & Listener
Time
Pres
sure
Moving Source & Listener
Time
Pres
sure
Moving Source & Listener
Time
Pres
sure
Moving Source & Listener
Time
Pres
sure
Moving Source & Listener
Time
Pres
sure
Main Challenge
• Impulse Responses vary spatially• 7D sampling space
– source (3D) x listener (3D) x time (1D)
• Can reduce to 6D– Restrict listener to 2.5D surface
• Brute-force sampling still infeasible
Brute-force sampling
• Scene: “Citadel” (28m x 60m x 32m)• Simulation grid (12 cm)
– 200,000 gigabytes
• Sub-sampled (~1m)– 60 gigabytes– Interpolation non-trivial
Main Contributions
• Compact representation for impulse responses– 100x reduction in memory usage– works with (band-limited) wave simulations– plausible high-frequency extrapolation
• Spatial interpolation allowing coarse (~1m) grid• Real-time wave effects
– automatic correspondence to scene geometry
Our approach: Overview
Source locations
Listener locations
~ 1-2 meters
Wave Simulation
Response (Listener at source location)
Encode this compactly
• Simulator: ARD [Raghuvanshi et. al., 2009]• Fast and memory-efficient• Band-limited (~1kiloHertz) – time/memory constraints during precomputation
Source (Gaussian pulse)
Positive pressure Negative pressure
Psycho-acoustics
Time
Pre
ssur
e
Early Reflections
Late Reverberation
• Direct Sound: sense of direction• Early Reflections: loudness, timbre; spatial variation; 50-100 ms• Late Reverberation: decay envelope; no spatial variation; few seconds
Reference: “Room Acoustics” by Heinrich Kuttruff
Demo
• Perceptual effect– Only Direct sound– Direct + Early Reflections– Direct + Early Reflections + Late Reverberation
TimePres
sure
Record response
Probe Source (Gaussian)
Listener
Technique: Wave Simulation
Technique: Peak Detection
• search for local extrema• extract peak delays and amplitudes• wide-band information
Early Reflections (ER) Late Reverberation (LR)
Technique: ER/LR Decomposition
• separation based on echo density (500 peaks/sec)• single LR filter per room• 10x reduction in memory & precomputation time• ER varies spatially
Store one per room
Early Reflections (ER) Late Reverberation (LR)
Time Frequency
Technique: Frequency Trend Extraction
Band-limitedFFT
FFT
Technique: Frequency Trend Extraction
• peak data does not capture downward trend (diffraction)
Early Reflections (ER) Late Reverberation (LR)
Time Frequency
FFT
FFTDivide
Frequency trend
Extrapolated
Technique: Frequency Trend Extraction
Early Reflections (ER) Late Reverberation (LR)
Time Frequency
FFT
FFTDivide
Store ,[ ]Peak times and amplitudes Frequency trendExtrapolated
Technique: ER Representation
Early Reflections (ER) Late Reverberation (LR)
Time Frequency
FFT
impulse response interpolation
Runtime: Spatial Interpolation
listenersource
Runtime: Spatial Interpolation
Time
Pres
sure
P1
P2
P2
Time
Pres
sure
P1
Ours
Time
Pres
sure
Linear
Time
Pres
sure
peak aliasing
no aliasing
?P1 P2
Runtime: Render
• Fast frequency-domain convolutions– bottleneck: FFT– Performed using Intel MKL (single-threaded)
Results: Citadel Walkthrough
• Game environment from Half Life 2TM
• Size: 60m x 28m x 22m
Results: Walkway
• Size: 19m x 19m x 8m• Realistic attenuation behind walls
Results: Walkway
• Sound focusing – Loudness increases beneath concave reflector
Results: Living Room
• Empty vs. furnished living room
• Scattering off furnishings changes acoustics
Integration with Half Life 2TM
• Train Station scene: 36m x 83m x 32m
Cost
• Run-time: ~5% of a core per source• Memory: few hundred MBs• Precompute: few hours per scene
• Machine: 2.8GHz Quad-core Intel Xeon, 2.5GB RAM
Conclusion
• First real-time wave-based propagation engine– diffraction, occlusion/obstruction, late reverberation, focusing,
complex 3D scenes
• Moving sources and listener• Compact IR encoding – 100x compression • Fast, high-quality IR interpolation on coarse (~1m) grid
Future Work
• Need further 10-100x reduction in memory– 4x: 32-bit peak/frequency trend data → 8-bit on decibel scale– temporal IR compression: loudness, comb-filtering etc.
– spatial compression + adaptive sampling
• Scalability: cells and portals for sound– efficient boundary integral for portal transfers
• Limited dynamic geometry – Precomputed frequency-dependent occlusion factors
• Runtime: Use GPU for convolution (FFT and IFFT)
Thank you!!
• Go to: http://research.microsoft.com/en-us/um/people/nikunjr/siggraph2010/FastWave.avi
• Hear the video over headphones
Runtime Processing: RenderingPe
r Ear
.FFT
IFFTOutput
Short FFT
Pre-baked
Input
ER
LR
Frequency Trend. + +
Source 2
Source n
: element-wise multiplication.: element-wise sum+
Overview
• State of the art• Our approach• Results• Conclusion
Overview
• State of the art• Our approach• Results• Conclusion
Overview
• State of the art• Our approach• Results• Conclusion
Overview
• State of the art• Our approach• Results• Conclusion
Overview
• State of the art• Our approach• Results• Conclusion
State-of-the-art: Games
Image © Dmitry Gait 2003
Occlusion
Obstruction
Exclusion
Hand-tuned filters: tediousSimulated propagation → automatic, detailed scene-dependent acoustics