XAudio2 High Performance Considerations US
XAudio2 High Performance Considerations US
Overview
XAudio2 overview Voice & Graph optimization xAPO optimization Voice reuse Compression Streaming Debugging / Performance analysis
What Is XAudio2
Low-level cross-platform game audio API
Play hundreds of sounds at once Loop, start, stop, adjust sounds at any time Volume, pitch, filter, reverb, DSP Identical code on both platforms
Building block for higher-level sound design tools such as the XACT3 engine Replaced XAudio1 Replaced DirectSound for gaming purposes
Features
Flexible channel routing
Any channel can be sent to any other channel with attenuation/amplification
Multistage submixing
For example, each car can have a submix (exhaust, transmission, engine, etc.), and each cars mix can then be fed into another submix for environmental effects
Advanced Features
Deferred commands
Most operations (Start, SetParameter, SetOutputVoice, SetEffectChain) can be grouped and applied as atomic, sample-accurate operations
xAPOs (DSPs)
In-box APOs (Reverb, notch, etc.) Create custom equalizers, compressors, limiters, monitors, phase shifters, attenuators, delays, ..
And they can be cross-platform, like the in-box APOs.
XAudio2 makes use of XMA hardware-accelerated decode and VMX instructions for 360
Audio Flow
32k, Mono XMA2
Pitch/SRC + filter Effect1
Source Voices
EffectN
32k (5.1)
Sample Rate. Conv.
32k (5.1)
48k (5.1)
EffectN
Effect1
32k (5.1)
Graph Optimization
Apply FX to many voices at once for the price of one Make use of lower-rate sub-graphs
Filter
Effect1
EffectN
SUBMIX!
Lower rate == fewer samples == less CPU Run expensive global send FX at a lower rate/channels than the final mix
Provides for more detailed control of performance characteristics Allows for smooth crossfades between disparate FX
e.g. Environmental reverb crossfade
Source Voices
32k (Mono)
Effect1
EffectN
Stopped voices are not touched by the real-time processing thread Voice Pooling
Much faster than repeated allocation/free SetFrequencyRatio may be applied to reuse voices for data of a different sampling rate
Voice Pooling
Create pools of Voices
Effect1
EffectN
Each Pool is unique on Source Content (xWMA, XMA, ADPCM) and Channel Count
FX Optimization
XAPO_BUFFER_SILENT
Effect1
EffectN
Use EnableEffect/DisableEffect
More convenient than destroying and recreating the
Remember this?
Memory is pooled for many things, including SRCs and Pitch Shifting
Compression
Effect1
EffectN
Always use compression to minimize disk/memory/cache footprint Reduce XMA/xWMA quality per sound for optimal quality/size tradeoff Seek tables:
Allows caller to skip past unwanted packets, without having to load the data itself.
Compression - Tradeoffs
PCM
Not compressed, so highest fidelity
32k (Mono)
Pitch + filter
XMA2
Effect1
EffectN
xWMA
Software decode (Mono/Stereo~=.6-1.2% of 360 core) Excellent compression (~20+:1) Good for voices/music, no seamless looping
Streaming
Effect1
EffectN
Cycle a circular queue of buffers to submit new data to XAudio2 Submit new data within voices OnBufferEnd callback
Increasing read-ahead before starting the voice decreases chance of glitching, but can increase perceptible latency depending on implementation Consider streaming several buffers into the engine before throttling
XMA2 Block Size should be in increments of 32K to mirror DVD I/O patterns
xWMA Streaming
Effect1
EffectN
Each xWMA file contains a list of offsets (DPDS chunk) EachDPDS submit 1st needs a 2nd modified form of this Chunk: 5000 2000 (5000Submit Submit list: 0 3000) 7000 0 0
1000 1 2000 2 3000 3 5000 4 7000 5 12000 6 1000 1 2000 2 3000 3000 12000 4000 (70001 3000) 2 9000 (120003000)
XAudio2 callbacks
Your code can be blocked by any XAudio2 API call, waiting on internal realtime thread locks.
Debugging
Use the debug versions of XAudio2, X3DAudio, XAPOBase, etc. SetDebugConfiguration may be used to control debug behavior for XAudio2 VolumeMeter xAPO useful for detecting clipping PIX counters available to track CPU, memory, and voice statistics
Similar data available via IXAudio2::GetPerformanceData
Watch for other threads on the core that may be slowing down XAudio2
A Case Study
Sample Rate Conversion
Mono
Pitch/SRC + filter Effect1 EffectN Filter Reverb
Sample Rate. Conv.
Stere o
Pitch/SRC + filter
Effect1
EffectN
EffectN
Effect1
Quad
Pitch/SRC + filter
Effect1
EffectN
5.1
Pitch/SRC + filter
Effect1
EffectN
PIX
Timing Capture
OnProcessingPassEnd Callback Use callbacks to notify Hardware Thread 5 that it can resume execution
xbPerfView
w/ Sampling Capture
A Case Study
Adding submixes
Mono
Pitch/SRC + filter Effect1 EffectN
Filter Reverb
Sample Rate. Conv.
Stere o
Pitch/SRC + filter
Effect1
EffectN
EffectN
Effect1
Quad
Pitch/SRC + filter
Effect1
EffectN
5.1
Pitch/SRC + filter
Effect1
EffectN
xbPerfView
w/ Submixing
A Case Study
SRC & Reverb
Mono
Pitch/SRC + filter Effect1
32k 48k
EffectN
Stere o
Pitch/SRC + filter
Effect1
EffectN
32k 48k
EffectN
Effect1
Quad
Pitch/SRC + filter
Effect1
EffectN
5.1
Pitch/SRC + filter
Effect1
EffectN
xbPerfView
Final Numbers
Component MatrixMix Reverb Resampling Total Start CPU% 17.48% 6.37% 14.74% 38.59% Final CPU% 4.25% 4.94% 11.41% 20.60% % Freed 13.23% 1.43% 3.33% 17.99%
Idle
27.95%
48.47%
20.52%
Summary
SUBMIX! Use OnBufferEnd callbacks to stream data Intentionally choose your compression methods Carefully manage your voice interactions Watch for Blocking Calls Pool voices where possible Use EnableEffect/DisableEffect Profile your title to focus your efforts
www.microsoftgamefest.com