Print this page Share

Spatial Audio Processing: MPEG Surround and Other Applications

ISBN: 978-0-470-03350-0
224 pages
December 2007
Spatial Audio Processing: MPEG Surround and Other Applications (0470033509) cover image


This book collects a wealth of information about spatial audio coding into one comprehensible volume. It is a thorough reference to the 3GPP and MPEG Parametric Stereo standards and the MPEG Surround multi-channel audio coding standard. It describes key developments in coding techniques, which is an important factor in the optimization of advanced entertainment, communications and signal processing applications.

Until recently, technologies for coding audio signals, such as redundancy reduction and sophisticated source and receiver models did not incorporate spatial characteristics of source and receiving ends. Spatial audio coding achieves much higher compression ratios than conventional coders. It does this by representing multi-channel audio signals as a downmix signal plus side information that describes the perceptually-relevant spatial information.

Written by experts in spatial audio coding, Spatial Audio Processing:

  • reviews psychoacoustics (the relationship between physical measures of sound and the corresponding percepts) and spatial audio sound formats and reproduction systems;
  • brings together the processing, acquisition, mixing, playback, and perception of spatial audio, with the latest coding techniques;
  • analyses algorithms for the efficient manipulation of multiple, discrete and combined spatial audio channels, including both MP3 and MPEG Surround;
  • shows how the same insights on source and receiver models can also be applied for manipulation of audio signals, such as the synthesis of virtual auditory scenes employing head-related transfer function (HRTF) processing and stereo to N-channel audio upmix.

Audio processing research engineers and audio coding research and implementation engineers will find this an insightful guide. Academic audio and psychoacoustic researchers, including post-graduate and third/fourth year students taking courses in signal processing, audio and speech processing, and telecommunications, will also benefit from the information inside.

See More

Table of Contents

Author Biographies.



1 Introduction.

1.1 The human auditory system.

1.2 Spatial audio reproduction.

1.3 Spatial audio coding.

1.4 Book outline.

2 Background.

2.1 Introduction.

2.2 Spatial audio playback systems.

2.2.1 Stereo audio loudspeaker playback.

2.2.2 Headphone audio playback.

2.2.3 Multi-channel audio playback.

2.3 Audio coding.

2.3.1 Audio signal representation.

2.3.2 Lossless audio coding.

2.3.3 Perceptual audio coding.

2.3.4 Parametric audio coding.

2.3.5 Combining perceptual and parametric audio coding.

2.4 Matrix surround.

2.5 Conclusions.

3 Spatial Hearing.

3.1 Introduction.

3.2 Physiology of the human hearing system.

3.3 Spatial hearing basics.

3.3.1 Spatial hearing with one sound source.

3.3.2 Ear entrance signal properties and lateralization.

3.3.3 Sound source localization.

3.3.4 Two sound sources: summing localization.

3.3.5 Superposition of signals each evoking one auditory object.

3.4 Spatial hearing in rooms.

3.4.1 Source localization in the presence of reflections: the precedence effect.

3.4.2 Spatial impression.

3.5 Limitations of the human auditory system.

3.5.1 Just-noticeable differences in interaural cues.

3.5.2 Spectro-temporal decomposition.

3.5.3 Localization accuracy of single sources.

3.5.4 Localization accuracy of concurrent sources.

3.5.5 Localization accuracy when reflections are present.

3.6 Source localization in complex listening situations.

3.6.1 Cue selection model.

3.6.2 Simulation examples.

3.7 Conclusions.

4 Spatial Audio Coding.

4.1 Introduction.

4.2 Related techniques.

4.2.1 Pseudostereophonic processes.

4.2.2 Intensity stereo coding.

4.3 Binaural Cue Coding (BCC).

4.3.1 Time–frequency processing.

4.3.2 Down-mixing to one channel.

4.3.3 ‘Perceptually relevant differences’ between audio channels.

4.3.4 Estimation of spatial cues.

4.3.5 Synthesis of spatial cues.

4.4 Coding of low-frequency effects (LFE) audio channels.

4.5 Subjective performance.

4.6 Generalization to spatial audio coding.

5 Parametric Stereo.

5.1 Introduction.

5.1.1 Development and standardization.

5.1.2 AacPlus v2.

5.2 Interaction between core coder and spatial audio coding.

5.3 Relation to BCC.

5.4 Parametric stereo encoder.

5.4.1 Time/frequency decomposition.

5.4.2 Parameter extraction.

5.4.3 Down-mix.

5.4.4 Parameter quantization and coding.

5.5 Parametric stereo decoder.

5.5.1 Analysis filterbank.

5.5.2 Decorrelation.

5.5.3 Matrixing.

5.5.4 Interpolation.

5.5.5 Synthesis filterbanks.

5.5.6 Parametric stereo in enhanced aacPlus.

5.6 Conclusions.

6 MPEG Surround.

6.1 Introduction.

6.2 Spatial audio coding.

6.2.1 Concept.

6.2.2 Elementary building blocks.

6.3 MPEG Surround encoder.

6.3.1 Structure.

6.3.2 Pre- and post-gains.

6.3.3 Time–frequency decomposition.

6.3.4 Spatial encoder.

6.3.5 Parameter quantization and coding.

6.3.6 Coding of residual signals.

6.4 MPEG Surround decoder.

6.4.1 Structure.

6.4.2 Spatial decoder.

6.4.3 Enhanced matrix mode.

6.5 Subjective evaluation.

6.5.1 Test 1: operation using spatial parameters.

6.5.2 Test 2: operation using enhanced matrix mode.

6.6 Conclusions.

7 Binaural Cues for a Single Sound Source.

7.1 Introduction.

7.2 HRTF parameterization.

7.2.1 HRTF analysis.

7.2.2 HRTF synthesis.

7.3 Sound source position dependencies.

7.3.1 Experimental procedure.

7.3.2 Results and discussion.

7.4 HRTF set dependencies.

7.4.1 Experimental procedure.

7.4.2 Results and discussion.

7.5 Single ITD approximation.

7.5.1 Procedure.

7.5.2 Results and discussion.

7.6 Conclusions.

8 Binaural Cues for Multiple Sound Sources.

8.1 Introduction.

8.2 Binaural parameters.

8.3 Binaural parameter analysis.

8.3.1 Binaural parameters for a single sound source.

8.3.2 Binaural parameters for multiple independent sound sources.

8.3.3 Binaural parameters for multiple sound sources with varying degrees of mutual correlation.

8.4 Binaural parameter synthesis.

8.4.1 Mono down-mix.

8.4.2 Extension towards stereo down-mixes.

8.5 Application to MPEG Surround.

8.5.1 Binaural decoding mode.

8.5.2 Binaural parameter synthesis.

8.5.3 Binaural encoding mode.

8.5.4 Evaluation.

8.6 Conclusions.

9 Audio Coding with Mixing Flexibility at the Decoder Side.

9.1 Introduction.

9.2 Motivation and details.

9.2.1 ICTD, ICLD and ICC of the mixer output.

9.3 Side information.

9.3.1 Reconstructing the sources.

9.4 Using spatial audio decoders as mixers.

9.5 Transcoding to MPEG Surround.

9.6 Conclusions.

10 Multi-loudspeaker Playback of Stereo Signals.

10.1 Introduction.

10.2 Multi-channel stereo.

10.3 Spatial decomposition of stereo signals.

10.3.1 Estimating ps,b, Ab and pn,b.

10.3.2 Least-squares estimation of sm, n1,m and n2,m.

10.3.3 Post-scaling.

10.3.4 Numerical examples.

10.4 Reproduction using different rendering setups.

10.4.1 Multiple loudspeakers in front of the listener.

10.4.2 Multiple front loudspeakers plus side loudspeakers.

10.4.3 Conventional 5.1 surround loudspeaker setup.

10.4.4 Wavefield synthesis playback system.

10.4.5 Modifying the decomposed audio signals.

10.5 Subjective evaluation.

10.5.1 Subjects and playback setup.

10.5.2 Stimuli.

10.5.3 Test method.

10.5.4 Results.

10.6 Conclusions.

10.7 Acknowledgement.

Frequently Used Terms, Abbreviations and Notation.

Terms and abbreviations.

Notation and variables.



See More

Author Information

Jeroen Breebaart was born in the Netherlands in 1970. He studies biomedical engineering at the technical University Eindhoven. He received his PhD degree in 2001 from the Institute for Perception research (IPO) in the field of mathematical models of human spatial hearing. Currently, he is a senior scientist with Philips research. His main fields of interest and expertise are spatial hearing, parametric stereo and multi-channel audio coding, automatic audio content analysis, and generic digital audio signal processing algorithms. He published several papers on binaural hearing, binaural modeling, and spatial audio coding. His work is incorporated in several international audio compression standards such as MPEG-4, 3GPP, and MPEG Surround.

Christof Faller received an MS (Ing) degree in electrical engineering from ETH Zurich, Switzerland, in 2000, and a PhD degree for his work on parametric multi-channel audio coding from EPFL, Switzerland, in 2004. From 2000 to 2004 he worked in the Speech and Acoustics Research Department at bell Laboratories, Lucent Technologies and Agere Systems (a Lucent Company), where he worked on audio coding for digital satellite radio, including parametric multi-channel audio coding. He is currently a part-time postdoctoral employee at EPFL. In 2006 he founded Illusonic LLC, an audio and acoustics research company. Dr Faller has won a number of awards for his contributions to spatial audio coding, MP3 surround, and MPEG surround. His main current research interests are spatial hearing and spatial sound capture processing, and reproduction.

See More

Buy Both and Save 25%!


Spatial Audio Processing: MPEG Surround and Other Applications (US $139.00)

-and- Signal Processing and Integrated Circuits (US $99.00)

Total List Price: US $238.00
Discounted Price: US $178.50 (Save: US $59.50)

Buy Both
Cannot be combined with any other offers. Learn more.
Back to Top