The goal of this article is to provide an approach to decoding Ambisonic B-Format to 5.1 surround sound, with a special attention to music content. If you're not familiar with Ambisonic surround sound technology, we suggest you read this article and this article or the Wikipedia entry first.You must also be familiar with Steinberg Nuendo.
The software suggested in this article is OS X 10.4 to 10.8, Steinberg Nuendo 3/4/5, David McGriffy Visual Virtual Microphone VST plug-in, Dave Malham & Ambrose Field Ambisonic VST plug-ins and York University Ambisonic VST plug-ins and our B2G and 2B2G decoder plug-ins, Apple Compressor, ffmpegX, Vortex Surround Encoder, Apple DVD Studio Pro, Burn, Roxio Toast 8/9/10, Apple iTunes and VLC Media Player. Nuendo, ffmpeg, Vortex Surround Encoder, iTunes, VLC and most of the plug-ins are available for Windows OS.
The information on this page is provided as is, without any guarantee regarding performance or end result.
Ambisonic B-Format files can be downloaded from the Ambisonia and the Sound of Space websites. The files available there have a «.amb» extension: if Nuendo can't recognize the files, replace the «.amb» with «.wav».
The Ambisonic B-Format is a multichannel surround sound format that is not intended to be directly played back on loudspeakers. The B-Format stream has to be decoded to a speaker-ready stream (historically referred to as «D-Format»). For various reasons, hardware decoders have always been hard to find and it's only recently that software decoders have been more widely available. But even then, the exercise of decoding requires a computer and interfacing it with a home theater system or a dedicated multispeaker array. It is for these reasons that we’ve seen in the past ten years the emergence of proponents of pre-decoded B-Format to be stored and distributed on widely available media, like DVD-Video. Pre-decoded B-Format for home theater, 5.1 surround, has been dubbed «G-Format». Ambisonic G-Format on DVD-Video, and more generally any type of «surround sound» on DVD-Video, can use either Dolby Digital or DTS for storing the information (and supposedly uncompressed PCM, but we haven’t seen evidence of this claim). Our focus though is not on what encoder might be used when the audio stream is being mastered for DVD or HD broadcast. We are interested in what comes before that.
What is the best way to decode B-Format to 5.1 surround? To answer, let’s compare what is needed for Ambisonic to perform well in playback with the typical home theater setup.
Ambisonic must then play the «5.1 game» to achieve the best possible result in a home theater setup. It’s a bit sad, but we must assume that the average home theater installation won’t be optimal: speakers of different sizes, with different frequency responses and directivities, installed at wrong angles (let’s hope at least that the consumer has run the THX Optimizer found on many popular DVDs and the speakers relative position and polarity is OK). In that context, any attempt at recreating a continuous, isotropic sound field is pointless. We must now think in terms of two axes: Left/Right and Front/Back. A «good» B-Format decoding for 5.1 will try to maximize separation, or difference of audio information, on these axes.
We must warn here that we proceed ahead with the aesthetic of traditional stage-presented music. This is the way that most music is played and presented in real life, although we do find exceptions to this in every eras and cultures.
1st order B-Format to 5.1
The Left/Right axis should include all the stage performers so that their positioning will not rely on the surround channels. This means creating L/R virtual microphones pointing at the edges of the performing ensemble. The L/R virtual microphones have the same directivity and, as a starting point, we propose this width/directivity table that will maximize separation.
Based on the L/R virtual microphone width, the surround Ls/Rs will use the remaining horizontal arc and their virtual microphones width is obtained with this equation: ((360 – Front L/R width) / 3 + Front L/R width / 2) * 2. This will give the Ls/Rs width relative to front-center. So for a L/R width of 120°, the Ls/Rs width will be 280°.
The Front/Back axis can be judged by balancing the Center channel against the Ls/Rs channels. We usually end up with a Center virtual cardioid, with a -6 to -10 dB gain relative to the L/R. The Ls/Rs channels are also cardioid, 2 or 3 dB lower than the L/R channels. To augment separation on the Front/Back axis, we can introduce a delay, up to 35 ms, in the Ls/Rs. A longer delay and the surround channels will not integrate anymore with the front. This kind of delay is actually quite similar to what has been used for years in Dolby Surround to increase the Front/Back difference.
In certain circumstances, a delay in the surround will be detrimental to the music. Case in point, the electro piece Colossus by Henry Walmsley (available on Ambisonia) has bass notes fed directly to the W channel: putting a delay noticeably lowers the amplitude of these notes. This sort of situation, signal in W only, would be quite rare (if not impossible?) in real acoustic sound recording.
Our recommendations for 1st order B-Format to 5.1 surround decoding are possible to implement in Nuendo by inserting an instance of the Visual Virtual Mic plug-in on a 5.1 output bus followed by an instance of Nuendo Mixer Delay. This plug-in is to control the delay in the surround channels, but also can be used to mute channels while balancing the Left/Right and Front/back axes.
In Nuendo, it's also possible to create three groups with an instance of York's B-Mic in each group. These group will be fed by three post-fader sends from a B-Format track. The first group will be used to generate the L/R signal, the second will be used for Center and the third for Ls/Rs. A delay plug-in is also inserted in the surround group. Each group is then sent to the output bus using child busses (L/R, C/Lfe, Ls/Rs).
If you're on OS X, we invite you to try our B2G plug-in: its design reflects our approach of B-Format to 5.1 decoding.
2nd order B-Format to 5.1
Globally, we could say that the approach stays the same for second order B-Format to 5.1 decoding.
The increased spatial resolution of 2nd order B-Format can yield greater speaker channel separation, in turn allowing more close correspondence between prescribed speaker positions and virtual microphones orientation. It can also result in a more useful center channel with less overlap with Left-Right channels than in a first order only 5.1 decoding. The smaller overlap can also lead to a reduction of the surround channel delay, if not its omission.
But by choosing 2nd order B-Format, we must now consider some implications in the decoding stage, be it to 5.1 or to a dedicated Ambisonic speaker installation.
One possible recording and post-production scenario is using a first order B-Format main microphone (tetrahedral or native B-Format) with spot microphones recorded onto mono tracks. At the mixing stage, the spot mics are panned to the 2nd order. The resulting B-Format is now of mixed spatial resolution: some information is of 1st order and some is of 2nd order.
In decoding, when choosing the virtual microphones directionalities by weighting the 0th, 1st and 2nd order, we must consider that some information won't be processed by the 2nd order stage of the decoder. For example, a very directional 2nd order cardioid, 0.5 + cos(A) + 0.5 cos(2A), with minimal reversed polarity back lobe (- 18 dB) will result in a first order mic, 0.5 + cos(A), with a reversed polarity back lobe of - 9.5 dB: is this what we want?
Again, if you're on OS X, we invite you to try our 2B2G plug-in: its design reflects our approach of 2nd order B-Format to 5.1 decoding, with independent order weighting for each group of speakers (L/R, Center, Lfe/Sub and Ls/Rs).
G-Format in Dolby Digital and DTS
Once the B-Format stream is converted to 5.0 or 5.1 G-Format, a Dolby Digital encoded DVD-Video or a DTS encoded Audio-CD can be easily created.
With Compressor, the 48 kHz / 24 or 16 bit mono files (one per channel to be encoded) are fed to Compressor. Suggested settings are to adjust the dialogue normalization to -31 dB, Center and Surround channels downmix to -6 dB and to turn off the 90° phase shift in the Surround channels. The bitrate should be set to the maximum allowed on DVD-Video, 448 kbps. In the case of a 5.0 mix with source files at 24/48, the size of the resulting AC3 (Dolby Digital) file will be about twelve times smaller than the total source files.
With ffmpegX, the 48 kHz / 16 bit six channel interleaved file (L C R Ls Rs Lfe) is fed to ffmpegX. Unlike Apple Compressor, there is no settings other than to select the "Audio file to AC3" preset in the Target format drop-down menu: one must hope that the reverse engineering to come up with a non-official AC3 encoding was adequately done.
This AC3 file is then imported in a DVD Studio Pro project. With a very basic menu and a still image as video track, the resulting DVD-Video can be burned directly from DVD Studio Pro or exported as a DMG file that can be distributed and later burned with software like Apple Disk Utility, Burn or Roxio Toast.
As an alternative, Roxio Toast (version 8, 9 or 10) has a DVD authoring mode called Music DVD (Audio Menu) that allows easy creation of DVD-Video for audio-only playback. To avoid the multichannel AC3 files from being downmixed and re-encoded to Toast default two channel AC3, the files have to be dragged to the Toast window while pressing the Option (Alt) key. Toast will create a menu for the DVD and a still image for each track.
It should be noted that AC3 files are directly playable with the VLC Media Player. One could then also include on a DVD-Video a non-multiplexed AC3 file, thus giving the end-user more playback options. Such a file, to be directly read by a software player, could use the maximum possible bitrate of 640 kbps allowed for a generic AC3 file and a 44.1 kHz sampling rate since it won't be tied in with a video stream.
The 44.1 kHz / 24 or 16 bit mono files (one per channel to be encoded) are fed to Vortex Surround Encoder. Beside selecting DTS as the output format, there are no settings to adjust. In the case of a 5.0 mix with source files at 24/44.1, the size of the resulting DTS data, in a WAV file container, will be about 3.75 times smaller than the total source files.
The DTS-WAV file is then imported in a standard Audio-CD burning application like Apple iTunes, Burn or Roxio Toast. Once the DTS-CD is ready, care should be taken not to play the CD undecoded, since it will output continuous white noise. If placed in a standard CD player, the player must output the data through a digital link to a surround decoder such as the ones found in home theater installations. If placed in a DVD-Video player, the DTS data will be either internally decoded in the player or again externally decoded with home theater components.
DTS-WAV files are also directly playable with the VLC Media Player.
While looking for a universal way of decoding first and higher order B-Format to 5.1 surround is a valid endeavour in certain context (like the Ambisonia web site that offers 4.0 DTS from the B-Format files), we do think that every production should have an individually optimized decoding if the 5.1 stream is what's ultimately offered to the consumers.