Virtual reality has become an evolutionary cornerstone in immersive technology, and it didn't take long for engineers to realize how spatial audio could significantly enhance the simulated experience. One of the main challenges for sound designers stepping into mixing for virtual reality is understanding the new standard of layering audio . Up until now, audio engineers have been mixing for static listeners whether that be a mono speaker, headphones, two bedroom speakers, or a 5.1 surround vehicle setup. Today, virtual reality technology relies on the use of HRTFs (head-related transfer functions) to track the orientation of the listener's head which may, or may not, influence characteristics in the sound.
Immersive audio combines both static and dynamic mixes into one experience , and it is the sound designer’s job to decide whether to spatialize each sound. When in doubt, it’s best to spatialize sounds in VR. However, there are some elements such as background music, narration voice over, and some UI elements that may not want to spatialize. No matter the case, there are many tools out there to create immersive sound design. These are just some of Part Time Evil’s favorites that help us achieve our audio goals.
One of our favorite sound effects libraries is a program called Soundly. It would be great if we could record our own sounds for every project. However, due to the size of our team and our project time constraints, we have to work fast and efficiently. Soundly allows us to find high-quality samples that we can further process in our digital audio workstation (DAW). We can layer multiple samples to create complex and original textures or add additional effects to better characterize the sounds for the worlds we wish to bring them into.
Not only does Soundly have its own professional library, but it has teamed up with Freesound , so you can search both libraries through its app. Soundly also has a way to organize sounds, using Collections, so you can sort similar files before downloading your favorites. One of the best features of Soundly is that you can highlight the waveform and simply click and drag the audio right into your DAW.
Splice is another sound library, except it is catered towards music producers. If we ever need to create a tune from scratch, Splice is our go-to. It’s best to dive into Splice with a mood already in mind, or you may get lost and overwhelmed pretty quick. We use Spotify and YouTube to find music that match our project goals, then we search Splice for sound packs that replicate the tone of our references. You can search packs by titles, genres, producers, or instruments and you can decide to download the entire pack or just a few of your favorite samples.
Just recently, Splice has been expanding its library with ambiences, foley, and surround sound in order to compete with cinematic libraries. Similar to Soundly, you can click and drag files from the Splice desktop app straight into your DAW. Splice might become the best one-stop shop for music production and sound effects if its cinematic library begins to compete!
After we’ve gathered our sound effects and music samples, we bring them into Ableton Live 11 to further process our sounds or start creating tunes for our experiences. Ableton Live in conjunction with Ableton Push 2 is all we need to create drums, bass, pads, and melodies. However, we have other instruments like guitars, pianos, bass guitars, and vocals readily available to record if necessary. We find Ableton helpful because it comes with instruments and effects that are made to create really interesting textures and ambiences. Some honorable mentions included with Ableton include Corpus , Spectral Resonator , and the Phaser-Flanger. Additional third-party synthesis plugins that are essential in our creative sound design process include Serum and Massive X.
There are two ways of working in Ableton that separates the DAW from its competitors. There is the standard linear timeline / arrangement method or the non-linear workflow. With the non-linear workflow, otherwise known as session view, you can record multiple loops with each instrument and mix and match them with other loops to find the best sounding results. When rendering our music, if it’s not going to be spatialized, we render as a two-track mix. Any sound effects that will be spatialized will need to be mixed and rendered as mono before bringing them into Wwise.
Ableton Live unfortunately is not up to speed on rendering capabilities past 2-channel, therefore Pro Tools is the DAW of choice when it comes to 360 audio. Pro Tools can export mono, stereo, 5.1, 7.1 surround, and up to third-order ambisonics. We use ambisonics for sounds that are not necessarily tied to elements in the experience, but rather help immerse the user into the environment as a whole. Some examples may include forest noises, wind ambiences, or rainfall. We have been using Facebook Audio 360 plugins for our ambisonic mixing before importing them into Audiokinetic’s Wwise for game implementation.
We also stick with Pro Tools for any post-production work due to its flexibility in bus routing and group management. We started recording voice overs in Pro Tools because of its ease of comping multiple takes. However, since Ableton Live 11 has upgraded its software to include take comping, we bounce between the two for voice over recording.
When it’s time to start implementing sounds into Unity , we use a middleware software, called Wwise , to manage our immersive audio. Up until this moment, we have been producing and bouncing our sounds with their destinations in mind. There are three different destinations to mix audio in Wwise: Audio Objects, Ambisonics, and Passthrough elements.
Audio Objects are location-based sounds placed in a 3D environment. Their volume and panning is dependent on your distance and head orientation from that audio object within the game engine. The majority of sounds within a VR experience will be spatialized as audio objects.
Ambisonics are the 360 audio files that can be used to encapsulate the user. A first-order ambisonic file contains 4 channels, whereas third-order ambisonics are higher in quality with 16 channels. They can be CPU-intensive on current VR headsets, so it is preferable to use only one or two at a certain time. Within Wwise, we use a plugin called Resonance Audio which will binauralize the ambisonic file into 2 channels in order to play them on a VR headset.
Passthrough is for stereo audio that is not dependent on head tracking, such as background music, game VO not tied to an NPC, and some UI elements.
Wwise makes audio development much more efficient because you can begin development before the game works. When the game is ready, you can QA test and mix your elements with Wwise alongside Unity to get real-time results.
While this is Part Time Evil's workflow for achieving immersive audio, there are multiple ways to achieve similar results. Some projects can be handled without the use of middleware and rather implemented directly in the game engine. You may even find other middleware useful for your project, such as FMOD. There are countless SFX libraries available that are worth digging through, and there is a growing industry of immersive audio plugins that are beginning to overshadow the capabilities of Facebook 360 audio.
In order to grasp this new pipeline of immersive audio, it’s best to start small and solve one problem at a time. Find reference / tutorial videos that will guide you through the process, and don’t be afraid to reach out to other sound designers or developers within the industry. More often than not, we are willing to help!
If you want to know more about our sound design process or discuss an upcoming project or game, drop us a line for a chat or a free quote.