Being There: The Playground of VR Audio


My Big Bang Moment

My first experience with spacialized audio was in 2004 at an art installation. The Paradise Institute by Cardiff and Miller required me, as a gallery participant, to step into a scaled-down model of a movie theatre and put on headphones.  The lights went down and the movie started. Then I heard other sounds. Sounds from the audience around me, not the film itself. A cellphone ringing, audience members shifting in their seats,  hushed conversations.  I turned around… but there was nobody there! The audio-illusion had transported me into a virtual space that was completely believable.

The technology that was used to create that theatrical audio-illusion is formally called binaural audio. It relies on clever filtering to create the experience of spacialized sound.  It’s a powerful tool. But until recently, all implementations of binaural audio have been static. In other words, the illusion only works when the listener faces a fixed direction, usually a screen.


Then came the current virtual reality (VR) technology, which relies on dynamic head position. VR technology allows the audio illusion to be maintained when the listener is moving.


So, what is VR exactly?

The VR audio experience uses binaural technology driven by the sensor data from VR head-tracking units. The spacialization effect continually updates based on real-time changes in head-position. So the direction and distance of the sounds appear to change realistically in relation to head movement. This technology is a huge step forward for creating immersive games and stories. The games that fully exploit it excel in immersiveness. And yet VR audio remains underutilized, especially in the indie community. I would like to see it more fully embraced by indie game developers.


I recently participated in a VR panel at a game conference and was totally inspired by the enthusiasm of the indie game dev community for the potential of VR in all types of games, including simulations, experiences, and strategy games. In response, I’ve started a tech demo series in Unity to more fully explore the potential of VR audio. In future posts I’ll share videos to demonstrate my findings. Meanwhile, here are some basic points to know.


Why do you need it?

In VR, visuals are not “locked to the screen” so audio shouldn’t be either.  Spatialized audio feels right in the VR experience by working on a subconscious level to go beyond simply hearing. You can suspend disbelief because you feel the presence of characters and environments like in the real world.


How does it work, technically?

Big question!

Real-time head-related transfer functions (HRTFs) imitate how we localize sound in the world.  Headphones must be worn to de-couple natural HRTFs from the ones implemented.  To stay attached to the sound source instead of the screen, head-tracking is employed.  This comes from the head mounted display (HMD) in VR game systems.


How is it different from surround sound?

In surround,  sounds are panned around a circular 2D plane, are locked to the positions of the speakers, and are always relative to the screen.  VR audio,  on the other hand, describes a sphere.  In this way, audio spatialization works on up/down and forward/backwards axes, making it super-realistic.  This is called ambisonics.  VR audio also updates with head movement.


What equipment do I need?

You need a spatialization plugin for your game engine. For example, the freely downloadable Oculus filters for Unity work well.  If using ambisonics, you will need either an ambisonic mic to capture your audio or a plugin to output Bformat files from your digital audio workstation (DAW).


Why Now?

Binaural and ambisonics have been around for some time in experimental circles, but not in the mainstream. In the past, an over-emphasis on the science and the requirement for headphones made it seem gimmicky.  But now, with the popular acceptance of VR visuals and of head-mounted displays, VR audio can become the norm.


How does it affect storytelling?

VR audio excels at first-person experiences. The player is more connected than ever to the first-person character. While sound cues have always been used to direct the player’s attention, the directionality of VR audio provides much greater precision and control for story-driven games.  I’m especially interested in the layering of subtle first-person sounds as a way to embody character.  Although VR may dispense with some traditional audio conventions, such as 2D sounding musical scores, it brings opportunities for new narrative tools for game designers. The possibilities are exciting!


How does it affect the audio team?

Most audioclips will be rendered as mono.  Environmental sounds such as traffic, crowds, and weather, will need to be built from many sources and spatialized separately to create a multi-dimensional soundscape.  Binaural and ambisonic plugins need to be used for monitoring when making mix decisions.  Some sounds spatialize better than others, so trial and error will likely be required.


Is it perfect?

No.  Environmental reverb is not quite there yet and it’s not clear how to deal with 2D sounds.  There can also be issues when spatialized sounds move too fast.  But things are improving quickly thanks to the growing interest and a lot of great ongoing research and development.


How do I get started?

Oculus offers a complete pipeline for production, middleware, and game engines, and covers the popular audio and game tools.  It also sounds very good!


Being there (right now)

Audio innovations are happening all the time! It’s exciting! Companies like Oculus are making it so accessible.  If you haven’t experienced VR audio yet, there has never been a better time than right now.