What makes audio 'immersive'?

December 20, 2022

Let’s begin by defining the word ‘Immersive’. The definition I like the most because it makes it easy to imagine how it can be applied to audio is: 

“Providing, involving, or characterized by deep absorption or immersion in something (such as an activity or a real or artificial environment”

Techniques to bring immersive experiences to people have always typically been in visual format, such as cinema and more recently, Virtual Reality (VR); Star Trek and Avatar are wonderfully crafted examples of this.

However, when it comes to creating immersive experiences solely through audio this has only just recently started becoming popular in the mainstream and accepted as a form of virtual entertainment. That’s not to say the concept of immersive audio (also referred to as ‘Spatial’, ‘3D’ or ‘8D’) is new; far from it. In fact, one of the first (to my knowledge) attempts at creating a hyper-realistic/immersive audio experience was the Virtual Barber Shop, produced in 1996.

When listening to an immersive audio production such as this, the listener can’t help but be amazed by what they’re hearing through a set of headphones. The only other time they usually hear multi-dimensional sounds like this is in real life - hence the immersive nature of this audio, it makes you feel like you’re actually in the scenario the sound is describing.

There are various levels to creating immersive audio, the most basic being simply moving the sounds from left to right (panning), this gives the listener a sense that the sound is alive and that they are in the middle of the movement.

This can be further enhanced by varying the distance of the sounds from the listener’s ears, this can create a sense that the listener is right in the thick of the scenario being depicted - for example a whisper in the ear, followed by a gunshot in the distance, the combination of this and the simple left to right panning can already create a sense of immersive wonder.

The third level is sound rotation, this is where the sounds don’t just appear symmetrically left to right but all around the listener - in front, behind, top left, top right etc. This technique creates a sense that the listener is not stationary, that they have truly embodied the protagonist in the scenario being brought to life

The final level is storyline and relatability, creating a narrative and taking the listener on a progressive journey through the immersive audio experience. This is the hardest aspect to get right, it requires careful planning and pushing the boundaries of human imagination. Relatability is powerful in that if the listener has already experienced the scenario for real they have a greater chance of being immersed as they know what to look for an expect - so it’s up to the creators to surprise them whilst at the same time meeting those expectations to create a sense of authenticity