Question about imaging and width

In broadcast I’ve had to be very mindful of how the mix folds to mono.

Immersive audio is really only meant to be deployed on headphones. It would be rare that certain effects that would require a heightened sense of width would be required to be played on a mono source.

Does anyone see any reason not to just go ahead and make stuff like water particle sound effects or rocks crumbling in an ancient tomb extra wide even though it could create some minimal phasing problems if folded to mono?

1 Like

I can’t imagine immersive audio, VR, game audio and these types of sounds / soundtracks ever being played on a bluetooth speaker! I had the occasion to play some VR games last year on a PS4, and while flying the Millennium Falcon(!), the thing that really got me going was the craziness of the sounds involved. I had explosions behind me, people talking to me in one ear, and all manner of cues coming from all directions. Having played a game like that, I couldn’t imagine why I would ever want to experience that in any other way.

I know… I totally agree. So I’m wondering if there’s a compelling reason to be stingy with the HRTF audio implementation and spacial width if you don’t HAVE to worry about it folding to mono?

It really depends on how you go about doing it. With things that aren’t musical sounding, you can get away with more widening techniques without the negative side effects.

I personally think that I would prefer to keep realism as high as possible in headphones, and let the people playing on crappy systems deal with the downsides. I don’t know much about video game audio engines, but don’t they sort handle all the panning and spacial effects live?

So in VR audio, I think you can oversimplify the delimma by asserting that sounds exist in one of there contexts. They’re either:

static - meaning they’re fixed to an object in an environment (if you kick in a door, you hear it the effect of busting the lock, but the position of the door lock is fixed on the door)

dynamic - meaning they respond to your movement on the x/y/z axis (if you are holding a torch in your VR helmet and you hold it up high in the air vs hold it behind your back)

or non diegetic - meaning they’re not associated with anything in the playable environment (like the background music)

The game engine handles panning based on which of these 3 contexts the sound exists in. But here’s my question - even with music, if you make a string v.i. sound really really super wide to the point it would cause imaging issues on a mono bluetooth speaker, who cares so long as it sounds appropriate while your headphones are on? Since that’s the playback medium that its designed for… (remember that the sound for most VR experiences are typically designed to be used with headphones)

Again, I’m trying to figure out if you really even have to worry about whether it translates to mono or not.

I don’t know best practices, so my thoughts don’t really have much weight, but I would guess that the sound in the headphones trumps all when it comes to VR. Unless it was really breaking the mono mix, then I wouldn’t care much about the mono mix. But I do think it’s worth making sure nothing important disappears in mono.

2 Likes

Jonathan, at this point you’re talking about “degrees of tolerance”. In manufacturing, they determine a measurable “tolerance” for Quality Control. So anyone working on the floor can take a ruler or micrometer and say “good” or “bad” in terms of quality (for say the length of a metal part used in a frame). See if there is some sort of quality standard in the industry, or create your own. If you create your own, you can always revise it later based on feedback. Define “percentage of mono degradation to the mix” by whatever metric or parameter you can. Then the decision kind of makes itself. :slightly_smiling_face:

1 Like