During the previous weeks, I've been publishing several posts talking about my project development, some literature review that I've done, app examples, etc... In some cases, I had to use concepts that were complex to explain within the context of a single blog post. Therefore, I had to summarise them a lot while saying "I'll write another post talking about that". So, that post has arrived! I hope that even though not explaining in depth those concepts, you could follow the idea of the post, however, if you perceived that you needed to know a bit more about ambisonics world, today's post is for you.
Obviously, it is impossible to explain all the background theory of ambisonics here, but at least, I'll try to describe the concepts that I think are more important to understand this particular project. Are you ready to go?
Ambisonics is a method for recording, mixing and playing back spatial audio that was invented in the 70s. It's recently become more widely used with the popularisation of VR content since it's flexible, and format agnostic. "Too much information in a couple of sentences!" Ok, no worries... to understand it better, it's worth starting from the basic spatial audio configuration: stereo.
Stereo is the fundamental spatial audio configuration since it is a 2D representation of the front image. If we want a complete 2D representation, then we would have to use a quadriphonic, 5.1 or 7.1 formats which also cover the back and the sides. All these configurations use the same amount of channels as speakers to play them back. Stereo is 2ch and uses 2 speakers, 5.1 is 6 channels and uses 6 speakers, and so on... Also, each channel is reproduced at each speaker individually. However, ambisonics doesn't. Ambisonics uses spherical harmonics (a mathematical concept that I'm not going to explain now, feel free to Google it to know more about it) to store the spatial information of a sound field. The basic ambisonics configuration uses 4 channels:
Within these four channels, we have all the necessary information to recreate a three-dimensional sound field completely. However, as I said before, having 4 channels doesn't mean that we need 4 speakers to play it back. We would need at least 4 speakers, but each speaker will reproduce a combination of the four channels. That's why we need a decoder to generate the sound that each speaker has to play back.
So, I've explained how ambisonics works - once the sound field is already in ambisonics format - but, what about placing a mono source in the ambisonics domain? In such a case, we would need an encoder; this uses mathematical formulas to add the necessary information of our mono source to each of the four channels. The amount of information added to the channels will be always related to the position of our mono source.
"In a previous post you mention ambisonic orders... what does it mean?" The four-channel example that I just explained is called 1st order ambisonics, and it is the minimum order needed to obtain a 3D representation of the sound field. These four channels are the first spherical harmonics but, as we can see in the image above, there are much more spherical harmonics that we can add. To increase the ambisonic order, a further layer of the pyramidal structure must be added each time. Therefore, 2OA has 8ch, 3OA 16ch, etc. "But... why?" As you can see in the image, the higher the order, the more channels of information there are. Each channel contributes more information about the sound field, meaning that the encoding/decoding process will be more precise.
Do you need a recap? This video from waves plug-ins explains 1OA in a very easy way:
Virtual loudspeaker technique
Ambisonics can be reproduced using a speaker array, where the speakers must be as even as possible distributed, perfectly calibrated, in an acoustically controlled room, etc... As you might think, this is a really difficult setup to have at home, and only some studios or research centres could have it. The other way to listen to an ambisonics sound field is using the virtual loudspeaker technique, where a speaker array is simulated to be reproduced using headphones.
"Wait... did you say: virtual speakers through headphones?" Indeed, let's imagine that we have a real speaker array like the one behind me in the picture. Here we can see that there are some speakers almost evenly distributed in a sphere. Using this array, we could reproduce an ambisonics sound field, right? Ok, do you remember last week's post? When I was talking about the Dummy Head? In this picture, I'm with my friend KEMAR, another kind of Dummy Head that also includes the torso. As I said last week, if we reproduce a sound in a certain point of the space, we record it through the dummy head ears, and then we play it back through a pair of headphones, we perceive that the sound is coming from the original position. In addition, we can extract the filter of that position and create a Head Related Transfer Function (HRTF), which we can apply it later to any non-processed sound and place it at that point. The result is a 2ch file that must be listened to using headphones to perceive it correctly, and this is called binaural.
"And then... what relates the Dummy Head, the HRTFs and Ambisonics?" Let's have a look at the following diagram. To reproduce an Ambisonics sound field using a speaker array, we would need: the Ambisonics sound field, a decoder (to create the different speaker signals) and the speaker array to reproduce them. To reproduce an Ambisonics sound field through headphones, on the other hand, the first stages will be the same. We need to decode the Ambisonics sound field to speaker signals but, instead of directly reproduce these signals, we'll place them to the virtual position (where that speaker should be placed in the real world) using the corresponding HRTF. Finally, we'll add all the resulting binaural signals, and we'll get a 2ch file (to be listened to using headphones). While listening to this file, the perception of the sound field will be the same as we were in the middle of the speaker array: a virtual loudspeaker array.
I assume that if you are entirely new to the topic, reading only the information given in this post might be not enough. Ambisonics is a huge topic, and HRTFs, which I just mentioned but not explained in depth, another huge one. Therefore, here I give you some links that might help you better understand spatial audio and ambisonics:
This blog aims to share the progress of my Master thesis project with anyone who might be interested. Maybe someone could be interested in all the project, perhaps someone else could be interested in a post of a particular topic or, why not, maybe you are a friend of mine, and you just want to keep in touch with me during these busy days (sorry about that... I'll be back soon!). Whatever of the previous reasons, you are really welcome to read as many posts as you like and, of course, feel free to comment, discuss or ask anything you want!
I’m currently researching on new DAW-based tools to work with Spatial Audio creation, using gesture control with smartphones. This project has Abbey Road as an industry partner, and it is supervised by Dr Gavin Kearney and Prof Andy Hunt.
I am passionate about Audio and Music Technology, and I really love the world of sound, audio and music. After finishing the Sonology Bachelor, and working for two years in the media industry, I decided to progress to a more scientific and engineering-based career. That is why I am currently stuying an MSc in Audio and Music Technology at the University of York, where I discovered my passion for academic research, especially in the field of Spatial Audio and Virtual Acoustics.
Aquest blog pretén compartir el progrés del meu treball de final de màster amb tothom qui hi pugui estar interessat. Potser hi ha algú que està interessat en tot el projecte, potser algú altre està interessat en algun tema en concret o, per què no, potser ets algun amic meu que vol simplement vol estar en contacte amb mi durant aquests dies que no hi sóc (perdó, tornaré aviat...). Sigui quina sigui la raó, sou tots molt benvinguts a llegir tants posts com vulgueu i, per descomptat, no dubteu en comentar, discutir, o preguntar el que us vingui de gust!