Apple's ARKit is a development platform that allows the developers to create highly detailed Augmented Reality experiences for iPhone and iPad. It was introduced with the release of iOS 11, and it can be used from iPhone 6s upwards or iPad pros released in 2017 or newer.
ARKit uses the iPhone’s cameras and motion sensors, which are a combination of several sensors like the gyroscope, the accelerometer or the magnetometer. Then, comparing the images received from the camera with the movements of the phone, it can map your environment, recognise where the walls and floor are, and establish a basic geometry of the space. With this, you can place virtual objects in your room, and see them through the camera as they were real. Do you remember Pokemon Go? Where virtual pokemon characters appear on your phone's camera, interacting with your environment? That was AR, and it was possible to develop thanks to ARKit (and Google's equivalent for Android, of course...).
In my first tests, I was trying to familiarise myself with the ARKit functions, and get some quick results and ideas of what I could do for the project. Specifically, I was testing the world tracking feature, which identifies features of the environment and analyses them between the different frames to get position data. Therefore, you can know where your phone is placed at each moment using an internal function that gives you the X, Y and Z coordinates (in meters) in relation to a starting point.
Just with a simple command to get this information, an OSC framework, and a bit of imagination (that's always necessary...), I created a Virtual Dummy Head app:
virtual dummy head
In order to test the AR capabilities that I explained before, I developed a test app to create a virtual dummy head. "Hold on... what do you mean with a dummy head?" If you aren't familiarised with audio terminology, it might sound weird, that's true. A dummy head (image) is a real-size model of a human head with a really accurate reproduction of the pinnae (the external part of the ears). There is a microphone inside each ear and, thus, the resulting capture allows the user perceive the sound as the mannequin would perceive it. It includes localisation on the vertical and horizontal planes, as well as distance perception.
What I wanted to do with this test app was to place a virtual dummy head at some point of the room, and be able to move around getting the relative position to it. To do it, I downloaded a mannequin head model from SketchFab, and I used some basic trigonometry to get the azimuth and elevation angles. As you can see in the following video, the result is quite precise, even with fast movements.
This Virtual Dummy Head looks cool, isn't it? But, how can we use these values to manipulate sound? And... how can we simulate the microphones that should be inside a real Dummy Head? As I said at the last post, a Max MSP package could be really useful to do some tests like this. Specifically, I used the Ambisonics Externals for Max MSP created by the Zurich University of the Arts.
The iPhone app sends Open Sound Control (OSC) messages via WiFi each 10/second. These messages contain the azimuth, elevation and distance information of the phone at that precise moment. Then, a Max MSP patch receives these messages and pass the values to an Ambisonics encoder object, which places a sound source at that position. Finally, an ambisonics decoder creates 50 virtual loudspeakers almost evenly distributed. "But you said that it would be a simulation of a real Dummy Head, and therefore, it would be reproduced through headphones, right?" Yes, continue reading...
After the decoding process, the 50 virtual speaker feeds are convoluted with the Head Related Transfer Functions (HRTFs) of a real KU100 measured during the SADIE II project. I would need another post to explain correctly what an HRTF is and how virtual loudspeakers technique works. For now, just imagine that a sound was reproduced from different angles and captured with a real KU100, then, knowing how this sound is perceived by the Dummy Head at a particular angle, we can extract a filter and apply it to another sound, to simulate that this new sound is placed at the position of the first one. (No worries if you've understood anything of this lasts paragraphs, I'll publish another post soon explaining it slowly).
To Summarise: the iPhone app sends OSC messages to a Max patch, where an ambisonics encoder encodes a sound source using the received position values. Finally, that ambisonics feeds are decoded and convoluted with HRTFs of a real KU100. Therefore, the resulting sound is virtually the same as if a KU100 was there!
Ready for the demo? Put your headphones on and have a look at the following video then!
This blog aims to share the progress of my Master thesis project with anyone who might be interested. Maybe someone could be interested in all the project, perhaps someone else could be interested in a post of a particular topic or, why not, maybe you are a friend of mine, and you just want to keep in touch with me during these busy days (sorry about that... I'll be back soon!). Whatever of the previous reasons, you are really welcome to read as many posts as you like and, of course, feel free to comment, discuss or ask anything you want!
I’m currently researching on new DAW-based tools to work with Spatial Audio creation, using gesture control with smartphones. This project has Abbey Road as an industry partner, and it is supervised by Dr Gavin Kearney and Prof Andy Hunt.
I am passionate about Audio and Music Technology, and I really love the world of sound, audio and music. After finishing the Sonology Bachelor, and working for two years in the media industry, I decided to progress to a more scientific and engineering-based career. That is why I am currently stuying an MSc in Audio and Music Technology at the University of York, where I discovered my passion for academic research, especially in the field of Spatial Audio and Virtual Acoustics.
Aquest blog pretén compartir el progrés del meu treball de final de màster amb tothom qui hi pugui estar interessat. Potser hi ha algú que està interessat en tot el projecte, potser algú altre està interessat en algun tema en concret o, per què no, potser ets algun amic meu que vol simplement vol estar en contacte amb mi durant aquests dies que no hi sóc (perdó, tornaré aviat...). Sigui quina sigui la raó, sou tots molt benvinguts a llegir tants posts com vulgueu i, per descomptat, no dubteu en comentar, discutir, o preguntar el que us vingui de gust!