ADRIA M. CASSORLA

  • Home
  • INICI
  • Hola Món!
  • Hello world!
  • Portfolio
  • Contact
  • Contacte
  • Home
  • INICI
  • Hola Món!
  • Hello world!
  • Portfolio
  • Contact
  • Contacte

master thesis

DAW based spatial audio creation using mobile phones

Exploring existing techniques

10/5/2019

0 Comments

 
Since the last post, I've been immersed in the first stage of the project: research and literature review. Controlling spatial audio is not a new topic, and therefore, I wanted to analyse the existing methods that are already on the market. During these days, I've been reviewing several projects and products, from software to hardware, and even some that use air gesture control or... smartphones! 

Let's have a look at what I found:

software

Spatial audio controller using software is probably the most extended method at the moment. There are lots of open source plug-ins as well as paid packages which offer a relatively easy way to create and control spatial audio. As my project pretends to create a DAW based tool, I've been mainly focusing on plug-ins. However, Max MSP packages will also be useful for testing proposes.

I've found almost 20 different products. Some that are Open Source (usually created by Universities) and others under a commercial license. All of them have similarities in the effects included. Here I've selected some of them:
  • Panning: Basic in any spatial audio tool. It is used to place a source somewhere to the space. For instance: if we have a bird singing (source) and we want to place it above us at the left, we'll have to use this panning tool.
  • Encoding/Decoding: Most of the tools that I've been analysing use Ambisonics as a spatial audio technique. This is another topic that I will cover in a further post but, we just need to know that when using it, we need some encoders and decoders to work with it properly. Frequently the encoding process is linked to the panning, but in some cases is presented as a separate tool.
  • Reverb / Virtual acoustics: In order to have a better perception of the space, most of the packages offer virtual acoustics tools, with which you can recreate the acoustics of a room and make it interact with your sources.
  • Visualisation: Another common tool is the visualisation, a graphical representation of where your sources are placed, and the amount of sound energy at each position of the space.
  • Rotator/ head trackers: Head trackers are used to tracking the movements of your head and compensate the position of your mix. This way, if you have a source panned in front of you, and you move your head to the left, the source remains at the same position instead of following your movement.
  • Others: Finally, we can find other effects like dynamics processors, equalisers, creative delay units, etc...

As it might be expected, the commercial plug-ins have better user interfaces, promotional videos, customer support, etc... Nevertheless, the Open source ones have the same audio quality (or even better in some cases). For example, AmbiX and SPARTA, which both are Open source, offer ambisonics encoding/decoding up to 7th order. On the other hand, Waves 360, which is a commercial one, only offers 1st order (more about orders in a further post, but let's assume that the higher the order, the higher the spatial audio quality). For an inexperienced user that is looking for something easy to install, easy to use, and easy to learn, Waves is probably the best choice but, for an experienced user, AmbiX or SPARTA will offer more options and flexibility.

Hardware

Hardware options are in most of the cases controlling parameters of a software-based tool, and thus, the spatial audio processing is still being software-based. However, hardware control could be handy to control several settings at the same time easily. Moreover, for most of the users, it is always better to have some touchable controls rather than using the mouse.
Picture
Avid blogs
The Avid S6 is a customizable DAW controller, where the user can choose between different modules to create their own mixer. One of these choosable parts is the joystick module (picture), where the user can panoramise sources using the two joysticks. Since the joysticks are 2D controllers, the first one controls the horizontal plane, whereas the other controls the vertical plane. Also, visual feedback of the position is shown in the small LCD. Other controls like distance or width are controlled using knobs.
Picture[1]
Exploring new hardware-based options, the BBC carried out a research on the use of a haptic feedback device for sound source control in spatial audio systems [1]. In this study, they used this 3D joystick (picture) which had motors to recall its position, or force the user to do some predefined movements. They explored several modes of controlling spatial audio using the capabilities of this device, which spatial audio controls are more likely to control using hardware, and the options that the motors could offer. It is a very interesting project, and I really encourage you to have a look at their​ their paper.

mid-Air GESTURAL control

Where I found more projects is using Mid-air gestural control. This means that we control audio without touching any surface or device, just moving our hands.

Several papers have been researching different techniques to control audio using mid-air gesture [2, 3, 4], using a hand tracker device called Leap Motion. This device can track the position of each finger, and give six degrees of freedom to the user. One commercial product that includes this technology is the Fairlight 3DAW, which integrates a Leap Motion in a conventional mixing desk. Difficult to comprehend? Probably it is better if you have a look at the following video:

others

Other approaches like DearVR, are using VR glasses and VR controllers, to control sources in a virtual world (as the video I embedded in the Project overview post). In this case, you can be immersed in the same virtual space where the mix will happen, and you can remotely control the sound sources using the VR controllers.

Last but not least, I found a project that is controlling spatial audio using smartphones [5]. Their approach is quite similar to my original idea, using gyroscope and accelerometer sensors to move a sound. However, their project is server-based, instead of DAW-based. In their case, the phone is sending messages to a server, which processes the audio for speaker reproduction.

and my approach?

I'm also going to use smartphones to control spatial audio, but my approach will be slightly different. In my case, I'm going to explore the native AR capabilities that iOS offers, implementing and evaluating several control methods in an iPhone app. In addition, my app will control the parameters of an ambisonics-based DAW plug-in. Therefore, it may be reproduced through speakers but also through headphones.

This week I've also been developing a simple app to try some of these AR capabilities, and next week, I'm going to write a post sharing my first thoughts and tests. Stay tuned to know more about it!!

conclusions

After this extensive review (not just what I posted here, believe me that I found much more information than expected...), I realised that the audio processing part (software) is widely covered by lots of companies and University research groups. Thus, it doesn't make sense to try to create a new plug-in and pretend to do something better than the already existing ones. At least, not within this four-month project...

However, I can focus my research on the control of spatial audio, and evaluate different techniques using smartphones. The only project that uses them is only using a few sensors and just one type of movement. Therefore, my project could explore more ways to control spatial audio, especially those related to the iOS native AR capabilities.
References:

[1] Melchior, Frank, Chris Pike, Matthew Brooks, and Stuart Grace. 2013. “On the Use of a Haptic Feedback Device for Sound Source Control in Spatial Audio Systems.” In Audio Engineering Society Convention 134. Audio Engineering Society. 

[2] Gelineck, Steven, and Dannie Korsgaard. 2015. “An Exploratory Evaluation of User Interfaces for 3d Audio Mixing.” In Audio Engineering Society Convention 138. Audio Engineering Society. 

[3] Quiroz, Diego. 2018. “A Mid-Air Gestural Controller for the Pyramix® 3D Panner.” In Audio Engineering Society Conference: 2018 AES International Conference on Spatial Reproduction-Aesthetics and Science. Audio Engineering Society. 

[4] Gelineck, Steven, and Dan Overholt. 2015. “Haptic and Visual Feedback in 3D Audio Mixing Interfaces.” In Proceedings of the Audio Mostly 2015 on Interaction With Sound, 14. ACM.

[5] ​Foss, Richard, and Sean Devonpor. 2018. “An Immersive Audio Control System Using Mobile Devices and Ethernet AVB-Capable Speakers.” Journal of the Audio Engineering Society. Audio Engineering Society 66 (9): 724–33.
0 Comments



Leave a Reply.

    blog:

    This blog aims to share the progress of my Master thesis project with anyone who might be interested. Maybe someone could be interested in all the project, perhaps someone else could be interested in a post of a particular topic or, why not, maybe you are a friend of mine, and you just want to keep in touch with me during these busy days (sorry about that... I'll be back soon!). Whatever of the previous reasons, you are really welcome to read as many posts as you like and, of course, feel free to comment, discuss or ask anything you want!

    project:

    I’m currently researching on new DAW-based tools to work with Spatial Audio creation, using gesture control with smartphones. This project has Abbey Road as an industry partner, and it is supervised by Dr Gavin Kearney and Prof Andy Hunt.

    (Click here to read the project overview)

    Author:

    I am passionate about Audio and Music Technology, and I really love the world of sound, audio and music. After finishing the Sonology Bachelor, and working for two years in the media industry, I decided to progress to a more scientific and engineering-based career. That is why I am currently stuying an MSc in Audio and Music Technology at the University of York, where I discovered my passion for academic research, especially in the field of Spatial Audio and Virtual Acoustics.

    View my profile on LinkedIn

    Blog:

    Aquest blog pretén compartir el progrés del meu treball de final de màster amb tothom qui hi pugui estar interessat. Potser hi ha algú que està interessat en tot el projecte, potser algú altre està interessat en algun tema en concret o, per què no, potser ets algun amic meu que vol simplement vol estar en contacte amb mi durant aquests dies que no hi sóc (perdó, tornaré aviat...). Sigui quina sigui la raó, sou tots molt benvinguts a llegir tants posts com vulgueu i, per descomptat, no dubteu en comentar, discutir, o preguntar el que us vingui de gust!

    Archives​

    May 2019
    April 2019

    Categories

    All
    Demo
    Overview
    Reviews

    back to portfolio

​© 2020 Adrià Martínez i Cassorla