ImmerGo is a client-server system that communicates with networked speakers that have mixing and delay processing capabilities.
All audio channels (from a DAW or multi-channel wav file) are sent to all the speakers.
Control over 3D movement is from the client, while the server does all the necessary work to render the audio sources in 3D space.
There can be multiple clients, where each client can run on a mobile device or a desktop. This allows multiple users to control 3D movements of sound sources, each user controlling different sources.
A user selects a sound track, then touch-moves a sound circle to move the chosen track in 2 dimensions.
The 3rd dimension (height) is selected by tilting the mobile device or, on a desktop, moving a height slider. The client sends the track number plus the 3D coordinates to the server.
How does the client know the actual 3D position of a sound source?
A user must indicate the size of the room and the positions of speakers in the room. These values are input graphically by a user.
What does the server do when it receives 3D coordinates?
The server knows the position of each speaker in the room. Given that it knows the position of each sound track in the 3D space of the room, it can work out what mix and delay levels to send to each speaker according to our spatial audio algorithms.
Each speaker has a mixer and delay matrix, and applies the mix and delay levels sent to it.
If there are ambisonic tracks, each speaker will decode the tracks using the appropriate ambisonics order and taking into account its position.
How can a user control DAW transport from a client?
The DAW sends MIDI time code messages to the server, and the server in its turn sends these to all the clients. The time code messages are used to time stamp 3D movements during recording.
The client interface has transport control buttons:
Rewind, Play, Pause, Stop, Fast Forward –
these indications are sent to the server, which translates them into MIDI transport commands and sends them to the active DAW.