I’m up and running with the Alvin Lucier inspired project I mentioned in the previous post. It’s in a pretty basic state at this point, but the good news is I can take in an audio file and process (convolve) it with an impulse response to get a first pass of the room acoustics.

For this project I’m going to follow a quasi-SCRUM methodology – essentially adding features as we go, with each post documenting a somewhat finished state and deciding at the end what the next feature to implement will be. If you’re familiar with SCRUM, you can think of each post as a SPRINT review with a backlog grooming and SPRINT planning session tacked on at the end. At the end of each SPRINT (usually a period of time like 1 or 2 weeks) you’re supposed to deliver a ‘potentially shippable product’ and a SPRINT review is held to show your work and let the product owner accept (or reject) the feature. Backlog grooming is the process of deciding priority of features and SPRINT planning is to allow you to commit to the next features to be implemented in the SPRINT ahead.

All of that is just to say that this project will be documented as a continuous work-in-progress. As a one-person team doing this in my spare time I won’t actually follow these rules, but it’ll be a guiding principle.

So down to the actual code and stuff –

One of the first steps was to set up a good old version control system. I use git / github for this, using their free account tier so you can view all the code I write along the way. I spent a bit of time configuring everything to track just the files I’m modifying and ignoring all the auto-generated stuff aswell as the framework files which are a little larger. This is done via the .gitignore file which allows you to tell git which types of files to ignore. I used the Swift.gitignore template for this and added the AudioKit.framework files to the ignore list, since you can download those from AudioKit directly instead.

So as mentioned I’m using AudioKit as the framework for processing the audio in this app.

AudioKit is an audio synthesis, processing, and analysis platform for iOS, macOS, and tvOS.

It’s an open-source project – so you can contribute if you feel like it. It both wraps and simplifies some elements of CoreAudio, and extends it allowing you to create complex processing graphs by chaining Nodes together. The current version is based on Swift 3 which requires Xcode 8 to build.

For this project, the processing needs are pretty simple. I just need to be able to get audio input, process it via convolution and output the result with multiple iterations. I decided as a first step that I could hardcode the audio input files to process a single iteration and play the resulting output. I dragged and dropped two files – sitting.wav (an excerpt from the original recording for now, which I’ll replace soon) and an impulse response file from a medium church hall – so pretty reverby to make the effect more noticeable.

Xcode Project Bundle

First we load the files into the project as an AKAudioFile and a file url respectively as these are the paramaters AKConvolve() looks for.

I then hooked up the player node to the convolution process, attached the output of the convolution to the engine output (which will eventually play the audio) and started the engine.

AudioKit.output = convolvedOutput!
AudioKit.start()

Then we simply need to start both the convolution and player to hear the audio output.

convolvedOutput!.start()
player!.start()

The UI for now is as basic as it gets. There is a single button to toggle the convolution ON/OFF. In order to achieve this the turnOffConvolution() function is attached to the UIButton with simple if/else statements to toggle start and stop. I had to declare convolvedOutput as a global before initializing later for this function to access the start/stop functions. I’m sure I’ll change that when I update the app to use the MVC (Model-View-Controller) design, but for now it works well enough.

So there it is, a very basic app to hook up our input audio to the convolution node and output the result with a button to toggle processing on/off. Up next we’ll add the processing iterations so we can completely destroy the actual speech by the room resonances to mimic the original recording.