Java development with an iPhone touch pad for the Atari 2600 from an urban hip-hop perspective

Day: December 7, 2010

*Update*
From part I I neglected to point out that you should un comment #define _USE_SSE in the config.h as mentioned below. This preproc directive will allow you to run on device. It was also mentioned that you could get more speed out of Speex if you #define FIXED_POINT instead of FLOATING_POINT. I have not verified this and Speex runs acceptable in my implementation without it but its worth mentioning.*Update*

You have a lot of vocal audio data. Maybe it needs to be stored on an iPhone. Maybe it needs to glide effortlessly over the network like a slice of dental floss blowing in the wind. Whatever the case, you need a good compressor and Speex is a great solution. Hi, I’m Cliff. You’re here because you’re looking for an answer to your audio compression needs. I’m here to deliver the secrets to decompressing audio with the Speex codec. That, for what it’s worth, is the only reason I’m still hanging around here. In any other event you’d probably find me on South Street sharing a soda with a cat. I digress . . .

In part I of this series I explained how to get Speex to compile. Today we’ll try to import the OGG container format into our project and move onto Speex decompression. Because not everyone may be aware, a brief explanation on codecs and containers is in order. Audio encoding is typically made of two distinct pieces. You would usually have a container format and an encoding. The audio container holds the meta data, or descriptive information, for the actual audio file. This meta or descriptive information includes things like how many channels are in the audio data, what sample rate it is recorded at, and the endianness (not Indianness) of the audio data. There are other potential data held in the container format depending on the type of encoding you use. Finally, the descriptive (meta) data will have the location (offset) of the actual audio data in the file. The encoding is the actual raw audio data that is to be delivered directly to the output device or speakers. The encoding can be a direct digital dump (that is the actual audio samples taken over time as the audio was recorded) or it can be a compressed variant of the raw samples. It’s important to note that the encoding and the container are not usually dependent upon one another. That mean you can mix and match Speex encoding with a wave container format just the same as you can put raw uncompressed samples in an OGG container. It’s just more common to find uncompressed data in a wave container and Speex compressed audio in an OGG container.

Let’s take a step back and try some TDD. Following best practices, we need to create a need for the Speex codec and the OGG container. I realize this is cart before the horse style since we’ve already imported Speex but bear with me as I’m doing this tutorial on my time off. Also up until now I’ve been completely out of the TDD habit for a while as I strive to work closely with others who are uncomfortable with the style. We start by creating a “Unit Test Bundle” target in the project. Create a new objective C class named “CCCSpeexDecoderTest” using the “New File…” dialog and do not choose (unselect) the “also create .h file” option. Include the following in your new Objective-C class file.

Running this tells us that we’re going to need some speex data to operate on. (I’ve taken the liberty to generate a wav file using the “say” command and converted it to a Speex encoded file using the JSpeex API via Groovy. I’ll include both in a download of the project for this lesson.) Next we’ll create a structure to hold our unit tests and test resources. We will be following the “golden copy” testing pattern. You later learn that using the pattern here is rather fragile, however a more purist approach would take us through an exercise of re-writing the entire Speex project which is outside the scope of my tutorial. Using Finder, I created a “Tests” and a “Resources” folder under my src folder in my project. Drag/drop these folders into XCode to create the corresponding groups. Then drag/drop the sample wave and sample speex files (named “sample.wav” and “sample.spx” respectively) into the “Resources” group in XCode. Running the test will now pass.

We now work our way through creating the decoder. I’ll spare the individual steps in TDD as it would make this text overly verbose and I’ll try to summarize instead. We need an actual decoder instance which we’ll be importing. TDD suggests we import what we don’t have so add the import for a CCCSpeexDecoder type which does not exist. Build and fail. (The failure is important as it formalizes the class or feature you are about to add/change/delete.) We also need to be able to create this type and give it some audio to decode. It will also need a place to send the decoded audio data. I’m going to define an abstraction for providing/receiving the audio data so that we don’t necessarily need a file system so I’m adding a test to demonstrate/document the need for an audio source, a test to demonstrate/document the need for an audio sink, and one other test that formalizes how we plug these two abstractions into the decoder.

We now have defined the ability to decode audio. We have to set our expectation for this method. (Test first begins with declaring or expressing a need for a feature or function then setting an expectation for its behavior.) After invoking decodeAudio we would expect to have collected the decoded audio bytes somewhere. I’ll add a mutable data fieldin the test for this.

Here’s the Oogly part. We are calling a method with no return value. We’ve defined an abstraction around collecting data (an audio sink) and we’ve made our test case adopt the protocol for this abstraction. The protocol defines no methods. The test calls for data to magically arrive in the mutable data field. Indirectly, our test is stating that given a source and a sink, when the decodeAudio message is sent we should have accumulated data in the sink. running the test fails because we haven’t added the functionality. We step into the decodeAudio implementation and fill in the simplest thing that works.

Let’s be more specific. When decoding audio we will want to discover the meta data or attributes of the audio. This information is usually the first group of bytes in a file and it explains what the rest of the file contains. We’ll declare an expectation to receive a callback in our sink which contains the meta data in an easily navigable NSDictionary.

Because I forget the attributes of the file provided I’m going to use a discovery test technique. With this technique we use a dummy expected value in our assert and allow the assertion error message tell us what the actual value is. I wouldn’t do this in normal testing. It’s only because I already have working code that I’m plugging in and because this tutorial is getting wordy that I’m going to take the cheap way out.

Once we implement the actual parsing logic we will start to see the actual values reported in the assertion errors. (I am adapting existing working code rather than developing the code from test cases.) We will pull the values from the errors back into the asserts to make the test pass and document what our expectations actually are.

Now we need to actually start pulling audio from our audio source abstraction. Because we used protocols, our test can pose (using the self-shunt pattern) as the audio source and provide data for the decoder. We step into the decoder and start doing some actual parsing.

At this point we have to import OGG for decoding the container so we can read the file meta data. Download and unpack libogg (not liboggz) from the Xiph.org download site.

We need to add the ogg header files to the header search path, so drag/drop the ogg folder from the include folder in the root of the unpacked directory into your XCode project. (/path/to/libogg-1.2.1/include/ogg) Choose to Copy the files from the dialog and select your static lib target before accepting the dialog. Delete the config_types.h.in and makefile.am and Makefile.in from this folder and group. (Also move them to trash.) Double click the project icon in the left tree pane and select the “Build” tab. Type “header search” in the search box at the top to narrow the options to the header search path. You need to add, “$(SRCROOT)” as one of your header search path values here. Create an XCode group for the ogg source code and drag/drop the “bitwise.c” and “framing.c” files from the unpacked libogg source folder. (/path/to/libogg-1.2.1/src).

At this point building unit test target should leave you with errors from the latest round of header info asserts which we will fix in the next part of the series. We have a fully configured project with access to both the speex and ogg encoding/decoding APIs which is exciting. In the next part of the series we will tackle calling into these APIs to decode the data. I’m going to upload my part II example project to my box account so it will be in the right and pane for your downloading pleasure. Until next time…

(Some of you will have noticed I accidentally published this post the other day before finishing it. This is why I’m publishing it half baked tonight. There’s alot here and a lot more to cover. Keep checking back for updates!)

The source of all code that’s fun

Code like a girlMy Page RankWho links to me?
The opinions expressed here are my own and are not borrowed, stolen, shared, or necessarily understood by my current employer.