Streaming Audio to Multiple Listeners via iOS' Multipeer Connectivity

Music has always been a very important part of iPhones and all Apple devices.
With the advent of iOS 7, Apple introduced a new technology called Multipeer
Connectivity
which allows us to stream data with NSOutputStream and NSInputStream. I
wanted to use this great new framework to stream audio to many listeners.
However, there is no easy way to play audio from an NSInputStream. So I set
out on an adventure through CoreAudio to make this possible.

Multipeer Connectivity uses NSOutputStream to stream data to a connected peer.
This is what we’ll use to send the audio data. On the receiving end, Multipeer
Connectivity uses NSInputStream which we’ll use to harvest the incoming data.
Using the Apple provided Audio Queue
Services,
we’ll send the data to the device’s system. With the Audio Queue Services, we
can fill buffers with audio data and then play them. This is all we need in
order to play raw audio data, but most audio files are encoded to reduce file
size like MP3 and AAC files. Apple provides the Audio File Stream
Services
that can process the encoded audio format and return the raw audio data. The
picture below shows the flow of data and initial state of the proposed solution.

First, we start the audio stream and as we receive data, we pass it into the
stream parser where it will be decoded. The parser will then send out the raw
audio data we need. There are three audio buffers in the audio queue which will
be filled one by one with the data received from the parser. When full, a buffer
will be enqueued to the system. When the system is finished playing an audio
buffer, it is returned, refilled, then enqueued again in a loop until there is
no more data to play. The GIF below demonstrates how the audio data flows from
the code to the system hardware. The red and green squares represent the empty
and full buffers respectively.

Now that we have some background on how streaming works, let’s play a song from
our iTunes library. We can use a MPMediaPickerController to allow the user to
pick a song to play. We will get an array of MPMediaItems from the picker’s
delegate method mediaPicker:didPickMediaItems:.

An MPMediaItem has many properties we can look at for song title or author,
but we’re only interested in the MPMediaItemPropertyAssetURL property. We use
that to create an AVURLAsset from which we can read the file data by using
AVAssetReader and AVAssetReaderTrackOutput.

Here, we create the AVURLAsset from the media item. Then we use it to create an
AVAssetReader and AVAssetReaderTrackOutput. Finally, we add the output to
the reader and start reading. The method startReading will only open the
reader and make it ready for later when we request data from it.

Next, we’ll open our NSOutputStream and send the reader output data to it when
its delegate method is invoked with the event NSStreamEventHasSpaceAvailable.

First, we get the sample buffer from the reader output. Then we call the
function CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer to get a list
of audio buffers. Finally, we write each audio buffer to the output stream.

We’re now streaming a song from our iTunes library. Now, let’s look at how to
receive this stream and play the audio.

Here, we set the delegate to the class so we can handle the stream events. Next,
we tell the stream to run in the current run loop, which could be on a separate
thread, and to use the default run loop mode. It is important to use the default
mode or our delegate methods will not be called. Finally, open the stream to
start receiving data.

Our class should conform to the NSStreamDelegate protocol so we can handle
events from the NSInputStream.

Above, we use the delegate method to handle events received from the stream.
When the stream ends or has an error, we should notify the application so it can
decide what to do next. For now, we are only interested in the event that the
stream has data for us to process. We need to take this data and pass it into
the Audio File Stream Services.

We create the parser by passing it a reference to the class, a property changed
callback function and a packets received callback function. We need to create
these callback functions to use the reference to the class and call methods
within the class.

Inside the didChangeProperty:flags: method, we are looking for the
kAudioFileStreamProperty_ReadyToProducePackets property which tells us that
all other properties have been set. Now we can retrieve the
AudioStreamBasicDescription from the parser. The AudioStreamBasicDescription
contains information about the audio such as sample rate, channels, and bytes
per packet and is necessary for creating our audio queue.

The file stream will continue to parse bytes until it has enough to decypher the
type of file. At this point, it invokes its property changed callback with the
property kAudioFileStreamProperty_ReadyToProducePackets. After this, it will
invoke its packets received callback with nicely packaged packets of decoded
audio data for us to use.

The Audio Queue is an AudioQueue class that allows us to create audio buffers,
fill them, and then enqueue them. It also gives us audio control like play,
pause, and stop. Now, let’s create the queue and its buffers.

To create the audio queue we need to pass the AudioQueueNewOutput function the
AudioStreamBasicDescription we received from the parser, a callback function
that is invoked when the system is done with a buffer and reference to the
class. Next, we create one audio buffer using the AudioQueueAllocateBuffer
function and give it the audio queue and the size of bytes it can hold.

Now, we wait for the parser to invoke its packets received callback. Then, we
fill an empty buffer with the packets. There are two possible formats for the
data received from the parser, VBR or CBR. Variable Bitrate
(VBR) means that the bit rate
can change from packet to packet where as Constant Bitrate
(CBR) means that it will be
constant.

In the case of VBR, we can only fill the buffer with whole packets which
contain many bytes. This means that the buffer may not fill up before we have
to send it to the system. With CBR, we fill the buffer to the brim and then
send it along.

Here, we need to check if the packet will fit in the leftover space in the
buffer. If it can’t fit another packet of mDataByteSize then we will have to
get another buffer. We also need to hold on to our packet descriptions for
enqueueing.

When the buffer is full, we enqueue it to the system with
AudioQueueEnqueueBuffer.

Those are the basics for streaming audio through Multipeer Connectivity. At the
end of this adventure I created a open source library that brings together
everything described here in a more organized and structured fashion. If you’d
like more detail, the complete code and examples are on GitHub at
tonyd256/TDAudioStreamer.