Streaming MP4 Videos + SRT Subtitles With Airplay

Sep 2nd, 2013

A few months ago, while working on a video playback iOS app, I encountered an interesting problem. My data was a bunch of video files along with their English subtitles, stored on a web server. I wanted to be able to stream these videos with the subtitles to an iOS device as well as an Apple TV.

Streaming a video from a remote file is a piece of cake thanks to MPMoviePlayerController. But adding the subtitles turned out to be much more difficult than I expected. Having them play on the device was already a bit involved, but being able to stream them from the device to an Apple TV literally turned into an epic battle.

I thought that would make an interesting topic for my blog. In this article, I explain how I approached this problem, what solution I came up with, and what parts of the problem remain unsolved. Oh, and I even share some code ;-)

First, a few words about the context of this problem. The videos I was trying to play were H.264 encoded .mp4 files, while the subtitles were plain .srt files. I did not control the web server where these files were hosted, so directly altering the files was not an option.

Once I realized MPMoviePlayerController didn’t have any support for external subtitles, I thought the best thing to do was to download the .srt file on the device, parse it, and display the subtitles in a UILabel with the right timing.

It worked well on the device, but once I tried to stream to the Apple TV, only the video was streamed while the subtitles stayed on the device. This is because MPMoviePlayerController’s Airplay mode directly sends the video data to the Apple TV. I tried using the screen mirroring feature and moving the video and subtitles to the external UIScreen, but the streaming was way too slow for the video to be watchable.

If MPMoviePlayerController doesn’t support subtitles as an external file, it can display subtitles tracks present in the video container. What if I could insert some kind of proxy between the device and the data server that would merge the video and the subtitles together into a format readable by iOS?

I had already merged .mp4 and .srt files to make .m4v containers in the past, with an app called Subler. I just needed to figure out how to do that by code, on the fly, without having access to the video file locally.

One atom was of particular interest for my problem: the Movie Atom (type moov), which contains metadata about the movie. This moov atom contains, among other things, a collection of trak atoms, each of which represents a media track (video track, audio track, subtitles track, alternate language audio track etc…).

These tracks contained in the moov atom only contain metadata, no actual data. For example, the subtitles track contains the font name, font size, the timing of each subtitle, but not the actual text of the subtitles. The data, be it video, audio or text, is stored in mdat atoms at the top level of the file (ie. outside the moov atom).

In order to add the subtitles to the video, my proxy needed to do two things:

Build a trak atom to describe the subtitles, and insert it in the video’s moov atom.

Build a mdat atom to hold the text of the subtitles, and append it to the file.

I started experimenting and, after quite a bit of trials and errors, I ended up with a nifty little HTTP server (written in Python) that did just what I needed. My MPMoviePlayerController could then call the proxy (with the URLs of the video and subtitles as parameters), and receive the video with the subtitles track. I could then turn on Airplay and see the video appear on my TV, along with the subtitles, at normal speed. Hurray!

If you think this proxy could be useful for you, the code is available on GitHub. Feel free to ask questions there if you need help to make it work.

Although this tool is an honorable solution to the initial problem, it unfortunately requires a server to host the proxy. What if you could do all this directly on the device? Feeding a MPMoviePlayerController from a local HTTP server running on the device, that sounds a bit crazy, but hey, why not?

I experimented a bit with CocoaHTTPServer and it seems doable, though more involved than the Python script. I was able to serve a video through a local HTTP server without adding the subtitle track, which is a first step. Unfortunately, I don’t have time to focus on this project at the moment. If someone feels like finishing the job, feel free to contact me.