MJPEG Decoder

Description

Last year the Coding4Fun/Channel 9 guys asked me to work on a few things for MIX10. One of these items was a way to output a webcam stream to Windows Phone 7 for use with Clint's t-shirt cannon project you may have read about. I figured the easiest way to accomplish this was by using a network/IP camera capable of sending a Motion-JPEG stream, which can be easily decoded and displayed that can display a JPEG image. Thus, this library was born.

It has gone through quite a few changes and I have expanded it to easily display MJPEG streams on a variety of platforms. The developer just references the assembly appropriate to their platform, adds a few lines of code, and away it goes.

Usage

For those that are just interested in the usage, it's as simple as this:

Reference one of the following assemblies appropriate for your project:

In the event handler, take the Bitmap/BitmapImage and assign it to your image display control:

In the case of XNA, use the GetMjpegFrame method in the Update method, which will return a Texture2D you can use in your Draw method.

Call the ParseStream method with the Uri of the MJPEG "endpoint".

That's it! The source code and binaries above both include projects demonstrating how to use the library on each of these platforms. As long as you set the appropriate reference, you can just copy and paste the code in the sample to get your project running (changing the Uri, of course).

If that doesn't fit your needs, you can also access the Bitmap/BitmapImage properties directly from the MjpegDecoder object, or the CurrentFrame property, which will contain the raw JPEG data prior to being decoded.

A Word About Network/IP Cameras

I have tested this against several different cameras. Each device has its own quirks, but all of them seem to work with this library with one exception: several cameras will respond differently when an Internet Explorer user agent header is sent with the HTTP request. Instead of sending down an MJPEG stream, it will send a single JPEG image as Internet Explorer does not properly support MJPEG streams. Unfortunately, this causes the Silverlight processor to not work properly as the header cannot be changed from the Internet Explorer default. When this happens, only a single frame will be sent, and the decoding will fail. The only fix I have found is to use a different camera that doesn't work in this way.

What is MJPEG?

Pretty simply, it's a video format where each frame of video is sent as a separate, compressed JPEG image. A standard HTTP request is made to a specific URL, and a multipart response is sent. Parsing this multipart stream into separate images as they are sent results in a series of JPEG images. The viewer displays those JPEG images as quickly as they are sent and that creates the video. It's not a well documented format, nor is it perfectly standardized, but it does work. For more information, see the MJPEG article on Wikipedia.

How Do I Find the MJPEG URL of My Camera?

Excellent question. Not an excellent answer. The user manual may mention the URL. A quick internet search with the model number should get you a result. Or, you can also try this company's lookup tool.

How Does It Work?

Glad you asked. If you take a look at the project, you'll notice there isn't much code. One single file is used with a variety of compiler directives to compile certain portions based on the platform assembly being generated. The MjpegDecoder.cs/.vb contains the entire implementation.

First, an asynchronous request is made to the provided MJPEG URL inside the ParseStream method. If we are in a Silverlight environment, the AllowReadStreamBuffering property must be set to false so the response returns immediately instead of being buffered. Additionally, we need to register the http:// prefix to use the client http stack vs. the browser stack. Finally, the request is made using the BeginGetResponse method, specifying the OnGetResponse method as the callback. This will be called as soon as data is sent from the camera in response to our request.

It then streams the response data, looks for a JPEG header marker, then reads until it finds the boundary marker, copies the data into a buffer, decodes it, passes it on to whoever wants it via an event, and then starts over.

The ProcessFrame method seen above takes the raw byte buffer, which contains an undecoded JPEG image, and decodes it based on the environment. However this isn't called in the case of XNA which we'll see in a moment:

In the case of Silverlight, the BitmapImage object has a SetSource method which takes a stream to be decoded and turned into the image. In WPF, BitmapImage works differently. In this case, BeginInit is called, then the StreamSource property is set to the stream of bytes, and finally EndInit is called. In WinForms, the library will return a Bitmap object which can be initialized with the stream right in the constructor.

In the code above, I look at the Application.Current property to determine if the library is being used by a WPF project. If that property is not null, it is assumed the library is being called from a WPF project.

When compiled as an XNA library, we have no use for a BitmapImage or a Bitmapâ€¦we need a Texture2D object. The GetMjpegFrame method seen below is what is called by an XNA application during the Update method to pull the current frame:

@_ivan: I have tested with the Cisco WVC210, TRENDnet IP110W, and a very old no-name brand camera. Practically any IP/network camera (i.e. not a USB webcam) should output MJPEG and work with this. I don't think I've seen an IP cam without some kind of MJPEG support, but that doesn't mean they don't exist.

Why not just use TcpClient instead of HttpWebRequest to get around the limitation on the IE User-Agent header? It's not like it's difficult to write a HTTP 1.0 client, especially if it only needs to GET a single resource.

But then I see TcpClient isn't supported in Silverlight. That really sucks.

@LukePuplett: You can only run this OOB becuase of the crossdomain.xml policy goodness in Silverlight. The cameras don't have this file on their internal webserver, so browser-based Silverlight won't work at all. That said, even OOB apps can't modify the User-Agent header either via the Headers collection or the UserAgent property.

@nils: Definitely an option worth exploring. Anybody want to give it a try? I'm swamped for the the next few weeks on some other projects. . I'd be happy to add it in if anyone manages to make it work...

@Bojan: what kind of load are you seeing and what version of the decoder are you using? (wpf, xna, etc.) In my testing here, my test apps are cranking away at a 640x480 stream at 30fps and I'm only seeing 5-6% CPU at most, typically less.

I don't know how much is loading(CPU) exactly, I don't have access to the server. I am using decoder (for WinForms) for display streams from 20 cameras. I made my own server wich send multipart response with jpegs, so i wil test on him.

@Bojan: Well, if you're constantly reading the stream of 20 cameras simultaneously, there's a chance it could be putting undue load on the server. It has never been tested in that kind of environment. I wouldn't think you'd crush the server, but if each stream is taking 2-3% of CPU to read/decode, make it times 20 and you're taking a decent toll on the server.

Thanks a lot, Brian, for this very informative article. I used WebClient instead of HttpWebRequest to make an SL app after reading your article. I found it to be a little bit easier. It works well.

The SL x-domain access limitation almost defeats the purpose of the app which I made because of being tired of dealing with obsolete ActiveX controls one would have to install on every computer used to view an IP camera. I do not understand why SL does not allow in-browser X-domain access if it is allowed explicitly by the user for a specific uri.

I am not sure what benefits one would get by using System.Net.Sockets as suggested by Nils.

Great library, great examples. However, when switching between two cameras (programatically, of course), the image object wants to show both streams at the same time. I have used the stopstream method but that doesn't seem to do anything and I can see both live feeds interlaced within each other. Help

Hi Brian, thanks for this ! Really helpfull.But I have a small weird issue: I can't start more than two decoder in an application (I'm trying to get on stream from the same camera, can it be the problem ?) and if I stop the stream, it can't restart it ? Could it be related to a keep-alive or not closed connection ?