Introduction

Ever since I started using VLC Media Player, I was impressed with its capabilities, especially its built-in codecs which require no further installations. After exploring the VLC structure a little further, I found the libvlc.dll module which is an API for the entire VLC engine, and contains a rich set of rendering, streaming, and transcoding functionality. Libvlc is a native DLL which exposes hundreds of C method calls. The main concept of this article is to provide a .NET API for the libVLC interface so the vast majority of VLC functionality could be utilized in managed applications.

VLC 1.1.x introduced several improvements and fixes detailed here; the most compelling ones are GPU decoding and simplified LIBVLC API with no exception handling. Version 1.1.1 also adds support for the Google WebM video format.

P/Invoke

In order to use libvlc in a managed application, it has to be wrapped by some kind of interoperability layer. There are three ways to accomplish this:

C++/CLI

COM Interop

P/Invoke

Since libvlc is a native library which exports pure C methods, P/Invoke is chosen here.

If you are planning to enrich your knowledge of P/Invoke, libvlc is a great place to start. It has a large number of structures, unions, and callback functions, and some methods require custom marshalling to handle double pointers and string conversions.

We have to download the VLC source code to better understand the libvlc interface. Please follow this link. After extracting the archive content, go to the <YOUR PATH>\vlc-1.1.4\include\vlc folder:

These are the header files for libvlc. In case you want to use them directly in a native (C/C++) application, there is an excellent article explaining that.

Custom marshalling

The entry point to the libvlc interface is the libvlc_new API defined in libvlc.h:

LayoutKind.Sequential means that all the members of the structure are laid out sequentially in the native memory.

Unions

Unions are similar to structures, but their members declared by type definition begins at the same memory location. This means that the layout must be controlled explicitly by marshalling the runtime, and this is achieved using the FieldOffset attribute.

If you intent to extend the libvlc_event_t definition with additional values, they must all be decorated with the [FieldOffset(8)] attribute since all of them begin at an offset of 8 bytes.

Callback functions

When the underlying VLC engine has its internal state changed, it uses callback functions to notify whoever subscribed for this kind of change. Subscriptions are made using the libvlc_event_attach API defined in libvlc.h. The API has four parameters:

Pointer to the event manager object.

libvlc_event_type_t enum value specifying the event on which callbacks are required.

Please note that I want to get a reference to the libvlc_event_t structure to access its parameters in the MediaPlayerEventOccured function. Unlike other places where I simply use an IntPtr to pass the pointer among method calls.

.NET delegate types are managed versions of C callback functions, therefore the System.Runtime.InteropServices.Marshal class contains conversion routines to convert delegates to and from native method calls. After the delegate definition is marshaled to a native function pointer callable from native code, we have to maintain a reference for the managed delegate to prevent it from being deallocated by the GC, since native pointers cannot “hold” a reference to a managed resource.

nVLC API

IMediaPlayerFactory - Wraps the libvlc_instance_t handle and is used to create media objects and media player objects.

IPlayer - holds a libvlc_media_player_t handle and is used for basic playout when no audio or video output is needed, for example, streaming or transcoding of media.

IAuidoPlayer – Extends IPlayer and is used to play and/or stream audio media.

IVideoPlayer – Extents IAudioPlayer and is used to render and/or stream audio and video media.

Memory management

Since each wrapper object holds a reference to native memory, we have to make sure this memory is released when the managed object is reclaimed by the garbage collector. This is done by implicitly or explicitly calling the Dispose method by user code, or by the finalizer when object is deallocated. I wrapped this functionality in the DisposableBase class:

Each class that inherits from DisposableBase must implement the Dispose method which will be called with a parameter true when invoked by user code, and both managed and unmanaged resources may be released here, or with a parameter false, which means it in invoked by the finalizer and only native resources may be released.

Logging

VLC implements logging logic in the form of a log iterator, so I decided to implement it also using the Iterator pattern, i.e., using a yield return statement:

This code is called for each timeout (default is 1 sec), iterates over all existing log messages, and cleans up the log. The actual writing to the log file (or any other target) is implemented using NLog, and you should add a custom configuration section to your app.config for this to work:

Using the code

Before running any application using nVLC, you have to download the latest VLC 1.1.x, or a higher version from here. After running the installer, go to C:\Program Files\VideoLAN\VLC and copy the following items to your executable path:

libvlc.dll

libvlccode.dll

plugins directory

If any of these is missing at runtime, you will have a DllNotFoundException thrown.

In your code, add a reference to the Declarations and Implementation projects. The first instance you have to construct is the MediaPlayerFactory from which you can construct a media object by calling the CreateMedia function and media player objects by calling the CreatePlayer function.

Playback DirectShow

VLC has built-in support for DirectShow capture source filters; that means that if you have a web cam or video acquisition card that has a DirectShow filter, it can be used seamlessly by using the libvlc API.

Note that the media path is always set to dshow:// and the actual video device is specified by the option parameter.

Playback network stream

VLC supports a wide range of network protocols like UDP, RTP, HTTP, and others. By specifying a media path with a protocol name, IP address, and port, you can capture the stream and render it the same way as opening a local media file:

Streaming

Beyond impressive playback capabilities, VLC also acts as a no less impressive streaming engine. Before we jump into the implementation details, I will shortly describe the streaming capabilities of the VLC Media Player.

After running VLC, go to Media -> Streaming, the "Open media" dialog is opened, and specify the media you desire to broadcast over the network:

As shown above, you can stream a local file, disk, network stream, or capture device. In this case, I choose a local file and pressed "Stream", and on the next tab, "Next":

Now you can choose the destination of the previously selected stream. If the "File" option is selected and "Activate Transcoding" is checked, you are simply transcoding (or remultiplexing) the media to a different format. For the sake of simplicity, I chose UDP, pressed "Add", and then specified 127.0.0.1:9000, which means I want to stream the media locally on my machine to port 9000.

Make sure "Activate Transcoding" is checked, and press the "Edit Profile" button:

This dialog lets you choose the encapsulation, which is a media container format, a video codec, and an audio codec. The number of possibilities here is huge, and note that not every video and audio format is compatible with each container, but again, for the sake of simplicity, I chose to use the MP4 container with an h264 video encoder and an AAC audio encoder. After pressing "Next", you will have the final dialog with the "Generated stream output string".

This is the most important part as this string should be passed to the media object so you can simply copy it and use it in the API as follows:

This will open the selected movie file, transcode it to the desired format, and stream it over UDP.

Memory renderer

Normally, you would render your video on screen, passing some window handle on which the actual frames are displayed according to the media clock. LibVLC also allows you to render raw video (pixel data) to a pre-allocated memory buffer. This functionality is implemented by the libvlc_video_set_callbacks and libvlc_video_set_format APIs. IVideoPlayer has a property called CustomRenderer of type IMediaRenderer which wraps these two APIs.

By calling the SetCallback method, your callback will be invoked when a new frame is ready to be displayed. The System.Drawing.Bitmap object passed to the callback method is valid only inside a callback; afterwards it is disposed, so you have to clone it if you plan to use it elsewhere. Also note that the callback code must be extremely efficient; otherwise, the playback will be delayed and frames may be dropped. For instance, if you are rendering a 30 frames per second video, you have a time slot of approximately 33 ms between frames. You can test for performance degradation by comparing the values of IVideoPlayer.FPS and the IMemoryRenderer.ActualFrameRate. The following code snippet demonstrates rendering of 4CIF frames in RGB24 format:

If you want to query for frames at your own pace, you should use the CurrentFrame property. It will return the latest frame that was scheduled for display. It is your own responsibility to free its resources after you are done with it.

The SetFormat method accepts a BitmapFormat object which encapsulates the frame size and pixel format. Bytes per pixel, size of the frame, and pitch (or stride) are calculated internally according to the ChromaType value.

The IVideoPlayer may operate either in on-screen rendering mode or memory rendering mode. Once you set it to memory rendering mode by calling the CustomRenderer property, you will not see any video on screen.

Advanced memory renderer

Starting with libVLC 1.2.0, it is possible to use the VLC engine to output decoded audio and visual data for custom processing, i.e., input any kind of encoded and multiplexed media and output as decoded video frames and audio samples. The format of audio and video samples can be set before playback starts, as well as video size, pixel alignment, audio format, number of channels, and more. When playback starts, the appropriate callback function will be invoked for each video frame upon its display time and for a given number of audio samples by their playback time. This gives you, as a developer, great flexibility since you can apply different image and sound processing algorithms and, if needed, eventually render the audio visual data.

libVLC exposes this advanced functionality through the libvlc_video_set_*** and libvlc_audio_set_*** set of APIs. In the nVLC project, video functionality is exposed though the ICustomRendererEx interface:

To make the task of rendering video samples and playing audio samples easier, I developed a small library called Taygeta. It started as a testing application for the nVLC features, but since I liked it so much I decided to convert it to a standalone project. It uses Direct3D for hardware accelerated video rendering, and XAudio2 for audio playback. It also contains a sample application with all the previously described functionality.

Memory input

As explained in previous sections, VLC provides many access modules for
your media. When any of those satisfies your requirements, and you need, for
example to capture a window contents or stream 3D scene to another machine,
memory input will do the work as it provides interface for streaming media from
a memory buffer. libVLC contains 2 modules for memory input: invmem and imem.
The problem is that both of them not exposed by the libVLC API and one has to
put some real effort to make them work, especially from managed code.

Invmem was deprecated in libVLC 1.2 so I will not describe it here. It
is exposed via IVideoInputMedia object and you can search the "Comments
and Discussions" forum for usage examples.

Imem, on the other hand, is still supported and exposed by
IMemoryInputMedia object:

The interface provides 3 AddFrame overloads which take frame data from
pointer on native heap, managed byte array or Bitmap object. Each method copies
the data to internal structure and stores it in frame queue. Therefore, after
calling AddFrame you can release frame resources. Once you initialize the IMemoryInputMedia
and call play on the media player object, VLC launches playback thread which
runs infinite loop. Inside the loop it fetches a frame of data and pushes them
as quick as possible to the downstream modules.

To support this paradigm I created producer/consumer queue to hold
media frames. The queue is BlockingCollection
which perfectly suits the needs of this module: it blocks the producer thread
if the queue is full and blocks the consumer thread when queue is empty. The
queue size default is 30 so it caches approximately 1 second of video. This
cache allows smooth video playback. Take into account that increasing the queue
size will impact on your memory usage – 1 frame of HD video (1920x 1080) at
BGR24 occupies 5.93 MB. If you have frame rate control over your media source,
you can periodically check for number of pending frames in queue and increase
or decrease the rate.

DTS and PTS value used to notify libVLC engine when the frame should be
handled by the decoder – decoding time stamp, and when the frame should be
presented by the renderer – presentation time stamp. The default value for DTS
is -1 which means don't use it and use only the PTS. This is useful when using
raw video frames like BGR24 or I420 which go directly to rendering so no need
for decoding. PTS are a must value and if you don't have it along with your
media frames they can be easily calculated by using FPS of your media source
and a frame counting number:

The CaptureDeviceInfo is a helper I wrote using DirectShow to enumerate over the available capture devices. Displaying the device directly works just fine, but streaming doesn't work at all. If I stream from VLC, everything works out fine.

If I try to duplicate the transcoded stream to the display nothing shows up. Maybe the transcoding is failing somewhere? I don't get any log information because when the logging system goes to get initialized it claims I need version 2.1.x to use the feature.

OK, so that works. Now the stream is black on startup sometimes, seemingly randomly. I think this is because my capture device only works if it's setup to exactly match the screen resolution/refresh rate of my graphics card.

So now how do I specify these parameters to VLC? Capture width, height, refresh rate, etc.

You need to enumerate audio output modules using IMediaPlayerFactory.AudioOutputModules.
Each module has list of audio output devices IMediaPlayerFactory.GetAudioOutputDevices.
After you found audio output module and device you can set them with IAudioPlayer.SetAudioOutputModuleAndDevice