Many platforms offer access to dedicated hardware to perform a range of video-related tasks. Using such hardware allows some operations like decoding, encoding or filtering to be completed faster or using less of other resources (particularly CPU), but may give different or inferior results, or impose additional restrictions which are not present when using software only. On PC-like platforms, video hardware is typically integrated into a GPU (from AMD, Intel or NVIDIA), while on mobile SoC-type platforms it is generally an independent IP core (many different vendors).

Hardware decoders will generate equivalent output to software decoders, but may use less power and CPU to do so. Feature support varies – for more complex codecs with many different profiles, hardware decoders rarely implement all of them (for example, hardware decoders tend not to implement anything beyond YUV 4:2:0 at 8-bit depth for H.264). A common feature of many hardware decoders to be able to generate output in hardware surfaces suitable for use by other components (with discrete graphics cards, this means surfaces in the memory on the card rather than in system memory) – this is often useful for playback, as no further copying is required before rendering the output, and in some cases it can also be used with encoders supporting hardware surface input to avoid any copying at all in transcode cases.

Hardware encoders typically generate output of significantly lower quality than good software encoders like x264, but are generally faster and do not use much CPU resource. (That is, they require a higher bitrate to make output with the same perceptual quality, or they make output with a lower perceptual quality at the same bitrate.)

Systems with decode and/or encode capability may also offer access to other related filtering features. Things like scaling and deinterlacing are common, other postprocessing may be available depending on the system. Where hardware surfaces are usable, these filters will generally act on them rather than on normal frames in system memory.

There are a lot of different APIs of varying standardisation status available. FFmpeg offers access to many of these, with varying support.

Platform API Availability

Linux

Windows

Android

Apple

Other

AMD

Intel

NVIDIA

AMD

Intel

NVIDIA

macOS

iOS

Raspberry Pi

AMF

N

N

N

Y

N

N

N

N

N

N

CUDA / CUVID / NVENC

N

N

Y

N

N

Y

N

N

N

N

Direct3D 11

N

N

N

Y

Y

Y

N

N

N

N

Direct3D 9 (DXVA2)

N

N

N

Y

Y

Y

N

N

N

N

libmfx

N

Y

N

N

Y

N

N

N

N

N

MediaCodec

N

N

N

N

N

N

Y

N

N

N

Media Foundation

N

N

N

Y

Y

Y

N

N

N

N

MMAL

N

N

N

N

N

N

N

N

N

Y

OpenCL

Y

Y

Y

Y

Y

Y

P

Y

N

N

OpenMAX

P

N

N

N

N

N

P

N

N

Y

V4L2 M2M

N

N

N

N

N

N

P

N

N

N

VAAPI

P

Y

P

N

N

N

N

N

N

N

VDPAU

P

N

Y

N

N

N

N

N

N

N

VideoToolbox

N

N

N

N

N

N

N

Y

Y

N

Key:

Y Fully usable.

P Partial support (some devices / some features).

N Not possible.

FFmpeg API Implementation Status

Decoder

Encoder

Other support

Internal

Standalone

Hardware output

Standalone

Hardware input

Filtering

Hardware context

Usable from ffmpeg CLI

AMF

N

N

N

Y

Y

N

Y

Y

CUDA / CUVID / NVENC

N

Y

Y

Y

Y

Y

Y

Y

Direct3D 11

Y

-

Y

-

-

F

Y

Y

Direct3D 9 / DXVA2

Y

-

Y

-

-

N

Y

Y

libmfx

-

Y

Y

Y

Y

Y

Y

Y

MediaCodec

-

Y

Y

N

N

-

N

N

Media Foundation

-

N

N

N

N

N

N

N

MMAL

-

Y

Y

N

N

-

N

N

OpenCL

-

-

-

-

-

Y

Y

Y

OpenMAX

-

N

N

Y

N

N

N

Y

RockChip MPP

-

Y

Y

N

N

-

Y

Y

V4L2 M2M

-

Y

N

Y

N

N

N

Y

VAAPI

Y

-

Y

Y

Y

Y

Y

Y

VDPAU

Y

-

Y

-

-

N

Y

Y

VideoToolbox

Y

N

Y

Y

Y

-

Y

Y

Key:

- Not applicable to this API.

Y Working.

N Possible but not implemented.

F Not yet integrated, but work is being done in this area.

Use with the ffmpeg command-line tool

Internal hwaccel decoders are enabled via the -hwaccel option. The software decoder starts normally, but if it detects a stream which is decodable in hardware then it will attempt to delegate all significant processing to that hardware. If the stream is not decodable in hardware (for example, it is an unsupported codec or profile) then it will still be decoded in software automatically. If the hardware requires a particular device to function (or needs to distinguish between multiple devices, say if several graphics cards are available) then one can be selected using -hwaccel_device.

External wrapper decoders are used by setting a specific decoder with the -codec:v option. Typically they are named codec_api (for example: h264_cuvid). These decoders require the codec to be known in advance, and do not support any fallback to software if the stream is not supported.

Encoder wrappers are selected by -codec:v. Encoders generally have lots of options – look at the documentation for the particular encoder for details.

Hardware filters can be used in a filter graph like any other filter. Note, however, that they may not support any formats in common with software filters – in such cases it may be necessary to make use of hwupload and hwdownload filter instances to move frame data between hardware surfaces and normal memory.

Note that VDPAU cannot be used to decode frames in memory, the compressed frames are sent by libavcodec to the GPU device supported by VDPAU and then the decoded image can be accessed using the VDPAU API. This is not done automatically by FFmpeg, but must be done at the application level (check for example the ffmpeg_vdpau.c file used by ffmpeg.c). Also, note that with this API it is not possible to move the decoded frame back to RAM, for example in case you need to encode again the decoded frame (e.g. when doing transcoding on a server).

Several decoders are currently supported through VDPAU in libavcodec, in particular H.264, MPEG-1/2/4, and VC-1.

VAAPI

Video Acceleration API (VAAPI) is a non-proprietary and royalty-free open source software library ("libva") and API specification, initially developed by Intel but can be used in combination with other devices.

It can be used to access the Quick Sync hardware in Intel GPUs and the UVD/VCE hardware in AMD GPUs. See VAAPI.

DXVA2

Several decoders are currently supported, in particular H.264, MPEG-2, VC-1 and WMV 3.

DXVA2 hardware acceleration only works on Windows. In order to build FFmpeg with DXVA2 support, you need to install the dxva2api.h header.
For MinGW this can be done by ​downloading the header maintained by VLC and installing it in the include patch (for example in /usr/include/).

For MinGW64, dxva2api.h is provided by default. One way to install mingw-w64 is through a pacman repository, and can be installed using one of the two following commands, depending on the architecture:

pacman -S mingw-w64-i686-gcc
pacman -S mingw-w64-x86_64-gcc

To enable DXVA2, use the --enable-dxva2 ffmpeg configure switch.

To test decoding, use the following command:

ffmpeg -hwaccel dxva2 -threads 1 -i INPUT -f null - -benchmark

VideoToolbox

​VideoToolbox, only supported on macOS. H.264 decoding is available in FFmpeg/libavcodec.

NVENC

NVENC is an API developed by NVIDIA which enables the use of NVIDIA GPU cards to perform H.264 and HEVC encoding. FFmpeg supports NVENC through the h264_nvenc and hevc_nvenc encoders. In order to enable it in FFmpeg you need:

You can see available presets, other options, and encoder info with ffmpeg -h encoder=h264_nvenc or ffmpeg -h encoder=hevc_nvenc.

Note: If you get the No NVENC capable devices found error make sure you're encoding to a supported pixel format. See encoder info as shown above.

Note:
FFmpeg now uses its own slightly modified runtime-loader for NVIDIA's CUDA/NVENC/NVDEC-related libraries. If you get an error from configure complaining about missing ffnvcodec, ​this project is what you need. It has a working Makefile with an install target: make install PREFIX=/usr. FFmpeg will look for its pkg-config file, called ffnvcodec.pc. Make sure it is in your PKG_CONFIG_PATH.

CUDA/CUVID/NVDEC

CUVID, which is also called NVDEC by NVIDIA now, can be used for decoding on Windows and Linux.
In combination with NVENC, it offers full hardware transcoding.

CUVID offers decoders for H.264, HEVC, MJPEG, MPEG-1/2/4, VP8/VP9, VC-1.
Codec support varies by hardware. The full set of codecs being available only on Pascal hardware, which adds VP9 and 10 bit support.

Sample decode using CUVID, the cuvid decoder copies the frames to system memory in this case:

The -hwaccel_device option can be used to specify the GPU to be used by the cuvid hwaccel in ffmpeg.

The note about missing ffnvcodec from NVENC applies for CUVID/NVDEC as well.

libmfx

libmfx is a proprietary library from Intel for use of Quick Sync hardware on both Linux and Windows. On Windows it is the primary way to use more advanced functions beyond those accessible via DXVA2/D3D11VA, particularly encode. On Linux it has a very restricted feature set and is hard to use, but may be helpful for some use-cases desiring maximum throughput.

OpenCL

​OpenCL can be used for a number of filters. To build, OpenCL 1.2 or later headers are required, along with an ICD or ICD loader to link to - it is recommended (but not required) to link with the ICD loader, so that the implementation can be chosen at run-time rather than build-time. At run-time, an OpenCL 1.2 driver is required - most GPU manufacturers will provide one as part of their standard drivers. CPU implementations are also usable, but may be slower than using native filters in ffmpeg directly.

OpenCL can interoperate with other GPU APIs to avoid redundant copies between GPU and CPU memory. The supported methods are:

DXVA2: NV12 surfaces only, all platforms.

D3D11: NV12 textures on Intel only.

VAAPI: all surface types.

ARM Mali: all surface types, via DRM object sharing.

libmfx: NV12 surfaces only, via VAAPI or DXVA2.

AMD UVD/VCE

AMD UVD is usable for decode via VDPAU and VAAPI in Mesa on Linux. VCE also has some initial support for encode via VAAPI, but should be considered experimental.