Multi-Codec DASH Dataset: An Evaluation of AV1, AVC, HEVC and VP9

This scientific evaluation puts AV1 to the test against industry standard codecs and shows that AV1 is able to outperform VP9 and even HEVC by up to 40%

Introduction

For practical Over-the-top (OTT) streaming applications it is mostly necessary to supply streams using multiple different video codec standards in order to stream to a wide range of devices and platforms. The most commonly used video codes in this scenario are AVC, VP9 and HEVC. With the standardization of AV1, another modern video coding standard is joining in. While AVC offers the best compatibility across devices and platforms, the newer standards such as HEVC and AV1 offer a much higher compression efficiency and thereby also a better user experience. Another key difference between the codecs is that VP9 and AV1 were developed with the goal of being open source and freely available for anybody to implement and use without any royalties while AVC and HEVC require a royalty to be paid.

The Dataset

Since the main focus is on an HTTP Adaptive Streaming (HAS) dataset, we adopted a set of bitrate/resolution pairs – referred to as the bitrate ladder – with a range from very low bitrates/resolutions of 100 kbits at 256×144 pixels up to 4k resolutions at 20 megabits. This is a well established approach for OTT streaming applications. For the video sequences we tried to cover a range of video sequences with different properties. For this, we calculated the spatial and temporal information so that the sequences contain different amounts of motion and texture.

For the adaptive streaming encoding, a size per segment of 2 as well as 4 seconds was used. For AV1 encoding a snapshot of the reference software was used (v0.1.0-7691-g84dc6e9). For the encoding, the cpu_used preset was set to 2. The encoding for AVC, HEVC, and VP9 was performed utilizing ffmpeg and, thus, libx264, libx265, and libvpx-vp9 are used. For these codecs, encoding performed with the slow preset. For all codecs, a two-pass scheme is employed.

Encoding of the AV1 bitstreams according to these specifications was performed by the Institute of Information Technology at the Alpen-Adria Universität Klagenfurt. Encodings using the other codecs AVC, HEVC and VP9 was carried out by Bitmovin using the Bitmovin Video Encoding cloud infrastructure. All bitstreams were then collected and jointly evaluated.

The Evaluation

For evaluation, the reconstruction at lower resolutions was upscaled to the original resolution and the weighted PSNR relative to the original source was calculated ((6*Y+U+V)/8). From these values we calculated the corresponding Bjøntegaard-Delta bit-rate (BD-rate) values. When calculated over the entire bitrate ladder, we were able to observe an average bitrate reduction of AV1 compared to VP9 of 13% and compared to HEVC of 17%. When we focus on the higher part of the bitrate ladder, the BD-rate reduction compared to VP9 increases to 22%-27% while compared to HEVC, the reduction increases to 30%-43%. It should be noted that because of the fixed bitrate ladder, the overlap becomes rather small for the highest resolutions in some sequences and the results should therefore be interpreted with some caution. This could definitely be improved by adapting the bitrate ladder to the properties of the different sequences.

Conclusion

The dataset is meant to offer a first HLS set environment for the emerging video coding standard AV1 and the other in OTT applications most frequently used codecs AVC, VP9 and HEVC. The coding performance results for this test set indicate, that AV1 is able to outperform VP9 and even HEVC by up to 40%. Please note that this evaluation primarily targets HAS services and has a very specific setup. While it can give an indication on the coding performance of AV1, the results should be interpreted with caution.