Video Encoding by the Numbers

Click cover to see at full resolution

On January 1, 2017, the Streaming Learning Center and Jan Ozer announced the availability of a new full-color 330 page book entitled Video Encoding by the Numbers, Eliminate the Guesswork from your Streaming Video. The book teaches readers to optimize the quality and efficiency of their streaming video by objectively measuring the impact of critical configuration options with industry-standard quality metrics like PSNR and SSIMplus. This takes the guesswork out of most encoding decisions and allows readers to achieve the optimal quality/data rate tradeoff.

Since all videos encode differently, the tests detailed in the book involve eight different videos, including movie footage, animations, talking head footage, a music video, and PowerPoint and Camtasia-based videos. The book walks the reader through quality testing, basic encoding configurations, encoding with H.264, HEVC, and VP9, and encoding for adaptive streaming.

When appropriate, the chapters conclude with a section detailing how to configure the options discussed with FFmpeg, a preferred tool for high-volume video producers, including packaging into HLS and DASH formats (the latter with MP4Box). The book also details how to use key Apple HLS creation and checking tools like Media File Segmenter and Variant Playlist Creator.

The book is a complete rewrite of Ozer’s last book, Producing Streaming Video for Multiple Screen Delivery, going much deeper into choosing encoding configurations for H.264, HEVC, and VP9, and addressing new technologies like DASH, CMAF, and others. The new book does not go into many of the topics covered in the older book, focusing almost exclusively on VOD rather than live.

Learn to optimize the data rate of your streaming files

The book uses eight different files throughout, representing a ccross-sectionfrom business and entertainment sectors, and starts by demonstrating how to optimize the data rate for each file type using CRF encoding. For example, this table from Chapter 6 shows how dramatically different data rates deliver relatively homogenous quality (particularly looking at SQM scores) for the different content types at 1080p resolution. The book also shows examples from Hollywood publishers that verify these data rates, with similar data for 720p and 360p resolutions.

Later chapters show multiple approaches to per-title encoding. Data rate largely determines streaming cost and quality, and this book will help you get it right. The book also provides a suggested encoding ladder for H.264 through 1080p, and up to 4K encoding ladders for VP9 and HEVC.

This table shows the data rate and quality level of the eight different files used throughout the book.

Make informed compress-related decisions

Every configuration-related decision for H.264, VP9, and HEVC is tied to objective quality metrics, allowing readers tounderstand how different configuration options impact quality and/or encoding speed. As an example, this chart shows how quality and encoding time vary by x264 preset.

Readers interested in encoding speed/capacity can instantly see that the Faster preset delivers over 90% of the available quality at a fraction of the encoding time of the Very Slow preset. Those interested in ultimate quality can see that the Very Slow preset delivers better quality than the Placebo preset in a fraction of the time.

This chart shows own encoding time and quality varies by x264 preset.

The table below shows the impact of reference frames on encoding quality, with 10 reference frames delivering the highest quality, but only by about .41% over a single reference frame, which would be indistinguishable by viewers. The following table in the book shows that encoding with ten reference frames roughly doubles encoding time. Together, these stats show that those encoding with 10 reference frames can roughly double capacity with minimal impact on quality by changing to a single reference frame.

This table shows how quality varies with the number of reference frames, with red
backgrounds indicated the worst PSNR score, and green backgrounds the best.

Learn to build your own encoding ladder and encode and package for HLS and DASH

Three chapters cover how to choose an ABR technology and how to build, encode, and package your encoding ladder. Detailed analysis helps the reader make informed decisions on parameters like segment duration, or the initial stream offered for playback in the adaptive group.

This chart from Bitmovin (included with permission) shows how segment duration impacts throughput.

Learn to benchmark your own footage

The book contains chapters on how to use the Moscow State University Video Quality Measurement Tool, and the SSIMwave Quality of Experience Monitor, as well as FFmpeg, to produce your own objective quality metrics. If you want to go the extra step and perform your own tests, this book will show you how, both through detailed instruction and through referenced videos like this one.

This book will help you understand the streaming environment that you’re publishing within, and how to choose and use different ABR technologies like HTTP Live Streaming and DASH for the various targets of your streaming video.

Learn to use command line tools like FFmpeg, MP4box, and Apple Media File Segmenter and Apple Variant Playlist Creator

Most chapters end with detailed instruction on how to encode the parameters discussed in that chapter using FFmpeg, making the book a great tool for those learning how to encode H.264, VP9, and HEVC with the command line tool. Later chapters cover ABR encoding and packaging with FFmpeg, MP4Box, and Apple tools like Apple Media File Segmenter and Variant Playlist Creator. The book contains comprehensive working samples that make it simple for those creating their own command line-based encoders to quickly get up to speed.

Chapter-by-Chapter Description

Section I: Introduction

Chapter 1: Technology Fundamentals. While the book is targeted towards intermediate to advanced compressionists, I cover encoding basics like codecs and container formats for newbies, or for a quick refresher for advanced users.

Chapter 2: Basic File Parameters. Sounds silly, but I logically couldn’t start talking about how to use objective quality metrics without describing file parameters like data rate, resolution, frame rate, and bits per-pixel. Here I define them at a high level, in later chapters you’ll learn how to customize them with information gleaned from your objective quality metrics.

Chapter 3: Essential Tools. Beyond the objective metrics themselves, there are several tools that deliver file information that you simply can’t live without, like MediaInfo, Bitrate Viewer, and Telestream Switch. In this chapter, you’ll learn what these tools do and where to get them.

Chapter 4: Testing Overview. Like video compression itself, objective testing is a garbage-in/garbage-out medium. If you start with bad inputs, you’ll end up with worthless results. This chapter covers a range of testing procedures, from choosing a test clip to verifying your encodes before applying the metric.

Chapter 5: Working with MSU VQMT. This tool has quickly become absolutely essential to me and my encoding practice and in this chapter, you’ll learn why, and how to use it most efficiently.

Chapter 6: Working with SQM. SQM has several features that VQMT doesn’t offer, including the ability to rate quality by the playback device. It’s an expensive tool at over $3,000, but I’ve found it very valuable in my encoding practice.

Section II: General Application

Chapter 7: Choosing Data Rate. Now that you know how to use multiple objective benchmarks, we’ll start to apply that knowledge to learn how to choose file data rates. You’ll also learn how to use constant rate factor (CRF) encoding, a mode in the x264 codec (and VP9/x265) to assist your data rate decisions.

Chapter 8: Bitrate Control. You choose between constant bitrate encoding (CBR) and variable bitrate encoding (VBR) based upon three factors; overall file quality, transient quality, and file deliverability. In this chapter, you’ll learn the impact of your bitrate control technique on all three. If you have only one chapter to read in this book, this should be it.

Chapter 9: I-, B-, P-, and Reference Frames. Rules for choosing key frame and B-frame interval, and the number of reference frames, by video type and by type of deployment (single file or adaptive).

Section III: Codec-specific Application

Chapter 10: Encoding H.264. This chapter covers H.264 specific instruction like royalties, entropy encoding, and how quality varies among the various H.264 codecs.

Chapter 11: Encoding HEVC. This chapter covers the HEVC royalty situation; where you can consider using HEVC and where you shouldn’t; how the various HEVC codecs compare quality-wise; and how to encode with x265, a high-quality, open-source HEVC codec.

Chapter 12: Encoding VP9. This chapter details where VP9 plays (everywhere), how the quality compares with that of HEVC and H.264, and how to encode VP9 with FFmpeg. You’ll also learn that VP9 will soon be replaced by a codec called AV1, from the Alliance for Open Media—a response to the slow, muddled, and overly expensive royalty policies proffered by HEVC patent owners.

Section IV: Multiple-screen Adaptive Bitrate Delivery

Chapter 13: Choosing an ABR Technology. This chapter identifies the technologies you’ll need to deliver to computers, mobile devices, OTT devices like Roku and Apple TV, and smart TVs, like HTTP Live Streaming (HLS), Dynamic Adaptive Streaming over HTTP (DASH), and others. In particular, you’ll learn about the transition from Flash to HTML5 on computers, and the various components of HTML5, including Media Source Extensions and Encrypted Media Extensions.

Chapter 14: Configuring Your Encoding Ladder. This chapter walks you through how to choose the rungs on your adaptive bitrate ladder, and issues like whether to use one ladder for all targets, or a different ladder for each target.

Chapter 15: Encoding and Packaging ABR Streams. So you’ve chosen your ABR technology and created your encoding ladder; now it’s time to encode. In this chapter, you’ll learn how to use FFmpeg to create your encoding ladder, and create media and master playlists for your HLS streams. You’ll also get an extensive look at the Apple tools for HLS creation, Media File Segmenter, Variant Playlist Creator, and Media Stream Validator. For DASH, you’ll learn how to use open-source tool MP4Box to do the same. You’ll also learn when to consider workflows like dynamic packaging that can save big bucks in high-volume encoding operations.

Chapter 16: Per-title Encoding. You’ll learn early on that every video is different and really requires a unique encoding ladder. In December 2015, Netflix introduced its per-title encoding algorithm, which does just that. YouTube followed soon thereafter, as did one commercial application, Capella Systems Cambria. You’ll learn what per-title encoding is, how it works, and several techniques for applying it yourself—including capped CRF.