Archive

As you know I’m in several private beta programs of Adobe’s products. One of last has been Flash Player 10.1 Mobile for Android.
During the last two months I have played with Motorola Droid and other devices trying to help Adobe to define a short and simple recommendation for encodings that target such devices.
The article has been recently published on Adobe’s DevNet and you find it here.

As you know Google has released VP8 as open source. After the acquisition of On2 several rumors spreaded across the web about the reasons of the deal. During Google I/O the secret has been revealed. Google opens VP8 source code and offers its technology for free.

This is a very important move. Even if there are shadows on the operation because of possible patents infringements, the web needed an open source video codec. All the Internet is founded on open source technologies that every company can handle freely but this was only partially true for video codec.

On2 gave VP3 technology to the open source community several years ago (project Theora) but it is a “primitive” codec, much similar to H.263, that cannot compete with modern codecs like VC1 and H.264.

But what exactly is VP8 ? It is the last codec designed by On2, a company specialized in video codec design. The most known codecs produced by On2 are Vp3 (Theora), Vp6 (licensed by Adobe for the Flash Player 8 and beyond), Vp7 (licensed by Skype) and now VP8. Only few people knew the technical details of VP8 until now, and therefore the potentialities of this codec have always been a mistery regardless of the fact that On2 declared it to be far superior than H.264.

But now it’s possible to compare directly the quality of Vp8 and H.264 because Google has provided the community with technical specifications and both the reference encoder and decoder. In this article I want simply to compare briefly Vp8 technical specification with H.264, the current “state of the art” in video encoding. The following table summarize the most important points:

Frame, Block Transforms, color spaces and color depth

Vp8 operates only on 4:2:0 8bit per pixel YUV picture. H.264 can instead operate also on 4:2:2 and 4:4:4 10 bit per pixel (or above) with the most advanced profiles. As well as H.264, Vp8 subdivides the frame in blocks of 4×4 pixel and performes an integer based transform with an addictional transform of DC coefficients. H.264 offers also an optional 8×8 transfor, available in High profile, which enhances compression in flat zones and gradients. An interesting feature of VP8 is the capability to handle different frame resolution in the same stream.

Intra frame compression

In intra frame compression, a picture is encoded only spatially using intra prediction, quantization and entropy coding. The intra prediction of Vp8 is almost identical to H.264 with several modes for 4×4 blocks and 16×16 macroblocks. Vp8 intra lacks the H.264 adaptive 8×8 transform mode.

Inter frame compression

In inter compression frames are compressed exploiting temporal redundancies between adjacent frames. VP8 has several macroblock configurations, very similar to H.264 modes but somewhat more limited. VP8 similarly to H.264 supports pixel, half pixel and quarter pixel accuracy in motion extimation. It uses a slightly more accurate, but also slower interpolation schema. Vp8 supports only P-frames (and a sort of Disposable frames), with up to 3 reference frames in the past, much less compared to h.264 which can use P-frames with up to 16 reference frames and weighted prediction. H.264 can also leverage on B-frames which are interpolated between a frame in the past and a frame in the future.

Interpolated Frames

Probably the most important difference with H.264 is the lack of B-frames. This kind of frame is the most efficient for compression. Every time the motion is “easy” to predict, a b-frame is inserted to exploit temporal redundacies. B-frames can also be dropped to keep audio video sync in difficult scenarios.

Deblocking

Vp8 supports an in loop deblocking filter. It can be considered as comparable to H.264 deblocking but slightly less flexible (rough adaptivity).

On2′ codecs traditionally support various post processing filtering in the decoder stage. For example VP6 supported many level of deblocking and deringing. This is an interesting feature of VP8 but it is not clear from the documentation how it is applyed. Standard H.264 does not support pre-defined post processing filter.

Comparison

H.264 offers less encoding techniques than VP8, but this leads to lower complexity in both encoding and decoding stage. VP8 seems to be designed to be simpler than H.264 and at the same time almost comparable. The lack of B-frames is probably the most important difference between Vp8 and H.264. This can reduce efficiency of around 15-25%. Further more, the other small differences like less articulate motion prediction, less reference frames and the absence of 8×8 adaptive transformation can be accounted for a further small difference, let’s say 5%.
In any case, VP8 seems to be comparable with H.264 baseline (in which 8×8, b-frames and CABAC are disabled) with the addiction of a simil-CABAC, thus probably a +5/10% more efficient than standard H.264 baseline.

Conclusion

I think VP8 has a long and probably bright future. It is relatively younger than H.264, so both encoders and decoders have to be improved and optimized. On the other hand H.264 has been in the market for years but even now there are room for improvements and optimizations (see my last test of near HD quality @250Kbit/s).

In any case we definitely needed an open source codec because theora was too poor in performances. The difference in quality and efficiency may overshadow if compared to the advantage of using a royalty free codec, especially in specific scenarios. Probably it is not a problem for a big company to pay for the creation of encoders, decoders or H.264 videos, but in some cases the presence of royalty can be blocking even if they are very low.

Think about the case of the Flash Player, how could Adobe pay a fee for every Flash Player to include an H.264 encoder? I hope that the availability of free VP8 could open the doors for a renewed realtime video encoding API in the next release of Flash Player.

Despite all the fruitless discussions about the clash between H.264 and Flash (which you know are simply FUD, being H.264 a codec and Flash a platform) , I think instead that H.264 and Flash form a wonderful duo. Fortunately enough I’m not alone if a giant like Hulu (over 500 million streams per month) has declared HTML5 to be too young to be used in scenarios where monetization is a key factor. This will probably change in the future, but by now only a plug-in like Flash (or possibly Silverlight) can assure the performance (dynamic streaming), the security (DRM or RTMPe) and the control (Monetization and Reporting tools) in video delivery.

Furthermore, the control over the streaming parameters and how the video is displayed on stage are two invaluable key factors for maximizing the QoS in video streaming.

With this Article I want to show exactly this: Flash + H.264 = H.264 squared

You know that I like to push the limit of H.264, trying to find the best processing chain to obtain HD quality at a fraction of the usual bandwidth.
If you are a reader of my blog probably remember my several tests (here you find a list of the most popular) where I showed HD (720p) contents encoded at very low bitrate like 500Kbit/s.

Ok, this time, I want to show you what is possible to achieve exploiting synergies between Flash and H.264.
In the video below you can see a HD* video encoded at … (drum roll prease) … 250Kbit/s!

* In my next post I’ll explain how Flash can be used to enhance video quality and obtain a result like this. In the meanwhile let me know what do you think about the quality / bitrate ratio of the video. Remember that obviously we are talking about a video that is using only 250Kbit/s, 8 times lower of what YouTube uses for a video like that and half the bitrate I used before in my extreme tests. So despite the inevitable loss of detail, I would draw your attention on the reproduction of film grain and on details of the text.

Dued to post processing, a fast computer is required, expecially at full screen where an upscaling schema is applyed. On Windows there’s no problem (I tested it also on a Pentium4 2.4 GHz), on Mac I suggest to update both Flash Player and Safari.

Today La7, the Italian TV broadcaster owned by Telecom Italia, launched La7.tv the first TV catch-up service entirely developed and delivered in Italy. In the last months I collaborated with ValueTeam at the development of the encoding pipeline and the streaming optimizations used by the Telecom Italia Group for the project.

The service offers in on-demand the tv programs of the last 7-days (a BBC-style catch-up), with the most succesful videos stored in a “cult” section.

The service is entirely Flash based (Video Player with Dynamic Streaming, Dynamic Buffering, Fast Protocol Negotiation and Auto-Recovery) and leverage a FMS 3.5 based CDN.

The custom encoding pipeline is integrated in the broadcaster’s publishing work-flow and after only 15-20min after the end of the program in TV, the encoded files (up to 3 Hrs long) are available for QA, inspection and meta-tagging.

The source, an high quality SD 16:9 video, is encoded in two bitrates: 600 and 350Kbit/s. Despite the low bitrate, quality is very good and is optimized for full-screen playback (see screenshots below). Obviously, a low bitrate to produce a target quality means lower costs and higher reach (see also my latest Adobe MAX presentation).

A lot of you asked me how to have the pdf of my Max presentation. The answer is simple: here you find the .pdf and here the entire session recorded during Max. Thank you all for the support and the appreciations received before, during and after the conference.
Many thanks to Kevin Towes, Desiree Motamedi and all the Adobe’s staff for have made my presentation at Max possible.

Only two weeks and the 2009 edition of Adobe MAX will begin.
This year in my presentation ( Encoding Best Practices for H.264 Video Using Flash. 5 October 2:00 pm Room 506 ), I’ll focus on changing the usual encoding perspectives proposing a dynamic approach in choosing the best resolution – bitrate mix.
Dynamic is always better than static, you know, and it’s much better in video encoding: reach a wider audience and save money!
I’ll also focus on playback best practices (full-screen acceleration, visual enhancements and so on).

Because a picture is worth a thousand words, today I want to show you the real potentialities of these approaches…

Me vs YouTube

In this test, I have downloaded from YouTube the video clip of Bjork “All is Full of Love”. This clip is an High Definition video encoded by YouTube in 720p (1280×720) at around 2Mbit/s.

The same (already encoded) video has been then re-encoded at a much lower bitrate of 500Kbit/s with a specific set of encoding parameters and pre-processings. I think the result is very good, and strangely enough, somewhat better than the original.

How can 500Kbit/s be better than 2000Kbit/s ? How can the quality improve re-encoding a damaged video at a quarter of the original bitrate ? Well, you are all invited at MAX 2009, 4-7 October in Los Angeles to discuss this and other topics.

About Me

Hello, my name is Fabio Sonnati. I'm an independent ICT consultant with a master degree in Electronic Engineering.
My main expertise is the development / optimization of video delivery platforms and encoding solutions. I have contributed to several video projects on the Internet and collaborated with big names of the industry, start-ups and companies ranked in Fortune500.
This is the blog where I want to share my knowledge and best-practices about Streaming and Video Encoding. More About Me