HTML 5 and Web video: freeing rich media from plugin prison

DailyMotion and Google are both experimenting with the HTML 5 video element …

The expressive power of the Web is largely made possible by open standards. HTML, the vendor-neutral markup language that serves as the underlying foundation of the open Web, helped to foster the culture of interoperability and inclusiveness that have made the Internet a success. HTML 5, the next iteration of that standard, could bring the same degree of empowerment and interoperability to rich media and other kinds of Web content.

Although HTML 5 is still in the draft process and has not yet been ratified by W3C, the nascent standard is gaining significant traction. Browser makers are implementing key features of HTML 5 and bringing robust support for some of its most advanced capabilities to end users. A growing number of prominent companies that deliver content and services on the Web are putting their weight behind HTML 5 and touting it as the way forward for building interactive Web applications and deploying rich media in the browser.

Video is one of the most significant areas where this trend will have a major impact. Some of the giants of Internet video are exploring standards-based solutions as means of breaking free from the constraints imposed by proprietary browser plugins. During the Google I/O conference last week, the search giant demonstrated a YouTube mockup built with HTML 5. In addition to using the HTML 5 video element, it also uses new HTML structural elements and other features introduced in the upcoming version of the standard. The demonstration illustrates how open technologies can be used to deliver a high-quality user experience for streaming video playback.

Another video titan that is fighting back against plugin prisons is DailyMotion. The popular streaming video website has launched an open video pilot program, providing a new beta version of its site that uses the HTML 5 video element to play content. As part of the pilot program, DailyMotion reencoded 300,000 videos with the open source Ogg Theora codec. Unlike many common video formats, Ogg Theora is not encumbered by known patents. It can be used and reimplemented freely without having to pay licensing costs.

Advantages of open video

For content providers like YouTube and DailyMotion, the HTML 5 video element offers numerous advantages. It integrates seamlessly with conventional HTML content and can be manipulated with JavaScript and CSS. This enables Web developers to build video player interfaces that are more consistent with the rest of their website. The ability to control playback with JavaScript allows video to be a more native part of the user experience in interactive Web applications.

Mozilla has crafted some compelling demos to show how the HTML 5 video element can be used in innovative ways with other Web standards. When we looked at Firefox's first steps towards implementing the HTML 5 video element in 2007, we linked to a demo in which video is natively rendered on SVG elements that can be interactively rotated, moved, and resized within a page while the videos are playing.

The more recent demonstrations are even more impressive. At the Southern California Linux Expo (SCALE) in February, Mozilla's Chris Blizzard showed how to use JavaScript worker threads to programmatically detect and highlight motion in video as it is playing. The HTML 5 features required to implement these demos will all be available in the upcoming Firefox 3.5 release. This can all be done with real JavaScript—no browser plugins or third-party programming languages are required.

In addition to the flexibility inherent in the technical benefits of the HTML 5 video element, open video also guarantees freedom from lock-in. The standard will be advanced collaboratively through inclusive processes, which means that all stakeholders have the ability to participate and are not beholden to any specific vendor. The availability of multiple interoperable implementations of the HTML 5 video standard, including some that are distributed under open source licenses, is paving the way for a more vibrant and healthy ecosystem in which no single company has complete dominance of the technology.

Video is becoming increasingly important on the Web, and it's becoming clear that content providers, browser vendors, and end users can no longer afford to have the primary delivery mechanisms for video locked up in an opaque binary blob that can't be improved or adapted to work in new environments. This is especially true in light of the growing relevance of mobile Web technology. It will be possible for anyone to adapt existing open source HTML 5 video implementations so that they will work optimally in environments with unusual resource constraints, form factors, or input mechanisms—the same cannot be said of Flash.

Flash's suboptimal performance and lack of reliability on Linux and Mac OS X (and arguably the absence of Flash on the iPhone) are emblematic of its limited portability. Browser and platform vendors can't fix Flash, but they do have the ability to ensure that standards-based open video technologies deliver the best possible experience in their own software. It's not hard to guess what approach they will favor for video delivery.

Challenges ahead

Although standards-based video solutions have the potential to deliver enormous benefits, the path to liberation will not be quick or easy. The current generation of open video technologies has a lot of limitations that will be difficult to address.

Microsoft's slowness to adopt emerging standards is probably the biggest hurdle that is impeding adoption of the HTML 5 video element. Microsoft is still struggling to implement long-standing Web standards, so it seems unlikely that the software giant will jump on board with a highly complex emerging standard that is still in the draft stage. Microsoft also has some competitive interests on the table that conflict with standards-based video efforts. Specifically, Microsoft is pushing its own Sliverlight browser plugin as an alternative.

Aside from Microsoft, virtually every other browser vendor already has an HTML 5 video implementation or has publicly announced plans to develop one. Microsoft's dominant marketshare, however, largely deflates the value of that widespread support in the broader browser ecosystem. There are some factors that could potentially push Microsoft into action, but it's hard to imagine it happening any time in the immediate future.

If enough Web content developers adopt HTML 5 video and use it to deliver a better experience to users of competing browsers, Microsoft might have an incentive to join the party. One source of hope is the fact that Microsoft isn't totally ignoring all emerging Web standards. The Redmond behemoth has already implemented some nice HTML 5 features, such as persistent client-side storage, which is supported in Internet Explorer 8.

Another enormous problem with HTML 5 video is the lack of consensus on codecs. Attempts to enshrine Ogg as the default format of HTML 5 multimedia in the standard itself fell flat due to dissent from various participants in the standards process. Without the guarantee that at least one codec will work universally in all HTML 5 video environments, many content producers will be reluctant to adopt the standard. Mozilla and some other players are still pushing hard for Ogg. Firefox 3.5 will include fully functional Ogg codecs so that playback of Ogg video will work out of the box. It's unclear, however, if other browser vendors will follow that path.

Enthusiasm for Ogg among some content providers could eventually help unify the browser development community around the codec. DailyMotion clearly has a very strong commitment to the format. The Wikimedia Foundation, the organization behind the popular Wikipedia website, is also a vocal advocate of Ogg. It's worth noting that media content on Wikipedia is virtually all encoded in Ogg formats.

Although Ogg might eventually be able to gain sufficient traction to provide a universal target for HTML 5 video, the format itself still has a lot of deficiencies. Ogg Theora can't yet match the quality of competing patent-encumbered formats. In its announcement about the launch of its open video pilot program, DailyMotion acknowledges that its Ogg-encoded content has lower video quality and suffers from occasional audio crackles.

Providing additional resources to Ogg implementation developers is really the only solution to this problem. Theora reached 1.0 status last year, but still has a long way to go. Mozilla recently contributed $100,000 to fund Ogg development. This grant, which will be managed by the Wikimedia Foundation, will help pay Ogg-backer Xiph.org to continue pushing forward the codec. Status reports look promising—the experimental Thusnelda branch is becoming much stronger and fares well in some benchmarks.

Other unencumbered free software codecs, such as the BBC's wavelet-based Dirac, could deliver competitive high-quality video support in the future. The BBC's original reference implementation is said to be impractical for real-world use, but the BBC is also funding the development of a real-world open source version called Schrodinger that is becoming quite mature. According to some experts, Dirac has the potential to deliver encoding quality that is comparable (or maybe even superior) to H.264.

It's obviously going to take time for free codecs to catch up, and the proprietary options will likely move forward during that time. In order to avoid getting trapped in a perpetual following position, open video stakeholders are going to have to make a really strong effort to address the disparities in quality.

Conclusion

Open standards have the potential to reshape the way that video is used and deployed on the Web. Rising support for the HTML 5 video element is a promising sign that the major stakeholders recognize the rewards of unchaining multimedia on the Web. Google, Mozilla, Apple, DailyMotion, and others are taking steps to make open video a reality, but there are going to be many barriers to overcome along the way.

The broader HTML 5 specification draft has a lot to offer for Web developers. We took a look at some of the features last year when the W3C published a working draft. Many of these features will also help reduce dependency on browser plugins and will help make the Web a richer platform for application development.

The demonstrations at the I/O conference on Wednesday show that Google is very serious about bringing HTML 5 to the masses. The enormous popularity of the company's Web services makes Google's endorsement of HTML 5 deeply meaningful. Google has real leverage to put behind HTML 5 and will likely play an important role in making it ubiquitous.

well, when it comes out, there will be a simple solution to the ie problem:

YouTube just puts "Sorry your browser does not support this site, please take you pick of these fine browsers: firefox, chrome, safari, opera [generated in a random order each time to prevent antitrust issues]" i think even the threat of that would cause MS to conform to the new standard.

I like MS and everything, but were done with the whole "standards compete against our own product so we are going to drag our feet..."

I'm using Firefox 3.5 and it appears as though that open video site is opting for flash first >_<

Also.. yes.. please, let's let adobe focus on photoshop et al and evolve beyond the garbage heap that is flash. It seems like mobile browsers would also immensely benefit here. Hopefully as more and more smart phones are hitting the web it will also push out flash.

Originally posted by Joshmx:I like MS and everything, but were done with the whole "standards compete against our own product so we are going to drag our feet..."

Arrrgh! THIS kind of behavior is EXACTLY WHY YOU SHOULD NOT "LIKE" M$!!!!!M$ has been doing this forever with all sorts of standards (anyone remember DCE/RPC and MSRPC?), and they will continue to do it as long as they possibly can, it is an essential part of their business model.

"It integrates seamlessly with conventional HTML content and can be manipulated with JavaScript and CSS."

While I am majorly excited about the idea of video being treated as its own HTML element proper (img has been a tag for a while now, and video should really be getting just as special treatment), the grand sum total of what I got out of this sentence is this:

Video enclosed in blink tags.

It's not too late to stop it, I suppose, as blink really should've been deprecated a while ago. Nonetheless....this could be fun.

Eh. Dirac's quality is higher than Theora's but it requires more horsepower. For your average youtube, I think Theora can rule the day. However, there needs to be a way to add support for H.264 or $UBERCODEC_OF_TOMORROW.

While it would be nice to have an open format for all videos on the internet, it's probably not going to happen, and that horrible .NET Abortion "Silverlight" will probably end up being used more and more, after all, we all know most Windows users blindly answer any and all dialog boxes to the affirmative, especially if it means it's annoying/stopping them from watching the latest alcoholic teen doing something stupid.

Specifically, Microsoft is pushing its own Silverlight browser plugin as an alternative.

This is just ridiculous. The community puts forth an awesome, technically sound method to move beyond the Flash era. Meanwhile rather than get onboard Microsoft instead decides to reinvent Flash, just like it often does.

Way to be two steps behind, MSFT.

Now I am not anti-MS, I was beginning to have real hopes that IE8 was going to signal a real commitment to standards support. I guess the internet is just a big service bus for Windows.

Originally posted by cuvtixo:THIS kind of behavior is EXACTLY WHY YOU SHOULD NOT "LIKE" M$!!!!!

"M$" is so yesterday. As of the latest patent sabre-rattling, their name is spelled "MICROS~1". Get with the program.

Anyway, reading the messages, apart from possible submarine patents that may surround Theora, Nokia is concerned with the lack of a hardware solution, ie. a mobile device with HTML 5 compliant browser would need to have a faster CPU if Theora would become widespread whereas with H.264 they can simply put an extra chip to do the decoding.

Originally posted by Joshmx:I like MS and everything, but were done with the whole "standards compete against our own product so we are going to drag our feet..."

Arrrgh! THIS kind of behavior is EXACTLY WHY YOU SHOULD NOT "LIKE" M$!!!!!M$ has been doing this forever with all sorts of standards (anyone remember DCE/RPC and MSRPC?), and they will continue to do it as long as they possibly can, it is an essential part of their business model.

Sites, especially the big ones, need to start aggressively pushing people to switch from IE, as long as majority of people use it MS isn't gonna do a damn thing about it. I'm fucking tired of building sites for 3 browsers - modern (i.e. FF, Safari, Chrome, Opera), IE7 and IE6.

I hope Google Wave becomes huge fast, not for the service alone (although that's fucking genius as well) but because it just won't fully work in IE and people will start switching browsers.

Actually, coding in Javascript can be a major pain, even with web development plugins in firefox like firebug. HTML is a mess, hence why we really needed CSS - which thanks to browser inconsistencies can also be a nightmare to get right. Who hasnt spent hours and hours trying to get a CSS layout working across browsers, or Javascript thanks to various issues with the DOM?

So while I don't feel as strongly as brownd4, I can see where he is coming from.

Originally posted by kingius:Actually, coding in Javascript can be a major pain, even with web development plugins in firefox like firebug. HTML is a mess, hence why we really needed CSS - which thanks to browser inconsistencies can also be a nightmare to get right. Who hasnt spent hours and hours trying to get a CSS layout working across browsers, or Javascript thanks to various issues with the DOM?

I'd rather program for 3 browsers than play silly ass games with Adobe and Microsofts plugins. Neither of them should be allowed to have any control whatsoever over content on the web.

If and If theora can prove it can provide quality Video with its current specification, i am all for it. But as far as i am concern that is not even technically possible.

And since Theora is patents encumbered ( there are huge amount of patents submarines that you could never be sure you are not ). Why not just use H.264.There are no such thing as Software patents anywhere else apart from US, Japan and UK ( ? ) H.264 already has a lot of Video Accelerated Hardware around. And it provides MUCH better video quality then Theora can ever achieved. And Wavelet codec like Dira and Snow will still need another few years. ( That is if people are still working on it, which they are not )

Can't wait for the HTML5 video/audio element implementations in browsers to be better performing. Currently, they require too much cpu (much more than flash), which makes the audio sound like shit and the video studder like crazy.

Originally posted by mrsteveman1:I'd rather program for 3 browsers than play silly ass games with Adobe and Microsofts plugins. Neither of them should be allowed to have any control whatsoever over content on the web.

You're free to do that, meanwhile everyone with a brain will build it just once with a rich toolset.

Originally posted by mrsteveman1:I'd rather program for 3 browsers than play silly ass games with Adobe and Microsofts plugins. Neither of them should be allowed to have any control whatsoever over content on the web.

You're free to do that, meanwhile everyone with a brain will build it just once with a rich toolset.

"rich toolset", perhaps you mean the security and performance clusterfuck that is Flash, and Microsoft's latest attempt to make everything dependent on them with Silverlight.

HTML5 video cannot come soon enough. Flash is unstable, underperforming, and doesn't know how to play nice with browsers that aren't Firefox. Firefox is becoming a behemoth that would rather write to its SQL database than actually render the page you're viewing or respond to user inputs. Theora, Dirac, whatever - as long as I have true options, I'm happy.

Lots of perspective. My long good post vanished into the ether, so here's my short version:

Codecs:Theora/Dirac/Snow/Vorbis are unlikely to get to even half the efficiency of the modern MPEG-LA codecs. So any savings in licensing costs are going to pale compared to the higher cost in bandwidth.

HTML5:People are getting a little prematurely excited here. The standard isn't done, and probably won't be complete for several years. And there's no shipping browser using any of it; there's some betas handling different subsets, but there's no video file that would even play in all the betas.

Are we likely to see Safari ever support Theora or Firefox ever support H.264?

Silverlight:Silverlight started well before HTML5; bear in mind Silverlight 3 is already in beta. It's better defined as a portable subset of .NET targeting rich internet applications.

And Silverlight is already capable of deeper media experiences than HTML5 is considering.

And Silverlight 3 will be able to host managed decoders, so the whole Ogg stack could run inside it in sandboxed bytecode.

Overall, I'd say it'd be eaiser to implement HTML5 wholesale in Silverlight than for HTML5 to be able to implement Silverlight-like experiences.

The browser model is great for lots of things, but media plaback requires tight synchronization of multiple decoders, demuxers, rendering, and a network stack. It's really not anything like what browsers have been designed for.