Foundations of Open Media Software (FOMS) 2013

Last week was the 6th annual meeting of Foundations of Open Media Software (FOMS), a yearly unconference where engineers working on video-related software get together and discuss future standards and video technology. Topics include browser technology and specifications, video formats, and more.

The biggest discussions at FOMS this year were around captions and subtitles (WebVTT), adaptive streaming (through Media Source Extensions), DRM (through Encrypted Media Extensions), new codecs, and real-time communication (WebRTC).

More information and session notes can be found at foms-workshop.org, but below are some of the notes compiled from Brightcove attendees.

Media Source Extensions

The media source extensions will make adaptive streaming (HLS, DASH) in HTML5 video possible by allowing developers to manually control which bytes of video are played in a video element. The API was first proposed at FOMS two years ago, and the Google Chrome team has continued to refine and implement the spec.

Current status:

Chrome: Working implementation using a slightly older API, supports WebM/VP8/VP9/Vorbis and H264/AAC/MP3

Internet Explorer: Working implementation in IE11, supports H264/AAC

Firefox: No support yet but working on it

Safari: No support. Apple engineers hinted that they are working on it, but desktop only.

Most interesting notes:

YouTube is using MSE in production now for Chrome and IE11 users. It gives them greater control over buffering and adapting, and has helped them increase “watch time”, their primary metric for A/B testing player changes.

MSE doesn’t care about mixing container formats, so you could play VP9 video alongside AAC audio, or H.264 video with Vorbis audio. This happens on YouTube today, depending on which files are cached.

Encrypted Media Extensions

EME is the proposed method for providing content protection in HTML5 video.

Current Status:

The spec is still being defined, but is in “last call”. Currently in “working spec state”.

It requires a key server that’s not available to the public yet.

Chrome: Supported (vendor prefixed, older spec). It’s available in millions of TVs via Chrome, and also supported by ChromeCast.

IE11: Supported (vendor prefixed, current spec)

Safari: Has an implementation, some things are exposed in Mavericks

Firefox: No word

Most interesting notes:

It requires the Common Key format, which allows a video to be encrypted once and decrypted by any DRM vendor. This does however make it incompatible with any existing encrypted files not using this method.

Youtube, Netflix, Chromecast and Google Play are all using EME in production. It’s replacing Flash Access in YouTube.

You can try it today (without the special server) using clearkey. http://simpl.info/eme/clearkey/

WEBVTT (Captions in HTML5 Video)

WebVTT is the format for providing captions (and other timed text) in HTML5 video. It is not yet up to FCC standards, and much of the discussion at FOMS was around how to get it there. Also no current provision for doing Live captioning.