Are video codecs written in JavaScript really the future?

Describing it as "the future," Mozilla has been showing off ORBX.js, a video codec roughly comparable to the industry-standard H.264 that can be decoded entirely in JavaScript.

ORBX.js was developed by a company called OTOY. OTOY's major product is Octane Render, a 3D renderer that works exclusively on NVIDIA cards using CUDA. Working with customers including Autodesk, OTOY has developed technology to allow applications such as Autodesk's 3ds Max modelling software to be accessed over the Web without using plugins.

3ds Max streamed to a browser

Central to this is the ORBX codec. The codec allows efficient, real-time encoding on GPUs, and can be decoded in JavaScript. The JavaScript decoder works in all modern browsers, both desktop and mobile, though with differences depending on the browser features available.

Like most video codecs, ORBX distinguishes between video frames that are encoded in their entirety—"I-frames"—and those encoded as deltas relative to another frame—"P-frames" and "B-frames". I-frames can be decoded independently of one another, but P- and B-frames can only be decoded when their surrounding frames have also been decoded. ORBX, again like most other video codecs, can be used to create video streams made up entirely of I-frames, or streams that use a mix of I-frames and P-frames. I-frame-only modes don't compress as efficiently, but are simpler to decode.

For browsers including Internet Explorer 10 and Safari on iOS, ORBX is used in I-frame-only mode. For other browsers, including Firefox and Chrome, it uses a more conventional mixed mode. That's because the mixed mode depends on WebGL for part of its decoding. I-frames can be encoded entirely in JavaScript, but P-frames require the use of shader programs due to their greater complexity. Internet Explorer 10 and Safari on iOS don't support WebGL, and so can't be used to run shader programs. As a result, they use about twice as much bandwidth for the same level of video quality.

OTOY plans to develop ORBX to make it suitable for a wider range of applications than existing codecs like H.264. The company wants to include features such as support for floating point data, to make it suitable for both high dynamic range imagery and video, and to make it usable for storing 3D data such as depth buffers.

Its encoding efficiency is broadly comparable to H.264—the company says that it's 25 percent "better" than H.264 in some tests, but in others H.264 pulls ahead. The new HEVC/H.265 codec has better encoding efficiency than both H.264 and ORBX (it produces smaller files for a given level of quality), but requires far more computational power to encode and decode.

Being designed for streaming live apps, the ORBX encoder boasts low latency. To prove this point, the company has demonstrated the game Left 4 Dead 2 being streamed to a browser.

Left 4 Dead 2 on a PC, playing in Firefox on a Mac.

Possibilities for ORBX

OTOY describes a number of future ideas for ORBX. One plan is for more powerful remote application support, not merely streaming a video of an application (similar to the way VNC provides remote application support), but streaming the graphical API calls that an application makes, using client-side WebGL to actually perform the drawing. This reduces the load on the server end, and can reduce bandwidth requirements substantially.

Another idea is to use ORBX for DRM-free streaming video. OTOY believes that ORBX's efficient on-GPU encoding makes it practical to embed within each video stream a per-user watermark. Watermarked video wouldn't be DRM-protected, and so would be relatively easy to record and redistribute, but the watermarking would make it possible to determine who released a given video, in turn opening the door to pursuing the original pirates.

Mozilla's interest in ORBX is obvious. Mozilla is aggressively promoting the Web as the platform to replace all other platforms. This is the entire point of its Firefox OS smartphone platform: that rather than developing apps tied to Android or iOS or Windows Phone or BlackBerry, developers should write apps for the Web, using JavaScript, HTML, and related technologies.

Demonstrations of video decoding in JavaScript serve to bolster Mozilla's argument that JavaScript performance is good enough to make it usable for almost every application—even those traditionally regarded as "compute-intensive." Mozilla itself has experimented with JavaScript-based video decoding with its Broadway decoder. This was fast enough to decode H.264 at 30 frames per second in JavaScript alone.

The alternative to DRM is also attractive to an organization that promotes the open Web. DRM capabilities are set to be included in browsers, a decision that isn't universally popular among open Web proponents. The use of watermarking instead of DRM has some precedent—iTunes Music Store, for example, made a successful transition from DRM to watermarks—and could be pushed as a demonstration of the lack of necessity for DRM in browsers.

But is this really "the future" as Brendan Eich, Mozilla CTO, greeted it? That's harder to see.

Challenges remain

ORBX looks very cool for certain applications, but as a demonstration of the power of "openness" it's currently a little lacking. Organizations such as Mozilla may not like H.264's royalty-bearing status, but it's nonetheless an open, documented, published standard. ORBX, however, is none of these things. It's a proprietary algorithm created by a commercial organization. We spoke to OTOY, and the company told us it has no immediate, specific plans to publish it openly. That's not to say that the company is necessarily opposed to doing so; rather, it's not sure how to do without undermining its commercial products.

Even apart from this detail, using JavaScript for video decompression is problematic. After many years of refusal, Mozilla decided last year that it would start supporting H.264 compression in the HTML5 <video> tag.

The company was ideologically opposed to this, due to H.264's patents and royalties, but came to realize that it was a practical necessity, especially for Firefox OS. H.264 video is abundant, and is unambiguously "the winner" of all the current major video codecs. Consequentially, H.264 is widely supported in hardware, enabling battery-efficient hardware-accelerated playback of H.264 video. For Firefox OS to be viable, it had to support this codec and this hardware.

H.264's abundance and hardware support are no less important this year than they were last. Video codecs written in JavaScript may be technically impressive, but they mean eschewing that hardware support. The use of WebGL and shader programs may improve the power efficiency somewhat, but they are still inferior to the dedicated motion video hardware found in all modern GPUs. It's unlikely that mobile users or video distributors are going to be willing to give up this power advantage any time soon.

The viability of watermarking as an alternative to DRM is also speculative conjecture rather than identification of a genuine industry trend. In spite of the changes in the audio market, the video market has remained firmly in favor of DRM. The audio and video markets have significant differences in terms of both production and usage model, and it's not clear that what works in one market will necessarily work in another.

Nonetheless, aspects of the OTOY demonstration probably are indications of what the future will hold. Sony bought cloud gaming company Gaikai last year, and there's speculation that it'll use Gaikai-powered cloud gaming as its backwards compatibility solution for the PlayStation 4. Cloud gaming pioneer OnLive almost went out of business last year, after failing to attract sufficient customers, but a comparable service that's built-in to an existing games platform could fare rather better—though latency problems are likely to be a permanent fixture.

Similarly, streaming applications are likely to continue to remain important to certain niche users, but it's much less likely that they'll ever be the solution to mainstream application demands. Again, latency is one of the insurmountable difficulties.

The ORBX demo is clever technology, and it shows just how far JavaScript performance has come since browser developers started making it a priority. But as a vision of the future, it's less compelling. Using JavaScript for video is cool, but for most users, not using JavaScript for video will be better.

124 Reader Comments

I think we should focus less on writing everything in JavaScript and more on bringing more languages to the browser.

A proper standardized bytecode for browsers would (most likely) allow developers a broader range of languages to choose from as well as hiding the source code from the browser/viewer (if that's good or not is subjective of course).

Binaries are also often significantly smaller than source code, reducing bandwidth.

I sincerely feel the burden of support that these devs are under. Their code forks must keep them awake at night and managing something as complex as a js video codec must be daunting.

That said, I can't help but wonder why JavaScript is the de facto standard that everyone is aiming for. I understand that getting away from plugins is a good thing, but if every browser runs their own proprietary javascript engine, isn't that really just another shade of the same color? It thusly seems like we're aiming one level of abstraction too high with a JS codec, indicated by the additional processing power required by the JS abstraction engine. I know I'll get fragged for suggesting a compiled plugin framework might be better for this kind of thing, but until every browser is running the same JS engine, that's essentially what we're looking at anyway...right?

I think we should focus less on writing everything in JavaScript and more on bringing more languages to the browser.

A proper standardized bytecode for browsers would (most likely) allow developers a broader range of languages to choose from as well as hiding the source code from the browser/viewer (if that's good or not is subjective of course).

Binaries are also often significantly smaller than source code, reducing bandwidth.

++My thoughts exactly...however, when we step back and look at how well "standardized" has worked for us in the past, it doesn't paint a pretty picture -- we need not look any further than the differences in javascript implementations across today's most popular browsers to find evidence that "standardized" is a meaningless term.

More to your point though, I thing you're right. A standardized bytecode would allow web developers the same freedom that .NET developers enjoy within their own framework, enabling a variety of languages that get interpreted the same way by a CIL. That'll be the day

I think we should focus less on writing everything in JavaScript and more on bringing more languages to the browser.

A while back there was an attempt to bring python to Firefox. I don't think this ever made it to the mainstream browser.

I think something like this should go through W3C and it should be a bytecode. This enables developers to write compilers for numerous languages and the responsibility of which languages are supported on the web is transfered from the browser developers to whoever wants to.

Shaders to decode? Clever.. but I can't help thinking that you wouldn't need to involve the GPU if a proper language (with access to SIMD primitives) had been used. Operating on 16 bytes at once really makes a difference.If we can get that language into the browser, so much the better.Just to throw a random idea out there: LLVM bytecode. That infrastructure already exists, and you get to use the ton of languages that already have a frontend for it (and more in the future, I'm sure).

I really don't like this idea. I love JS, but I know it's limitations. I recently spent hundreds hours on a new framework in JS where most of my efforts went into trying to simulate OOP features in a language that has no clue. (hopefully they'll let me open source it)

I want JS to be replaced by something better and I don't trust JS to decode video. I highly doubt that experience will have consistent performance results across all platforms.

The viability of watermarking as an alternative to DRM also speculative conjecture rather than identification of a genuine industry trend.

I don't understand why people still think that is what DRM is for. DRM is not intended to prevent piracy, regardless of how much industry associations campaign to the general public as to that purpose. DRM is intended to control the use of content by the legitimate customer. Watermarking only serves to identify who leaked the content, and allow those individuals to be charged, combating the source of piracy directly, but it does nothing to replace the goals of DRM.

so everyone commenting wants a bytecode for the browser to support additional languages?twenty years of internet history have proven that the best balance between security & usability is to have a restricted runtime integrated into the browser, executing some (far from flawless) standardized interpreted language

that is EXACTLY what mozilla's asm.js promises...

standardized special syntax of js that run's as fast as java bytecode, works with little modification to current javascript engines... only drawback: it might just be the ugliest bytecode ever

I like what Mozilla is trying to do with the open-web-as-a-platform strategy, but we have to fix JavaScript first, because it isn't the foundation we want to build our open platform on.

JavaScript is in the process of being "fixed". ECMAScript 6 is going to be finished by the end of the year and will fix most ugly parts of the language, it'll provide classes and modules etc. The future of JS looks bright.

so everyone commenting wants a bytecode for the browser to support additional languages?twenty years of internet history have proven that the best balance between security & usability is to have a restricted runtime integrated into the browser, executing some (far from flawless) standardized interpreted language

that is EXACTLY what mozilla's asm.js promises...

standardized special syntax of js that run's as fast as java bytecode, works with little modification to current javascript engines... only drawback: it might just be the ugliest bytecode ever

As far as I know (but I'd love to be wrong about this), asm.js has no intention to support SIMD. It would merely be efficient scalar code, which is nice, but not quite enough.

"Even apart from this detail, using JavaScript for video decompression is problematic. After many years of refusal, Mozilla decided last year that it would start supporting H.264 compression in the HTML5 <video> tag."

You say javascript for decompression is problematic, but you don't say in what way it's problematic? I don't think you mean to say technically problematic, because you just wrote an entire article detailing how it's done. So in what other aspect is it problematic?

I like what Mozilla is trying to do with the open-web-as-a-platform strategy, but we have to fix JavaScript first, because it isn't the foundation we want to build our open platform on.

JavaScript is in the process of being "fixed". ECMAScript 6 is going to be finished by the end of the year and will fix most ugly parts of the language, it'll provide classes and modules etc. The future of JS looks bright.

Just how long do we have to wait until all browsers in use (not just current versions) support ECMAScript 6?

"Even apart from this detail, using JavaScript for video decompression is problematic. After many years of refusal, Mozilla decided last year that it would start supporting H.264 compression in the HTML5 <video> tag."

You say javascript for decompression is problematic, but you don't say in what way it's problematic? I don't think you mean to say technically problematic, because you just wrote an entire article detailing how it's done. So in what other aspect is it problematic?

I think we should focus less on writing everything in JavaScript and more on bringing more languages to the browser.

That is exactly what many of us are working on, and of course your contributions are welcome (you said "we" in projects like Emscripten, Mandreel, asm.js, CoffeeScript, GWT, pyjamas, and the long list of other languages that compile to JS.

There is no need for a bytecode format for any reason that I can see, but please tell me why you think one is necessary. I am doubtful because we can already compile entire games, like the Epic Citadel demo from last week, into JS, without any new bytecode format.

I do agree there is intuitive appeal to the concept of bytecode, but in practice, it doesn't seem to be necessary.

While it's cool in a "someone actually paid some people to do this, and it kind of works?" way, I don't see this as being a real solution to anything. I just don't see the advantage over a proper native code standardized video codec. The patents and licenses are the barriers to codec adoption, not the fact that they are native code.

I also despise javascript as a language and wish someone would hurry up replacing it with a bytecode so we can use decent languages again. The applications I'm writing for the browser, and that I'm seeing written by other companies, are far too large and complex for javascript's freewheeling scripted style where anything goes, until it just doesn't work. We need a better way to run code on the client side of a browser.

More to your point though, I thing you're right. A standardized bytecode would allow web developers the same freedom that .NET developers enjoy within their own framework, enabling a variety of languages that get interpreted the same way by a CIL. That'll be the day

I agree, HTML6 should be something like Silverlight / Moonlight, with support for something XAML-like for creating UIs. Document object model (DOM) is for document layout, but making an UI out of a little bit fancier Word features is an ugly hack. JS can be used as an assembly language, but it is woefully inefficient at this.

I guess now main mobile (and also desktop) vendors want developers to create native apps. It seems they deliberately want webapps to stay in the stone age and inconvenient as possible, both for users and developers. Apple surely wouldn't want a native-like webapp, but also Microsoft tries to preserve its OS monopoly.

For anyone complaining about js or wanting other languages in the browser. Go spend some time in the es-discuss mailing list. You'll see how difficult people and companies have made it to further js.

It's not that I'm opposed to other languages in the browser. It's that they'd never be implemented by all browsers like js has.

edit: And for everyone screaming "bytecode!" The complexities of creating a language for the browser extend past what this could achieve. So then you'd need a second standard defining what operations would/wouldn't be allowed.

I really like JS and I really don't understand all the negative comments.

I may not want to write a video codec in JS, but then I don't want to write a video codec in any language. I definitely don't want to write one in C.

Bytecode just makes things messier, and no one in their right mind wants to mix dozens of languages together in the one project. Meta languages like coffee script and the ten other so called "better" versions of JS mostly fail to improve anything in a meaningful way, it's just tinkering.

When it comes to big apps, I do like the option of being able to compile down to JS rather than embedding a more bloated language into the browser. I would rather see the core language adopt more features for this style of use case, both performance wise and features like native sets etc.

That said web apps need simpler layout options and more flexible widgets too.

Am I the only one having a hard time seeing the point to this demo? I've been able to watch video being rendered by the browser for quite awhile now thanks to HTML5 video support. What does this codec provide that the the existing supported codecs don't?

This sounds similar to someone saying "I've written an HTML renderer in javascript that runs in a browser."

I fully understand why Mozilla would want to move more and more stuff into the browser, the browser being their core product. I don't agree with most of it though. While technically impressive, I don't really see a lot of benefits to a PDF viewer or a video decoder implemented in JavaScript and running in the browser. I mean, theoretically I can think of some advantages (easy to update, write-once-run-anywhere, plugs a few possible security holes), but those are mainly advantages for developers, not users, and are IMO far outweighed by disadvantages (less efficient, introduces *other* possible security holes, don't integrate with the OS as well, cannot make use of OS/hardware specific features, need a full browser environment to run, don't translate well to hardware solutions etc).

I really think Mozilla is trying too hard, and focussing on the wrong problems. Instead of trying to move _everything_ into the browser, trying to get the whole world into some form of cloud-based computing running in the browser, they should focus on those applications where cloud computing actually makes sense, and those applications that actually benefit from write-once-run-anywhere. There are plenty. PDF viewers, image and video decoders, etc. are not things Mozilla should be worrying about, every OS has these covered, and it wil allways provide a much better experience for such tasks.

Mozilla's whole premise is based on the pipe dream that in the future, we will all have dumb clients that only run or display stuff streamed from some central server or whatever. This idea has failed so many times in the past, it's really quite amazing great companies or organisations such as Mozilla are still clinging to it. Cloud-computing has so many useful applications, but the all-or-nothing drive or hunger or whatever it is to scale it to all kinds of tasks that don't actually suit it, kind of defeats the whole idea.

In the early 2000's, I thought the idea of playing video through Flash was absurd. (For those who don't remember, even a 200MHz-ish computer could play watchable video using the dedicated video-player plugins of the era; you needed a system three or four times faster to do the same chore in Flash.)

Obviously, it wasn't as absurd as I thought.

What I've learned: Given enough time, Moore's Law and widespread adoption are a more powerful force than efficiency.

it actually makes the most sense to port everything to Q-BASIC and then run it under that Q-BASIC emulator some chap wrote in javascript. you guys know nothing about coding efficiency and O(n) and technical things, jeez.

I really don't like this idea. I love JS, but I know it's limitations. I recently spent hundreds hours on a new framework in JS where most of my efforts went into trying to simulate OOP features in a language that has no clue. (hopefully they'll let me open source it)

I want JS to be replaced by something better and I don't trust JS to decode video. I highly doubt that experience will have consistent performance results across all platforms.

Then you could have saved yourself a lot time by simply getting a clue and learning the "prototype" OOP model the language uses.

The "class" model is not the only OOP paradigm, it's merely the most popular. And quite frankly, for good reason too. Even though the prototype model is certainly able to handle most OOP constructs, class oriented languages tend to have an inherent syntactic organization that makes code easier to scale and maintain. That inherent organization only works if you follow the rules, of course. Nightmares can be coded in even the strictest class oriented languages.

i worry that someday, no one will actually remember how to write C, let alone assembler. everyone will have grown up writing "apps" and be all, "pointers? what?"

then the googapplamazon server will crash after 100 years of flawless operation, and no one will know how to fix it, since it wasn't written in javascript. the lights will go out, power plants will melt down, frozen peas will thaw, etc.

Shaders to decode? Clever.. but I can't help thinking that you wouldn't need to involve the GPU if a proper language (with access to SIMD primitives) had been used. Operating on 16 bytes at once really makes a difference.If we can get that language into the browser, so much the better.Just to throw a random idea out there: LLVM bytecode. That infrastructure already exists, and you get to use the ton of languages that already have a frontend for it (and more in the future, I'm sure).

Silverlight has long had support for video and audio decoders in managed code (.NET bytecode), including pixel shader support. Silverlight pixel shaders just ran on the CPU, but got translated into SIMD code. As you anticipated, bytecode + SIMD wound up giving a huge performance boost even within the sandbox.

Overall, code ran about half as fast as a C++ implementation of the same encoder/decoder on the same hardware. Which isn't amazing, but sure better than what could be done with JS.

In the early 2000's, I thought the idea of playing video through Flash was absurd. (For those who don't remember, even a 200MHz-ish computer could play watchable video using the dedicated video-player plugins of the era; you needed a system three or four times faster to do the same chore in Flash.)

Obviously, it wasn't as absurd as I thought.

What I've learned: Given enough time, Moore's Law and widespread adoption are a more powerful force than efficiency.

That was true until Flash eventually added support for hardware decoding, which is even more efficient than doing it through a plugin.

I really don't like this idea. I love JS, but I know it's limitations. I recently spent hundreds hours on a new framework in JS where most of my efforts went into trying to simulate OOP features in a language that has no clue. (hopefully they'll let me open source it)

I want JS to be replaced by something better and I don't trust JS to decode video. I highly doubt that experience will have consistent performance results across all platforms.

Seems like a pretty heavy-handed remark. JS *is* object-oriented, albeit in a way that is much different from its C-like counterparts. The trick is to figure out how to do it the "javascript" way instead of the "C" way. In regards to JS replacement, there are some pretty kickass options available today. Dart and TypeScript are both pretty kickass ("languages"?) that bridge the gap between JavaScript's prototypical nature and the terseness of C/C++/C#/etc etc. Worth taking a look at the next time you approach a big web-app project .

In the early 2000's, I thought the idea of playing video through Flash was absurd. (For those who don't remember, even a 200MHz-ish computer could play watchable video using the dedicated video-player plugins of the era; you needed a system three or four times faster to do the same chore in Flash.)

Of course, Flash's codecs were always optimized binaries compiled into the .dll, not something that ran in ActionScript.

Quote:

What I've learned: Given enough time, Moore's Law and widespread adoption are a more powerful force than efficiency.

In the video world, Moore's Law generally gives us continuous improvements in encoding efficiency, higher definition playback, and about once every decade a new bitstream format. But more decoding on PCs is done with ASICs or fixed-function units on GPU than ever before.

Warren Schwader sent the game to Ken Williams, who was impressed with the logic and with the graphics, which gave a clear, sharp picture of each card dealt. What was even more amazing was that Schwader had done this on the limited Apple mini-assembler.

It was as if someone had sent Ken a beautifully crafted rocking chair, and then had told him that the craftsman had used no saw, lathe, or other conventional tools, but had built the chair with a penknife. Ken asked Warren if he wanted to work for On-Line. Live in the woods. Boot into Yosemite. Join the wild, crazy Summer Camp of a new-age company.

--Hackers, Steven Levy, Chapter 17.

in the context of things that aren't DOM munging, i feel javascript is that penknife.... just as the Apple II's mini-assembler was best used as a bootstrap system to load a real assembler. it's impressive to see the hacks people pull off with javascript, but they're not good as production solutions.

I think we should focus less on writing everything in JavaScript and more on bringing more languages to the browser.

A proper standardized bytecode for browsers would (most likely) allow developers a broader range of languages to choose from as well as hiding the source code from the browser/viewer (if that's good or not is subjective of course).

Binaries are also often significantly smaller than source code, reducing bandwidth.

The problem with standardising on a bytecode is that you are restricting how the browser optimises the JavaScript code (see e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=646597). It requires you to support the opcodes you publish and keep them stable (see the same bug 656597 which mentions Mozilla wanting to drop the GETGLOBAL opcode).

You also have the problem of what bytecode to standardise on -- each JavaScript engine will have a different set of bytecodes with different semantics. All engines will need to agree on the bytecode to use. This will cause issues as the opcodes and their semantics will be designed to help the optimisation paths on each engine (e.g. adding/recording type information for Mozilla's engine).

There are also other considerations as the string representation differs between engines (V8/Chrome has an ASCII string variant; Mozilla keeps them all in UTF-16) and type representation (e.g. Firefox has "fatvals" that are 64-bit value types with 32-bits for the type and 32-bits for the value; 64-bit doubles take advantage of the representation of NaN values -- see "Mozilla’s New JavaScript Value Representation" from http://evilpie.github.io/sayrer-fatval- ... e.aspx.htm).

If the bytecode is binary, you have endian issues, floating point representation issues, etc. You will end up with something like Flash -- after all Flash has two JavaScript-like bytecodes (v1 [http://www.m2osw.com/swf_actions] and v3 which is based on Mozilla's Tamarin project), and you know how secure Flash is . Note also how flash is supporting the old bytecode format for ActionScript and the newer (as of Flash v9) ActionScript3 which is based on the old Mozilla JavaScript bytecode!

The Mozilla team have been working on asm.js which allows you to use a subset of JavaScript that has well defined semantics and behaviour, working like an assembly language. Tools like Emscripten can target this subset and the JavaScript engines can compile it to near-native performance. This is more likely where a "bytecode" will be specified as it does not constrain an engine to a specific bytecode format and the asm.js code is JavaScript, so it is engine neutral.