Chakra: Interoperability Means More Than Just Standards

August 26th, 2010

How do we decide whether to implement a feature that isn’t included in a standards specification? Like all browser providers, we often have to make this decision. In this post, I’ll use some real-world JavaScript examples to illustrate some of the principles we use to deliver an interoperable browser when the standards specification isn’t enough.

In an ideal world, a standards specification is the complete story. A JavaScript implementation would just include everything in the ECMAScript specification and nothing more. We believe that specifications published by standards organizations such as Ecma International or the W3C are essential for interoperability among web browsers. Ideally, such specifications tell the browser implementers what features they need to provide and tell web developers what features they should be able to use.

In the real world of the web, things are not so clear-cut. Specifications are seldom perfect and sometimes they are intentionally incomplete or ambiguous. Having served as editor for the ES5 specification, I know that there are always issues that don’t get fully resolved. The result is that there are widely implemented and used features that are not defined by any standard specification. If you are trying to build a browser that runs the existing web you have to implement many features that are not defined by a standards specification.

What’s the Real Regular Expression Grammar?

We built Chakra by carefully following the ECMAScript 5 (ES5) specification including the grammar for regular expressions. But when we started testing on actual web sites we started seeing pages not work because of syntax errors on some regular expression literals. For example, we failed on regular expressions containing a right square bracket such as:

next = /]/.exec(buffer);

The reason was that the grammar for regular expressions in the ECMAScript standard includes ] in a list of characters that you cannot directly use as a match character. The specification says that /]/ is illegal and instead you need to say /\]/. It turns out that it is important that [ is restricted in this manner because otherwise it wouldn’t be possible to parse literals like /axyz[a-d]qwer/. However a lone ] that isn’t preceded by a [ really doesn’t present a parsing problem. But the standard still says that it is illegal.

In practice, all browsers accept regular expressions such as /]/ and web developers write them.

Consensus feature is a term we use to describe such features that are not part of any standard but which are universally implemented in an interoperable manner. In general, IE9 and any other new JavaScript implementation need to implement such consensus features in order to work with actual web content. The real challenge is identifying them, as they are not included in the standards specifications.

We recognize that we have to support these sorts of consensus features and we will also be working to make sure that they get included in future editions of the appropriate web standards. However, not all JavaScript extensions are universally or uniformly implemented.

Why Doesn’t IE9 have those _ _ methods?

ES5 includes support for getter/setter methods. Similar features have been available in some web browsers for many years. Over that period, browser developers experimented with various syntaxes and APIs for defining such methods. Prior to ES5, the most widely implemented API for defining getter/setter methods used two methods named __defineGetter__ and __defineSetter__. Surrounding method names with two underscore characters is a convention that some browser developers use to identify methods that are either experimental features or which access unique low level capabilities of a specific JavaScript implementation.

ECMA TC39 members agreed that getter/setter methods should be included in the ES5 specification because their value had been clearly demonstrated. However, TC39 also chose to design a new API for defining them and to not standardize the __defineXXX__ APIs.

Why did TC39 do this? Because of significant differences in the behaviors of the browsers that had provided that API. Standardizing on a common semantics for these methods would mean that some or perhaps all existing browser implementations would have to change to conform to the new semantics. That would probably break some applications that were written to work with those specific browsers.

Another reason was to maintain consistent naming conventions. No other built-in names in ECMAScript begin or end with underscore characters. The association of such names with experimental or an implementation specific feature is widely known and using that convention for a standardized feature would be both misleading and would dilute the utility of the convention for future experiments.

TC39 developed a new API based upon the Object.defineProperty method. This API not only supports defining getter/setter properties but also support other new capabilities of ES5. The __defineXXX__ methods were left out of the ES5 specification.

Browsers that already provide the __defineXXX__ API are free to continue to supporting them unchanged for compatibility reasons. However, new code that is written to be interoperable among browsers supporting the ES5 standard should use Object.defineProperty to define getter/setter properties.

We still get requests that we add the __defineXXX__ APIs to IE9. We understand why. Some developers already have code that uses those APIs and they would like that code run on IE9. However, we believe that supporting these APIs would not be in the best interests of the interoperable web. TC39 has already considered this API and concluded that it should not be standardized. If IE9 added support for the __defineXXX__ APIs, that action would move that API closer to being a permanent consensus feature of the web and would be essentially overriding the decision of TC39.

Unfortunately, in the short term this creates a small amount of additional work for developers currently using the legacy API. However, it is trivial to create a simple compatibility library for your code that implements the __defineXXX__ APIs in terms of Object.defineProperty.

A Malfunctioning Consensus Feature

Within JavaScript code, a function can be referenced and called in code that precedes the actual function declaration. This is possible because JavaScript logically “hoists” all function declarations to the top of the containing code body. JavaScript also allows more than one function declaration to exist for the same function name. Consider a simple test function:

The ECMAScript specification has historically placed only one restriction upon the placement of function declaration. The ECMAScript standard does not include the ability to place a function declaration inside the body of a control structure statement. For example:

if (someCondition) { function f1() {alert("f1")}
}

The above code would produce a syntax error if parsed in according to the ECMAScript specification. However, if you try this in any browser, you will find that you do not get an error. Supporting function declarations anywhere a statement is permitted is a consensus feature of the web. If so, why wasn’t it included in ES5? Section 12 of the ES5 specification actually says something about this:

NOTE Several widely used implementations of ECMAScript are known to support the use of FunctionDeclaration as a Statement. However there are significant and irreconcilable variations among the implementations in the semantics applied to such FunctionDeclarations. Because of these irreconcilable difference, the use of a FunctionDeclaration as a Statement results in code that is not reliably portable among implementations. It is recommended that ECMAScript implementations either disallow this usage of FunctionDeclaration or issue a warning when such a usage is encountered. Future editions of ECMAScript may define alternative portable means for declaring functions in a Statement context.

What does this dense statement mean? Simply stated, the behavior across different browsers varied so widely that reconciling them was impractical.

So what does IE9 do? Does it disallow such statement level function declarations as recommended by ES5? No, it treats such declarations exactly like previous versions of IE.

Why did we make that decision? Because previous experiments by browser implementers found that rejecting such declarations broke many existing web pages.

This may seem surprising. How can a web page rely on features that behave different across browsers and still be interoperable among browsers? One possibility is that the code isn’t actually used. It may be in a function that is never called or whose result is not essential to the operation of the page. However, if the JavaScript implementation treated these functions as syntax errors the entire script would be rejected even though the code is never called.

We don’t like having to implement a feature that is such a hazard for developers. But its existence is enshrined in the web so we really had no choice.

So What About const?

We are sometimes asked whether we will support the const declaration in IE9. const is a non-standard JavaScript extension that provides a way to declare “named constant” values. A typical usage might be something like:

const pi=3.14159;

Several browsers support this feature but there are significant differences in their handling of unusual situations and in what they define to be errors. Here is an example that shows some of the differing behavior among the browsers that “support” const:

Alert: ‘1’, ‘2’, ‘2’. This means conflicting const declarations are allowed but the assignment was ignored.

Alert ‘1’, ‘2’, ‘3’. This means that const was treated just like var.

Basically, only very simple uses of the const declaration are interoperable among browsers that support the declaration. Many more elaborate uses, or scenarios that might trigger some error condition, are not interoperable among those browsers.

An actual JavaScript implementation can’t only support the trivial common use cases. A real implementation also has to do something for all the odd edge cases and error scenarios that exist on real web pages. Arguably, it’s mostly for situations like this that we have standardized specifications. It’s generally pretty obvious what should be done for the simple common use cases. It’s the edge cases that require a standards specification in order to have interoperable implementations.

TC39 seriously considered including const in ES5 and it was in early ES5 drafts. However, there were many issues about how it should be defined and how it interacts with other declarations. Ultimately, there was agreement within TC39 that standardization of const should wait for the “next” edition of the specification where perhaps some of these other problems could also be addressed. Basically, TC39 decided it did not want to standardize on a flawed feature. Instead, it chose to defer any standardization of const for a future where a single improved design could be incorporated into all browsers.

So, what is IE9 doing with const? So far, our decision has been to not support it. It isn’t yet a consensus feature as it has never been available on all browsers. There is no standard specification and there are significant semantic differences among all the existing browser implementation. In addition, we know that TC39 decided not to standardize any of the existing alternatives and wants to reconsider it in the next ECMAScript revisions.

We understand the desire of many web developers to have such a declaration. But we also don’t want to create another situation like conditional function declarations where the feature has to exist but web developers better not use it if they care about browser interoperability. In the end, it seems like the best long term solution for the web is to leave it out and to wait for standardization processes to run their course.

Principled Decision Making

These are just four JavaScript specific examples illustrating the sort of decisions we have to make in creating IE9. Many similar issues come up with regard to other web standards and consensus features. For each we perform a similar analysis. What does the standard say? How many websites actually require it? Is it a consensus feature with a common semantics? Are there multiple, incompatible variants of the feature? Is a standards committee working on it? Do any test suites exist? Would our adoption help or hinder the standards processes? In the end it is a judgment call, and we try hard to make principled consistent decisions.