Subclassing builtins in ECMAScript 6

In JavaScript, it is difficult to create sub-constructors of built-in constructors such as Array. This blog post explains the problem and possible solutions – including one that will probably be chosen by ECMAScript 6. The post is based on Allen Wirfs-Brock’s slides from a presentation he held on January 29, during a TC39 meeting.

The problem

Creating sub-constructors of built-in constructors is difficult to impossible. The normal pattern for subclassing in JavaScript is (ignoring the property constructor[1]):

If you invoke a constructor C via new C(arg1, arg2, ...), two steps happen (in the internal [[Construct]] method that every function has):

Allocation: create an instance inst, an object whose prototype is C.prototype (if that value is not an object, use Object.prototype).

Initialization: Initialize inst via C.call(inst, arg1, arg2, ...). If the result of that call is an object, return it. Otherwise, return inst.

There are obstacles to both steps when you try to subclass Array.

Allocation obstacle: MyArray allocates the wrong kind of object

Array instances are special – the ECMAScript 6 specification calls them exotic. Their handling of the property length can’t be replicated via normal JavaScript. If you invoke the constructor MyArray then an instance of MyArray is created, not an exotic object.

Initialization obstacle: MyArray can’t use Array for initialization

It is impossible to hand an existing object to Array via this – it completely ignores its this and always creates a new instance.

Apart from changing the prototype of an existing object being a relatively costly operation, the biggest disadvantage of this solution is that you can’t subclass MyArray in a normal manner, either.

This is the only solution that works in current browsers (that support __proto__).

Non-solution: constructors make objects exotic

One could change the Array constructor so that it makes objects that are passed to it exotic. But then one faces difficulties: Some exotic objects have a special structure that you can’t add to an object after the fact. And it would allow one to add several exotic features to the same object (e.g. by first calling Array and then Date), which could lead to conflicts and other problems.

ECMAScript 6 solution: decouple allocation and initialization

Specification-wise, the new operator invokes the internal [[Construct]] method, which roughly looks like this:

Eliminating the allocation obstacle. In a subclass of Array, we’d like to reuse method [[Construct]] of Array. In ECMAScript 5, we can’t, because the prototype of a constructor is always Function.prototype and never its super-constructor. That is, it doesn’t inherit [[Construct]] from its super-constructor. However, in ECMAScript 6, constructor inheritance is the default.

Additionally, Wirfs-Brock proposes to handle allocation in a separate, publicly accessible method whose key is the well-known symbol @@create (that can be imported from some module). Array would only override that method and default [[Construct]] would look like this for all constructors:

You could also trigger the “function” case for instances of Foo that have already been initialized.

This solution will probably be adopted by ECMAScript 6. Its complexity will be largely hidden: You can either use the canonical way of subclassing shown above or you can use a class definition [4]:

class MyArray extends Array {
...
}

When new does not initialize

The following problem is independent of the “allocation versus initialization” problem mentioned in Sect. 2.3: Some constructors, even when invoked directly via new, don’t initialize the instance that has been created, they throw it away. The following subsections describe when that happens.

Factory constructors

Use case. A factory constructor is “abstract”: it examines its arguments and, depending on their values, invokes one of its sub-constructors. Many class-based languages use static factory methods for this purpose.

Solution. You need to distinguish whether you are called directly via new or via a sub-constructor. It is conceivable to add language support for this. An alternative is to have a parameter calledFromSubConstructor whose default value is false. Sub-constructor set it to true. If it is true, you initialize. Otherwise, you return the result of a sub-constructor.

Cached instances

This use case is very similar to factory constructors. For example, constructor might put every instance it creates in a cache. If it is told to create an instance that is similar to one in the cache, it returns the cached one, instead. The solution is the same as for factory constructors. However, if you subclass, more work is probably needed, especially if the subclass has a different notion of similarity.

Returning an argument

Some constructors return their argument if it fulfills certain criteria. For example, Object:

> var obj = {};
> new Object(obj) === obj
true

Again, this can be solved in the manner described in Sect. 3.1.

A few observations

The constructor as a method

In ECMAScript 6, the role of the constructor as a method that initializes an object increases.
Take, for example, the following line from the [[Construct]] method:

let result = Constr.apply(inst, args);

It is equivalent to:

let result = inst.constructor(...args);

Furthermore, sub-constructors call super-constructors to help them with initialization. For example:

A method for post-initialization?

One could introduce @@postConstruct, a method that is invoked after all constructors have been executed. It is the inverse of @@create. Use case for this method: freeze an instance or make it non-extensible.

How about ECMAScript 5?

There are a few tricks you can use to subclass builtins in ECMAScript 5 [5].

Acknowledgement

Thanks to Allen Wirfs-Brock for answering my questions about his slides.