Domenic's blog about coding and stuff

This post is part of a series on the byte sources underlying the readable streams in the Streams Standard. See the introductory post for more background and links to the rest of the series.

At the simplest level, sockets can be treated much the same as files. You can get a file descriptor for the socket, and while there are a number APIs involved in setting it up to actually connect to a remote server, once you’ve done that you can read from it using the same read(2) interface as we discussed for files.

But! For sockets, there’s an advanced technique available. Instead of using the straightforward-but-blocking read(2) call, we can fine-tune our syscall usage to give us our first taste of non-blocking I/O. That is, there’s a way to arrange it so that—without spinning up any threads—we can continue doing work while the OS gets our data ready for us.

Non-Blocking Socket I/O

A quick aside. In higher-level languages, non-blocking I/O is often conflated with non-blocking APIs. These are actually distinct concepts, and for clarity we’ll refer to the latter as “asynchronous” instead. So: I/O can be blocking or non-blocking; APIs can be synchronous or asynchronous. An asynchronous API in a higher-level language might be backed by non-blocking I/O syscalls, or it might be backed by blocking I/O syscalls in a threadpool (as we showed in the file case).

What’s really interesting is what the APIs for non-blocking I/O look like in C. They’re nothing like what you might expect from working in a higher-level language, where concepts like “events” or “callbacks” are present to do the heavy lifting. Instead, it works something like this:

When creating the socket, you set it to non-blocking mode.

You go do some other work, and every once in a while, you come back and try to read some data from the socket.

If the OS has data ready for you, you get it instantly!

Otherwise, if there’s no data ready, the OS returns a special error code, saying to try again later.

(Of course, there’s always the possibility that something went wrong, and you’ll get a non-special error code.)

The devil is in the details of how you “go do some other work” and “every once in a while” come back to check on your socket. Or, more likely, sockets plural: what kind of self-respecting program will only be dealing with a single socket?

The usual solution consists of two parts. First, redesign your program to be centered around an event loop, which continually cycles through the various things it might have to do—computation, reacting to user input, checking on and trying to read from any non-blocking sockets, processing the resulting data once it gets read, etc. Second, take advantage of some advanced APIs like select(2) or epoll(4), to allow you to check on multiple sockets at once without needing to supply a buffer to each of them. In practice, the heavy lifting for both of these is usually provided by a library like libevent (the original) or libuv (the new hotness).

JavaScript Translation

In the previous episode, we were able to do a fairly direct translation of read(2)-in-a-threadpool into a promise-returning file.readInto API. For sockets, we’re going to need to skip a few more steps: the jump from select(2) to evented programming is just too great to map out directly.

Let’s assume we’ve somehow integrated our host program’s event loop (e.g., the one provided by libuv) with the JavaScript event loop. We can then have some host environment code that, each time through the loop, uses select(2) or similar to check if the socket has any data available. If it does, it needs to communicate that to JavaScript somehow. An easy way to do this would be with an event:

socket.on("readable",()=>{...});

Once we know that the socket has some data available, what should the JavaScript API for reading it look like? Its general shape will be pretty similar to our file.readInto from before. But this time, we know the result is going to be synchronous. That means we don’t need to worry about observable data races, and so we can skip all the transfer stuff we had to do last time to avoid them. The end result ends up being:

Sockets vs. Files

With this JavaScript translation in hand, we can more easily probe the differences between non-blocking socket I/O, and file I/O. It turns out there are quite a few.

The first point to note is that we’re being proactively told: there is data ready for you. But where does that data live while it’s waiting for you to come pick it up? The answer is that the OS kernel maintains its own buffer of data for that socket, where the data accumulates until you read it. If you decline to read it, then the buffer will just keep filling up, until eventually it reaches a built-in limit. Once that happens, you’ll start losing data!

This is a big difference between sockets and the higher-level stream APIs you might be used to. Streams generally go to great pains to ensure you never lose any data. But this means that any streams wrapping a socket must be careful to always pull data out of the kernel buffer before it gets too full, and then keep it around in their own user-space buffer until it’s requested.

The second interesting difference is that we have much less control over how much data we’re going to read. When the socket tells us that there’s data available, it doesn’t tell us how much. That means that in the above code, we can easily end up in a situation where bytesRead < count: indeed, it will happen whenever the kernel buffer had fewer bytes available than we requested. This is in contrast with blocking file I/O, where the only time bytesRead < count occurs is when we’ve reached the end of the file.

Finally, I want to draw attention to the different way in which buffers are provided in the two scenarios. With files, since we are doing blocking I/O in a threadpool, we need to provide the buffer up front. Whereas with sockets, we can wait until the last minute to do so. This had a pretty drastic impact on the API surface when we tried to express the reuslt in JavaScript. In particular, while you could imagine a way to wrap up our JavaScript socket API into something like our JavaScript file API, you can’t really do the other way around.

This post is part of a series on the byte sources underlying the readable streams in the Streams Standard. See the introductory post for more background and links to the rest of the series.

Once you have opened a file descriptor, you’ll use the read(2) function to read bytes from it. In C the signature is

ssize_tread(intfd,void*buf,size_tcount);

Translated into JavaScript this might look something like

constbytesRead=file.readInto(buffer,offset,count);

which will attempt to read count bytes into the ArrayBufferbuffer, starting at position offset into the ArrayBuffer. The returned number of bytes, bytesRead, might be less than the desired count, usually because you’ve reached the end of the file.

The most interesting thing to note about read(2) is that it is blocking. So our above naive translation into JavaScript would actually lock up your browser or server for the amount of time the I/O happens. This is obviously a no-go if you’re trying to write a server that serves more than one user in parallel, or trying to create a responsive 60 fps web page.

But of course we know how to fix this. We’ll just turn it into a promise-returning function:

file.readInto(buffer,offset,count).then(bytesRead=>{...});

Not so fast. How exactly do we plan on translating a blocking POSIX API into a non-blocking JavaScript API? The obvious answer is to use another thread. That is, off in a background thread, we pass the memory represented by buffer into read(2), and when read(2) finishes, we go back to the main thread and fulfill the promise we previously vended with read(2)’s return value.

This solution has a major issue, however: data races. That is, it makes it possible to observe the memory in buffer changing out from under us, with code like the following:

Because the memory in buffer is being filled in by read(2) in the background thread, it’s possible for this program to output false! Oh no!

In the io.js world, this is considered OK, and with some effort you can create situations like this using their native Buffer type. However, in the world of web browsers, and in general in any world where standards bodies need to get multiple vendors to agree, this is not going to fly. JavaScript’s execution model is strongly based around a run-to-completion single-threaded paradigm, and if we poke holes in that by letting other threads modify our variables out from under us between two execution steps, all hell can break lose. No specs, libraries, or optimizing compilers are written to accomodate such a world.

One proposed solution would be to transfer the backing memory of the ArrayBuffer into a new ArrayBuffer that is only accessible once the read(2) call has finished. In code, that might look something like this:

file.readInto(buffer,offset,count).then(({result,bytesRead})=>{// `result` is backed by the same memory `buffer` used to be// backed by, but they are not equal:assert(result!==buffer);});// `buffer`'s backing memory has now been transferred, so trying to use// `buffer` directly (or any views onto `buffer`) will throw:assert.throws(()=>buffer.byteLength);assert.throws(()=>newUint8Array(buffer));

Note how once buffer has been transferred, the buffer instance itself is now useless: it is “detached” in spec terms.

We could also imagine other ways of avoiding the data races. For example, if we had an API that allowed the background thread to first detach, then “reattach,” the backing memory to buffer, we wouldn’t need the separate buffer and result variables pointing to the same backing memory. Ideally such an API would allow us to detach and reattach sections of the ArrayBuffer, so that I could (for example) read multiple files in parallel into different sections of one large buffer. I proposed this on es-discuss, but nobody seemed to be interested.

Alternately, we could decide that for a low-level JavaScript API representing a file descriptor, data races are OK after all. In that case, Mozilla’s SharedArrayBuffer proposal would be a good fit—we’ll just write to the shared array buffer in the background thread, while still allowing reading in the main thread. As mentioned before, it might be hard to get such a primitive past multiple vendors and into the relevant standards. But the desire to transpile threaded C and C++ code into asm.js is proving to be a powerful motivator, which might push it into acceptance.

This post is the beginning of a series of posts regarding some of the more interesting issues I’ve encountered while working on the Streams Standard.

In the Streams Standard we have the concept of readable streams, which are an abstraction on top of the lower-level underlying sources. In an abstract sense an underlying source is “where the chunks of data come from.” The most basic underlying sources are things like files or HTTP connections. (More complicated ones could be e.g. an underlying source that randomly generates data in-memory for test purposes, or one that synthesizes data from multiple concrete locations.) These basic underlying sources are concerned with direct production of bytes.

The major goal of the Streams Standard is to provide an efficient abstraction specifically for I/O. Thus, to design a suitable readable stream abstraction, we can’t just think about general concepts of reactivity or async iterables or observables. We need to dig deeper into how, exactly, the underlying sources will work. Otherwise we might find ourselves scrambling to reform the API at the last minute when confronted with real-world implementation challenges. (Oops.)

The current revision of the standard describes underlying sources as belonging to two broad categories: push sources, where data is constantly flowing in, and pull sources, where you specifically request it. The prototypal examples of these categories are TCP sockets and file descriptors. Once a TCP connection is open, the remote server will begin pushing data to you. Whereas, with a file, until you ask the OS to do a read, no I/O happens.

This division is conceptually helpful, but it’s instructive to go deeper and look at the actual system APIs in play here. Given that concepts like “events” are way too high-level for an OS API, they end up being shaped quite differently than how you might imagine. We’ll assume a POSIX environment for most of this series, but I’d like to talk about some Windows specifics toward the end. Along the way we’ll continually be trying to bridge the gap between these C APIs and how they might manifest in JavaScript, both to give the less-C-inclined readers a chance, and to illustrate the issues we’ve been wrestling with in the Streams Standard.

I want to document an interesting pattern we’ve seen emerge in some recent web platform specs, including promises and streams. I’m calling it the revealing constructor pattern.

The Promises Example

Let’s take the case of promises first, since that may be familiar. You can construct a new promise like so:

varp=newPromise(function(resolve,reject){// Use `resolve` to resolve `p`.// Use `reject` to reject `p`.});

We see here that the Promise constructor takes a single function as its sole parameter (called the “executor function”). It then immediately calls that function with two arguments, resolve and reject. These arguments have the capability to manipulate the internal state of the newly-constructed Promise instance p.

I call this the revealing constructor pattern because the Promise constructor is revealing its internal capabilities, but only to the code that constructs the promise in question. The ability to resolve or reject the promise is only revealed to the constructing code, and is crucially not revealed to anyone using the promise. So if we hand off p to another consumer, say

doThingsWith(p);

then we can be sure that this consumer cannot mess with any of the internals that were revealed to us by the constructor. This is as opposed to, for example, putting resolve and reject methods on p, which anyone could call. (And no, adding underscores to the beginning of your method names won’t save you.)

Historical Origins

The first place anyone can remember seeing this pattern is in the WinJS promise implementation. Before that, promise libraries used an awkward concept called a “deferred.” You would do something like this:

vardeferred=Q.defer();varp=deferred.promise;// Use `deferred.resolve` to resolve `p`.// Use `deferred.reject` to reject `p`.doThingsWith(p);

This was strange in a few ways, but most prominently, it was strange because you were constructing an object without using a constructor. This is generally an antipattern in JavaScript: we want to be able to clearly conceptualize the relationship between instances, constructor functions, and prototypes.

In contrast, with the revealing constructor pattern, we get our nice constructor invariants back. Things like:

The Streams Example

To produce a Node stream representing a specific resource—which is somewhat analogous to producing a promise representing a specific asynchronous operation—you don’t use the stream constructor. You don’t even use something like the deferred pattern. Instead, you subclass the appropriate stream class. And then you overwrite certain underscore-prefixed methods!

So for a simplified example, here is how you would create a file reader stream using the Node APIs. I’ll use ES6 class syntax for brevity, but that is just sugar over the usual ES5 incantations.

classFileReaderStreamextendsReadable{constructor(filename){this.filename=filename;}_read(size){// Use `this.filename` to eventually call `this.push(chunk)`// with some data from the file, or `this.push(null)` to close// the stream, or `this.emit("error", e)` with an error.}}varmyNodeStream=newFileReaderStream("/path/to/file.txt");

There are two interesting actors here:

_read, a method not meant to be called by users directly, but instead called by the internals of the stream when it’s time to read data from the underlying source.

push and emit("error", e), which have the capability to manipulate the stream’s internal buffer and state machine. They too are not meant to be called by users directly, but instead only by implementers, inside their _read method (or perhaps inside the constructor).

Interestingly, these are almost exactly analogous to the promise situation. _read is like the executor argment passed to the promise constructor, in that it consists of user code that does the actual work. And push/emit are capabilities, like resolve/reject, which can be used by the work-doing function to manipulate internal state.

In building the streams spec, we realized the Node pattern wasn’t the way we wanted to go. Requiring subclassing for every stream instance is not ergonomic. Using underscore-prefixed methods as the extension point isn’t realistic either. And letting any user access the capabilities involved is not tenable, in part because it means implementations can’t build invariants around who has access to the internal buffer.

In contrast, the revealing constructor pattern works out really well. To create a file reader stream with whatwg/streams, you do something like

functioncreateFileReaderStream(filename){returnnewReadableStream({pull(enqueue,close,error){// Use `filename` to eventually call `enqueue(chunk)`// with some data from the file, or `close()` to// close the stream, or `error(e)` with an error.}});}varmyWhatwgStream=createFileReaderStream("/path/to/file.txt");

Notice the difference in the external API exposed. If you pass myNodeStream to another function, that function can mess with the stream’s internal state as much as it wants, calling push, emitting "error" events, or even (despite the underscore) calling _read. Whereas if you pass myWhatwgStream around, consumers will not be able to do any of those things: the integrity of its internal state will be preserved.

(Plus, no subclassing!)

When Would I Use This?

I admit that that the revealing constructor pattern seems a bit unorthodox. The number of actors involved—viz. the constructor itself, the work-doing function to which capabilities are given, and the capability arguments—can be hard to get your head around, at least the first few times you see them.

That said, it is a pretty elegant solution to a tricky problem. You might not need this level of encapsulation in your home-grown code. And even more widespread libraries may be able to skate by, as Node does, with documentation strategies and an attitude of “don’t do anything dumb with the capabilities we leave lying around, or it’ll break.” But when writing platform-level libraries and abstractions, which need to maintain their integrity in the face of any environment, the revealing constructor pattern really proves its worth.

And besides, patterns become part of our vernacular. Many patterns that are commonplace today seemed just as strange when they are introduced as the revealing constructor pattern might to you now. After working with promises and streams for a while, you might encounter a situation where a revealing constructor is a natural fit for your library’s needs. Who knows!

The W3C Technical Architecture Group has made immeasurable progress this year since
the original wave of reformist thought swept through it last
election season. The extensible web agenda, which I’ve
spoken about previously, has been adopted into their vision for the
web’s foundations and informed recent spec work across the W3C. The TAG even moved its deliverables
onto GitHub, allowing better collaboration with and transparency to developers.

But there’s always more to do. The web is slowly but surely coming into its own as a serious modern development
platform—one which can compete with native apps across the board. New APIs, new primitives, and new tools are very much
necessary to make our open platform as attractive to developers and users as it could be. To lure them away from the
walled gardens of closed app stores and vendor-proprietary development platforms, we must provide something better.

The TAG is in a unique position to oversee these efforts, with its charter to steward the evolution of web architecture
and coordinate with other relevant groups like Ecma TC39 and the IETF. As such, I’m excited to be
running for TAG membership in this
newest election cycle.

Over the last year of my increasing involvement in web standards, I’ve found two things to be paramount: developer
involvement, and a focus on solid low-level primitives. Independent of any formal role in the process, I have and
will continue to champion these causes. My nomination by the jQuery Foundation to serve on the TAG only allows me to
advocate them in a more formal role.

As a web developer myself, I experience the joys and disappointments of our platform every day. Some of you might think
it’s all disappointments—and I can certainly sympathize, given our
day-to-day frustrations. But one of the more eye-opening
experiences of the last few months has been working alongside an experienced Java developer, new to the web platform,
and seeing his almost childlike glee at how easy it is to produce complex, interactive, and robust UIs. More
generally, when I think on what I actually do for a living at Lab49—produce complex financial trading and analysis
systems, built on the open web platform—it’s hard not to be amazed. We’ve come a long way from the time when only
desktop apps were considered for serious work. Now all our clients want cross-browser and cross-device web applications,
that they can access from any computer at any time, with shareable URLs and responsive experiences and all the other
things that come with the web.

To enable developers to build such increasingly powerful experiences, we need to listen to them. That’s why I spend a
lot of time speaking at and traveling to developer conferences, or being involved on Twitter, on IRC, and on GitHub,
with the community. I recently gave a talk specifically on
how to get involved in web standards, and have been working
constantly to get developer feedback on missing features or in-progress specs since then.

Developers are a tricky bunch, as many have been trained to ignore standards bodies and simply hack together their own
solutions. They’re used to being ignored themselves. But times are changing. The
extensible web manifesto guides us to supply the web with the low-level features
developers need, and then to listen to them and roll what they build back into the platform. The TAG’s role is helping
to guide this overall process, and I hope to bring along my experience listening to and learning from the developer
community.

You may have noticed I kept saying “developers” above, and never “web developers.” That’s because I strongly believe we
need to look outside our own community for inspiration. There are lessons to be learned everywhere across the software
development landscape, from other UI frameworks and standard libraries, to other languages whose features we need in our
platform’s lingua franca of JavaScript. Perhaps most importantly, I maintain strong ties with and involvement in the
Node.js community. They provide an excellent source of inspiration and advice, as a platform that takes JavaScript far
beyond where many of us would have envisioned it only a few years ago.

Which brings us to the issue of low-level primitives. Node’s great success comes in a large part from its focus on
providing such primitives: things like standard patterns for binary data, for asynchrony, or for streaming. On top of
these they’ve built a standard library that should be the envy of any platform in both its
small size and in its power.

Of course, the web platform must by necessity evolve via consensus, and so more slowly than Node. But this gives us the
benefit of watching them run out ahead of us, make mistakes, and then come back with field reports on how it went. As
such we are getting typed arrays instead of buffers; promises instead of error-first callbacks; and
intelligently-designed streams instead of backward-compatible evolved ones. And
it’s no coincidence that I’ve been involved in both the promises and streams efforts, as I’m very passionate about
ensuring that these foundational pieces of the platform are solid enough to build on and have learned from experiences
implementing them elsewhere.

But we’re still in our infancy when it comes to building on these primitives. We need to tie them together with the rest
of the web platform. In short, we need to get to the day when the

In my view, it’s the TAG’s job to get us there. The cross-group coordination issues necessary to make visions like this
a reality are a large part of the TAG’s charter. We can provide a high-level vision, fueled by our interaction with the
developer community, for extending the web forward. And all the while, I’ll be down in the trenches, both gathering
feedback to help shape this vision, and working on specifications and interfacing with implementers to make it happen.

If this sounds like progress to you, I’d appreciate your organization’s vote.

The web platform has, historically, been somewhat of a kludge. It’s grown, organically, into something with no real
sense of cohesion. Most of its APIs have been poorly designed, by C++ developers, via
a binding layer meant originally for CORBA.

Worse, there have been major gaps in what we can do compared to native apps. And for those things that we can do, we end
up accomplishing them by drowning ourselves in custom JavaScript functionality.

The problem is in the process. Generally, new things have been introduced into our web platform via a months or years of
mailing-list standardization, writing something in prose and IDL, driven by scenario-solving—without much concern for
actual utility, much less usability. Implementers expose some fundamental capability in terms of a high-level API or
declarative form that burrows down directly to the C++ layer, giving you limited customizability. After all this time,
it eventually ends up in your hands, and you end up telling the standards bodies that it’s a huge mess, or
that it solves half of your problems half of the time.

Despite all this, we’ve somehow done OK. Actually, a bit more than OK, given that the web is the most successful
platform ever. How did we manage this?

Well, we wrap up APIs with horrible usability into ones that are quite pleasant, like jQuery. We “prolyfill,” creating
libraries like Sizzle to implement CSS selector matching, or libraries like Angular to implement custom elements, in the
hope that eventually native support will appear. We transpile from languages like CoffeeScript or SASS to add new
features to our authoring languages. And one case, promises, we even
built an interoperable standard from the ground up.

We need our platform to be better, and so we make it better, by ourselves.

The Extensible Web Manifesto

The Extensible Web Manifesto is standards bodies saying they’re ready to do
their part. Until now, we, the developers, have been shouldering all the work, writing massive JavaScript libraries or
transpilers to reinvent basic functionality.

There’s a better way, where we work together toward the future.

What these standards bodies have realized is that the web platform is our language, but like all languages, it must
evolve.

Extending our Vocabulary

Extending our vocabulary means two things:

Explaining the features of the platform that are already there. Wouldn’t it be weird if we had compound words like
“scifi,” but didn’t have the words “science” or “fiction”? If some standards body, perhaps the
French Making Up Words Consortium, just handed us the
word “sandpaper,” but we had no way in our language to talk about “sand” or “paper” individually? The web is like
that today, and we’ll go over a few examples.

Giving you new low-level features that you can use. If you wanted to invent the word “scifi,” somebody had better
have come up with the words for “science” and “fiction”! Similarly, there’s lots of things we just don’t have “words”
for on the web, yet. That’s where native apps are hurting us.

So with this in mind, let’s look at some examples.

Custom Elements

The most fundamental unexplained gap in the platform is simply: how do those damn elements even work?

Somehow, you feed a string containing some angle brackets into the browser, and they get turned into these JS objects
with terrific APIs, which we call “the DOM.” How did that happen?

Custom elements explain this process,
saying that you register a mapping of tag names to element prototypes with the browser, and that’s what the HTML parser
is actually using under the hood. This is great! This is the democratization of HTML!

And better yet, this means no more crazy widget libraries with their own crazy semantics. No more jQuery UI with its
.option thing (sometimes a setter, sometimes a getter, sometimes a method call); no more Dojo digits; no more
Bootstrap craziness; no more WinJS with its funky winControl property. Just tags, that turn into elements, which
behave like you’d expect: they have properties, getters, setters, methods, and all that.

The Shadow DOM

But what about the existing tags? Half of the reason these widget libraries exist is so that you can create your own
stupid <select> element, because the existing one isn’t styleable or customizable.

In general, think of all the “magic” tags that exist today, like <select>, or <input type="date">, or <details>,
or <video>, or even good old <li>, whose bullet seems to come out of nowhere. In all cases, there’s some extra
“stuff” the browser is creating, and allowing users to interact with, and sometimes even allowing you to style via
ridiculous vendor-prefixed pseudo-elements like ::-moz-placeholder. But where does this extra stuff live?

The answer is: in the shadow DOM. And what’s
great about the shadow DOM, is that once we actually have a realistic basis for these hidden parts of the DOM, in
reality instead of in C++ magic-land, you’ll be able to actually start hooking into them instead of rebuilding an entire
element just to customize its behavior and styling. That day is
almost here.

Web Audio

The web audio API is a good example of both
facets of the “new vocabulary” theme. You can do fundamentally new things with web audio, like positional audio or audio
synthesis or so many other cool possibilities.

But remember the <audio> tag, from way back in 2009? It’s kind of the quintessential instance of “here’s some C++
magic thrown over the wall to you web developers; have fun!” Well, from an extensible web perspective, the <audio> tag
should be explained in terms of web audio.

Etcetera

There are of course many other APIs which exist solely to expose a new low-level hardware or platform feature to the web
platform. One of the older examples on the hardware side is the
geolocation API. On the software side, good examples include the
notifications API and fullscreen API. But
more and more are popping up as we attempt to close all the gaps preventing full parity with native apps; one particular
driver of this is the work on Firefox OS and the related device APIs.

ES6 and ES7

Finally, I want to call out ECMAScript 6 (which is nearing finalization) and ECMAScript 7 (for which efforts are just
starting to ramp up). Extending the web’s programming language is adding new vocabulary at its most literal level, and
the TC39 committee driving the evolution of ECMAScript does not disappoint in their efforts here.

In ES6 we’ll be getting subclassable built-in objects, so that you can finally extend Array or Date or the new Map
and Set types, in a way that actually works. We’ll also be getting proxies, which allow an object almost-complete
control over the meta-object protocol underlying all interactions with it. And for ES7, the proposal for
Object.observe is starting to firm up. Plus there is talk of adding weak references to the language, now that some of
their trickier aspects have been worked out.

Incorporating Slang

The second half of the extensible web philosophy is that we need to tighten the feedback loop between developers and
standards bodies.

Think about it: you get all these neat new low-level tools, and you build great things out of them. But you end up
downloading megabytes of JavaScript, or transpiling your code, just to get the base platform in place. This is why
almost every web page uses jQuery: because the platform itself hasn’t stepped up to the plate and incorporated jQuery’s
innovations back in.

In short, we need to incorporate this kind of invented “slang” back into our shared language. Let’s take a look at some
of the examples of this so far.

<template>

The <template> element is a generalization of the
common <script type="text/x-template"> trick. By rolling it into the browser, additional benefits can be realized,
allowing the template tree to be treated as an inert version of a real DOM tree, and for the element parsing and
serialization rules to specifically call out templating use cases.

<dialog>

The <dialog> element obviates all of the annoying
dialog or “lightbox” libraries we keep having to ship, each with their own strange semantics. Instead, it’s a simple
tag, with some imperative APIs, some declarative features, and a nice ::backdrop pseudo-element. Sweet!

CSS Improvements

CSS is slowly but surely starting to roll in innovations from SASS and elsewhere.
CSS hierarchies, still under development, brings SASS’s nested selectors to
the browser. CSS variables uses a clever trick to get something with the same
benefits as the variables in SASS and others, but fitting in well with CSS’s existing semantics. And
CSS cascade introduces the unset keyword which reduces all those complicated
CSS reset stylesheets rules to virtually nothing.

Pointer Events

Pointer events finally unify mouse and touch
events into a single abstraction. In one stroke, this obviates many libraries built to work around this strange
dichotomy introduced by mobile Safari, and around other strangeness relating to trying to use mouse events on a touch
device. They will be a welcome addition to the web platform.

Promises

When a pattern is adopted by jQuery, Dojo, Angular, Ember, WinJS, and YUI, as well as many other popular dedicated
libraries, it’s time to put it into the platform. Promises are on
track for ES6, and are being added to browsers now.

What’s Next?

The extensible web is an ongoing project, and several efforts are being headed up to expose even more capabilities to
developers, or roll even more common patterns into the platform. Here’s a brief taste of those I’m watching closely.

Node.js has led the way with their battle-tested implementations, but they’ve also learned some lessons we should be
sure to heed in order to design a good browser stream API. In the end, the goal is to be able to take various sources of
binary data (HTTP requests, camera data, payloads stored in IndexedDB, the output of a web audio graph, …) and pipe them
into various sinks (<img>, <video>, and <audio> tags; other windows, frames, or workers; filesystem or remote HTTP
endpoints; or completely custom consumption code). It’s going to be really cool, but we have some work to do before we
get something as well-designed as promises were.

Fetch

The basic act of doing an HTTP request has so much complexity on the web platform: cross-domain protection; redirect
following; deserialization from bytes; cookie jars; caches… We want to provide the basic building block, and then the
ability to layer and compose each of these features on top of it.

ZIP/ZLib

There’s active investigation going on into how to expose compression primitives to the web. This is clearly something
where native bindings will be more performant, and although there are
impressive polyfills, native APIs, preferably with asynchronous off-main-thread
compression, will enable new scenarios. This work is in its early stages, so if you want to get involved, reach out.

class Elements extends Array

My personal favorite new feature is the upcoming
Elements collection. It’s a proper array subclass, using the
aforementioned ES6 subclassable builtin support, to give you something where you can finally use forEach, reduce,
filter, and all your favorite methods.

As part of this effort we added two methods, query and queryAll, to both Element.prototype and to
Elements.prototype. They act as better versions of querySelector and querySelectorAll, in that they treat relative
selectors like "> div" the way you would expect instead of throwing an error. The versions on Elements.prototype act
as composite operations over all elements in the collection, just like in jQuery.

This is the beginning of a new, friendlier DOM, and I’m pretty excited about it.

What Else?

What do we need? What is preventing you from building the web apps of your dreams? You tell us! The extensible web is
waiting for your participation!

Getting Involved

The best thing you can do to get involved in the extensible web is prolyfill. There’s only so much standardization
bandwidth to go around, so if you can create a de-facto standard like jQuery, or an open specification with wide
implementer suppport like Promises/A+, the world is waiting.

For example, if you wanted to figure out what a zlib API for the browser should look like, the best thing you can do is:

Learn what the constraints and use cases are. (And not just your use cases, but everyone’s!)

Design an API and library to prolyfill this gap.

Evangelize its use among developers, so that everyone recognizes it as the clear solution that browsers should just
ship and be done with it.

More generally, if you want to be involved in helping the web succeed by guiding us toward better standards, then let’s
talk. It’s an area I’ve been diving into over the last year, stemming from my Promises/A+ work but expanding into many
other things. Finding the right approach and content is delicate, as these people are jaded by newbies coming out of the
woodwork to demand feature X. But if you approach in good faith and avoid a prideful demeanor, they’re often happy to
listen. I’ve had a few success stories in this area already, and by this time next year I want to have a lot more.

Another thing I wanted to note, before closing out, is that this extensible web philosophy has teeth. The W3C Technical
Architecture Group had four seats go up for reelection recently. Four “reformers” were elected at once: Yehuda Katz,
Alex Russell, Marcos Caceres, and Anne van Kesteren. The extensible web philosophy underlies their governance, as the
ultimate technical body which provides guidance and approval for all W3C specs. We’ve already seen fruit here with their
review of the web audio spec,
among others. They’ve been helping specs build on a solid grounding in
JavaScript fundamentals, and generally be less magic and more JavaScript. All their work is being done
on GitHub, as are more and more specifications. This is happening!

To close, I’d like to give a short message of hope. It’s easy to think about all these cool things that are coming, and
then get depressed about having to support IE8 or Android 2.3 at your job. But that’s the price we pay for an open,
interoperable web. We can’t just march to the tune of a single vendor, upgrading in lockstep. Instead we work through
this collaborative, cooperative process, to build our shared language. In the end, the future is longer than the past,
and I look forward not only to living in that future, but to helping shape it, together with you all.

I wrote up a quick guide to the terminology around ES6’s iteration-related concepts, plus some notes and other
resources.

Definitions

An iterator is an object with a next method that returns { done, value } tuples.

An iterable is an object which has an internal method, written in the current ES6 draft specs as
obj[@@iterator](), that returns an iterator.

A generator is a specific type of iterator whose next results are determined by the behavior of its corresponding
generator function. Generators also have a throw method, and their next method takes a parameter.

A generator function is a special type of function that acts as a constructor for generators. Generator function
bodies can use the contextual keyword yield, and you can send values or exceptions into the body, at the points where
yield appears, via the constructed generator’s next and throw methods. Generator functions are written with
function* syntax.

A generator comprehension is a shorthand expression for creating generators, e.g.
(for (x of a) for (y of b) x * y).

Notes

for-of

The new for-of loop works on iterables, i.e. you do for (let x of iterable) { /* ... */ }. So for example, it
works on arrays by looking up their Array.prototype[@@iterator]() internal method, which is specified to return an
iterator that does what you’d expect. Similarly Map.prototype, Set.prototype, and others all have @@iterator
methods that help them work with for-of and other constructs in the language that consume iterators.

Note that for-in has nothing to do with iterables, or indeed any of the concepts discussed here. It still works as
it did before, looping through enumerable object properties, and it will be pretty useless when given an iterable of any
sort.

Iterable Iterators

An iterator can also be iterable if it has an @@iterator() internal method. Most iterators in the ES6 draft spec
are also iterable, with the internal method just returning this. In particular, all generators created via generator
functions or generator comprehensions have this behavior. So you can do:

Making Iterable Objects

To make a custom iterable object, you use the Symbol.iterator symbol, which is the inside-JavaScript way of referring
to the specification’s @@iterator. It should return an iterator, thus making your object iterable. The easiest way to
write the iterator-returning method is to use generator syntax. Putting this all together, it looks like

Generator Comprehension Desugaring

You can think of generator comprehensions as “sugar” for writing out and immediately invoking a generator function, with
yields inserted implicitly at certain points. For example, the comprehension (for (x of a) for (y of b) x * y)
desugars to

(function*(){for(xofa){for(yofb){yieldx*y;}}}())

A Weirdness

It’s not entirely clear why generator comprehensions create generators instead of simple iterable-iterators. In
particular, as you can see from the above desugaring, calling throw or giving a parameter to next is pretty useless:

It seems the arguments in favor of generators instead of iterable-iterators are largely that it makes implementers' jobs
easier, at least according to
this es-discuss thread I started.

Implementation Status

Due to the tireless work of Andy Wingo, V8 (Chrome) has support for generator functions, behind
the --harmony flag (or “Experimental JavaScript Features” in chrome://flags). It also has some form of for-of,
which only works with generators and with the custom iterables returned by Array.prototype.values and
Array.prototype.keys, but does not work with anything else (e.g. arrays themselves). I assume this is because the
iteration protocol has not been implemented yet. V8 does not have generator comprehensions.

SpiderMonkey (Firefox) has an old version of everything here, with outdated syntax and semantics. Over the last week or
so, Andy has submitted patches to update their generator implementation. He’s now working on for-of and the
iteration protocol in bug #907077; no word on generator
comprehensions. Note that SpiderMonkey, unlike V8, does not hide these features behind a default-off flag.

Chakra (Internet Explorer) is as always a complete mystery, with no transparency into their development cycle or even
priority list.

And I still haven’t forgiven JavaScriptCore (Safari) for its long period of forgetting to implement
Function.prototype.bind, so I haven’t even tried looking into their status.

Today we revealed The Extensible Web Manifesto, calling for a new approach to
web standards that prioritizes new local-level capabilities in order to explain and extend higher-end platform features.
I want to take a minute to talk about what this means in practice, and why it’s different from how we operate today.

Show Me the Magic!

The core of the extensible web manifesto says two things:

We should expose new low-level capabilities to JavaScript

We should explain existing high-level features in terms of JavaScript

The first of these is fairly obvious, but important. It’s this goal that has brought us things that were previously
restricted to native apps or Flash, e.g. webcam and geolocation access, or WebGL. Even something as simple as the page
visibility API is a nice, low-level capability that helps us build our apps in ways we couldn’t do before.

What all of these low-level APIs have in common is that they’re exposing “C++ magic” to us, the web developers. They’re
bridging the gap between native hardware and OS capabilities, into the realm of web applications. This is good. This is
where the magic belongs: where our technology fails us.

The second point is where things get subtle. Because it turns out there’s a lot more C++ magic going on in your browser
than just access to hardware, or to OpenGL bindings. Arguably the biggest amount of magic in the web platform is the
magic we take for granted: the magic that translates declarative HTML and CSS into what we see on the screen and
manipulate with JavaScript.

This is the more revolutionary part of the extensible web manifesto. It’s drawing a line in the sand and saying: the
C++ magic stops here. We need to stop building high-level features of the platform out of magic when it’s not
necessary to do so, and we need to explain the existing features in terms of JavaScript technology—not C++ magic—at
least until we bottom out at the low-level hardware capabilities discussed above. By taking this stand, we enable users
to extend the web platform without rebuilding it from scratch.

#extendthwebforward In Practice

Custom tags are a great example of this principle in action. If you stick to a pre-defined list defined by the W3C,
then you can use your declarative HTML all over the place. Each tag gets turned into its own JavaScript counterpart via
the parser, whether it be the humble transformation of <p> into HTMLParagraphElement or the complex wirings between
<img> and HTMLImageElement.

Everything the W3C doesn’t know about, however, gets turned into the featureless blob that is HTMLUnknownElement. Hrm.

There’s clearly a lot of magic going on here, most of it encapsulated in that “parser” thing. What if the parser was
extensible, and could be explained in terms of a JavaScript API? What if we pushed the magic further back? A good first
step might be allowing
the registration of custom elements. That way, we could
explain the inner workings of this magic parser in terms of how it looks up an element name in a registry, and then
derives instances from that registry. This has a wonderful emergent property as well: now that HTML elements are
explained by a C++ parser turning HTML into JavaScript, our JavaScript objects can use the usual mechanisms of the
language, like prototypal inheritance and constructors, to build on existing HTML elements.

The shadow DOM is another of my favorite examples. While <p> might be a relatively non-magical element, clearly
there’s a lot of crazy stuff going on with <select>! And don’t get me started on <audio> and <video>. It’s as if
there’s a whole, um, shadow DOM, hidden underneath the one we see, accessible only by C++. The goal of the shadow DOM
spec, going back to its earliest conception, has been to
bring that magic out of C++ and explain it in terms of
JavaScript primitives.

But it’s not just these large examples, attempting to explain HTML. What about something as simple as … parsing
URLs? Clearly, the platform has this capability:

But somehow, this capability got tied up inside the high-level abstraction of the <a> element, and isn’t accessible
to us directly as JavaScript programmers. We’re left reimplementing it, often
incorrectly, on our own. It’s this kind of travesty that work like
the URL spec in particular, and the extensible web movement in general, is
trying to prevent.

Let’s keep going. What does this do?

varels=document.querySelectorAll(":enabled");

Well, hmm, something about “an enabled state”. I wonder what
that means. Probably something to do with
the disabled attribute. How did the browser know that? More magic! We passed in a string that’s in a magic
list called “CSS3 selectors spec,” and it has some magic handler that turns that into “elements where
el.disabled === false.” This is a pretty high-level abstraction; could we explain it with technology instead of magic?
What about some kind of CSS selector registration?

It’s of course more complicated than that. To make this performant and sensible, we’re going to need some way to ensure,
or at least assume, that the function passed is pure (i.e. side-effect free and gives the same answers to the same
inputs). But we need something like that anyway.
It’ll happen.

There’s so many more examples. Once you start seeing the web browser in this way, trying to pick apart and explain its
C++ magic in terms of JavaScript technology, you’ll have a hard time un-seeing how much work needs to be done. Here’s
just a brief scratchpad idea list:

What’s Next

The extensible web manifesto was a statement of intent, and of prioritization.

Some of the reaction to it has been along the lines of “Well … duh!?” To which I reply: exactly! This should be obvious. But if you look at how the standards process has worked
historically, that’s not what happened. Implementers handed down
new high-level solutions from their ivory towers, without knowing
if they actually solved the problems we as developers actually had. Or they knew we needed some primitive, but gave us
a slightly messed-up version of it, due
to lack of attention to common use cases. It’s been somewhat tragic, actually.

The publication of the extensible web manifesto is taking a stand for a process that should help avoid these missteps,
by doing what we should have been doing all along. In short, prioritize efforts to expose low-level tools; then,
watch what the developer community converges on and fold that back into the web platform now that we know it solves
common high-level use cases. And this ideology has teeth: the W3C’s newly-reformed
Technical Architecture Group has taken on the case of overseeing this
effort, ensuring that new APIs are introduced to explain away the C++ magic in terms of idiomatic JavaScript.

Simply put, code is statically scoped if you can statically analyze it and determine what all the identifiers refer
to. In other words, you can statically determine where every variable was declared. As we’ll see, JavaScript’s sloppy
mode does not have this property, giving you yet one more reason to shun it in favor of "use strict".

Sloppy Mode = Dynamic Scoping

Most of the time, JavaScript scoping is fairly simple. You look up the scope chain, as declared in the source code; if
you can’t find something, it must be on the global object. But in sloppy mode, there are several situations that can
foil this algorithm.

Use of with

Using the with statement completely destroys the sanity of your scope chain:

In this example, we can’t statically determine if console.log(__filename) is referring to the free __filename
variable, set to "/my/cool/file.js", or if it’s referring to the property of anotherContext, set to
"/another/file.js".

Strict mode fixes this by banning with entirely. Thus, the above code would be a
syntax error if it were placed in a strict context.

Use of eval

Using eval in sloppy mode will introduce new variable bindings into your scope chain:

In this example, we have a similar problem as before: eval might have dynamically introduced a new variable binding.
Thus, require can refer either to the new function introduced by eval into requireStuff’s scope, or to the
function declared in the outer scope. We just can’t know, until runtime!

But in sloppy mode, it is literally unknowable what a given variable refers to. At any time, an enclosing with or a
nearby eval could come along, and really ruin your day. In such a setting, static analysis is doomed; as we’ve seen
in the above examples, the meaning of identifiers like require or __filename can’t be determined until runtime.

So? Just Don’t Use Those Things

A common refrain from people who can’t handle typing `“use strict”
is that they’ll simply not use these features. And this indeed suffices: if you subset the language, perhaps using tools
like JSHint to enforce your subsetting rules, you can create a more sane programming environment.

Similar arguments are applied commonly in other languages, like prohibiting the use of exceptions or templates in C++.
Even telling people to not pass expressions to their require calls in Node.js modules falls under this category (with
the rationale that this breaks the popular browserify tool).

I don’t buy these arguments. A language should give its users built-in tools to use it correctly. In the case of
JavaScript, there is one very clear tool that has been given: a simple, backward-compatible "use strict" pragma at the
top of your source files. If you think that’s difficult, try being a C++ programmer and writing exception-safe code: the
techniques you need to use are a lot more involved than a single pragma.

Use Strict Mode

In the words of Mark Miller, ECMAScript 5 strict mode has
transitioned JavaScript into the “actually good” category of programming languages. Let’s use it. Opt in to static
scoping, and a saner language in general. Use strict.

npm is awesome as a package manager. In particular, it handles sub-dependencies very well: if my package depends on
request version 2 and some-other-library, but some-other-library depends on request version 1, the resulting
dependency graph looks like:

├── request@2.12.0
└─┬ some-other-library@1.2.3
└── request@1.9.9

This is, generally, great: now some-other-library has its own copy of request v1 that it can use, while not
interfering with my package’s v2 copy. Everyone’s code works!

The Problem: Plugins

There’s one use case where this falls down, however: plugins. A plugin package is meant to be used with another “host”
package, even though it does not always directly use the host package. There are many examples of this pattern in the
Node.js package ecosystem already:

Even if you’re not familiar with any of those use cases, surely you recall “jQuery plugins” from back when you were a
client-side developer: little <script>s you would drop into your page that would attach things to jQuery.prototype
for your later convenience.

In essence, plugins are designed to be used with host packages. But more importantly, they’re designed to be used with
particular versions of host packages. For example, versions 1.x and 2.x of my chai-as-promised plugin work with
chai version 0.5, whereas versions 3.x work with chai 1.x. Or, in the faster-paced and less-semver–friendly world of
Grunt plugins, version 0.3.1 of grunt-contrib-stylus works with grunt 0.4.0rc4, but breaks when used with grunt
0.4.0rc5 due to removed APIs.

As a package manager, a large part of npm’s job when installing your dependencies is managing their versions. But its
usual model, with a "dependencies" hash in package.json, clearly falls down for plugins. Most plugins never actually
depend on their host package, i.e. grunt plugins never do require("grunt"), so even if plugins did put down their host
package as a dependency, the downloaded copy would never be used. So we’d be back to square one, with your application
possibly plugging in the plugin to a host package that it’s incompatible with.

Even for plugins that do have such direct dependencies, probably due to the host package supplying utility APIs,
specifying the dependency in the plugin’s package.json would result in a dependency tree with multiple copies of the
host package—not what you want. For example, let’s pretend that winston-mail 0.2.3 specified "winston": "0.5.x" in
its "dependencies" hash, since that’s the latest version it was tested against. As an app developer, you want the
latest and greatest stuff, so you look up the latest versions of winston and of winston-mail, putting them in your
package.json as

{"dependencies":{"winston":"0.6.2","winston-mail":"0.2.3"}}

But now, running npm install results in the unexpected dependency graph of

├── winston@0.6.2
└─┬ winston-mail@0.2.3
└── winston@0.5.11

I’ll leave the subtle failures that come from the plugin using a different Winston API than the main application to
your imagination.

The Solution: Peer Dependencies

What we need is a way of expressing these “dependencies” between plugins and their host package. Some way of saying, “I
only work when plugged in to version 1.2.x of my host package, so if you install me, be sure that it’s alongside a
compatible host.” We call this relationship a peer dependency.

The peer dependency idea has been kicked around for literallyyears. After
volunteering to get this done “over the weekend” nine
months ago, I finally found a free weekend, and now peer dependencies are in npm!

Specifically, they were introduced in a rudimentary form in npm 1.2.0, and refined over the next few releases into
something I’m actually happy with. Today Isaac packaged up npm 1.2.10 into
Node.js 0.8.19, so if you’ve installed the latest version of
Node, you should be ready to use peer dependencies!

As proof, I present you the results of trying to install jitsu 0.11.6 with npm
1.2.10:

As you can see, jitsu depends on two Flatiron-related packages, which themselves peer-depend on conflicting versions
of Flatiron. Good thing npm was around to help us figure out this conflict, so it could be fixed in version 0.11.7!

Using Peer Dependencies

Peer dependencies are pretty simple to use. When writing a plugin, figure out what version of the host package you
peer-depend on, and add it to your package.json:

{"name":"chai-as-promised","peerDependencies":{"chai":"1.x"}}

Now, when installing chai-as-promised, the chai package will come along with it. And if later you try to install
another Chai plugin that only works with 0.x versions of Chai, you’ll get an error. Nice!

One piece of advice: peer dependency requirements, unlike those for regular dependencies, should be lenient. You
should not lock your peer dependencies down to specific patch versions. It would be really annoying if one Chai plugin
peer-depended on Chai 1.4.1, while another depended on Chai 1.5.0, simply because the authors were lazy and didn’t spend
the time figuring out the actual minimum version of Chai they are compatible with.

The best way to determine what your peer dependency requirements should be is to actually follow
semver. Assume that only changes in the host package’s major version will break your plugin. Thus,
if you’ve worked with every 1.x version of the host package, use "~1.0" or "1.x" to express this. If you depend on
features introduced in 1.5.2, use ">= 1.5.2 < 2".