For Dave and David

Dave Herman jokingly accused me a couple of TC39 meetings ago of being an “advocate for JavaScript as we have it today”, and while he meant it in jest, I guess to an extent it’s true — I’m certainly not interested in solutions to problems I can’t observe in the wild. That tends to scope my thinking aggressively towards solutions that look like they’ll have good adoption characteristics. Fix things that are broken for real people in ways they can understand how to use.

This is why I get so exercised about WebIDL and the way it breaks the mental model of JS’s “it’s just extensible objects and callable functions”. It’s also why my discussions with folks at last year’s TPAC were so bleakly depressing. I’ve been meaning to write about TPAC ever since it happened, but the time and context never presented themselves. Now that I got some of my words out about layering in the platform, the time seems right.

Consider the disconnect: they’re not saying “oh, it sure would be nice if our types played better with JS”, they’re saying “you and what army are gonna make us?” Remember, WebIDL isn’t just a shorthand for describing JavaScript classes, it’s an entirely parallel type hierarchy.

Many of the Chapter 8 properties and operations are still in the realm of magic from JS today, and we’re working to open more of them up over time by giving them API — in particular I’m hopeful about Allen Wirfs-Brock’s work on making array accessors something that we can treat as a protocol — but it’s magic that DOM is appealing to and even specifying itself in terms of. Put this in the back of your brain: DOM’s authors have declared that they can and will do magic.

Ok, that’s regrettable, but you can sort of understand where it comes from. Browsers are largely C/C++ enterprises and DOM started in most of the successful runtimes as an FFI call from JS to an underlying set of objects which are owned by C/C++. The truth of the document’s state was not owned by the JS heap, meaning every API you expose is a conversation with a C++ object, not a call into a fellow JS traveler, and this has profound implications. While we have one type for strings in JS, your C++ side might have bstr, cstring, wstring, std::string, and/or some variant of string16.

JS, likewise, has Number while C++ has char, short int, int, long int, float, double, long double, long long int…you get the idea. If you’ve got storage, C++ has about 12 names for it. Don’t even get me started on Array.

It’s natural, then, for DOM to just make up it’s own types so long as its raison d’être is to front for C++ and not to be a standard library for JS. Not because it’s malicious, but because that’s just what one does in C++. Can’t count on a particular platform/endianness/compiler/stdlib? Macro that baby into submission. WTF, indeed.

This is the same dynamic that gives rise to the tussle over constructable constructors. To recap, there is no way in JS to create a function which cannot have new on the left-hand-side. Yes, that might return something other than an instance of the function-object on the right-hand side. It might even throw an exception or do something entirely non-sensical, but because function is a JavaScript concept and because all JS classes are just functions, the idea of an unconstructable constructor is entirely alien. It’s not that you shouldn’t do it…the moment to have an opinion about that particular question never arises in JS. That’s not true if you’re using magic to front for a C/C++ object graph, though. You can have that moment of introspection, and you can choose to say “no, JS is wrong”. And they do, over and over.

What we’re witnessing here isn’t “right” or “wrong”-ness. It’s entirely conflicting world views that wind up in tension because from the perspective of some implementations and all spec authors, the world looks like this:

Not to go all Jeff Foxworthy on you, but if this looks reasonable to you, you might be a browser developer. In this worldview, JS is just a growth protruding from the side of an otherwise healthy platform. But that’s not how webdevs think of it. True or not, this is the mental model of someone scripting the browser:

The parser, DOM, and rendering system are browser-provided, but they’re just JS libraries in some sense. With <canvas>‘s 2D and 3D contexts, we’re even punching all the way up to the rendering stack with JS, and it gets ever-more awkward the more our implementations look like the first diagram and not the second.

To get from parser to DOM in the layered world, you have to describe your objects as JS objects. This is the disconnect. Today’s spec hackers don’t think of their task as the work of describing the imperative bits of the platform in the platform’s imperative language. Instead, their mental model (when it includes JS at all) pushes it to the side as a mere consumer in an ecosystem that it is not a coherent part of. No wonder they’re unwilling to deploy the magic they hold dear to help get to better platform layering; it’s just not something that would ever occur to them.

Luckily, at least on the implementation side, this is changing. Mozilla’s work on dom.js is but one of several projects looking to move the source of truth for the rendering system out of the C++ heap and into the JS heap. Practice is moving on. It’s time for us to get our ritual lined up with the new reality.

The network is our bottleneck and markup is our lingua-franca. To deny these facts is to design for failure. Because the network is our bottleneck, there is incredible power in growing the platform to cover our common use cases. To the extent possible, we should attempt to grow the platform through markup first, since markup provides the most value to the largest set of people and provides a coherent way to expose APIs via DOM.

Any place where you cannot draw a line from browser-provided behavior from a tag to the JS API which describes it is magical. The job of Web API designers is first to introduce new power through markup and second to banish magic, replacing it with understanding. There may continue to be things which exist outside of our understanding, but that is a challenge to be met by cataloging and describing them in our language, not an excuse for why we cannot or should not.

The ground below our feet is moving and alignment throughout the platform, while not inevitable, is clearly desirable and absolutely doable in a portable and interoperable way. Time, then, to start making Chapter 8 excuses in the service of being more idiomatic and better layered. Not less and worse.

15 Comments

That was probably Jonas Sicking or Ian Hickson, not me. The reason you get this disconnect is because changing JavaScript (or ECMAScript, whatever) is hard. E.g. we asked TC39 about byte representation in 2006 or so and we are still not there. Instead we have the kludge that exposes endianness of the system from WebGL.

As we have plenty of other things to work on, such as documenting a whole part of the platform not documented to date, adding new features as we go and requested by developers, and we already had OMGIDL from our predecessors, Web IDL is not all that weird and actually solves a very large number of problems we have been having. In terms of behavior at the binding level not being well defined.

I agree with you that this could be prettier. Maybe dissolve the committee style development of JavaScript and do development jointly at the W3C or WHATWG instead? Everyone could still solve their own problems, but both parties would be exposed earlier to the ideas of the other and ideally that would benefit the synergy between the two (or some other buzzword that is appropriate here).

I guess the other thing I said last time and might be worth repeating, is that you cannot fix this by shouting from the sideline “fix this”. No army is needed, but like everyone we are constrained, and the ideas put forward so far have all had their share of issues and nobody so far has taken the time to address them. new HTMLDivElement() might be nice, but new HTMLElement() does not work the same way. You suggested somewhere we should make tagName writable then, but that would have to change the underlying class and has other implications that are not thought through.

Bringing JavaScript and DOM closer together definitely seems like a worthwhile goal, but also something that will take a lot of time. And if you want to get there someone will have to do all that research.

I’m a huge fan of getting everyone talking more. The world-view disconnect is strongest amongst the folks who are forced to sit in the same room least often. IIRC, I was only one of two or 3 folks from TC39 at TPAC at all. That’s pretty bad.

Anyway, I’m not trying to stone WebIDL to death, only to identify a waypoint on our journey that’s some distance beyond what it’s giving us. DOM prototype linearization is *huge*. But it’s just the start. And yes, I want DOM’s concerns to weigh ever more pressingly on TC39 (why don’t we have events/futures/promises/collection types/etc. in JS yet? “DOM needs them” should be a statement that carries serious weight).

First, you don’t need to specify behaviors for every constructor. I’m really flummoxed why we keep getting hung up on this. First pass, just specify what happens for all the concrete types and omit arguments; do the abstract types and constructor arguments second and third if it’s so hard (and it’s really not, just have that ctor be a no-op until we define the lifecycle). Making things callable by default is really just step one, and these things can be done in steps. Removing the magic one layer at a time is *doable*. But it’s only going to happen when it’s a priority.

As for tagName, we need some invariants preserved…but what are they, really? You need tagName to be fixed by the time you appendChild. That is to say, it needs to be a non-configurable, non-writeable own property when the element is attached. That ties the identity of the object to the observable tag name, and that’s not terribly difficult to say in JS. Object.getOwnPropertyDescriptor(el, “tagName”) can tell you everything you need to know.

As for the lifecycle, we’re actually defining most of what you’d need in Web Components. There’s stuff left to define about how attributes are actually stored (MSFT got this right in their DOM but lost the battle…sigh) and how types from markup will be coerced. But again, if you don’t take any arguments to the ctor to start with, you side-step the problem in the first phase of the transition.

Thanks for the thoughtful manifesto! I’m pretty much with you about avoiding magic in our APIs, though I can’t really get worked up about the constructor issue.

Where you lose me is the markup-first policy. I don’t see how that helps for APIs that are not related to document content. My view is that the open web platform has grown bigger than HTML. In my layer diagram, the bottom layer includes the HTML parser, but also the underlying hardware and OS, and has boxes for network, filesystem, audio, graphics, vibration, telephony, etc. There’s still a nice solid yellow band of JS above that layer, and then a layer of “HTML5” APIs above that, but only one of the boxes in that third layer is the DOM.

Basically, I’d say that JavaScript is our lingua franca, not HTML, and I think it would feel artificial to try to tie all Web platform APIs into the HTML document. The logical extension of my attitude is that it should be possible to write JS programs for the web platform that do not have or require an HTML document. And that is more or less what we have with SharedWorker. So I rest my case!

“As for tagName, we need some invariants preserved…but what are they, really? You need tagName to be fixed by the time you appendChild. That is to say, it needs to be a non-configurable, non-writeable own property when the element is attached. That ties the identity of the object to the observable tag name, and that’s not terribly difficult to say in JS. Object.getOwnPropertyDescriptor(el, “tagName”) can tell you everything you need to know.”

This is not true. tagName needs to be fixed when the object is created because it determines the interface.

Hrm, that interface guarantee might be needed for the built-in element types (who can fix their tagName early in their constructor), but I’m not sure I understand why that’s need for other element types, particularly custom ones.

I guess it’s sort of a litmus test: caring about consistency with the language idioms is something you do or don’t have.

I’m with you on the “JavaScript is our lingua franca”, but only to the extent that it’s our analog to the JVM. HTML is our high-level language, and that’s where you describe the semantics you *want* to be writing with.

The way I phrase the question to our API designers here is “what was your thinking about HTML when you were designing this API?”. If they haven’t thought about it, go back and do it again. HTML isn’t something you can just put to one side, it’s how you get users. It can’t be an add-on.

I don’t want to derail too much from the main point of the post, which I full agree with, but I did want to at least make a note.

I would say that html is strongly related to the user interface part of the platform but the platform has grown to encompass a lore more than that. Having spent most of the last year primarily working with node and and the only thing related to the DOM I’ve worked on was…dom.js, it seems kind of limited to put HTML at the center of any definition of JavaScript anymore. The DOM is what one deals with when they are dealing with GUI related concerns, and most of the new APIs being implemented these days (most anything aside from webgl and audio) has no direct dependency or even reference to DOM related semantics.

This article seems to misunderstand the intention of these technologies. HTML is a data structure and nothing more. JavaScript is an interpreted language whose interpreter is supplied with many of the most common HTML parsers. That is as deep as that relationship goes and has little or nothing to do with DOM.

DOM was originally created during the Netscape/IE browser wars, but this stuff was non-standard and typically referred to as DOM Level 0. The standard, Level 1, was released by the W3C in 1998 about the same time CSS 2 and XML were released as standards. The original specification for DOM Level 1 was originally created for XML only and then simultaneously regressed and expanded for HTML support at recommendation time. I say regressed because HTML syntax has always been and continues to be far more sloppy than XML, so assumptions have to be made when HTML is parsed. I also say expanded because DOM support for HTML contains additional properties that have no relevance in XML.

It would be safe to say that DOM was created because of JavaScript, but standard DOM has little or nothing to do with JavaScript explicitly. Since the release of standard DOM it can be said that DOM is the primary means by which XML/HTML is parsed suggesting an intention to serve as a parse model more than a JavaScript helper.

Types in DOM have little or nothing to do with types in JavaScript. There is absolutely no relationship here and there shouldn’t be. These technologies serve different intentions that are entirely unrelated. It is only coincidental that they happen to work together, which is possibly the only relationship available. In DOM Level 1 types were broad and somewhat rough, because its based upon XML and XML is merely a syntax opposed to a logical model. The logical model, XML Schema, came later. DOM Level 2 was heavily influenced by the work on XML Schema and was recommended in November 2000 while the work on Schema was ongoing until its recommendation in May 2001. The work is on going with the recommendation of DOM Level 3 in 2004 and Schema Second Edition released at about the same time.

You cannot claim to understand the design intentions around DOM without experience working on either parsers or schema language design, but its operation and design have little or nothing to do with JavaScript. JavaScript is just an interconnecting technology like Java and this is specifically addressed in the specification in Appendix G and H respectively.

While you’ve described a relationship between HTML/DOM/JS that is *technically* correct in the context of today’s browsers (the very best kind of correct!), you’ve made the common but critical mistake of conflating what is with what should be. I know all of the history — I work on a browser for chrissake — but you’ve gone all schema-tron on this and forgotten that this isn’t about abstract descriptions, it’s about providing power for people who need it to get a job done. I could go on and on, but suffice to say, you’ve clearly articulated the antithesis of my views.

I will point people at your comment in future when they ask me what people who don’t want good layering in the platform could possibly be thinking.

While I admire your drive for consistency, and your take on the DOM seems accurate, I worry that changing JavaScript because of the DOM seems the wrong way around.

The DOM has – how can I put this kindly – never really been a “beloved” API. From its first specifications (maybe even before), JavaScript developers have found it clunky at best, weird and broken at worst. The DOM has always been something to work around, even if it was impossible to avoid. Even before the rise of frameworks, code for interacting with it was frequently encapsulated so that developers didn’t have to think about its peculiarities all the time.

I’ve been joking lately that jQuery is how the Web routes around broken API design. Take whatever the browser developers or the W3C came up with, figure out what that would look like in a more JavaScript-appropriate idiom, simplify where possible, and bake it into an ever-growing easily accessible monad. The DOM was practically test case number one for this approach, and I see a lot less DOM code as a result.

jQuery isn’t, of course, to everyone’s taste – but at the same time I can’t help wondering if a better approach to the DOM problem is studying how JavaScript developers have worked around it, and creating something new that does the same work without the excruciating legacy and cultural misfire headaches.

Providing access to the document tree using natural JavaScript idioms rather than a strange set of compromises with C++ and Java could be a huge simplification that I suspect would open new programming horizons.

Yeah, that’s difficult, probably harder than any of the current efforts to reimagine the web browser. It’s tearing down and rebuilding a key part of the stack rather than tacking on new pieces. Both APIs would have to coexist for a long time, as the DOM would not go quietly.

At the same time, though, I suspect that developers are already doing this work, and suffering because they have to do it as an extra layer on top of broken pieces.

The problem with today’s web is that it is so focused on empowering the people that it is forgetting the technology along the way. One school of thought suggests the people would be better empowered if their world were less abstract, cost radically less to build and maintain, and is generally more expressive. One way to achieve such objectives is alter where costs exist in the current software life cycle of the web. If, for instance, the majority of costs were moved from maintenance to initial build then it could be argued that more time is spent being creative instead of maintaining.

I have found that when working in HTML alone that I save incredible amounts of time when I develop only in XHTML 1.1, because the browser tells you where your errors are. There is no guess work and most of the wisdom of my professional experience in trying to identify bad form is now automated by the browser. When I am complete I can relax the document back to HTML 4 or XHTML 1 for compatibliity, and many accessibility and semantics problems are solved already without trying. Challenges are removed and costs are reduced by pushing the largest cost challenges to the front of development.

What I find strange is that most companies and software projects think like this internally. They tend to solve the hardest problems first so that their software will be more stable over its lifetime. Strangely enough even applications dedicated for use on the web, such as web browsers, tend to think this way. Unfortunately this sort of disciplined business acumen is thrown away when speaking of web technologies. The typical wisdom is that people need to be empowered. If you would not think this way in your strategic software planning then why would it make sense to think this way about strategic application of web technologies?

In my mind people are generally better empowered when a given environment’s costs are reduced and complexity absolved.

Perhaps you missed the work that Erik Aarvidson did on the Dart DOM? An idiomatic standard library is the birthright of any language worth a damn. In that spirit we’re doing Web Components and have driven changes to DOM already (element.find(), prototype linearization, etc.).

The ugly truth is that only JS and Java constrain themselves to an IDL when starting on their APIs. WebIDL’s detractors are right to hold it up as a primary stumbling block to progress on this front. We certainly thought about an IDL um…never (except as a description shorthand) when designing dart:html.