Use-Case Zero

Some weeks back I lobbed an overly terse“noooooooooo!!!!!” at the W3C Web Application Security Working Group over revisions to the CSP 1.1 API; specifically a proposed reduction of the surface area to include only bits for which they could think of direct use-cases in a face-to-face meeting. At the time I didn’t have the bandwidth to fully justify my objections. Better late than never.

For those who aren’t following the minutiae of the CSP spec, it began life as a Mozilla-driven effort to enable page authors to control the runtime behavior of their documents via an HTTP header. It was brought to the W3C, polished, and shipped last year as CSP 1.0; a much better declarative spec than it started but without much in the way of API. This is a very good way for any spec to get off the ground. Having a high-level declarative form gives implementers something they can ship and prove interop with very quickly. The obvious next step is to add an API.

Late last year I started digging into CSP, both for a personal project to implement a “user CSP” extension for Chrome, and to work with the spec authors to see what state the proposed API was in and how it could be improved. The short form of my analysis of the original CSP proposal was that it was pretty good, but missed a few notes. The new proposal, however, is a reflection of the declarative machinery, not an explanation of that machine.

Not coincidentally, this is also the essential difference between thinking in terms of a welded-shut C++ implementation and a user-serviceable JavaScript design.

For example, the previously proposed API provided methods on a SecurityPolicy class like allowsConnectionTo(url) which outline an API that the browser might plausibly need to enforce a policy at runtime. The new API includes no such methods. As someone working with CSP, you suspect the browser indeed has such a method on just such an object, but the ability to use it yourself to compose new and useful behaviors is now entirely curtailed. This is the extreme version of the previous issues: an API that explains would make an attempt to show how the parser is invoked — presumably as a string argument to a constructor for SecurityPolicy. Similarly, showing how multiple policies combine to form a single effective policy would have lead away from document.securityPolicy as something that appears to be a single de-serialized SecurityPolicy and instead be written in terms of a list of SecurityPolicy instances which might have static methods that are one-liners for the .forEach(...) iteration that yeilds the aggregate answer.

So why should any WG bother with what I just described?

First, because they’ll have to eventually, and by showing only enough skin to claim to have an API in this round, the question will be raised: how does one implement new rules without waiting for the spec to evolve? The Extend The Web Forward idea that flows naturally from p(r)ollyfills has shown real power which this new design puts further from reach…but it won’t keep people from doing it. What about implementing something at runtime using other primitives like a Navigation Controller? Indeed, the spec might get easier to reason about if it considered itself a declarative layer on top of something like the Navigation Controller design for all of the aspects that interact with the network.

There are dozens of things that no over-worked spec author in a face-to-face will think to do with each of the platform primitives we create that are made either easier or harder for the amount of re-invention that’s needed to augment each layer. Consider CSS vs. HTML’s respective parsing and object models: both accept things they don’t understand, but CSS throws that data away. HTML, by contrast, keeps that data around and reflects it in attributes, meaning that it has been possible for more than a decade to write behavioral extensions to HTML that don’t require re-parsing documents, only looking at the results of parsing. CSS has resisted all such attempts at gradual runtime augmentation in part because of the practical difficulties in getting that parser-removed data back and it’s a poorer system for it. CSP can either enable these sorts of rich extensions (with obvious caveats!) or it can assume its committee knows best. This ends predictably: with people on the committee re-implementing large bits of the algorithms over and over and over again for lack of extension points, only to try to play with new variations. This robs CSP and its hoped-for user-base of momentum.

Next, the desire to reflect and not explain has helped the spec avoid reckoning with poor use of JavaScript types. The document.securityPolicy object doesn’t conceptually de-sugar to anything reasonable except a list of policy objects…but that more primitive SecurityPolicy object type doesn’t appear anywhere in the description. This means that if anyone wants to later extend or change the policy in a page, a new mechanism will need to be invented for showing how that happens: meta tags parsed via DOM, not objects created in script and added to a collection. All of which is objectionable on the basis that all that will happen is that some objects will be created and added to the collection that everyone suspects is back there anyway. This is like only having innerHTML and not being able to construct DOM objects any other way, and the right way to be presented with the need to go build idiomatic types for what will eventually be exposed one way or another is to try to design the API as though it was being used to implement the declarative form. JavaScript first gets you both good API and a solid explanation of the lifecycle of the system.

There is, of course, another option: CSP 1.1 could punt on an API entirely. That’s a coherent position that eliminates these frictions, and I’m not sure it’s a bad option given how badly the API has been mangled recently. But it’s not the best solution.

I’ve got huge hope for CSP; I think it’s one of the best things to happen to webappsec ever. What happens about the API will always be overshadowed by the value that it is already delivering. But as a design microcosm, its API is a petri-dish sized version of scenario-solving vs. layering, and a great example of how layering can deliver value over time; particularly to people who aren’t in the room when the design is being considered. An API that explains by showing how declarative layers on top of imperative is one that satisfies use case zero: show your work.