Saturday, February 15, 2014

HTTP 308 Incompetence Expected

Internet History

The Internet from every angle has always been a house of cards held together with defective duct tape. It's a miracle that anything works at all. Those who understand a lot of the technology involved generally hate it, but at the same time are astounded that for end users, things seem to usually work rather well.

Today I'm going to point out some proposed changes being made to HTTP, the standard which the World Wide Web runs on. We'll see how not even the people behind the standards really know what they're doing anymore.

The World Wide Web began in the early 90s in a state of flux. The Internet Engineering Task Force, as well as major players like Netscape released a bunch of quasi-standards to quickly build up a set of design rules and techniques used until HTTP v1.0 came out in 1995. Almost immediately after, HTTP v1.1 was being worked on, and despite not being standardized until 1999, it was pretty well supported in 1996. This is around the same time Internet Explorer started development, and a lot of their initial work was basically duplicating functionality and mechanics from Netscape Navigator.

Despite standards and sane way of doing things, implementers always deviate from them, or come up with incorrect alternatives. Misunderstandings, and ideas on how things should work is how things were shaped in the early days.

Thankfully though, over the years, standards online are finally being more strictly adhered to, and bugs are being fixed. Exact precise specifications exist for many things, as well as unit-tests to ensure adherence to standards. Things like Internet Explorer 6 are now a distant memory for most (unless you're in China).

Existing Practice

A key point which led to many standards coming into existence was existing practice. Some browser or server would invent something, and the others would jump on board, and a standard would be created. Those who deviated were told to fix their implementation to match either the majority, or what was correct and would cause the least amount of issues for the long term stability of the World Wide Web.

Now we'll see how today's engineers want to throw existing practice out the window, loosen up standards to the point of meaninglessness, and basically bust the technology you're currently using to view this article.

HTTP Responses

One of the central designed structures of HTTP is that every response from a server has a code which identifies what the result is, and servers and clients should understand how to work with the particular responses. The more precise the definition, the better online experience we'll all have.

HTTP v0.9 was in a constant state of fluctuation, but offered three basic kinds of page redirects, permanent, temporary, and one which wasn't fully specified and unclear. These were defined as status codes 301, 302, and 303 respectively:

Moved 301: The data requested has been assigned a new URI, the change is permanent.
Found 302: The data requested actually resides under a different URL, however, the
redirection may be altered on occasion.
Method 303: Note: This status code is to be specified in more detail. For
the moment it is for discussion only.
Like the found response, this suggests that the client go try another network
address. In this case, a different method may be used.

The explanation behind a permanent and temporary redirect seems pretty straight forward. 303 is less clear, although it's the only one which mentions the method used is allowed to change, it's even the name associated with the response code.

Several HTTP methods exist, for different kinds of activities. GET is a method to say, hey, I want a page. POST is a method to say, hey here's some data from me, like my name and my credit card number, go do something with it.

The idea with the different redirects essentially was that 303 should embody your requested was processed, please move on (hence a POST request should now become a GET request), whereas 301 and 302 were to say what you need to do is elsewhere (permanently or temporarily), please take your business there (POST should remain POST).

In any case, the text here was not as clear as can be, and developers were doing all kinds of things in general. HTTP v1.0 came out to set the record straight.

301 Moved Permanently
The requested resource has been assigned a new permanent URL and
any future references to this resource should be done using that
URL. Clients with link editing capabilities should automatically
relink references to the Request-URI to the new reference returned
by the server, where possible.

Note: When automatically redirecting a POST request after
receiving a 301 status code, some existing user agents will
erroneously change it into a GET request.

302 Moved Temporarily
The requested resource resides temporarily under a different URL.
Since the redirection may be altered on occasion, the client should
continue to use the Request-URI for future requests.

Note: When automatically redirecting a POST request after
receiving a 302 status code, some existing user agents will
erroneously change it into a GET request.

HTTP v1.0 however did not define 303 at all. Some developers not understanding what a temporary redirect is supposed to be thought it meant hey, this is processed, now move on, however if you need something similar in the future, come here again. We can hardly blame developers at that point for misusing 302, and wanting 303 semantics.

HTTP v1.1 decided to rectify this problem once and for all. 302 was renamed to Found and a new note was added:

Note: RFC 1945 and RFC 2068 specify that the client is not allowed
to change the method on the redirected request. However, most
existing user agent implementations treat 302 as if it were a 303
response, performing a GET on the Location field-value regardless
of the original request method. The status codes 303 and 307 have
been added for servers that wish to make unambiguously clear which
kind of reaction is expected of the client.

Since 302 was being used in two different ways, two new codes were created, one for each technique, to ensure proper use in the future. 302 retained its definition, but with so many incorrect implementations out there, 302 should essentially never be used if you want to ensure correct semantics are followed, instead use 303 - See Other (processing, move on...), or 307 Temporary Redirect (The real version of 302).

In all my experience working with HTTP over the past decade, I've found 301, 303, and 307 to be implemented and used correctly as defined in HTTP v1.1, with 302 still being used incorrectly as 303 (instead of 307 semantics), generally by PHP programmers. But as above, never use 302, as who knows what the browser will do with it.

Since existing practice today is that 301, 303, and 307 are used correctly pretty much everywhere, if someone misuses it, they should be told to correct their usage or handling. 302 is still so misused till this day, it's a lost cause.

HTTP2 Responses

Note: For historical reasons, a user agent MAY change the request
method from POST to GET for the subsequent request. If this
behavior is undesired, the 307 (Temporary Redirect) status code
can be used instead.

Let me get this straight, you're now taking a situation which hasn't been a problem for over a decade now, and asking it to begin happening anew by now allowing 301 to act as a 303???

If you don't think that paragraph above was problematic, wait till you see this one:

301 is allowed to change the request method? Excuse me, I have to go vomit.

It was clear in the past that 301 was not allowed to change its method. But now, I don't even understand what this 301 is supposed to mean anymore. So I should permanently be using the new URI for GET requests. Where do my POSTs go? Are they processed? What the heck am I looking at?

To add insult to injury, they're adding the new 308 Permanent Redirect as the I really really mean I want true 301 semantics this time. So now you can use a new status code which older browsers won't know what to do with, or the old status code that you're now allowing new browsers to utterly butcher for reasons I cannot fathom.

This is how they want to alter things. Does this seem like a sane design to you?

If the new design decisions of the HTTP2 team is to now capitulate to rare mistakes made out there, what's to stop here? I can see some newbie developers reading about how 307 and 308 are for redirects, misunderstanding them, and then misusing them too. So in five years we'll have 309 and 310 as we really really really mean it this time? This approach the HTTP2 team is taking is absurd. If you're going to invent new status codes each time you find an isolated instance of someone misusing one, where does it end?

HTTP 308 is already taken!

One last point. Remember how earlier, I mentioned how a key point for the design of the Internet is to work with existing practice? 308 is in fact already used by something else, Resume Incomplete for resumable uploading. Which is used by Google, king of the Internet, and manyothers.

Conclusion

I'm now dubbing HTTP 308 as Incompetence Expected, as that's clearly the only meaning it has. Or maybe that should be the official name for HTTP2 and the team behind it, I'll let you decide.

Edit:
Thanks to those who read this article and sent in images. I added them where appropriate.

I'm perplexed as to how a status could mean both "all future requests should go to this other URL and replace any references to this URL with the new one" at the same time as "you can throw away the data from this current request".

I wonder how these HTTP 2 bozos would like it if the post office would just leave their letters and packages in the empty lot that is the remains of their now-demolished old house while sending nothing but a slip reading "someone sent you something" to their new forwarding address. Something tells me they wouldn't be too thrilled, but that's exactly what they're planning to do to the internet.

If something has moved permanently then the only fathomable explanation is that anything which was attempted to be sent to the old place should get forwarded on to the new one. Anything else and you are LITERALLY asking for undefined behavior. Period. End of story.