Saturday, February 15, 2014

HTTP 308 Incompetence Expected

Internet History

The Internet from every angle has always been a house of cards held together with defective duct tape. It's a miracle that anything works at all. Those who understand a lot of the technology involved  generally hate it, but at the same time are astounded that for end users, things seem to usually work rather well.

Today I'm going to point out some proposed changes being made to HTTP, the standard which the World Wide Web runs on. We'll see how not even the people behind the standards really know what they're doing anymore.

The World Wide Web began in the early 90s in a state of flux. The Internet Engineering Task Force, as well as major players like Netscape released a bunch of quasi-standards to quickly build up a set of design rules and techniques used until HTTP v1.0 came out in 1995. Almost immediately after, HTTP v1.1 was being worked on, and despite not being standardized until 1999, it was pretty well supported in 1996. This is around the same time Internet Explorer started development, and a lot of their initial work was basically duplicating functionality and mechanics from Netscape Navigator.

Despite standards and sane way of doing things, implementers always deviate from them, or come up with incorrect alternatives. Misunderstandings, and ideas on how things should work is how things were shaped in the early days.

Thankfully though, over the years, standards online are finally being more strictly adhered to, and bugs are being fixed. Exact precise specifications exist for many things, as well as unit-tests to ensure adherence to standards. Things like Internet Explorer 6 are now a distant memory for most (unless you're in China).

Existing Practice

A key point which led to many standards coming into existence was existing practice. Some browser or server would invent something, and the others would jump on board, and a standard would be created. Those who deviated were told to fix their implementation to match  either the majority, or what was correct and would cause the least amount of issues for the long term stability of the World Wide Web.

Now we'll see how today's engineers want to throw existing practice out the window, loosen up standards to the point of meaninglessness, and basically bust the technology you're currently using to view this article.

HTTP Responses

One of the central designed structures of HTTP is that every response from a server has a code which identifies what the result is, and servers and clients should understand how to work with the particular responses. The more precise the definition, the better online experience we'll all have.

HTTP v0.9 was in a constant state of fluctuation, but offered three basic kinds of page redirects, permanent, temporary, and one which wasn't fully specified and unclear. These were defined as status codes 301, 302, and 303 respectively:

Moved 301: The data requested has been assigned a new URI, the change is permanent.
Found 302: The data requested actually resides under a different URL, however, the redirection may be altered on occasion.
Method 303: Note: This status code is to be specified in more detail. For the moment it is for discussion only. 
Like the found response, this suggests that the client go try another network address. In this case, a different method may be used.

The explanation behind a permanent and temporary redirect seems pretty straight forward. 303 is less clear, although it's the only one which mentions the method used is allowed to change, it's even the name associated with the response code.

Several HTTP methods exist, for different kinds of activities. GET is a method to say, hey, I want a page. POST is a method to say, hey here's some data from me, like my name and my credit card number, go do something with it.

The idea with the different redirects essentially was that 303 should embody your requested was processed, please move on (hence a POST request should now become a GET request), whereas 301 and 302 were to say what you need to do is elsewhere (permanently or temporarily), please take your business there (POST should remain POST).

In any case, the text here was not as clear as can be, and developers were doing all kinds of things in general. HTTP v1.0 came out to set the record straight.

301 Moved Permanently

   The requested resource has been assigned a new permanent URL and
   any future references to this resource should be done using that
   URL. Clients with link editing capabilities should automatically
   relink references to the Request-URI to the new reference returned
   by the server, where possible. 
       Note: When automatically redirecting a POST request after
       receiving a 301 status code, some existing user agents will
       erroneously change it into a GET request. 
302 Moved Temporarily

   The requested resource resides temporarily under a different URL.
   Since the redirection may be altered on occasion, the client should
   continue to use the Request-URI for future requests.
       Note: When automatically redirecting a POST request after
       receiving a 302 status code, some existing user agents will
       erroneously change it into a GET request. 
HTTP v1.0 however did not define 303 at all. Some developers not understanding what a temporary redirect is supposed to be thought it meant hey, this is processed, now move on, however if you need something similar in the future, come here again. We can hardly blame developers at that point for misusing 302, and wanting 303 semantics.

HTTP v1.1 decided to rectify this problem once and for all. 302 was renamed to Found and a new note was added:

      Note: RFC 1945 and RFC 2068 specify that the client is not allowed
      to change the method on the redirected request.  However, most
      existing user agent implementations treat 302 as if it were a 303
      response, performing a GET on the Location field-value regardless
      of the original request method. The status codes 303 and 307 have
      been added for servers that wish to make unambiguously clear which
      kind of reaction is expected of the client.

Since 302 was being used in two different ways, two new codes were created, one for each technique, to ensure proper use in the future. 302 retained its definition, but with so many incorrect implementations out there, 302 should essentially never be used if you want to ensure correct semantics are followed, instead use 303 - See Other (processing, move on...), or 307 Temporary Redirect (The real version of 302).

In all my experience working with HTTP over the past decade, I've found 301, 303, and 307 to be implemented and used correctly as defined in HTTP v1.1, with 302 still being used incorrectly as 303 (instead of 307 semantics), generally by PHP programmers. But as above, never use 302, as who knows what the browser will do with it.

Since existing practice today is that 301, 303, and 307 are used correctly pretty much everywhere, if someone misuses it, they should be told to correct their usage or handling. 302 is still so misused till this day, it's a lost cause.

HTTP2 Responses

Now, in their infinite wisdom, the new HTTP2 team has decided to create problems. 301 status definition now brilliantly includes the following:

      Note: For historical reasons, a user agent MAY change the request
      method from POST to GET for the subsequent request.  If this
      behavior is undesired, the 307 (Temporary Redirect) status code
      can be used instead.

Let me get this straight, you're now taking a situation which hasn't been a problem for over a decade now, and asking it to begin happening anew by now allowing 301 to act as a 303???

If you don't think that paragraph above was problematic, wait till you see this one:

 +-------------------------------------------+-----------+-----------+
 |                                           | Permanent | Temporary |
 +-------------------------------------------+-----------+-----------+
 | Allows changing the request method from   | 301       | 302       |
 | POST to GET                               |           |           |
 | Does not allow changing the request       | -         | 307       |
 | method from POST to GET                   |           |           |
 +-------------------------------------------+-----------+-----------+

301 is allowed to change the request method? Excuse me, I have to go vomit.

It was clear in the past that 301 was not allowed to change its method. But now, I don't even understand what this 301 is supposed to mean anymore. So I should permanently be using the new URI for GET requests. Where do my POSTs go? Are they processed? What the heck am I looking at?

To add insult to injury, they're adding the new 308 Permanent Redirect as the I really really mean I want true 301 semantics this time. So now you can use a new status code which older browsers won't know what to do with, or the old status code that you're now allowing new browsers to utterly butcher for reasons I cannot fathom.

Here's how the status codes work with HTTP 1.1:

+------+-------------------------------------+-----------+-----------------+
| Code | Meaning                             | Duration  | Method Change   |
+------+-------------------------------------+-----------+-----------------+
| 301  | Permanent Redirect.                 | Permanent | No              |
| 302  | Temporary Redirect, misused often.  | Temporary | Only by mistake |
| 303  | Process and move on.                | Temporary | Yes             |
| 307  | The true 302!                       | Temporary | No              |
| 308  | Resume Incomplete, see below.       | Temporary | No              |
+------+-------------------------------------+-----------+-----------------+

So here's how the status codes will work now with the HTTP2 updates:

+------+------------------------------+-----------+---------------+
| Code | Meaning                      | Duration  | Method Change |
+------+------------------------------+-----------+---------------+
| 301  | Who the heck knows.          | Permanent | Surprise Me   |
| 302  | Who the heck knows.          | Temporary | Surprise Me   |
| 303  | Process and move on.         | Temporary | Yes           |
| 307  | The true 302!                | Temporary | No            |
| 308  | The true 301!                | Permanent | No            |
+------+------------------------------+-----------+---------------+



And here's how one will have to do a permanent redirect in the future:

 +------+----------------+----------------+
 | Code | Older Browsers | Newer Browsers |
 +------+----------------+----------------+
 | 301  | Correct.       | Who Knows?     |
 | 308  | Broken!!!      | Correct.       |
 +------+----------------+----------------+

This is how they want to alter things. Does this seem like a sane design to you?

If the new design decisions of the HTTP2 team is to now capitulate to rare mistakes made out there, what's to stop here? I can see some newbie developers reading about how 307 and 308 are for redirects, misunderstanding them, and then misusing them too. So in five years we'll have 309 and 310 as we really really really mean it this time? This approach the HTTP2 team is taking is absurd. If you're going to invent new status codes each time you find an isolated instance of someone misusing one, where does it end?

HTTP 308 is already taken!

One last point. Remember how earlier, I mentioned how a key point for the design of the Internet is to work with existing practice? 308 is in fact already used by something else, Resume Incomplete for resumable uploading. Which is used by Google, king of the Internet, and many others.


Conclusion

I'm now dubbing HTTP 308 as Incompetence Expected, as that's clearly the only meaning it has. Or maybe that should be the official name for HTTP2 and the team behind it, I'll let you decide.

Edit:
Thanks to those who read this article and sent in images. I added them where appropriate.