Tuesday, October 29, 2013

REST Web APIs - Part 1

We all understand hypermedia in the context of the Web. It’s just a fancy word for links. Web pages link to each other, and the result is the World Wide Web, driven by hypermedia. Hypermedia is the single most important aspect of REST, and the least understood.

We say that a URL is the URL of some thing: a product, a user, the home page. The technical term for the thing named by a URL is resource.

When a web browser sends an HTTP request for a resource, the server sends a document in response (usually an HTML document, but sometimes a binary image or something else). Whatever document the server sends, we call that document a representation of the resource.

URL identifies one and only one resource. If a website has two conceptually different things on it, we expect the site to treat them as two resources with different URLs.

The principle of addressability just says that every resource should have its own URL.

Application state is kept on the client, but the server can manipulate it by sending representations — HTML documents, in this case — that describe the possible state transitions. Resource state is kept on the server, but the client can manipulate it by sending the server a representation — an HTML form submission, in this case — describing the desired new state.

You know what application state is — it’s which web page a client is on. Hypermedia is the general term for things like HTML links and forms: the techniques a server uses to explain to a client what it can do next. To say that hypermedia is the engine of application state is to say that we all navigate the Web by filling out forms and following links.

All successful post-Web protocols do something the Web can’t do: peer-to-peer protocols like BitTorrent and real-time protocols like SSH. For most purposes, HTTP is good enough.

The unprecedented flexibility of the Web comes from the principles of REST.

In REST terms, putting information about URL construction in separate human-readable documents violates the principles of connectedness and self-descriptive messages. In REST terms, the website redesign is entirely encapsulated in the self-descriptive HTML documents served by the website. A client that could understand the old HTML documents can understand the new ones.

The HTTP standard says that a GET request is a request for a representation. It’s not intended to change any resource state on the server.

application/vnd.collection+json, you’ll discover that it’s a media type registered for Collection+JSON. Collection+JSON is a standard for publishing a searchable list of resources over the Web. JSON puts constraints on plain text, and Collection+JSON puts constraints on JSON. A server can’t serve just any JSON document as application/vnd.collection+json. It can only serve a JSON object: {} But not just any object. The object has to have a property called collection, which maps to another object: {"collection": {}} The “collection” object ought to have a property called items that maps to a list: {"collection": {"items": []}} The items in the “items” list need to be objects: {"collection": {"items": [{}, {}, {}]}} And on and on, constraint after constraint. Eventually you get the highly formatted document you just saw, which starts out like this:

Collection+JSON is a way of serving lists — not lists of data structures, which you can do with normal JSON, but lists that describe HTTP resources. The collection object has an href property, and its value is a JSON string. But it’s not just any string — it’s the URL I just sent a GET request to: { "collection": { "href" : "http://www.youtypeitwepostit.com/api/" } } The Collection+JSON standard defines this string as “the address used to retrieve a representation of the document” (in other words, it’s the URL of the collection resource). Each object inside the collection’s items list has its own href property, and each value is a string containing a URL, like http://www.youtypeitwepostit.com/api/messages/21818525390699506 (in other words, each item in the list represents an HTTP resource with its own URL).

A document that doesn’t follow these rules isn’t a Collection+JSON document: it’s just some JSON. By allowing yourself to be bound by Collection+JSON’s constraints, you gain the ability to talk about concepts like resources and URLs. These concepts are not defined in JSON, which can only talk about simple things like strings and lists.

To create a new item in the collection, the client first uses the template object to compose a valid item representation and then uses HTTP POST to send that representation to the server for processing.

Collection+JSON works along the same lines as HTML. The server provides you with some kind of form (the template), which you fill out to create a document. Then you send that document to the server with a POST request.

The server responds: HTTP/1.1 201 Created Location: http://www.youtypeitwepostit.com/api/47210977342911065 The 201 response code (Created) is a little more specific than 200 (OK); it means that everything is OK and that a new resource was created in response to my request. The Location header gives the URL to the newborn resource.

REST is not a protocol, a file format, or a development framework. It’s a set of design constraints: statelessness, hypermedia as the engine of application state, and so on. Collectively, we call these the Fielding constraints, because they were first identified in Roy T. Fielding’s 2000 dissertation on software architecture, which gathered them together under the name “REST.”

A resource is anything that’s important enough to be referenced as a thing in itself. If your users might “want to create a hypertext link to it, make or refute assertions about it, retrieve or cache a representation of it, include all or part of it by reference into another representation, annotate it, or perform other operations on it” (Architecture), you should make it a resource. Giving something a URL turns it into a resource.

When a client issues a GET request for a resource, the server should serve a document that captures the resource in a useful way. That’s a representation — a machine-readable explanation of the current state of a resource.

The server might describe a database row as an XML document, a JSON object, a set of comma-separated values, or as the SQL INSERT statement used to create it. These are all legitimate representations; it depends on what the client asks for. A representation can be any machine-readable document containing any information about a resource. We think of representations as something the server sends to the client. That’s because when we surf the Web, most of our requests are GET requests. We’re asking for representations. But in a POST, PUT, or PATCH request, the client sends a representation to the server. The server’s job is then to change the resource state so it reflects the incoming representation.

The server sends a representation describing the state of a resource. The client sends a representation describing the state it would like the resource to have. That’s representational state transfer.

If a DELETE request succeeds, the possible status codes are 204 (No Content, i.e., “it’s deleted, and I don’t have anything more to say about it”), 200 (OK, i.e., “it’s deleted, and here’s a message about that”); and 202 (Accepted, i.e., “I’ll delete it later”).

If a client tries to GET a resource that has been DELETEd, the server will return an error response code, usually 404 (Not Found) or 410 (Gone): GET /api/

DELETE method has another useful property: it’s idempotent. Once you delete a resource, it’s gone. The resource state has permanently changed. You can send another DELETE request, and you might get a 404 error, but the resource state is exactly as it was after the first request. The resource is still gone.

POST request to a resource creates a new resource underneath it. The most common response code to a POST-to-append request is 201 (Created). It lets the client know that a new resource was created. The Location header lets the client know the URL to this new resource. Another common response code is 202 (Accepted), which means that the server intends to create a new resource based on the given representation, but hasn’t actually created it yet.

A PUT request is a request to modify resource state. The client takes the representation it got from a GET request, modifies it, and sends it back as the payload of a PUT request. If the server decides to accept a PUT request, the server changes the resource state to match what the client says in the representation, and usually sends either 200 (OK) or 204 (No Content). PUT is idempotent, just like DELETE. The client can also use PUT to create a new resource, if it knows the URL where the new resource should live. PUT is an idempotent operation even when you use it to create a new resource.

The PATCH method allows for this. Instead of PUTting a full representation, you can create a special “diff” representation and send it to the server as the payload of a PATCH request.

The best response codes for a successful PATCH are the same as for PUT and DELETE: 200 (OK) if the server wants to send data (such as an updated representation of the resource) along with its response, and 204 (No Content) if the server just wants to indicate success. PATCH is neither safe nor idempotent.