Wednesday, November 16, 2011

RESTful Design - Benefits, Patterns

This captures odds-n-ends around RESTful design - why bother, what are the benefits, what are some patterns, etc. This is written in a terse "talking points" style, with most content either paraphrased or explicitly copied from the footnoted links at end of post; I've added some of my thoughts here and there.

An opening thought around what might be the benefit to understanding and leveraging aspects of RESTful design: since REST describes the way the web works, and the web is the single most scalable application ever known, we might do well to understand and embrace aspects of RESTful style.

REST describes a Resource-Oriented Architecture (ROA):the web is based on resource exchange, not on sending commands.

Selected excerpts from Roy Fielding's thesis[1]:

REST provides a set of architectural constraints that, when applied as a whole, emphasizes scalability of component interactions, generality of interfaces, independent deployment of components, and intermediary components to reduce interaction latency, enforce security, and encapsulate legacy systems.

The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components. By applying the software engineering principle of generality to the component interface, the overall system architecture is simplified and the visibility of interactions is improved.

What makes HTTP significantly different from RPC is that the requests are directed to resources using a generic interface with standard semantics that can be interpreted by intermediaries almost as well as by the machines that originate services. The result is an application that allows for layers of transformation and indirection that are independent of the information origin, which is very useful for an Internet-scale, multi-organization, anarchically scalable information system. RPC mechanisms, in contrast, are defined in terms of language APIs, not network-based applications.

HTTP is not designed to be a transport protocol. It is a transfer protocol in which the messages reflect the semantics of the Web architecture by performing actions on resources through the transfer and manipulation of representations of those resources. It is possible to achieve a wide range of functionality using this very simple interface, but following the interface is required in order for HTTP semantics to remain visible to intermediaries.

scalability: can leverage (and not get burned by) existing web infrastructure (proxies, gateways, etc...crawlers, etc.)reliability: GET and HEAD are safe; PUT and DELETE are idempotent - thus clients can resend requests as needed (except for POST - but see patterns below for workarounds). Cache intermediaries can "determine the cacheability of a response because the interface is generic rather than specific to each resource. By default, the response to a retrieval request is cacheable and the responses to other requests are non-cacheable." (Fielding, sec. 5.2.2)lowered dev costs: no need to invent a new protocol for every application (ala WS).extensibility: re-use of testing tools/techniques, interoperability between new apps and existing clients, etc.value-add: facilitate intranet with searchable resources (i.e. crawlers can index GETs without deleting your database...).

Multiple Representations:

ease of evolution: support versioning for backwards compatibility (e.g. application-custom MIME types)flexibility: client references are not coupled to a particular representation

If you must use cookies, store all app state on client side (i.e. don't use a session ID pointing to data on server) - else, you'll sacrifice scalability

Leverage intermediaries - using HTTP as intended in a RESTful style facilitates interoperation with network components that provide load balancing, caching, security policies, etc. As per Fielding: Within REST, intermediary components can actively transform the content of messages because the messages are self-descriptive and their semantics are visible to intermediaries.

Design/Coding

New URIs are created by server, returned to server via the Location header after a POST creates it

Version the service with URI - /v1/service/resource, or even as part of the host - v1.myservice.twc.com, v2.myservice.twc.com, etc.

Keep in mind that HTML5 will support PUT/DELETE. But not all firewalls allow these through...so, tunnel the method using header or hidden form field, or just use XHR

Post-Once-Exactly: to get around non-repeatable POSTs - GET returns a server-side link representing a resource not yet created, then client POSTs to that URL to create new resource; subsequent POSTs to the same resource URL return 405 (not allowed).

"Conditional PUT (POST)": before submitting a large amount of information to server that might not be able to handle it at the moment - PUT without resource but include Content-Length and Expect headers. If response has same code as Expect value, client proceeds, else it does not.

Use POST to support large queries, to get around length limits imposed by servers, clients and proxies. However this results in loss of cacheability if response is sent synchronously; instead, use the async request pattern from above - server returns new resource with 201 response code, client then GETs the answer to the request.

No comments:

Post a Comment

Welcome to the Perimeter Sweep Blog

My blog is largely intended to be a placeholder for topics involving software development - architecture, technology drill-downs, best practices, various solutions, workarounds, gotchas and the like - things that will remind me what I've learned over time. If it helps you out also - all the better.

Subscribe To This Blog

About Me

I'm a Senior Software Engineer, an avid runner, and formerly a professional musician...currently the proud father of a super-tyke, raising two Siberian Huskies and married to my best friend. Life is good.