Designing and Implementing Hypermedia APIs

This article (the second in the series) focuses on implementing a hypermedia server. The message design and problem domain description used in this implementation were covered in the previous article. We'll talk in general terms about the role of hypermedia servers (routing, evaluation and execution) and then walk through the basics of implementation including the component layer, the representation layer, and the connector layer. Finally, we'll briefly cover client-side browser of a hypermedia API; specifically the limitations of relying on Web browsers and command-line tools and the advantages of the "explorer" approach.

This article walks through the high-level details of a fully functional server built using Node.js. To keep things relatively simple, this example implementation does not take advantage of many custom Node modules or frameworks and even handles storage of simple disk files. Because the implementation is kept 'bare bones' and basic, it does not contain all the features and safety of a real production-level server, but you'll still be able to get the main point of the implementation pattern. Also, even though this server is built using Node.js, you should have no problem transferring the ideas shown here into your preferred programming language, framework and/or platform.

NOTE: The full source code for this server implementation is available in GitHub.

Hypermedia API Servers are basically Web Servers with a bit more work added. Like a common Web server, Hypermedia API servers accept requests, process them, and return responses. However, Hypermedia API servers do some additional work, too. They act as translators. Each request is sent in a predetermined message format, translated into something the server's components understand (storage, database, business logic), processed, and then translated back into some predetermined format that includes information on possible next steps" for the client that made the request. Typical RPC-style API servers don't include that next step" information.

This additional information can include whether this client can see related resources, can execute a search operation, can modify the data stored on the server, etc. All this is communicated by adding hypermedia controls (links and forms) based on the media type design understood by the client making the request. That client request may be tied to a user identity; one that may (or may not) have additional privileges, which affects what next steps" are valid at the time of the request. This context-driven modification of the response is one of the key value-add" elements to hypermedia-style implementations.

Routing

The first responsibility of an API server on the Web is to accept, parse, and route the incoming request. API servers on the Web are designed to use URIs as the primary means for routing requests once they arrive at the server. This is done by parsing the URI into path segments, the query string, etc. and, using that information, sending the details of that request (including possible request body data) to the right internal component for processing.

tells the server that a client wants to do a "read" operation searching for pending users on the www.example.org server. The client request indicates that the response should be represented in the collection+json format (a registered hypermedia type). Finally, this request has been made by an authenticated user as identified by the encrypted value in the Authorization header.

The server would likely break down the URI into it's parts:

users

search=pending

and then formulate a valid request to an internal component to handle the task:

results = Users.Search('pending');

The results would then be translated into the requested format and then returned to the client:

http.Response = Representation(results, 'collection+json');

The example here is just pseudo-code, but you get the basic idea. API servers accept, route and process requests then create representations of the results to return to the client.

Evaluation and Execution

The details of processing requests involve evaluating the request (not just the URI but also the protocol details such as the method, additional headers, and any payload) and determining which internal routines need to be executed to fulfill the request. In the previous example, the server "decided" that /users/?search=pending meant that the server should pass the pending" query string argument to the User module's Search function. The server also decided to format the reply as a collection+json representation based on the contents of the Accept header sent from the client.

The server acts as an intermediary between the outside world (which "talks" HTTP) and the internal components on the server (which talk whatever source code or local network language is in use). The server's role is to evaluate the request, convert that into "component-speak" and then format the reply appropriately. This make the servers' role a "connector" between the outside world and the internal components.

NOTE:
This "component-connector" model was referenced by Roy Fielding in his description of the Representation State Transfer (REST) architectural style for the Web.

In many implementations the responsibilities of components (internal) and connectors (external) are mixed together. Over the long term, this mixing of concerns can make maintenance and evolvability harder. For this reason, the implementation pattern shown in this article will emphasize the differences between component and connectors. You'll also see another responsibility identified as a separate concern; that of generating representations of internal data for responses.

The Component Layer

The component layer is where the work of solving problems for your domain happen; this is the stuff that no-one else does the same way. It's also work that usually has nothing to do with HTTP or the Web. For example, reading and writing data into storage, calculating formulas related to your business, enforcing business rules, etc. These are all component-level activities.

Domain-Specific, Independent Implementation

The problem domain identified in the previous article in this series was a Class Scheduling system. It handles managing students, teachers, courses, and combining all three of those into class schedules. These are all domain-specific details that live in the component layer. For that reason, our implementation has a module (called component.js) that handles this work. We'll also use a simple file-based storage module (called storage.js) to handle the read and write actions for this implementation.

These two modules (storage.js and component.js) are implemented to be unaware of any connector details (e.g. HTTP, WebSockets, etc.). The examples here are small but, in large systems, the component layer contains the unique details of the target domain (here that's the Class Scheduling domain). This layer is most often the value add" of your implementation; the parts that no-one else does quite the same way.

Creating a separation of concerns (SoC) between the component layer and the rest of the system also improve the chances that new connectors (FTP, SMTP, etc.) can be added in the future with minimum disruption. It also means that optimizations at the connector layer (caching, scaling out with more servers, etc.) can be done without touching the components.

Storage.js

In this example, data storage is implemented as a simple file system. In production implementations this would likely be done using structured storage such as a document database (MongoDB, CouchDB, etc.), relational database (MySQL, Oracle, SQL Server, etc.), or some other storage system. It might even be done via a remote storage model using HTTP connectors!

Here's a snippet of code that shows how storage is implemented for our sample app:

Basically, JSON objects are stored on the disk with unique names created by the makeId() routine. We'll take a look at how this storage module is called when we review the component.js module in the next section.

Component.js

In this example app, the component.js module handles all the domain-level details. It knows how to talk to storage and how to convert service requests such as "Add Student", "Assign Student to a class", etc. In a larger system, the component layer may include several modules but they would all still do the same basic type of work.

Along with storage handling, the component layer is responsible for the business logic for the solution. In our example the source code is in a single module (component.js) but in larger, more complete systems, you're likely to have multiple components; each handling different aspects of the business logic.

Below is some of the high-level code used to implement the business logic for handling schedule information:

Lastly, here is the routine that processes the list of one or more records from storage into an internal object graph. This is the format that is understood by all the component-level routines in this system.

Note that the component layer does not "talk" HTTP or XML; that's handled separately. The component layer only needs to be able to implement the internal business requirements and communicate with storage services (locally or remotely). The component layer does, however, include some links when appropriate. How they will be rendered (if at all) is left to the next element in our implementation: The Representation Service.

The Representation Service

HTTP is an unusual protocol because it is designed to allow the same data to be represented in different formats, called media types. These media types are well defined and (usually) registered with a standards body (the IANA). Clients and servers "share understanding" about how the data and transaction details are represented and that is what makes it possible for a client (e.g. an HTML-aware Web browser) to successfully communicate with a new-found server.

Media-Type Focus

It's not just the protocol semantics that the two parties share, but also the message semantics. For example the HTML A, LINK, FORM and INPUT elements in HTML all indicate transition details. In the previous article in this series a custom hypermedia type was designed (application/TK). That design has LINK, ACTION, and DATA elements that indicate transitions. The representation service is the place where the information from internal storage and the operations in the private component layer are translated into a public representation; one that both client and server understand.

This focus on using the message itself - the media type as the primary "shared understanding" between client and server is one of the important features of hypermedia systems. The media type is the way clients and server "talk to each other without having to know what programming language (Ruby, Python, PHP, Node, etc.), coding style (Object Oriented, Functional, Procedural), or even operating system used by either party.

Translating Domain-Specific Information

Representation services do very important work. They accept requests from the public connector (we'll see what that looks like in the next section), make requests to the private component layer and act as a translator between these two "worlds."

Representation.js

In this sample implementation, the representation layer is housed in a single module called representation.js. This module is able to "talk" application/TK.

Here is the high-level code that "walks" the internal object model supplied by the component layer and translates that into the public Class Scheduling hypermedia type:

NOTE:
The full source code for this server implementation is available in GitHub.

The routine "knows" the layout of the internal object graph supplied via the object argument above. The routine also "knows" the layout of a valid Class Scheduling message. A private object graph goes in and a public hypermedia message comes out.

Here's the dataElement routine; the one that converts any data points in the private graph into valid data elements in the message.

You may notice that the layout of both the internal and public data is very similar. This is not a requirement to translate between private object graphs and public media types, but it does sometimes make things easier. However, this is not a common case; especially in systems that support more than one public message format. The similarities here were made to keep translation straightforward and the comparison relatively easy to view and analyze.

So, with a representation layer in place, the last step is to implement the Connector layer that converts incoming protocol requests (in this case, HTTP) into something the components can understand and then return the results of the representation layer's work back to the caller.

The Connector Layer

The Connector Layer is the layer that is exposed to the public Web. Connectors "speak" HTTP, DNS, etc. and are the gateways into which requests flow and responses return. Web server engines (Apache, IIS, nginx, etc.) are the most well known type of connector. Most of them support more than just blindly accepting requests and returning responses. They also support some level of routing and scripting. These make it possible to write code that inspects incoming requests, passes them to the proper component and provides the proper response once the component has completed it's work.

Protocol-Level Interaction

Connectors focus on protocol-level interaction. An HTTP connector understands the details of the HTTP protocol and makes those details available for inspection and manipulation. Inspecting the URL on the incoming request, validating the headers to determine what the response format should be, and routing the request (and any arguments) to the proper component is the job of the connector.

For this article series, Node.js is the connector. It is rather simple to start up an HTTP connector using Node.js. Using Node.js to provide routing and manipulation of HTTP messages is also easy.

Mediating Between Internal and External World

Since it is the connector that faces the external world, scripting the connector means deciding which requests are accepted, which URIs are valid, and what each of them return when executed. This means mapping the internal operations of our problem domain (Class Scheduling) to the external limitations of (in our case) HTTP. Much of this was described in the documentation created as part of the previous article in this series. The Protocol Mapping section of the Media Type document maps domain actions to HTTP methods. The Problem Domain document sets out which data elements are supplied when making HTTP requests. This material provides the basis for implementing connector scripts for our server.

App.js

In this example, the connector coding resides in the app.js module. This is the place when HTTP requests arrive and where HTTP responses originate. To keep things easy to read, no installed external modules or frameworks are used in this example. While that means some of the code is a bit "wordy", that also means there are no "hidden" features and you don't need to know much about Node's external modules in order to understand what's going on in these examples.

You can see that the connector inspects the URL, checks the HTTP method, and then passes the work on to a local routine which processes an payload that was passed in and then passes things to the component layer.

Here's the bit of connector code that calls the component module to handle assigning a student to an existing class:

And that's all there is to connector coding. The connector layer routes and parses the requests, passes them off to the appropriate component and, when the internal response is returned, passes that to the representation layer and then returns that to the caller.

Browsing the API

Once you have the server up and running, you want to browse that API, validate the various operations, and explore it a bit. For the typical Web application, this can be done using a common Web browser. That works because almost all Web applications limit themselves to a single hypermedia type (HTML) and a handful of other standardized media formats (CSS, Javascript, binary images, etc.).

The common Web browser is also an incredibly fine-tuned application. By adhering closely to a handful of standards, browsers can successfully connect and interact with any Web server that also follows the same standards. We have discovery and interoperability working at the same time.

The Limitations of the Common Web Browser

When servers use uncommon registered media types such as Atom, HAL, Collection+JSON, Siren, etc. there is no assurance that common Web browsers will understand them and be able to interact successfully with the server. Browsers will not share understanding of the hypermedia controls (transitions) that appear in messages. Browsers may not "know" which HTTP method to apply for various transitions (GET, POST, PUT, DELETE, etc.). And browsers may not know which data element should be rendered locally (e.g. an image) or treated as a navigation (e.g. a link).

In the case of our XML-based hypermedia type created just for this article series, browsers can actually get us "part of the way" toward being able to browse our Server API. Since the format is XML, all common browsers will display the responses clearly. There are even some Web browser plug-ins that not only render XML well, they also parse it and allow users to click on links to navigate the API. Below is a screenshot from Google's Chrome browser with such a plug-in loaded and running while viewing a response from our Class Scheduling server.

(Click on the image to enlarge it)

This is helpful because we can move from one state to the next by simply clicking on the links in the responses. However, there is no support for executing transitions that support data parameters. In short, the plug-in does not know how to recognize and process the <template /> elements of our custom media type.

If we want to execute state transitions that include passing variables, we need to rely on other tools.

The Limitations of the Command Line

The most common approach for executing parameterized interactions for the Web are command-line tools like CURL and WGET. In fact, it's not unusual for API authors to claim they have a quality interface because "you can just CURL it!" For example, here is a command for creating a new student record for our server implementation using CURL.

First, a small file (post-student.txt) that contains the content to send to the server:

studentName=Marius%20Wingbat&standing=junior

and then the actual command line that uses CURL to send this content to the running server:

curl -X POST -d @post-student.txt http://localhost:1337/student/

Of course you can also retrieve data from servers using command-line tools:

curl http://localhost:1337/course/

but the resulting response is pretty much unusable (see screenshot below):

(Click on the image to enlarge it)

While it's possible to use command-line scripting tools to pipe these results to parsing tools in order to present a more understandable version of the response, the current crop of client-side tools do not easily support an interactive hypermedia experience, either.

What is needed is something that blends the interactive value of read-only browser-based tooling with the power of command-line style "write-able" interactions.

The Advantages of a Media Type Explorer

One way to achieve a more fully functional browser-style interactive experience using custom media types is to create an "explorer" interface. This interface leads humans through a hypermedia-style UI similar to today's common Web browsers and offers the ability to execute parameterized transitions, too. In fact, any hypermedia-style media type can support this kind of experience. For example, the Hypertext Application Language (HAL) - an IANA-registered media type offer just such an experience today with it's HalTalk explorer (see screenshot below):

(Click on the image to enlarge it)

Explorers make it possible for humans to browse and interact with any hypermedia server that "speaks" the same language. This may not be at the level of supporting a stand-alone custom application platform for the media type, but it goes quite a ways toward making these hypermedia types available and "surf-able" in order to validate the functionality and inspect newly discovered servers and their APIs.

In the next article in this series, we'll build an explorer for our Class Scheduling media type as well as other more familiar clients including an automated bot to perform tasks without the direct intervention of humans at runtime.

Summary

In this article we looked at the details of building a server that supports a custom hypermedia format as the primary interface - a hypermedia API. Along the way, a general model for hypermedia server implementation was outlined. One based on a separation of concerns between private components and public connectors. Components handle the storage and business logic. Connectors handle the translation of the private data into a public format (in this example, the Class Scheduling media type) and the routing of requests to the proper internal components. The notion of a representation layer was introduced as a way to create a bridge between the private and public portions of the system and also as a way to allow for future support for multiple representation formats when needed. This combination of domain-specific components, and domain-agnostic connectors provides a solid, scalable basis for hypermedia-style servers.

The next installment in this series will explore the details of coding various types of hypermedia clients. Ones that provide a "faithful rendering" of the server's responses, ones that maintain their own "custom view" of server replies in order to establish their own application interface, and ones that act as automated robots that solve specific problems without the need for human intervention at runtime.

About the Author

Mike Amundsen is Principal API Architect for Layer 7 Technologies, helping people build great APIs for the Web. An internationally known author and lecturer, Mike travels throughout the US and Europe consulting and speaking on distributed network architecture, Web application development, Cloud computing, and other subjects. He has more than a dozen books to his credit.

REMINDER:

All the source code for the server implementation discussed here is available at the GitHub repository for this series. Readers are encouraged to download the code and provide contributions and comments in the public repo.

Is your profile up-to-date? Please take a moment to review and update.

Email Address

Note: If updating/changing your email, a validation request will be sent

Company name:

Keep current company name

Update Company name to:

Company role:

Keep current company role

Update company role to:

Company size:

Keep current company Size

Update company size to:

Country/Zone:

Keep current country/zone

Update country/zone to:

State/Province/Region:

Keep current state/province/region

Update state/province/region to:

Subscribe to our newsletter?

Subscribe to our industry email notices?

You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.

We notice you're using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.