Sam Ramji: Thoughts on Open Source, Software, and Cloud Computing

February 04, 2010

Why modern applications need an API proxy

Structures of control are spontaneously generated in every environment and every wave of computing.

Today on the web we have a model where browsers are the single point of control for much of what happens, not just at the level of applications, but at the meta-application level as well. Not simply usage (“point-click-type”), but things about usage – who is the user (browser cookie), what are they using the app through (user agent), where did they come from (referrer), what can we infer about their behavioral state, and so on – as well as modifications of usage (browser add-ins, content filters, security modes, local caching for performance). To be sure, some of these things can be and are performed using infrastructure between the browser and the website (such as content filtering, security, and caching), but the guaranteed component is the browser.

This is one of the reasons that Google Analytics is so popular and useful – you can rely on it to tell you useful things about your traffic because it can rely on the browser as a predictable point of control. Including an invisible piece of content on your web page makes the browser fetch data from Google, implicitly sending information that enables Google to report on your usage.

For web and cloud APIs, what is the equivalent structure of control?

Currently there is no one point like the browser. This is for great reasons – APIs are all about reusing application or service logic and rendering it to different form factors: pure logic (built into an internal application computation), web UIs (part of a mashup), and most notably, client applications on a wide range of devices (from PCs to mobile phones, set-top boxes, and tablets like the iPad). These devices are in the early part of a boom that will see over 10 billion individual units in use, representing at least hundreds of unique hardware/software designs. The sheer utility of these internet-connected devices predicts that their usage will drive high demand for APIs rather than standard websites. There are initial specifications like BONDI that suggest a standard contract across all of these for “mobile web applications” that include interaction with the features of the local device (such as a camera or GPS) but they are years from broad adoption and don’t attempt to unify all API access down to a common control point.

Given that APIs are to application logic what RSS is for content, we know they will be very important; at least as important as the visible web that we use today and possibly more important. This suggests that the other things that are spontaneously generated in value-exchange environments like user/customer management, behavior analysis, content filtering, caching, and security – will show up for APIs as well.

The web API equivalent of the browser’s control structure is an API proxy.

This is a control point which unlike a web proxy is fully aware of API content, communications patterns, and able to drive the meta-application controls discussed above. An architecture like Google Analytics which is founded on a browser’s predictable algorithms cannot work in an API setting. The same rule applies to add-ons that modify usage – they can’t do so relying on the local device if they are to be widely adopted. But an API proxy – a server or service on the internet, sitting between the client (regardless of type) – is able to be that point of control. As traffic runs through it, meaningful data can be captured for immediate outcomes (block access, change the message, or respond from a cache) and later used for behavior analysis and business planning. Add-ons that modify usage of the API can be installed at this point (content filtering, adding new information such as advertising, or identity management). All of this can be done while adhering to the contracts of the APIs and supporting the web architecture and rules of HTTP-based applications, and without attempting to solve the logarithmically complex problem of modifications to all the world’s clients.

So API proxies are likely to be necessary for the sustained growth of web and cloud API usage. There are likely to be several nuances that end up differentiating the different implementations and providers of API proxies. The key is to start experimenting with them now in order to build better apps and stay ahead of the competition.

I think you understood very clearly, in fact. My perspective is not provider centric but app developer centric. Not every app will expose an API but most will use one. This is why we've built Apigee (http://www.apigee.com).

So yes, there's a growing market for API proxies. There are a number of players in this market, including Sonoa (my company), Mashery, 3Scale, Webservius, and I expect that there will be others soon. Each has its own focus areas and features. Apache Synapse can perform similar functions but was not designed with this use case in mind.

Additionally, not just webapp developers but any app developer - including iPhone app developers, for example - who uses 3rd party APIs will benefit from an API proxy.

Interesting points and I think the architectural changes APIs are driving will be profound. I'm not sure I really agree with the proxy control point though (unless I misunderstood).

There are really two interesting control structures involved for APIs - provider side and consumer side. On the provider side you certainly want a gateway of somekind which enforces security, rights, rate limits etc. There are some API infrastructure vendors which solve this problem with cloud-hosted or on-premise proprietary gateways (labelled proxies) which provider traffic control. At 3scale (http://www.3scale.net) we solve it by providing control agents which you plug into different systems - either open source proxies such as Varnish or most flavours of software stack.
Either way, you are bringing traffic management to the data ingress point.

On the API consumer side however, as you point out, you need mechanisms to track the rights that you have on any given API. Currently this is very weak since it depends on essentially having a list of keys and certs + hoping that some other system is tracking the rights that those give you.

You seem to suggest that the two sides will necessarily be unified in the middle but I doubt this will happen broadly (it may for certain applications) - primarily because A) the way the web works at scale is point to point, traffic needs to go peer to peer otherwise overwhelming volume will choke bottlenecks, B) the actual problems you need to solve for APIs are actually various (establishing identity, tracking rights, analytics, payments, monitoring) and it's not actually obvious the will all need to go through the same point. For example Facebook has become a leading Web Identity provider and it's used to track credentials/access to many sites - yet, the content of those sites subsequently does not pass through facebook.

Interesting debate!

There were recently some sessions on this at Gluecon and in one we had the chance to provide a bit of an overview on possible evolutions of the web - I think some of those topics are relevant here also! http://slidesha.re/KQltld

PHILOSOPHYWhen we win it's with small things, and the triumph itself makes us small. What is extraordinary and eternal does not want to be bent by us. I mean the Angel who appeared to the wrestlers of the Old Testament: when the wrestler's sinews grew long like metal strings, he felt them under his fingers like chords of deep music.

Whoever was beaten by this Angel (who often simply declined the fight) went away proud and strengthened and great from that harsh hand, that kneaded him as if to change his shape. Winning does not tempt that man. This is how he grows: by being defeated, decisively, by constantly greater beings.