A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away. ~Antoine de Saint-Exupery -- Note, the opinions stated here are mine alone and are not those of any past, present, or future employer. --

Wednesday, January 10, 2007

A Real eBay Architect Analyzes Part 3

In his ongoing interview, Duncan Cragg addresses business functions in Part 3. I have been stepping in for my imaginary friend but in this installment, I am going to switch to a different format. Rather than try to answer the questions directly, I will address specific aspects of the interview that I think are ripe for discussion.

Let me start out by stating in general, I believe either the declarative or imperative model can be applied. eBay is looking at both because there is merit to both approaches. My issues instead are with claims that I believe are overstating the benefits of REST. Duncan makes assertions about common content types in a couple of places. The two quotes that I'll call are:

We can read data at a URI with GET. We will usually understand that data when we get it, because it has a standard content type at a number of layers - perhaps from character set up to Microformat via XML and XHTML.

and

There's also the expectation of standard Content-Types, sub-types and schemas in GET and POST, rather than custom eBay WSDLs and schemas, that I mentioned before.

I've been on record for a while now with the assertion that as you move from common concepts like media or messages the availability of common formats will decrease. Duncan's example cite the resources User, Item, Offer, and Feedback. I might expect to find a common type for User. Item and Offer are unique to auctions (they differ from product and sale). While Feedback might exist in other systems, the semantics vary in each of those systems so expecting to find a common schema is a bit of a reach. I certainly wouldn't preclude eBay and the other auction sites working on a widely used format to represent an auction (which is not a product, so those formats don't work) but there isn't any current activity in that area.

A concrete example of my point is the current state of maps. There are at least two popular map interfaces that are completely incompatible. My assertion is that vendors will invent formats to serve their needs which causes divergence as you move away from common media. Roy Fielding has argued that consumers will drive the vendors back to standards but I'm not sure I agree. Video is arguably a common media type with high consumer demand. Yet the pressure from consumers has been on clients to support the dozen or so video formats, not on the content producers to standardize.

Another point that Duncan continues to make is that REST offers better scalability than SOAP. From the article:

It's scalable because of all the reasons I mentioned before: the cacheability of the basic data operations and their parallelisability through partitioning.

Plus now we have parallelisability of the application of the business rules. There's nothing more parallelisable than a declarative system.

I don't believe claims of improved parallelism following declarative vs process oriented interactions. Partitioning is about how you architect your implementation not inherent in the interaction style. We have created a massively parallel system that implements SOAP interfaces and has the ability to scale horizontally to incredible levels of parallelism.

As a counter example, state style interactions can actually lead to lower levels of efficiency in the implementation. When a client makes an imperative statement like CompleteSale, we are completely clear on the intent of the operation. We can immediately go to work on the processing and manage it as efficiently as possible. But if the client passes back an Item (which consists of over 200 state elements) with some state changed, the first task we have to perform is determining the state transition. This will involve retrieving the item and potentially other state in the system. All of this is a precursor necessary to determine intent. This certainly increases the resource requirements.

We need to partition along functional as well as data lines. We have separate functional pools for revising an item and finalizing a sale due to the different load characteristics. Since I can't efficiently deduce intent from the REST POST I have a new challenge of how to partition my functionality. So, you can see that eliminating a clear statement of intent from the information passed by the client, makes it more challenging to partition my architecture.

Comments

The Item resource that's passed back via REST *could* contain only values that are changed.

I'm sure with a little thought we could come up with a scalable way to solve your example via REST, but I'm not terribly familiar with eBay's current web services (last eBay development I did was against Mr. Lister) so I don't understand the use case for CompleteSale exactly. Who normally runs CompleteSale and what changes does it make to a listing?

> REST isn't a protocol, it's an architectural style based around resources. Partition by those resources. The Web (pretty big) is nicely partitioned that way.

This is contrasting type partitioning versus functional partitioning. Both work and it remains to be seen which scales better. Either approach supports instance partitioning.

> I'd love to see more detail about this - as would many others.

I think there's a deck out there somewhere that goes through a lot of that!:-)

> Yes - so break things up into micro-resources! That's good practice.

Hmm...now I have my type space growing geometrically so I can expose subparts of my entities? The number of types could degenerate into the number of operations and I fail to see an improved interface for the developer when that happens.

Duncan, if you are listening: I wonder if you have actually tried to build a non-trivial system this way. A working example would be most useful, because I'm afraid you are oversimplifying your eBay examples to the point of meaninglessness

Dan, on microresources, I do not think exposing them grows your types space, you have them anyway, as fields in a bigger resource, you are just exposing them through the interface. So instead of exposing RPC CompleteSale method you exposse .../SaleTransactionState as a resource, I don't think it is a big gain or lose.

Frankly with REST much remains the same. All what you do is put OO on steroids and you have ROA, where some of the functions become resources. Where you can still have algorithms exposed through overloaded POST, to be avioded (e.i. if in doubt, create resource, not overload POST), but not forbidden.

The big deal about REST is that it fits into existing infrastructure (yes, caching is in it), and in the fact that you do not need elaborate, version matching runtime on the clients.