SOAP versus REST (opinion piece)

SOAP versus REST. Which is better? The politically correct answer of course, as always, “it depends”.

But that is not what I think – REST is best.

On the surface, SOAP has benefits such as it can have a lot more complex data structure submitted as a request compared to REST. REST normally has a series of arguments with simple strings as values. SOAP allows very complex data structures to be encoded (even cyclic graphs if you use RPC Encoding). The SOAP encoding can also be used to send a request (and response) down different communication channels, not just over HTTP. However, you can HTTP POST XML requests as well. So the benefit of SOAP to me is limited.

Digging a little deeper in SOAP, there are three encodings of the payload of a SOAP request:

RPC Encoding

RCP-Literal Encoding

Document-Literal Encoding

Which encoding is best? You will no doubt find a few posts talking nicely about choosing one form or another talking about the pros and cons of the different encodings. Let me give my completely person opinion: RPC encoding is plain dumb.

I recall when SOAP first came out everyone was talking about “section 5″ encoding (RPC Encoding). I was involved in the early rounds of interop testing back in 2001 for the spec. There were multiple teams racing off and implementing the spec, putting up servers, and testing for interoperability. (Gosh, the things Google can pull up: “Alan Kent SOAP Interoperability Testing” found this for example from Nov 2001.) I remember back distinctly about this crazy notion that you use XML Schema to describe the data model of the content to be transferred, but it was not defining the legal XML. Huh? Using XML Schema to define legal XML to send but you cannot use it to validate the XML? Sorry, I don’t have the time here to explain.

I wondered at the time why Microsoft did not seem interested in RPC Encoding and went straight for Document-Literal encoding. Microsoft bashing was in vogue back then, so many wrote it off as Microsoft being silly. RPC encoding was complex, which meant it was cool (for developers). Later I realized Document-Literal encoding was exactly the right thing to do. It’s RPC encoding that is dumb. Months of haggling over the poorly conceived RPC encoding in the spec really was a waste of time. And the haggling did not get it fixed. You just had to learn which toolkits supported the subset that other toolkits also supported.

Why is RPC encoding dumb? One core reasons is there are many different ways to encode the same data structure. A coolish idea is RPC Encoding can actually encode cyclic graphs. You can have XML markup with links between elements using ‘id’ and ‘href’ attributes. However, as cool as this is, you are pushing your luck if you rely on toolkits supporting cyclic graphs correctly. There were so many options (OK, things may be better now, I have not checked the spec for years) in the original spec to encode a data structure that interoperability was a major challenge.

Now RPC-Literal encoding is actually new to me. It was not in the early versions of the spec. My reading is its RPC encoding, but throwing out the stupid concept that the XML Schema defines a data model, not the legal XML encoding you can use.

Document Literal encoding makes a lot more sense to me. It basically makes SOAP a transport mechanism for XML messages. You get to define the XML structure of requests (and responses). You just need an XML parser to decode the response into a tree and then use XPath expressions, JAXB, or similar to pull content out of the XML. But then you have to ask yourself why bother with the SOAP side of things?

And this is in part why I like REST. You can HTTP POST the XML message directly at an end point and wait for a response. Its quick and simple, and there is no need for the SOAP envelope wrapping elements.

So what is and is not REST? You can argue that pretty well any URL structure is valid REST (if you squint hard enough). To me however REST is an approach for using HTTP to build an API. I believe in the idea of URLs having a semantic structure. For example a path of /item/1234 identifies an item with ID 1234. Once you have formed a path to an entity, you can do operations on it (such as fetch it).

Toolkit support should be considered when designing a good REST API. There are many toolkits that work best with static templates for URLs. For example, /item/{itemid} and /item/{itemid}/variant/{variantid}. Query parameters can be used to pass secondary metadata or request data to the API. This is good for optional parameters or requests of the identified entity. Using the URL structure for optional parts I consider is generally against the grain of REST. (That is, don’t have a path such as /a/b/c where /b is optional.)

Wrapping up, SOAP and REST are both pretty commonly used. I think SOAP RPC Encoding was one of the badly conceived specs they they refused to fix early on. I think SOAP Document Literal encoding is more sensible, but seems overkill. That leaves REST and XML over HTTP is the logical way to proceed.

Now if this is not a post to start a flame war I don’t know what is! Maybe I will just turn off comments now! 😉