Saturday, November 24, 2007

Congratulations on your election victory, Mr. Prime Minister! (I think I can safely address you as such even though you haven't been officially sworn in yet).

I would like to get a quick word in before you are deluged by work and a cacophony of voices representing the spectrum of Australian interests.

Your campaign promised Australia a world-class education system, with an emphasis on "digital schools" and a computer for every child in Years 9 to 12. I laud your vision and extend my support and best wishes towards its successful implementation.

I must however caution that to bureaucrats who haven't kept up to date with the latest developments in the computer industry, implementing your policy may seem to mean nothing more complicated than signing a multi-million dollar agreement with Microsoft, but there is a far more efficient use of taxpayers' money.

I would urge the Labor government to look closely at the Open Source movement not only to provide basic operating and application software for school computers, but also to provide access to courseware through an "educational commons." When you do the numbers, you will find that very significant savings can be obtained by substituting perfectly good Open Source equivalents for proprietary and expensive software such as Microsoft Windows Vista and Microsoft Office. I am talking about Linux, Firefox, OpenOffice and countless other cooperatively-developed software products. Not only are the Open Source products cheaper, they are more open and standards-compliant, which means they play nicer with software from other sources. Best of all, they run on less powerful hardware and require fewer hardware upgrades, reducing the amount of money that needs to be spent just to get the infrastructure up and running. Ongoing running costs are also likely to be cheaper. And that's just the hard (dollar) benefits.

An exposure to Open Source software and an "educational commons" will also help to create a new generation of technology- and community-savvy students who can more effectively participate in an increasingly collaborative world. Our educational investment is not in computers, after all, but in intellectual assets - our children. It's the Open Source way that our children need to learn in order to adapt and survive in tomorrow's world, and your government's policies can make that a reality.

I know that these revolutionary approaches and solutions will be bitterly resisted by the technology establishment, but if the election of your government is not a vote for change, what is?

As a voter, taxpayer and concerned citizen, I would like Australia to get the best for its investments in technology and education, and I believe the Open Source way is the best way forward for this country.

Wednesday, November 21, 2007

Actually, SOAP-RPC is not just simply evil but triply evil, and here's why.

The root of the whole evil was the original arrogant assumption behind the notion of distributed objects - that we can make remote objects appear as if they are local. It was only much later that Martin Fowler came out with his First Law of Distributed Objects ("Don't Distribute Your Objects").

Quick question: How can we make a remote object appear to be local?Correct answer: We can't, period.Naive answer: Serialise the remote object, transport it over the wire and deserialise it to create a local copy.

The reason the naive answer is so badly wrong is that copies behave very differently to the original objects. To be precise, when a copy is changed, the original is not changed. So "hiding" the remoteness of an object through some object serialisation sleight-of-hand doesn't work. The component or application that gets a reference to the copy must know that it is a copy, otherwise all kinds of application errors can occur.

So that's why RPC (Remote Procedure Call) is evil. It tries to make remote objects look local. If one is not careful, this can create serious errors in applications.

The other issue is the serialisation mechanism. Even assuming it's OK to serialise an object and recreate it elsewhere from the serialised representation, what is the mechanism used for serialisation? Java serialisation works because we have Java on both ends. In recent times, XML has become popular, and some bright spark must have thought of "XML serialisation". The concept is simple. Convert a Java object to XML format, transport it over the wire to a remote host, then convert the XML document back into a Java object to create a perfect copy, a "remote object". The problem with this assumption is that there is an impedance mismatch between Java and XML. We can do things with Java that we can't do with XML, and vice-versa.

So converting a Java object into XML isn't straightforward. Things get lost in the translation. Converting from XML back into Java isn't straightforward either. More things get lost in the translation. So this "serialisation/deserialisation" using XML as the transport format doesn't result in perfect copies at the other end. This is XML-RPC, and when it's used naively, it's easy to see why it's doubly evil.

As SOAP-RPC evolved beyond XML-RPC, the wire protocol itself began to be seen as a "contract" between systems. In other words, the SOAP message could be used to decouple implementations at either end. All of a sudden, the XML document going over the wire was not expected to be just a Java serialisation mechanism, it was a technology-neutral serialisation mechanism. So now it became fashionable to think of serialising a Java object into an XML document, wrapping it in a SOAP envelope and "exposing" this as a contract to another system. This other system could pull in the SOAP object, extract the XML document from within it, then deserialise it into...a C# object! Not only is the impedance mismatch alive and well here too, there is another subtle assumption made that is violated in the execution. Can you find it?

Let me sum up.

RPC is evil because it makes developers think they can make a remote object appear local.

XML-RPC is doubly evil because it makes developers think (i) they can make a remote object appear local and (ii) it is possible to convert a Java object to XML and back without error or ambiguity.

SOAP-RPC is triply evil because it makes developers think (i) they can make a remote object appear local, (ii) it is possible to convert a Java (or C#) object to XML and back without error or ambiguity and (iii) even though the SOAP message travelling between two systems is now a "contract" to be honoured regardless of changes to implementations, it can still be generated from implementation classes, and implementation classes can be generated from it.

What's the truth, then?

Nothing glamorous or counter-intuitive.

1. A copy is a copy, not the original.2. XML is XML and Java is Java (and C# is C#).3. Contracts are First Class Entities, just like Domain Objects in an OO system.

SOAP-based Web Services technology today is sadder and wiser because it has internalised these truths. Because of Truth 1, we don't pretend anymore that a SOAP message is a Remote Procedure Call. We know it's just a message. Because of Truths 2 and 3, we don't believe anymore that we can generate XML documents from Java/C# objects, or Java/C# objects from XML documents. We can only map data between these two forms. If you build your Web Services abiding by these principles, you can stay out of trouble, otherwise you will be left wondering why your systems are so brittle.

(Actually, there are still people who don't understand these Truths, and who still go on about SOAP-RPC. That's what makes me despair about SOAP's chances for success.)

Tuesday, November 20, 2007

I’ve only had time to skim through the book once, but I think I’ve understood enough to be able to post some initial thoughts on it. As I read it in greater detail, I may refine this post into a proper review.

What I like about the book:

The choice of PHP as an implementation language: This is a refreshing reminder that SOA is about hiding service implementation details from service consumers. The implementation language doesn’t always have to be Java or C#. What matters is what the service consumer sees (i.e., SOAP). There is however, a downside with PHP, which I'll cover in a moment.

The use of Open Source technologies: I believe that Open Source is the way of the future, and the use of Apache/PHP, Tomcat and ActiveBPEL to illustrate SOA concepts feels just right. Besides, readers can readily try out the examples in the book without having to buy expensive commercial software. (An exception is Oracle, which the author discusses in some detail in the context of data-centric services. Perhaps it’s the Oracle bias he has by virtue of having worked extensively with Oracle technology in the past ;-).

Copious examples: This is not a theoretical book, and there is plenty here for the reader to try out for themselves. Every concept that the author deals with is represented in code. [I can’t comment on how complete or correct the examples are and whether they work, because I haven’t tried them out yet.]

Lots of diagrams: Pictures are really worth thousands of words, and there are diagrams sprinkled liberally throughout the book to illustrate almost every concept discussed. I thought they were quite decent.

Emphasis on data: One of my pet peeves with many people’s approach to SOA is their relative neglect of the Data Interchange view of service interactions. SOAP-based web services is about exchanging XML documents that represent some structured data relating to the operation being performed. There is a lot of design that needs to go into these XML documents. Thankfully, the author spends a fair amount of time showing how to design the data payload of messages with XML schema and converting data into XML format (but a nit about that bit later!) I also liked the treatment of the data within the contract (importing the schema file into the WSDL file instead of defining the schema in place).

Introduction to WS-Security and to Qualities of Service (QoS): The book has a section on implementing secure messaging using WS-Security, and also makes the point that virtually all the WS-* specifications use SOAP headers to implement functionality. There’s not too much here, but enough to get the developer to understand the WS-* approach.

View of a process as a service in its turn: One of the value propositions of SOA is its ability to exhibit a “flat” landscape of services, regardless of how they were implemented. From a service consumer’s point of view, it doesn’t matter if a service was purpose-built in a programming language (e.g., PHP) or stitched together out of other services (using BPEL). Services of both types should look the same. The book shows how composite services can also be exposed as services in their turn, with the appropriate WSDL sections highlighted.

WS-BPEL treatment at the right level of detail: I thought the level of discussion and the examples of WS-BPEL were just right for a beginner. There is enough detail to be meaningful, but not so much as to overwhelm.

What I don’t understand or don’t like in the book:

The title: SOA and WS-BPEL are like fruit and oranges, not even apples and oranges. The first is an architectural approach; the second is a language used to implement processes. Considering that the book deals with building standalone web services in the first part, then composing them into processes in the second, perhaps it should have been called “SOAP and WS-BPEL” or“SOA: SOAP and WS-BPEL”.

The unquestioning acceptance of the RPC view of Web Services: I have religious feelings about RPC. It is the devil’s spawn. Many of the REST camp’s arguments about SOAP are actually directed against SOAP-RPC. The modern view of SOAP-based Web Services is based on messaging. Messaging, not RPC.

What’s the difference, and why is this important? RPC is architecturally dishonest. It is impossible to make a remote object behave like a local one, and I don’t mean the effect of network latency. A reference to a local object that is passed to an application carries with it the promise that any change made by the application using the reference will change the object. But with RPC, what is passed to the application is not a reference to the object, but a reference to a copy of the object. This is not an insignificant difference. When the application makes a modification using the reference, the local copy is changed, not the remote object. But the application thinks the actual object has been changed. That’s what is so dishonest about it.

Messaging turns this essentially hopeless exercise around. It makes local objects look remote, by always passing copies around, even if the actual object is accessible by a reference. The application is under no illusion. It knows that in order to make a change to the real object, it is not enough to make a change to the copy. Either the copy must be passed back to be synchronised in some sense with the original, or an independent operation to pass a Data Transfer Object is required. This is architecturally honest and clean. What it may lose in efficiency in some corner cases (local access), it more than regains in terms of robustness, flexibility and scalability.

That’s what SOAP messaging brings to the table. SOAP-RPC is evil and should have nothing to do with a book on modern Web Services. It’s a pity the author actually mentioned RPC by name when introducing SOAP messaging, because the actual examples do not assume RPC.

In this context, the use of PHP has a downside, as I indicated earlier. PHP is heavily tied to HTTP and by extension, to synchronous request/response semantics. SOAP-based Web Services technology does not inherently have this constraint and can work with asynchronous transports as well. The book could have illustrated this effectively using a message queue example. There are tools such as PHPMQ and Mantaray that make this possible, as this example shows.

Automatic data transformation to and from XML: This is another of my pet peeves. The service contract (of which the XML document forms a part) is a First Class Entity. So are the classes that make up the internal Domain Model. How can one ever be generated from the other using a tool? Code generation is an example of tight coupling, and if two First Class Entities are tightly coupled, then at least one of them is not a First Class Entity. QED. I’m not sure if there is an equivalent to TopLink or JiBX in the PHP world, but such a mapping tool is what is required to transform data between the PHP and XML worlds. To be fair, this book is not alone in propagating the code generation approach. Virtually the entire Java Web Services industry is consumed by JAXB disease.

Data-centric Web Services – Actually, I didn’t understand the point of this as a separate topic. It’s a special case of service implementation. In fact, I have a totally different view of Data-Centric Web Services. I call them REST.

No high-level view of process as an aspect of the business: For a book that purports to be on SOA, the treatment of composite services and processes is surprisingly low-level and technology-oriented. There should have been an introduction that focused on business processes and their decomposition into services. In fact, SOA best practice is all about business process modelling and re-engineering. That’s how architects and business analysts determine the services that are required and their granularity. Proceeding bottom-up from services and composing them into processes, as the book seems to suggest is the way to do SOA, is ingenuous.

Anaemic index: The index of the book doesn't list many of the things discussed inside. I tried going back a couple of times to look up something I had seen earlier, but the index was of no help.

Other comments:

ActiveBPEL Designer is not an Open Source product, merely free. This isn’t the fault of the author. It’s just something I’m personally sad about. I haven’t yet found a truly Open Source BPEL designer that is powerful and friendly, and generates full-featured BPEL.

Overall comments:

Actually, notwithstanding the negative comments I made (I'm a nitpicker, as my wife will attest), this is a pretty decent book on Web Services (SOAP and WS-BPEL). It’s got enough low-level detail to help developers get their hands dirty and understand the technology by actually building services and processes. The choice of PHP could turn out to be a masterstroke by reaching beyond Java or C# developers and appealing to the vastly more populous LAMP community. Time will tell. I thought the book was a bit light on architectural insight, but maybe it’s for the best. For a developer audience, such discussion might just cause eyes to glaze over.

It's worth repeating what we said in the Conclusion section of our paper:

"Although it seems presumptuous on our part to claim that we have “solved” the end-to-end integration problem, what is probably true is that recent paradigms and technology breakthroughs have brought the SOFEA model closer to conceptualisation, and someone or the other was bound to suggest it. It happened to be us."

Well, clearly not just us. Many others are saying the same thing.

Matt Raible wondered if there was any difference between the two frameworks (SOFEA and SOUI). SOUI proponents Nolan and Jeff haven't just proposed a theoretical architecture, they've actually created a set of tools to help developers build to this model, and what's more, this set of tools is available for Java, PHP, Ruby and .NET. It's called Appcelerator, and it's a pretty impressive piece of work.

Examining Appcelerator (and it would be a fair assumption that this is what Nolan and Jeff intend SOUI to look like), it appears to satisfy all the conditions that we proposed for SOFEA, except one.

SOFEA emphasises the use of XML for Data Interchange, because one of its guiding principles is to mesh seamlessly with the service tier. Services may be built using either the REST paradigm or the SOAP messaging paradigm, but XML plays a big role in either one. SOAP requires, and REST recommends, the use of XML for Data Interchange. XML has the essential characteristic of being able to enforce data integrity (i.e., conformance to specified data structures, data types and data constraints). That's why we make a big deal about XML support in SOFEA. We considered but rejected JSON because it's only slightly better than raw HTML-over-HTTP in these respects.

Appcelerator uses JSON, not XML. So that seems to be the big difference between the two models. SOFEA requires the use of XML for Data Interchange. SOUI prefers the more lightweight JSON.

I guess Nolan and Jeff have made a valid architectural choice favouring ease-of-use over data integrity, but better XML tooling may blunt that advantage in future.

Incidentally, Ye Zhang made a comment about SOFEA on Matt Raible's blog to the effect that we seemed to be chasing buzzwords and hence tacked on the "Service-Oriented" prefix to our model. Not true. Service-Orientation is at the heart of our model, because we approached the Presentation Tier from that angle. Our emphasis on XML as the Data Interchange mechanism for SOFEA was designed to enable the Presentation Tier to interoperate with the Service Tier with no impedance mismatch at all.

At any rate, the software industry seems to be at the start of a new era, and I'm happy we were able to make a contribution to the debate.

"To Analyse, Understand and Explain"

This site presents the technology-related opinions (wise or otherwise) of Ganesh Prasad - software architect, Java devotee and Open Source aficionado. (For my views on other topics, see this blog instead.)Disclaimer: Though I bear the name of the Hindu god of wisdom, Lord Ganesh, such cosmic wisdom is not always guaranteed to be transmitted through my writings. Reader beware!