Mark Pilgrim tells us why Protocol Buffers are so nice. Notice, though, that everything he writes focuses entirely on their form and structure as messages. If you focus only on that perspective, then sure, they’re better than what many could come up with if they were rolling their own. In fact, if Google had stopped there, I think Protocol Buffers could be a superb little package.

The Protocol Buffer library does not include an RPC implementation. However, it includes all of the tools you need to hook up a generated service class to any arbitrary RPC implementation of your choice. You need only provide implementations of RpcChannel and RpcController.

Why ruin a perfectly good messaging format by throwing this RPC junk into the package? What if I want to send these messages via some other means, such as message queuing, for example? Do I have to pay for this RPC code if I don’t need it? If my messages don’t include service definitions, do I avoid all that RPC machinery?

In my previous post I talked about the message tunneling problem, where data that doesn’t fit the distributed type system are forced through the system by packing them into a type such as string or sequence of octets. Since Protocol Buffers require you to “hook up a generated service class to any arbitrary RPC implementation of your choice,” it’s likely that you’re going to run into this tunneling problem. For example, if you want to send this stuff over IIOP, you’re likely going to send the marshaled protobufs as Common Data Representation (CDR) sequences of octet. You’re thus unavoidably paying for marshaling twice: once at the protobuf level for the protobuf itself, and then again at the CDR level to marshal the sequence of octet containing the protobuf. Any worthwhile IIOP/CDR implementation will be very fast at marshaling sequences of octet, but still, overhead is overhead.

But there are other problems too. What about errors? If something goes wrong with the RPC call, how do I figure that out? The answer appears to be that you call the RpcController to see if there was a failure, and if so, call it again to get a string indicating what the failure was. A string? This implies that I not only have to write code to convert exceptions or status codes from the underlying RPC implementation into strings, but also write code to convert them back again into some form of exception, assuming my RPC-calling code wants to throw exceptions to indicate problems to the code that calls it.

What about idempotency? If something goes wrong, how do I know how far the call got? Did it fail before it ever got out of my process, or off my host? Did it make it to the remote host? Did it make it into the remote process, but failed before it reached the service implementation? Or did it fail sometime after the service processed it, as the response made its way back to me? If the call I’m making is not idempotent, and I want to try it again if I hit a failure, then I absolutely need to know this sort of information. Unfortunately, Protocol Buffers supplies nothing whatsoever to help with this problem, instead apparently punting to the underlying RPC implementation.

Still more problems: the RpcController offers methods for canceling remote calls. What if the underlying RPC package doesn’t support this? Over the years I’ve seen many that don’t. Note that this capability impacts the idempotency problem as well.

Another question: what about service references? As far as I can see, the protobuf language doesn’t support such things. How can one service return a message that contains a reference to another service? I suspect the answer is, once again, data tunneling — you would encode your service reference using a form supported by the underlying RPC implementation, and then pass that back as a string or sequence of bytes. For example, if you were using CORBA underneath, you might represent the other service using a stringified object reference and return that as a string. Weak.

All in all, the Protocol Buffers service abstraction is very leaky. It doesn’t give us exceptions or any ways of dealing with failure except a human-readable string. It doesn’t give us service references, so we have no way to let one service refer to another within a protobuf message. We are thus forced to work in our code simultaneously at both the Protocol Buffers level and also at the underlying RPC implementation level if we have any hope of dealing with these very-real-world issues.

My advice to Google, then, is to just drop all the service and RPC stuff. Seriously. It causes way more problems than it’s worth, it sends people down a fundamentally flawed distributed computing path, and it takes away from what is otherwise a nice message format and structure. If Google can’t or won’t drop it, then they should either remove focus from this aspect by relegating this stuff to an appendix in the documentation, or if they choose to keep it all at the current level of focus, then they should clarify all the questions of the sort I’ve raised here, potentially modifying their APIs to address the issues.

Responses

This “RPC junk” powers most of Google’s internal and external properties, including every Google search you type every day.

There is room for both RPC and REST-based communications. Your non-stop bashing of RPC despite its obvious success in the connected (Internet) and local (COM) worlds undermines your credibility quite a bit.

@Dean: please explain to me how a package that supplies no RPC implementation of its own powers all of my Google searches. While you’re at it, perhaps you could answer all the questions I raised in his blog posting regarding the very real problems the PB RPC abstraction introduces.

Regarding credibility, are you trying to say that RPC does not have the fundamental flaws that many have detailed over the years? If so, please explain how those flaws don’t actually exist. If however you do agree that the flaws are there, then why do you have a problem with me pointing them out?

These are technical issues, Dean, not personal ones relating to popularity contests. I’m constantly looking for better technical ways to get the job done. Do you still listen to shellac 78 RPM records and get your news from the telegraph? Do you grow your own food planted with the help of a blade pulled by a farm animal? I mean, they all work too, right? Many a system has been built with RPC, that’s true — I’ve helped build quite a few myself — but that certainly doesn’t mean there aren’t better ways of doing things, with REST being only one of a number of such alternatives.

I’m curious about something. This “fundamentally flawed distributed computing path” seems to have adherents at Google, Facebook, Yahoo and Microsoft who have all built massive distributed computing systems based on it. Despite this you believe that they should “just drop” this approach which has successfully scaled to systems with tens of thousands of machines spread across the world.

My question is this: what would it take to convince you that, at least for systems of this size, an approach based on RPC and binary protocols works well?

@ade: for starters, how about answers to the questions I raised in this blog entry?

I have no issues with binary protocols whatsoever. That much should be obvious. But let’s be really clear: “RPC” stands for “remote procedure call,” and it’s a method of hiding network operations behind the abstraction of the local programming language procedure call. That is its proper definition, nothing more, nothing less. So, given this definition, are you saying that systems of this size need RPC? If so, I would really love to hear your explanation of exactly why that’s the case.

I think I understand. You’re using a definition of RPC that the people behind Thrift, Protocol Buffers, etc do not subscribe to. Your description sounds like what we’d call a bad implementation of RPC.

So for argument’s sake lets say that these large-scale systems do something we’ll call Network Oriented Programming and use PB/Thrift serialization/Hadoop’s Writables as the binary format.

One of the big requirements for NOP is extremely low latency and highly efficient use of bandwidth, cpu and memory.

Message queues aren’t used because of the need for low latency. As much as I like AMQP et al they don’t solve our problems at a reasonable cost.

We can’t use HTTP mainly because when you have this many requests per second the overhead of a text format like HTTP (in terms of bandwidth) translates into significant amounts of money. Compressed HTTP also doesn’t work for these scenarios because it would increase latency and cpu usage to unacceptable levels.

The way they use the Work class makes it fairly clear that this not an ordinary method call. Note the use of InvalidOperation to indicate remote failure. Just like RFC707 says the programmer still has to have enough understanding of the system’s context to know what’s a local call and what’s remote as well as the relative costs. All the system designer can do is make it obvious.

A lot of the questions you raise are meant to be answered by whoever implements the network transport layer in your distributed system. We haven’t open sourced our implementation so I can’t talk about it. However I can highly recommend looking at Thrift or even the Rabbit MQ implementation of AMQP to see the sorts of solutions that exist for these problems. Solutions do exist it’s just that they’re not in common usage.

@ade: I think if you were to search through this site, you’d find you’re preaching to the choir. (I used to work on AMQP, for example.) The approach you’re describing sounds sane enough, and I know well of its place in the distributed computing spectrum. (Anyone who thinks that I think RESTful HTTP is the answer to everything hasn’t done their homework on me, not even close.)

But, given that you don’t like “bad implementations of RPC,” then why did you put one in Protocol Buffers? What you put there is in fact worse than a bad implementation of RPC, as it doesn’t give enough of an API to do the job correctly, not even close, as I described. I still think you’d have been much better off leaving that part out of the package.

By the way, I read the RFC 707 and I remember you saying that the RFC itself lays down the problems with RPC. I got time this weekend to sit down and read it (I hadn’t expected it to be so short , so I was pushing it to “later”) and the only thing I could find was it saying “network calls are costly and thus the programmer should know about this”

Can you clarify where I can find the implementation details in RPC mentioned in the RFC ?

The Protocol Buffer RPC support is, in fact, used for practically all inter-machine communication at Google. We were unable to make our RPC system part of this release, but we have one, and it works on top of exactly the RPC stubs that are in this release. We certainly are not going to remove this support. We rely on it.

If you don’t want to use the RPC stuff that is in the release, don’t. The library contains only a couple abstract interfaces, so there’s virtually zero overhead from having this support in the package.

Regarding exceptions: You aren’t supposed to encode exception information into the error string. If you need to return structured errors, then the right way to do it is to make your response type be able to represent that information. For example, your response type could be:

We felt that supporting exceptions explicitly would add too much complication with little real gain.

Regarding idempotency: I don’t see what the RPC system can be expected to do here. The application is always going to have to be prepared for the case where the server machine dies in the middle of processing the request, in which case if the request is not idempotent at the application level then there’s not much you can do.

Regarding cancellation: This can be “correctly” supported purely on the client side by just calling the “done” callback immediately, without trying to inform the server. Informing the server of the cancellation is just an optimization that might allow the server to avoid some work. Idempotency issues are, again, up to the application.

I certainly agree that RPC is not the right solution for every network communications problem, but for many problems the simple solution works fine.

@anonymous: I believe I said RFC 707 lays down some of the problems with RPC, not all of them. It took years to figure them all out. See section 4c1 of the RFC. For implementation notes, see the appendices in the RFC or better yet see the classic Birrell and Nelson paper Implementing remote procedure calls.

Regarding idempotency, I don’t see how you can ignore it; the fact that you do makes for even tighter coupling between caller and service, because whether a given operation is idempotent or not becomes a hidden bit of knowledge shared between them. You might want to go back to the late 80s / early 90s and look at the Apollo Network Computing System (NCS) and the OSF Distributed Computing Environment (DCE), for example — unless my memory is totally shot, both supported “idempotent” attributes you could attach to operation declarations within the IDL. We tried to get something similar into CORBA way back then in 1991-1992 or so, but we were unsuccessful; however, when system errors occur in CORBA, the ORB at least returns information in the system exception indicating whether the operation was actually invoked or not, thus helping the client decide whether or not the call can be retried. Some sort of support for dealing with idempotency issues, perhaps along these lines or perhaps in some other way, is absolutely critical IMO.

I was on the CORBA technical committee (representative from Taligent) back in those days of yore. Per my recollection, we didn’t include idempotence specifications because doing so essentially requires that you include distributed transactions to implement it in any useful way, and that was just not an appropriate feature to mandate in a messaging protocol.

(Yes, you can just have an “idempotent” specification that means ‘Pound the server until you get a reply,’ but that’s a trivial case. Very few real-life requests are truly idempotent in that sense; a partial failure is nearly always a possibility.)

Protocol Buffers strike me as far too low-level a mechanism to have direct support for transactional integrity. It’s the serialization and message-sending component; more sophisticated message and request-response semantics can be built on top of it.

I believe you are looking at serialization and network-layer code, not seeing an ORB, and coming away disappointed.

The first part of my comment is more proper for Steve’s previous post. But anyway this issue was already touched here.

In his post about PB Ted Neward wrote: “Protocol Buffers, as with any binary protocol format and/or RPC mechanism (and I’m not going to go there; the weaknesses of RPC are another debate for another day), are great for those situations where performance is critical and both ends of the system are well-known and controlled.”http://blogs.tedneward.com/2008/07/11/So+You+Say+You+Want+To+Kill+XML.aspx

Dare Obasanjo made a similar point: “It is all about coupling and how much control you have over the distributed end points. On the Web where you have little to no control over who talks to your servers or what technology they use, you want to utilize flexible technologies that make no assumptions about either end of the communication. This is where RESTful XML-based Web services shine. However when you have tight control over the service end points (e.g. if they are all your servers running in your data center) then you can use more optimized communications technologies that add a layer of tight coupling to your system.”http://www.25hoursaday.com/weblog/2008/07/10/TheRevengeOfRPCGoogleProtocolBuffersAndFacebookThrift.aspx

I think this pretty well addresses RPC vs. REST arguments. So, if I want to implement a simple client-server communication between internal components on my system, I’d go with “flawed but convenient and fast” RPC with binary protocol. As well as nobody in their mind would discuss using RESTful HTTP inside Hadoop for example.

Steve, I do my homework, just wanted to make it clear for other readers that all technologies have their palce, because your posts can be misunderstood as “forget about RPC, use REST everywhere”.

Concerning the use of PB for RPC, I agree that it’s a bad idea. It misses already mentioned fundamental features of any mature middleware (ZeroC Ice is a good example here, and it supports far more languages, though I wish it would not be under GPL).

@Christophe: I disagree on multiple levels. First, idempotent was implemented properly in NCS and I believe in DCE as well, and it’s not nearly as complicated as you seem to make it. It simply marks a remote call as one that can be called without fear of side effects. HTTP GET is idempotent, for example, and there’s certainly no distributed transactions required to make it work.

I’m talking about fundamental networked computing requirements — idempotency is one, and what about timeouts? PB offers nothing for either, and by failing to do so it clearly falls into the convenience over correctness realm.

As for your ORB comment, I don’t see where that’s coming from. If you’ve read anything I’ve written over the past few years, I think you’d realize your comment misses the mark.

@Oleg: thanks for doing your homework. :-) Unfortunately I can’t summarize the entire body of all my publications and work from the past 15 years in each one of my blog postings for readers who happen by and are unfamiliar with it. My recent columns about RPC, REST, and different programming languages are intended to get people to step back and look at all the issues, rather than just think, “Oh, I’m using Java so I have to do things this way” or, “Everyone knows that HTTP couldn’t possibly work under these circumstances.” Canned solutions and idioms implicitly introduce a whole set of trade-offs, and developers often choose them without understanding those trade-offs. Challenging conventional thought, thinking hard about the problems, understanding and thinking hard about the trade-offs, and being willing to step outside the popular but narrow range of solutions that many typically limit themselves to can yield significant gains in the capabilities and usefulness of the systems we write as well as in developer productivity.

@steve: I know what you mean. Many people blindly use technologies whithout knowing previous lessons and trade-offs, relying on industry leaders, authoritative sources or standard bodies. Sometimes they even neglect their own practical experience.
Thanks for writing about REST in your columns. Actually I’m about to use it in one of my projects which needs a flexible way for exposing services to clients.

Telling people to read up on everything you wrote these past years is not a very constructive way to participate in a debate.

Kenton’s answer is filled with practical considerations that show that he and his team have been thinking hard about the problem, and all you can answer is that idempotency is easy because ? That’s not very convincing.

Yes, RPC is hard, but the overall point is that there are proven and documented ways to overcome these difficulties. It’s sad that your contribution to the debate is just to tell Google and everybody else using it successfully to just “drop it because it doesn’t work”.

Also, as others have pointed out, I’m puzzled by your definition of RPC, which seems to be different from everybody else’s. I’m pretty sure everyone here has read the “Distributed Note” white paper, so why did you decide to pick your definition of RPC to be exactly what the paper says should be avoided at all costs?

Kenton’s answer is filled with practical considerations that show that he and his team have been thinking hard about the problem, and all you can answer is that idempotency is easy because [insert definition of itempotency]? That’s not very convincing.

@Dean: Are you just trolling, or are you actually here to contribute something meaningful? Lessee, first you tell me that I better stop saying bad things about RPC or else, and then you get on my case for raising idempotency as an issue. I also noticed that you simply dodged the questions I put to you. And now you have the gall to tell me that I’m not being constructive, and what’s worse, telling me that based on something I didn’t even say?

I said that it’s impossible for me to summarize 15 years worth of publications in every blog posting — what’s wrong with that? It’s a fact. Did I tell you that you had to go read everything I’ve written? If so, can you please provide the direct quote?

The definition of RPC that I use is the definition of RPC. RFC 707 defined it, right? That’s the definition I use. What other definition of RPC is there? Please cite the “official” definition so we can all learn.

And yes, Kenton provided some thoughtful answers. Did you miss this part of my response to him?

@Kenton: thanks for the answers, I appreciate it.

Should I have written some sort of long-winded tribute to Kenton instead for taking the time to answer?

And yes, toolkits from well in the past provided a solution for idempotency, but I’m wrong to point that out?

How about this for an idea, Dean: most of us are here to learn, including me. If you’re here to learn too, then how about contributing something meaningful, instead of preaching, scolding and generally being miserable?

@Tony: yep, that’s exactly what I was saying. The RPC parts they included in the PB package seem less than useful to me, for the reasons I’ve explained, and so I feel that if they had just kept those parts to themselves, just as they kept the whole RPC implementation to themselves, the package would have been better.

The Google folk should bear in mind that the rest of us have zero visibility into how they run this behind the scenes. We know none of the details of their RPC implementation or of their operational environment, and as Ade hinted above, they’re not going to give them to us. We are therefore naturally going to ask questions that neither their docs nor their code answer — I mean, if they don’t like us asking questions or giving our opinion, they shouldn’t have released PB in the first place! How do we know they don’t use extra APIs or interfaces or whatnot to address the issues I’ve raised with respect to idempotency, cancellations, and other issues like timeouts? Kenton says their implementation works with exactly the interfaces they provided, but as I’ve already explained, I can’t see how their system can be reliable and accurate in the face of network and system failures if that interface is all they use.

I certainly agree that RPC is not the right solution for every network communications problem, but for many problems the simple solution works fine.

I don’t get you there , Kenton, are you saying that the RPC solution is wrong but its fine cos its simple and works (that you are choosing convenience?) There is nothing wrong with that, but just clear it up …

I can’t see how their system can be reliable and accurate in the face of network and system failures

Exactly. Not everyone has the kind of money / the kind of hardware that Google has. In some sense, I can imagine that google can afford to scale via just throwing machines at the problem as it would always seem trivial compared to what they do daily

It would be really weird imho for a SME to do what google is doing.

Ade :

One of the big requirements for NOP is extremely low latency and highly efficient use of bandwidth, cpu and memory.

Exactly, if the requirement is low latency and highly efficient use of bandwidth, then don’t use HTTP. Fine by me and I don’t think even Steve would disagree with this line (he might disagree with finally choosing RPC though ).

But get your requirements in order and then design .. thats what Steve has been pushing for so long .. understand your requirements, understand the tradeoffs inherent in a particular style and then design.

Telling people to read up on everything you wrote these past years is not a very constructive way to participate in a debate.

wow … you just changed my belief system. I thought quoting references, telling people to read some stuff for background is nice …

Yes, RPC is hard

Ok you Googlers really need to sort this out … is it hard or is it simple ? Maybe you and Kenton need to sit down together … we can’t have people preaching two different things , or maybe thats how it works?

It’s sad that your contribution to the debate is just to tell Google and everybody else using it successfully to just “drop it because it doesn’t work”.

thats not sad, your inability to read this post, or the one before, or the various columns he has written or heck even see the about page , thats whats sad.

I’m puzzled by your definition of RPC, which seems to be different from everybody else’s.

we at Google have our own g-definitions for the g-network g-protocols and g-religiously g-follow g-them.

who cares about what everybody else thinks? our “company of geniuses” will tell you what is the definition ( I kid you not, I have videos of Googlers talking about themselves like that ).

pick your definition of RPC

You see that’s where you are wrong , NO ONE can PICK definitions. A definition is a definition is a definition. You can’t just pick and choose , he used the definitive RFC. Note DEFINITIVE rfc.

And the saddest thing is , NOT IN A SINGLE PLACE do you ever mention ANYTHING technical / usefull .. not a single point is any way technical ? what are you : from marketing ? we actually like Google around here but you managed to mess even that up !!

I’m no expert on networking theory. All I can really say is that Protocol Buffers and Protocol Buffer RPC have worked great inside Google. Our intent in releasing this is not to suggest that Protocol Buffers are the best solution for everyone; simply that they have worked well for us and that we’d like to share that success if possible.

I’m still unconvinced that it’s useful for the RPC system itself to know if a call is idempotent. In my experience, every app wants to do something a little different when retrying RPCs. I don’t see a lot of value in letting the RPC layer contain a built-in retry mechanism; inside Google, at least, most people would not use it. Of course, I can see how this is arguable.

It’s probably worth noting that a major part of the Protocol Buffers design philosophy is simplicity. We intentionally avoid adding features which we think will only be useful to a small minority of users, because each added feature increases the maintenance burden and makes it that much harder to provide a high-quality implementation, and also steepens the learning curve. There are obviously many arguments for and against this philosophy, but it has worked well for us.

FWIW, Dean Ness does not appear to be a Google employee, unless that’s not his real name.

soabloke asked the central question to many of the arguments that have been coming up, “why do you want yet another binary encoding format?” OK, I couldn’t help myself, took the bait and responded on his blog. I thought I’d bring a copy back here because there’s more discussion going on here.

To a large extent this conversation spanning these blogs is related to enterprise systems, however, the requirement for communications is much wider than this. There’s a huge gap in the tools and technology to deal with the emerging smart devices. Look at the ZigBee technology stack, HomePlug, or other emerging transports. The need binary communication protocols and the best we’ve got is a handfull of ASN.1, XDR, etc.

Obviously if Google are developing this solution they’re not happy with all the current solutions out there now. All these so called experts have decided that what is available solves all the problems. They don’t. More research and experimentation is required. The best way to do this is release things like Protocol Buffers and let the world that see a need in these to innovate and improve on them.

And Yes, if you look at the link I posted, you’ll see that I’m another developer of alternative binary encoding formats. It does a job that Protocol Buffers, ASN.1,XDR, etc don’t do which is why I’m interested in researching it. The fact is that many of the concepts of loose coupling that is found in XML can be applied to binary using the right encoding formats. This is the area my format Argot is exploring.

We will need more encoding formats until we have a set that is as flexible and as well endorsed as TCP/IP is for transport. We are no-where near that point now. Yes, we don’t just need one more encoding formats, we need a whole lot more encoding formats. We need to let them breed and watch them evolve into the fittest for the job. Right now, all I see if everyone try and shoot anything new that appears saying that its a “not invented here” syndrome or other such. The fact is, there’s so few solutions in this space that businesses need to develop these new protocols to fill the huge gaps that exist.

Posted using the all the preview love from Steve :)
imfg, whats with the yellow background?!

I’m no expert on networking theory. All I can really say is that Protocol Buffers and Protocol Buffer RPC have worked great inside Google.

fair enough.

Our intent in releasing this is not to suggest that Protocol Buffers are the best solution for everyone; simply that they have worked well for us and that we’d like to share that success if possible.

Aah .. therein lies the problem, the g-worshippers will g-shout at anyone who has a brain and refuses to use this .

I’m still unconvinced that it’s useful for the RPC system itself to know if a call is idempotent.

My view is that it is absolutely fantastic to have something like GET (Safe, idempotency is another additional nice thing to have, but GET is really fantastic). As Don Box said

GET is one of the most optimized pieces of distributed systems plumbing in the world

Isn’t it awesome being able to keep clicking refresh untill the damn thing works (you won’t realise its awesomeness if you never had dialup)

Additionally knowing which methods are idempotent methods is a nice thing to have, it makes the coding imho much easier. maybe MOST of your methods just are not idempotent (like in the REAL world its just GET / POST ), but its a nice feature to have esp. GET. imho, 2c.
(obviously you don’t agree. )

inside Google, at least, most people would not use it.

1. You are ONE employee in Google . You don’t matter in their plans for world domination, You might as well go home now.
2. Given that you agreed you are not an expert in networking theory, I don’t think you should be making such a statement , (See pt 1 on why).

BTW, you still didn’t answer my question on whether RPC is just you choosing convenience even if you know its fundamentally flawed! I am just curious .(I don’t agree with Steve that RPC is fundamentally flawed btw .. )

FWIW, Dean Ness does not appear to be a Google employee, unless that’s not his real name.

That was worth a lot. Now we can go back to God-oogle woshipping!! Ohh thank you .. :P

Note that protobuf RPC does not require that an RPC system implement *only* the methods listed in the RpcController interface. It simply requires that the system implement *at least* those methods. A specific RpcController implementation may very well provide a SetTimeout() method — Google’s internal implementation does exactly this. The set of methods of the RpcController base interface were chosen to be the minimal set that all RPC systems can reasonably be expected implement (including the poor-man’s implementation of cancellation that I described above). Note in particular that it is possible to implement an RpcController which works correctly when making direct, local calls, without any stub in between the client and server.

I think you and Dean are talking past one another. As far as I can make out, the disagreement seems to be about the definition of RPC.

You think of RPC as a mechanism that makes function calls across the network possible, transparently. So transparently that the programmer should not have to know whether an object is local or remote. As you rightly point out, that approach has some severe shortcomings, and two
better ways of doing the same thing which you have written about previously are (1) messages, Erlang style, and (2) REST.

The kind of “strong” RPC where the “application is built using objects all the way down, in a proper object-oriented fashion,” and “the right ‘fault points’ at which to insert process or machine boundaries will emerge naturally” has been quite comprehensively discredited already.

Lets assume a system is being designed with all the issues you have pointed out – errors, failures, idempotency, and others. There will be some interactions in the system that will be complex, but the vast majority will be simple request-response. These request-response interactions may be simple, but they still need to be implemented. Writing all the messaging plumbing needed for even the simple cases from scratch every time is tiresome (and error prone). In my opinion, Thrift, and the “RPC” extensions of protocol buffers are best thought of as an easy way to implement these request-response servers and clients.

This is a different definition of RPC than what you’ve been criticizing. It is intentionally leaky. Competent developers will consider it an easy way to send messages between servers, nothing more (and nothing less!). In fact, I would think that even in an environment where you had well implemented messaging primitives available, you may still want an easy way to write request-response patterns without having the full generality of messages.

Now, you may argue that it is still undesirable for these conveniences to exist because they lead developers down the wrong path. Maybe you are right about that. I think that we are still learning how to do distributed computing effectively, and primitives like these are useful. As comments from the Google folk demonstrate, these hooks are clearly not a complete waste…

@Kenton: thanks again for the comments and information. I do appreciate the contributions that you and Ade have made here; they’ve been very helpful.

As I explained earlier, I figured you were using something more than just the PB RpcController interface. Thanks for verifying that. Perhaps your documentation should mention that?

Back to your earlier comment regarding idempotency: having an indication of such isn’t to let the system know so that it can do built-in auto-retries and stuff like that; rather, it’s to let the clients know that they can safely do retries without side effects. Like anonymous said, this is a huge win.

@David: you say we need more encoding formats. Note that that’s one of the nice things about REST over most RPC packages: the media type is indicated in a header, allowing for content negotiation and allowing virtually any format understood by client and server to be exchanged. RPC systems, OTOH, tend to hard-wire the format, take it or leave it, because RPC is primarily geared toward getting data in and out of native programming language types. The types are primarily what matters to the application, not the wire format.

@anonymous: I really don’t see how anyone can argue that RPC isn’t fundamentally flawed, as I explained to Steve Jones last time. It’s fundamentally about treating network operations as if they were local operations, which as we all know is impossible. If that’s not a fundamental flaw, I’m not sure what is. (Oh, and the yellow preview background is to make sure you see the preview, but I’m also experimenting with a shade of green to match the rest of the blog.)

I understand what you’re saying. Again, the purpose of my recent publications has primarily been to focus attention on the trade-offs involved — that simplicity and ease is not free, after all — and bring some alternatives to light. The Google guys have posted useful comments here that explain their thinking and the trade-offs they’ve chosen, which I appreciate and I’m sure anyone reading here appreciates. I think all this matches what you’re saying too.

One specific comment: you said:

Writing all the messaging plumbing needed for even the simple cases from scratch every time is tiresome (and error prone). In my opinion, Thrift, and the “RPC” extensions of protocol buffers are best thought of as an easy way to implement these request-response servers and clients.

This kind of brings this whole conversation full circle, since I’ll now remind you that it all depends on the language you choose. For some languages, the plumbing is already there, so there’s really nothing tiresome or error-prone to worry about. By choosing the right language, you can get both convenience and correctness. And that’s really the key — when you choose a language such as the general-purpose C++ and Java languages that really aren’t well-suited for distributed computing when compared to some other languages, then you have to try to fill in the voids with generated code and wrappers and proxies and stubs and frameworks and so on. People get really caught up in these general-purpose languages and really don’t even think twice about alternatives, which is why mypastfourcolumns discuss not only REST and RPC but multilingual programming as well.

Some people come here and get pissed off at me because they see me “bashing RPC” or “pushing REST as the answer to everything.” If that’s what they see, then I would argue they’re not really reading what I’m writing. I’m really just trying to put some non-traditional alternatives out there and get people to think hard about their choices and whether those alternatives might work better for them. After all, they certainly work better for me, and while I understand they won’t work for everyone, they might work for others too, as I doubt my circumstances are all that unusual.

@David: but can you explain why, when we’re developing these new protocols, we need to forget fairly basic requirements for network programming such as dealing with timeouts, idempotency, etc.?

I think its not forgetting or ignoring these requirements, but instead its choosing not to include them. Timeouts can be dealt with at a higher level in your code. Although, this is where I think we need to understand when RPC becomes synchronous Request-response. If I take timeouts out of the RPC mechanism and put it in my application am I step closer to synchronous request-response?

For instance, I wrote Argot as purely an encoding system. I then built a RPC mechanism and called it Colony just so that people would assume its a monolithic architecture. I’d call Colony an RPC solution. However, I could also use Argot in a synchronous Request-response mechanism and then wrap that up in a simple RPC mechanism. If I wrap that request-response mechanism in a single method has it just become RPC again? Oops.. off on a tangent again.

The same with idempotency; if your system doesn’t have any adverse side effects from all methods being called multiple times then indempotency isn’t a requirement of the RPC mechanism. The other reason not to include idempotency is because the hardware/software cost is too great. If I’m building an embedded device, I’m not going to bother with idempotency if I don’t need it. The final reason I can think of is economics. I haven’t implemented idempotency in Colony because I’m one guy building a full end to end system. That’s a feature that can be added when someone asks for it.

@David: you say we need more encoding formats. Note that that’s one of the nice things about REST over most RPC packages: the media type is indicated in a header, allowing for content negotiation and allowing virtually any format understood by client and server to be exchanged. RPC systems, OTOH, tend to hard-wire the format, take it or leave it, because RPC is primarily geared toward getting data in and out of native programming language types. The types are primarily what matters to the application, not the wire format.

I’ve been really trying to understand your position on REST. However, what I keep thinking about is that old OSI 7 layer model for comms. What I think rest does is provide a session layer of the communications model. It doesn’t actually do anything else. The developer then needs to choose a presentation layer for how to encode the information. Then on top of this the developer needs to work out how to get that information back into the application for processing.

I like the simplicity and elegance of REST, but it really seems to only deal with one portion of the stack.

I made some flippant comment in a previous post saying “Have we really moved into the post modern deconstructionlist era”. Actually, I was going to delete it before posting; phew atleast I’ve got preview now. :)

Now that I’ve said it.. I think the point I was trying to make is that we’re pulling appart these monolithic RPC systems and starting to rebuild piece by piece all the bits of the distributed comms puzzle. I’d like to see how people would implement REST in a binary request-response mechanism to remove its perceived reliance on HTTP. How could I get REST into an embedded device?

As I’ve said to whoever will listen, the presentation layer is really not mature. The solutions date back to pre RPC days and little advancement has occured. The problem is that presentation and encoding is usually messed up in all those other REST/RPC arguments and never gets the attention it deserves. I don’t know if you ever bothered looking at Argot, but I really think its got a solid base. With every new technology that comes out I jump straight to the encoding to see how flexible it is, and I’m always disapointed. PB was no different.

Finally, onto the impedance mismatch. I really think that all REST does is defer it to the developer. Lets treat this as part of the application layer. If I use XML I still have the problem of mapping XML into my target language. REST doesn’t actually remove that need.

My current job has me getting involved with Home Area Networks in the energy space. Think Smart Grid, Smart home appliances, etc. The way these people build communication systems is arcaic. Specify the individual request/response pairs of every function in a huge document, get agreement from all involved, go away and implement.

The fact is that distributed systems and communications has no cohesian across the industry. I’d really like a place like Lambda-the-ultimate.org for distributed systems where we can work out all the things that have been discussed here and more. Is there such a place, or do we need to build it?

It’s probably worth noting that a major part of the Protocol Buffers design philosophy is simplicity.

What one person thinks is simple, another person thinks is simplistic. Some examples of this spring to mind.

1. Many people think RPC is simple, but some think it is simplistic.

2. XML proponents claim XML is simple, but some think XML is simplistic.

3. The binary number system is ideal for implementing in silicon. However, most humans find it much more difficult to use binary numbers than decimal numbers in day-to-day work. As far as humans are concerned, binary arithmetic is not simple (even though it has only 2 digits instead of the 10 digits in decimal); Instead, binary arithmetic is simplistic.

Regarding cancellation: This can be “correctly” supported purely on the client side by just calling the “done” callback immediately, without trying to inform the server. Informing the server of the cancellation is just an optimization that might allow the server to avoid some work.

I think that approach is simplistic rather than simple. I will explain why with the following scenario.

A client specifies a timeout when making an RPC to a server. The server is a bit overworked and doesn’t manage to respond quickly enough, so the client times out, continues with its own work and, later on, ignores the reply that eventually comes from the server. So far, so good. But consider the following possibility…

The server is so overworked (or the client’s timeout is so unrealistically short) that the server reads the client’s incoming request, puts it on a queue of pending requests and doesn’t get around to even start processing the request by the time the client times out. In this case, being able to cancel the pending request in the server is incredibly important because if the server is under a heavy load then wasting that CPU time needlessly puts it under an even heavier load.

Let’s assume the client retries the RPC if he ever gets a timeout exception, as in the following code:

You may think that coding a client application to retry a timed-out RPC is stupid, but the client might be retrying the RPC in response to human interaction. (Example, if you are browsing a very slow website and your browser times out waiting for a webpage then you are likely to hit the “reload” button.)

If the server is under a heavy load and the client’s requests keep timing out but the timed-out requests are never canceled in the server then the server quickly becomes unstable and will never be able to able to process a new request before it times out because it is trying to process an ever-growing backlog of already-timed-out pending requests. Eventually the server will crash due because the ever-growing backlog of pending requests will (eventually) consume an infinite amount of memory.

To say “Informing the server of the cancellation is just an optimization” is to seriously understate the importance of a good cancellation mechanism. It’s almost like saying to a C++ programmer that “deleting memory you no longer need is just an optimization”. Such “optimizations” can make the difference between a stable application and one that crashes.

Funny thing this timeout business. I was looking at ZeroC’s Ice on how they have implemented cancel/retry/timeouts on a method call and came across this in the docs[1]

“You should also be aware that timeouts are considered fatal error conditions by the Ice run time and result in *connection closure on the client side*. Furthermore, *any other requests pending on the same connection also fail with an exception*. Timeouts are meant to be used to prevent a client from blocking indefinitely in case something has gone wrong with the server; they are not meant as a mechanism to routinely abort requests that take longer than intended.”

Interesting that they think a timed out RPC indicates that the server is already overworked and should ideally not accept any more connections.

The first one, which I’ll call RPC-Steve, is Steve Vinoski’s definition. In his own words:

“It’s fundamentally about treating network operations as if they were local operation”

Then we have another definition, let’s call it RPC, where RPC simply means “Remote Procedure Call” (or Remote Process Call). It’s the idea that a process can invoke functions in another process, where that process can be running on the same machine or on a computer in a different continent.

With that in mind, I think everybody agrees that RPC-Steve is evil, so I don’t see much point discussing it much further.

As for RPC, it’s alive and well, as shown by COM and hinted by Google’s Protocol Buffers. Some aspects requested by Steve, such as idempotency, are desirable, but certainly not mandatory to create scalable and maintainable systems.

@Ciaran: Obviously it’s better for an RPC implementation to actually communicate cancellations to the server. I was just responding to Steve’s point that not all RPC protocols provide a mechanism for this.

First, by calling it “RPC-Steve” you imply that I somehow made up that definition of RPC, but that of course is ridiculous. Not sure how many times I can say it, but I’ll try again: that definition comes from RFC 707, which defined RPC in the first place. Did you even bother reading it? I previously asked you that if you disagree that that’s the definition of RPC, to please cite another publication where you think it’s defined. Making up definitions just to suit your own purposes doesn’t help anything.

Second, if you look at the two definitions of RPC that you provided, they’re virtually the same. “Remote procedure call” is a remote procedure call, no matter how you cut it. You can try to reword the definition all you like, but it doesn’t change the fundamental nature of what RPC is, as defined in RFC 707. If you want to try to cheat by changing the meaning of the words themselves ala Humpty-Dumpty (“‘When I use a word,’ Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.’”), then again, that’s not very helpful and it certainly doesn’t do anything to add insight to this conversation.

Third, COM is alive and well? Don’t you mean DCOM anyway? Can you name me one DCOM-based system running today that has proven itself to be scalable and maintainable, as you claim? I doubt even Don could do that. Are you seriously trying to hold up DCOM as some sort of pinnacle of RPC excellence?

No, I mean COM, which has been powering Windows since 1995 and which allows pretty much any application running on Windows to perform RPC calls to any other process on that machine (running or not), regardless of the language used at both endpoints.

COM has gone through several metamorphoses and evolutions through the ages, but it’s still one of the most successful and most widespread examples of RPC technology today.

@Dean: similarly, I know of numerous very successful CORBA systems that have been in operation for years and will continue to be in operation for many more years. Numerous CORBA systems were successfully built and deployed in the telecommunications, financial, and manufacturing domains, for example, and many are still in operation. It’s probably still the case that when you make a phone call, you’re likely invoking CORBA code in some way or another that my old friends from IONA and I wrote. However, that doesn’t mean such approaches are not without problems, and it doesn’t mean that we can’t do better.

@anonymous: your ideology is flawed then. :-) Same answer as to Dean above: just because something works to some degree doesn’t mean it’s not flawed or even fundamentally flawed.

@Kenton: having idempotency indicated in a comment is essentially useless. Such an indicator needs to be available as metadata for the operation.

@David: impedance mismatch is a red herring in the REST case, since REST is not about trying to build abstractions that look like extensions of programming languages. Plus, again, no matter what approach you’re taking, if you choose the right programming language, then impedance mismatch is not really an issue. As for how you’d get REST into an embedded device, you would go to Roy’s thesis, look at the architectural constraints and properties of the REST style, and build yourself a system to suit. There’s nothing about REST that requires HTTP or text-based protocols. Also, REST might not be the best fit for your system, and if it isn’t, I’d still encourage you to follow the same design approach that Fielding did in his thesis, where he carefully weighed alternatives and imposed constraints that yield the desired system properties, as opposed to the RPC style of design which is typically along the lines of “what language am I using and how can I generate code to hide my network operations and make this stuff convenient to use in that language.”

@steve, ok. I sense you’re getting a bit bored with this topic, however, there still seems to be a lot of confusion. On the train this morning I went back and did some homework and read RFC707. I didn’t really find the RPC nice clear definition that I had expected. The core of this RFC (to me) was the section titled “Specifications for the Command/Response Protocol” (3b).

The RFC mainly concentrates on the protocol. It doesn’t really discuss any of the stubs/skeletons or how best to implement the application level interfaces. I had expected a bit more of this as the way you’ve discussed RPC in a few places has seemed to include this aspect.

In terms of the protocol, the specification has eight requirements. I’ll go through them and describe how I interpret them:

This is pretty generic. This could easily applied to the HTTP/REST approach where named commands are actually URLs. Really, it just suggests that the request will cause some action on the server.

(2) Permit command outcomes to be reported in a way that aids both the program invoking the commmand and the user under whose control it may be executing.

The is also very generic. Responses should have an indication of if there’s an error. This applies to HTTP/REST and any other asynchronous request-reply.

(3) Allow an arbitrary number of parameters to be supplied as arguments to a single command.

This is probably the only part that separates RPC from REST, although, its a narrow definition if taken this way. I could argue that REST allows an arbitrary number of parameters to be passed in the content. I believe the author is just trying to make a distinction between the previous protocols which were single parameter commands.

(4) Provide representations for a variety of parameter types, including but not limited to character strings.

Once again, this is very generic. If you look into the appendix you see a couple of very simple encodings, however, I think its unfair to lock this requirement down to such simple encodings. You’ve argued that REST is not limited by a single representation. I don’t see anything here which requires a single representation.

(5) Permit commands to return parameters as results as well as accept them as arguments.

Again, this is generic. I think it applies to REST aswell as any other RPC. It does not say this needs to be a single representation.

(6) Permit the server process to invoke commands in the user process, that is, eliminate entirely the often inappropriate user/server distinction, and allow each process to invoke commands in the other.

This is more specific, but interestingly I don’t think is implemented that often. Most RPC systems I’ve looked at don’t allow the server to respond with a command. Two peers can call each other, but I’d in most situations each is acting as a client to the other; the protocol doesn’t directly support server->client calls.

(7) Permit a process to accept two or more commands for concurrrent execution.

Generic again. Applies to all modern asynchronous request-reply systems.

(8) Allow the process issuing a command to suppress the response the command would otherwise elicit.

This is the optional one-way features that some RPC systems have.

There is an implementation of these requirements in the RFC, however, I don’t think that single implementation is what defines the concept of RPC. I think these requirements above are what define the meaning of RPC.

My guess is that you’ve taken a different meaning from this RFC than I have. Can you elaborate on how you either read these requirements different, or what section you take as being the definitive definition?

@David: keep in mind you are reading this RFC in 2008, so you have the benefit of over 32 years of experience, research, insights, and experimentation of numerous people who have explored this area since James White wrote the RFC. I don’t think anyone has ever suggested that the RFC is some sort of “RPC bible” explaining everything anyone would ever need to know to completely understand and implement the abstraction, so that’s not what you should expect.

There’s more to this document than the eight requirements. You have to read section 4, but first, consider for example paragraph 3a3, paying special attention to item 3:

The thesis of the present paper is that one of the keys to facilitating network resource sharing lies in (1) isolating as a separate protocol the command/response discipline common to a large class of applications protocols; (2) making this new, application-independent protocol flexible and efficient; and (3) constructing at each installation a RTE that employs it to give the applications programmer easy and high-level access to remote resources.

Item 4 after paragraph 3d1:

This protocol would make possible the construction at each installation of an application-independent, network run-time environment making remote resources accessible at the functional level and thus encouraging their use by the applications programmer.

The beginning of section 4a7:

But even though it has no substantive effect upon the Protocol, the selection of a model–command/response, request/reply, and so on–is an important task because it determines the way in which both applications and systems programmers perceive the network environment. If the network environment is made to appear foreign to him, the applications programmer may be discouraged from using it.

Paragraph 4a8:

In this final section of the paper, the author suggests a network
model (hereafter termed the Model) that he believes will both encourage the use of remote resources by the applications programmer and suggest to the systems programmer a wide variety of useful Protocol extensions. Unlike the substance of the Protocol, however, the Model has already proven quite controversial within the ARPANET community.

Everything I’ve quoted above pretty much hints at the RPC model. Section 4b, however, is where it starts to get really clear. Consider paragraph 4b1:

Ideally, the goal of both the Protocol and its accompanying RTE is to make remote resources as easy to use as local ones. Since local resources usually take the form of resident and/or library subroutines, the possibility of modeling remote commands as “procedures” immediately suggests itself. The Model is further confirmed by the similarity that exists between local procedures and the remote commands to which the Protocol provides access. Both carry out arbitrarily complex, named operations on behalf of the requesting program (the caller); are governed by arguments supplied by the caller; and return to it results that reflect the outcome of the operation. The procedure call model thus acknowledges that, in a network environment, programs must sometimes call subroutines in machines other than their own.

And the beginning of paragraph 4b4:

“The procedure call model would elevate the task of creating applications protocols to that of defining procedures and their calling sequences. It would also provide the foundation for a true distributed programming system (DPS) that encourages and facilitates the work of the applications programmer by gracefully extending the local programming environment, via the RTE, to embrace modules on other machines.”

Does it get any clearer than that?

I’ll also note, as I have before, that section 4c contains warnings that the model might have some fundamental flaws.

it’s almost like saying to a C++ programmer that “deleting memory you no longer need is just an optimization”. Such “optimizations” can make the difference between a stable application and one that crashes.

roflmao

This could easily applied to the HTTP/REST approach where named commands are actually URLs. Really, it just suggests that the request will cause some action on the server.

hahahahhaa…hahahahah …. hahahaha

here I was thinking URLs were resources, was trying to think of how to split my stuff into resources .. and there it was right in front of me .. just write my code and then make all function names as resources. AWESOME!!! Thanks a lot!!! I am free!!!

I could argue that REST allows an arbitrary number of parameters to be passed in the content.

you could argue that , but you would loose. And you would be wrong … ofcourse, seeing how you do URI design (which I generally discourage wasting time on , but you changed that ), I can imagine you would use POST everywhere and pass arguments each time.

Thats just RPC over HTTP.

Once again, this is very generic. If you look into the appendix you see a couple of very simple encodings, however, I think its unfair to lock this requirement down to such simple encodings.

do you know when this was written ? in 76. Do you know when they came up with the IDEA of using the IMG tag in HTML ? 93 Do you have any idea of the context in which they are talking about!??! (neither do I really , but atleast I don’t go around saying .. “they should have supported more encodings” )

Two peers can call each other, but I’d in most situations each is acting as a client to the other; the protocol doesn’t directly support server->client calls.

ok , now I am serious, I haven’t used COM much but isn’t this possible simply by allowing server/client to implement IDispatch and giving these pointers to each other. Although COM isn’t really over the wire still seems like RPC to me.

(7) Permit a process to accept two or more commands for concurrrent execution.

Thats the point . REST is an interface model .. you don’t know what the process is , if there is a process at all , if you are sending to the same process the request .. you have no idea of anything at all in REST. This allows the server much more flexibility.

You started off by saying you would tell us your interpretation and then went off somewhere else. whats up??

The Protocol Buffer library does not include an RPC implementation. However, it also does not include any of the tools you need to hook up a generated service class to any arbitrary RPC implementation of your choice.

Would that make you feel pleased ?

If you are not particularly impressed with RPC, are unlikely to be using it, how does it really matter whether Protocol Buffers Library can work with it or not ? You are simply not any better off or worse off. If you have a rant on RPC make it an RPC rant not a Protocol Buffer rant.

I came to this post assuming I would get a different perspective on Protocol Buffers .. instead I got an opinion on RPC. It was still an interesting reading, but I am just unhappy the title to me seemed misleading. Perhaps it should’ve been – “Protocol Buffers when used with RPC does not mask RPC problems”

@Dhananjay: not sure I follow your logic, and I don’t really see how your alternative blog entry title differs substantially from the actual title. In this posting I asked a few questions related to how PB relates to RPC, and raised some issues that I’ve learned the hard way from years of working on similar systems. I don’t really see what’s wrong with that. Some Google folks were kind enough to wade in here and provide some thoughtful answers to my questions, so I think if you add up the posting and the comments there’s some really useful information here.

Take a look at Kenton Varda’s reply a few comments above. It goes like this:

“The Protocol Buffer RPC support is, in fact, used for practically all inter-machine communication at Google. We were unable to make our RPC system part of this release, but we have one, and it works on top of exactly the RPC stubs that are in this release. We certainly are not going to remove this support. We rely on it.”

@steve, I’ve taken a day to mull over what you’ve written and read back over some of the arguments that have been presented in these comments and in other posts. I still have a fundamental problem that I disagree on what RPC means and I think others have similar issues. However, that disagreement is not really getting to the core of the issue. This would probably get solved faster over a few beers (even if a few people had a bag over their head with “anonymous” written on it). I really want to get past the word RPC and make sure we agree on the meaning.

I’ll try and link this problem back to a concrete example to explain my position better. Let’s say I’m a developer in an organisation that has been given a library that has a single method, int createOrder( Order ). It takes a complex object as the argument and returns an order number. I’ve been told to go and expose this as a client interface so that it can be used across the organisation. They want low latency, so it means I can’t use any queuing and the client will block waiting for the response. The other developers in my team want to use what ever I build on the client. The other developers want the convenience of calling a single method to perform the action. It could be int sendOrderToServer( Order ) throws RemoteException; or something similar. The other developers in my team know full well that this is not a local operation. They know that they need to handle any exceptions that are due to timeouts, etc, however they still want me to deal with those implementation issues. They will build the rest of the software to recognise any exceptions and deal with them appropriately.

As far as I’m concerned, it doesn’t matter how you implement sendOrderToServer, the result is that sendOrderToServer is a remote procedure call. It is bound by requiring a blocking synchronous request/reply semantics. It doesn’t matter if sendOrderToServer is implemented using REST, CORBA, Protocol Buffers or Colony. The end result is that my manager wants me to deliver a convenient interface to the remote library. So my first question to you is; have I already been bitten by the RPC bug and convenience over correctness? Second, If this was done with REST, is it actually just RPC over HTTP and not REST?

Its my view that all the code and protocols from sendOrderToServer all the way to the library method createOrder is inclusively labelled a Remote Procedure Call. It doesn’t matter if that is hand coded, generated, or uses reflection; the full solution is part and parcel of the RPC. As long as it meets the blocking synchronous request/reply semantics it’s RPC.

It’s my view that CORBA and similar products are “monolithic RPC” solutions. They attempt to provide a convenient and quick solution for every aspect of the problem for the developer. I view REST as a “decomposed RPC” solution. For REST, I choose the encoding, the location and various other aspects. It’s up to me as a developer to go and implement many parts of the RPC. I’m pretty sure you’ll say I’m wrong on these definitions, so if you’ve got a better ways to describe them… I’m all ears.

In my experience idempotency is not important because it’s the *default* status for an RPC call. Every RPC call used in the software I work on is idempotent (this is a major consumer website). Requests for an ad to display, searches, news queries, and checking for new RSS stories are all idempotent operations. Non-idempotent operations are handled via async message queues and direct SQL access.

Google, Facebook, et. al are an existence proof that RPC is a successful solution for certain problems in large scale systems. The more interesting question is what kinds of problems is it useful for and which is it not.

@Adam: I can assure you that a purely idempotent RPC system is quite unusual. This is no exaggeration: you’re the very first person I’ve encountered in 20 years of building such systems who has ever made such a claim. I’m not saying that your description is not accurate — I’m simply saying that it’s very unusual, especially in enterprise integration scenarios.

@David: RPC is RPC. RFC 707 is pretty darn clear on why RPC was conceived. I don’t see how continuing to try to Humpty-Dumpty its definition is going to provide any useful insights.

In your most recent comment you are making the very mistake that I normally point out at the start of any RPC explanation, which is that blocking synchronous request/reply == RPC. No, no, a thousand times no. RPC is not purely about networking operations, but rather, it’s a view of such operations from the programming language perspective. As RFC 707 says, it’s a model “that encourages and facilitates the work of the applications programmer by gracefully extending the local programming environment…this integration of local and network programming environments can even be carried as far as modifying compilers to provide minor variants of their normal procedure-calling constructs for addressing remote procedures.” That ‘P’ in the middle of “RPC” refers to programming language procedures/functions/methods, not to some generic use of the term “procedure” in the sense of “a series of actions conducted in a certain order or manner” (definition courtesy of the dictionary on my Mac).

REST is absolutely, definitely not RPC. REST is a well-defined architectural style that has nothing at all to do with programming languages or programming environments. The fact that REST promotes in part a client-server request-reply approach does not make it RPC. @anonymous took you to task for your previous comment because it got some fundamental aspects of REST very wrong, and calling it RPC is very wrong as well, so I’m not sure how s/he will react to this latest gaffe! ;-) Please do yourself a favor and sit down and read Fielding’s thesis — it will change your understanding of all this stuff for the better. It is one of the very best documents on distributed systems I’ve ever read.

I could go on and on, but I won’t. I don’t know how many times I can cite the same sources and explain the same things over again.

Alright, I’ll go off and read the Fielding’s thesis. I already hunted it out earlier today. Honestly, I am just trying to find a language to describe my views and am not trying to continue a Humpty-Dumpty approach to calling things what I want. I see what you’re saying with regard to P in RPC. I’ll stop using the word RPC for fear on inciting violence.

As I said, if you’ve got a better way to describe the general concept of blocking synchronous request-reply semantics then I’m all ears. Can I say that RPC and REST both contain the ability to remotely call methods? How do I talk about these approaches at the level I said above. I have a method on the client; int sendOrderToServer( Order ) and on the server it calls a method int createOrder( Order ). I want a general way of talking about this end to end process without bringing up discussions of nursery rhymes.

Geez, you know a discussion is going bad when people start quoting scripture at each other :-)

Regardless of the definition you use for RPC, it is fundamentally an abstraction devised to assist in distributed computing. There are many abstractions we use every day in IT and some of them work well – others do not. Object-Oriented programming is one such abstraction which works well in domain modeling and in most every-day applications. But in some areas such as user-interfaces or distributed systems, objects don’t work well as an abstraction.

For many years Power Programming with RPC was my bible and I think I’ve got a pretty good handle on what it was about. RPC was originally devised as an abstraction of procedural programming as applied to distributed systems. Later with DCOM and especially CORBA the distributed systems abstraction became “method calls” so that OO programmers would feel at home.

The problem with RPC when used as an abstraction is that it promotes tightly coupled systems which are difficult to scale and maintain. That is the lesson of 20 years of distibuted systems development. One problem with “out of the box” Web-Services is that it continues the RPC abstraction.

Other abstractions have been more successful in building distributed systems. One such abstraction is message queueing where systems communicate with each other by passing messages through a distributed queue. REST is another completely different abstraction based around the concept of a “Resource”. Message queuing can be used to simulate RPC-type calls (request/reply) and REST might commonly use a request/reply protocol (HTTP) but they are fundamentally different from RPC as most people conceive it.

So my point is that RPC is generally frowned upon because of its architectural implications. I try to avoid it in my line of work. There are some cases where it is useful, but like many of these things – caveat emptor.

When we introduce an interface, we want to separate the concerns of the Consumer from the Provider. This is good software engineering, in that it enables interfaces to be oriented towards broad classes of consumers, and enables substitution in the provider’s location, implementation, or even the organization that provides the service.

(italics/bold mine)

The whole idea behind having an interface model is that you can do whatever you want. Ofcourse , a dumb person (by mapping all function calls to URIs for e.g (sound familiar?) ) can screw it all up and make a ass hat of a system. But REST atleast allows the smart person not to screw it all up. The smart person with REST can make a kick ass application. REST atleast gives you the choice, RPC doesn’t.

Read the whole post from where I flicked it : Stu’s Blog
It has all the cool words that people use in companies (Zachman something something for e.g) , so maybe you will understand it .

Every RPC call used in the software I work on is idempotent (this is a major consumer website). Requests for an ad to display, searches, news queries, and checking for new RSS stories are all idempotent operations. Non-idempotent operations are handled via async message queues and direct SQL access.

I always wondered how people got insane performance increases by just using HTTP and designing URIs correctly.

Thanks for clearing it up for me!

Please do yourself a favor and sit down and read Fielding’s thesis

And when you think you have understood it, read it again. The first time I read it, I was like “Hmm , interesting” .. only after some rereads did the whole “WOW!! OMFG” thing come in.

Honestly, I am just trying to find a language to describe my views

(cheap attempt at humour)
No no no, you got it all wrong, REST requires you to use a uniform interface .. English is fine by us. You can’t just go about finding and using languages you dig up.
(end cheap attempt at humour)

blocking synchronous request-reply semantics then I’m all ears.

whats that gotta do with REST? You can do such dumb things with REST too. Block on every HTTP request till you get the response .. have an awesome time with that application!! Hell, while you are at it , change your OS to “synchronously” block for every I/O request. That would be damn easy to program for!

I want a general way of talking about this end to end process without bringing up discussions of nursery rhymes.

@soabloke, thanks for stepping in and providing a higher level view. Between you and the first sixty pages (all I’ve read so far) of Fielding’s dissertation, I think I’m finally starting to see Steve’s/REST’s point of view. I also watched Steve’s video presentation he did on Infoq. Its also worth a watch.

I’ve been burried in the detail of how to implement these distributed computing problems and couldn’t see the forest for the trees. (Steve, Notice I did not say RPC :). It’s interesting that I’ve never separated “architectural styles” from “guiding principles” in my head. By guiding principles I mean all the good ideas that guide how I build distributed systems; many of the decisions are buried in some of the most basic detail. What I think Fielding is doing is taking those guiding principles that people like me have been taking for granted and creating a set of explicit rules for hypermedia systems. As long as people don’t break these rules then they should end up with a good solution (as Anonymous said. I wrote this before reading his response). As I said, I’m only sixty pages in so far, so there’s still a way to go.

@steve, just to see if I understand your position on RPC, I’ll go back to my concrete example. What I think you’ve been saying is that RPCguy (using the same idea as your presentation) receives some library for creating and working with orders. The library has been designed for local use and RPCguy has been told to make it available to a client. RPCguy writes some IDL that looks as close to the library as possible, churns it through a IDLtoCode generator and announces how easy it was. He hasn’t taken into consideration any of the real issues that distributed computing requires.

Now, ORBgirl(I shouldn’t be sexist) comes along and says to RPCguy, “you’ve got no clue! That won’t work!” She says you need to do atleast some OO design and worry about the life-cycle of the objects. She changes the IDL to introduce the concept of creating an object to represent an order on the server. She also has a bit of a think about the interfaces to the Order object. She writes some IDL passes it through a CORBA vendor’s solution and assumes the container will look after everything.

Next comes along SOAguy (btw, nice picture for the SOAguy in your presentation). He say’s, “But CORBA won’t work for the Internet! You have fifty methods on that Order object.. didn’t you even think about the latency issues?” SOAguy gets onto designing a much simpler interface to the library. He also designs an XMLSchema that
allows passing full Order objects back and forth between systems. He writes some WSDL, churn’s it through a code generator, sits back and says “See, that’s how its done!”

Finally(?), comes along RESTguy and says “Yo Fool, this is da Internet Age! You’re still living with RPCguy.. Get with the Hypermedia dude! And btw, you’ve got no style, Architectural Style!” He gets set to creating a set of URLs which allows doing a GET to find an Order. The URL can be used as a reference in other queries. He still uses basically the same XMLSchema as SOAguy, but can return PDFs or anything else. His thinking about the problem from a completely different perspective.

I’ve been looking at all four people and thinking, “I’ve been looking at the guts of all of this and in the end, for computer to computer communications it all ends up in code somewhere”. It all has to map back to calling a method. But I am starting to see where the REST idea is coming from now. It’s the architectural style that if followed correctly should end up a system that scales with the Internet.

There’s still a hole in my thinking which is how does REST get mapped back to OO systems when that’s the requirement. I understand how documents and hypermedia work so well together, but I’ve got this legacy of OO libraries and thinking that I need to map to hypermedia.

As I said, I’ll continue on with reading Fielding’s disertation.. I may even read it twice as suggested by Anonymous. Thankfully its a good read as Steve has said numerous times.

But the big question is… am I on the right track and heading in the right direction?

Sorry, I should have read Stu’s blog on “Understanding hypermedia as the engine of application state” before asking that the question of how OO systems link back to hypermedia. Thanks anonymous, that filled another gap!

I also watched Steve’s video presentation he did on Infoq. Its also worth a watch.

yay .. video for the weekend ..

What I think Fielding is doing is taking those guiding principles that people like me have been taking for granted and creating a set of explicit rules for hypermedia systems.

I hope to God this is because you have only read 60 pages. If you still get this impression after reading it through fully, please read it again. Please , I beg of you … cos right now I feel really sorry for you.

@David: I’m glad to hear that you’re reading the REST thesis. There are a bunch of folks out there who continually whine/whinge about REST and write reams of prose telling us how terrible it is, and yet they’ve never even bothered to read even the title page. Congratulations on not being one of them. :-)

BTW, today someone tried to post a comment here suggesting that you take your questions elsewhere because I am nothing but a “REST zealot” and so apparently I would be unable to answer them. I didn’t post it because not only is it wildly inaccurate, but it was negative and adds no value. Now, some of anonymous’s comments are also a bit rough and arguably negative, but I’ve passed them through because overall s/he is contributing some useful observations, insights, and links, plus some of the things s/he says are kinda funny. :-)

So yes, I’d say you’re on the right track, anonymous’s latest comment notwithstanding. :-) But like anonymous said earlier, it generally takes a few passes through the thesis, and I’ll add that it also takes some experimentation, before it really sinks in.

[...] Steve Vinoski has been busy trying to convince the world at large that RPC is “fundamentally flawed”. I think it is interesting to take a look at RPC and see what those fundamental flaws are (and whether there are flaws, for that matter). Doing this will definitely take more than one post, so don’t expect the answers all at once. I will deal with various aspects of the topic over a number posts over the next few weeks, so please bear with me. [...]

[...] Protocol Buffers: Leaky RPC Steve has a very tight definition for RPC as per Note On Distributed Computing and RFC 707. Unfortunately most real-world RPC mechanisms do not fit with this definition. Yet he still critiques them as if they did fit. (tags: rpc) [...]

@Steve & Anonymous. I hope you didn’t think I was finished here. I’ve been off reading the Fielding dissertation on REST. I also read the “Tao of Pooh”. Both were a great read. I suggest you both read the second one sometime.

Before I start (or repeat some of my earlier arguments), I wanted to address what Steve said about me taking my questions elsewhere. I do tend to agree with the person’s remarks you didn’t post. You and Anonymous are coming across as being REST zealots. That’s ok though. You’ve got your opinion, and I’ve got mine. The first step to actually having a real discussion with a people with strong opinions is to understand where they’re standing. To do that, I’ve had to deal with Anonymous standing on his pedestal throwing belittling remarks. Those remarks didn’t achieve much and shows his zealot nature. However, in between those useless remarks I’ve got a glimpse into his point of view and some useful information. If I’ve had to look foolish to get come to a better understanding of these topics then I’m ok with that.

Also, it’s good to see Michi joining the discussion. :)

OK, First off, let me repeat that I really enjoyed reading Fielding’s dissertation. The ability to evolve the architectural style of the web was ingenious. His method of creating an architectural style is a step above systems architecture. I think a lot of architects do develop architectural styles in their projects, however, most of the time they would be implicit in designs. Fielding does a great job of making the concepts explicit. Obviously an easy pot shot at every other solution to distributed computing is that they have not been developed this way. However, this is not to say they can’t be retrofitted in the same way REST was.

Next, REST is fundamentally not RPC. REST is an architectural style that is designed to ensure that the web’s hypermedia solution to distributed computing will not be ruined by future changes. REST is not a design pattern or an implementation. You could look at the actions of REST and loosely suggest as I’ve done in the past, and Michi has, that they have some similarities to RPC. I don’t think it is an argument worth pursuing. This does not mean that the REST architecture doesn’t look like RPC on the client, but I’ll get to that later. REST is as different from RPC as it is from Message Queuing or Publish Subscribe systems.

An important point is that REST is an architectural style for an open hypermedia system. REST is designed to ensure that new additions to the Web architecture don’t remove any of the positive traits that were carefully designed into the system. There is no claim that REST provides any of the architectural style required for point to point system to system communications. REST also isn’t designed for the security conscious world where point to point solutions are required and caches provide no benefit either. It is designed for one purpose; to define the open web hypermedia system.

Hypermedia and the web is a fantastic solution for human/browser to computer communications. However, it offers only part of the solution in the area of computer to computer communications.

I’ll go back to one of my earlier points. A blocking synchronous request/response semantic interaction starts when a local procedure is called on a client. It finishes when that local procedure call returns with a valid result or exception information. From the developer building these systems, RPC and REST look the same; the entry point is a local procedure call.

Adapting REST to the problem of system to system interactions provides little help to the developer. REST provides an architecture for the Session layer of the communications system. The developer must choose the Presentation layer (mime type), and then must encode that information they need into the Presentation to create their own Application layer. In system to system distributing computing problems, all the same issues are there that RPC/SOA/ORBS have, without the mature tools to assist the developer.

The problem is that there’s a disconnect between a developer working in an environment where they are making local method/function calls and REST which offers a hypermedia solution. This creates a situation where a developer is trying to provide a local library which is OO based and bolt it onto a hypermedia solution.

The REST approach to architectural style can obviously benefit system to system communications. There are a couple of things that REST provides which I’ll probably take away and explore. The first is a separation of the request parameters from the underlying request structure, to allow any data to be sent to the server. I’d actually already thought about doing this previously but hadn’t got around to it. I was going to put the data structures here to show how I’d make my Colony solution behave more REST like, but that’s probably a little to far off topic. I’ll do that on my Blog sometime.

The other part of REST I’d like to bring back to Colony is the ability to specify caching behaviour in the response. Colony is all binary so this would need to have Colony specific caches built; however, the design is a good one. Once again, I’ll put up the structures on my Blog sometime to show that.

So, I’m still looking for words to describe the group of all blocking synchronous request/response calls that the client makes in system to system communications. In these situations it doesn’t matter what implementation is used, the client still makes a blocking local functional call which blocks and eventually returns with a result from a server or cache. I’m told I can’t call it RPC because that has an ideology of ignorance; I don’t want to call it SOA because that has its own ideology. It’s definitely not an ORB. What is it?

Just to throw in another example. In Colony I’ve built two different implementations of calling a server. One uses a simple data structure with location, method id and parameters. The second uses a mini virtual machine using a simplistic byte code. The VM has a heap, stack, byte code and program counter. It can be used to call multiple methods over multiple machines. This is obviously not RPC; so what do I call it? I can build the same interface and call it in exactly the same way, yet the actions they perform on the client and server are completely different. It still fits into the class of blocking synchronous request/response semantics, yet I haven’t got a generic name to call it.

Steve, one of the things you’ve said you’re trying to achieve is make people aware of other solutions to distributed computing. I’m now aware of what REST is, and more importantly, what it isn’t. However, to have some more meaningful conversations I’d like to have the words to categorise and dissect the various solutions to distributed computing. If you’re saying I’m using the wrong words to describe things, please give me the right words. In particular how do we talk about the class of problems associated with system to system communications which involve blocking synchronous request/response semantics?

Seriously, great stuff Steve. I wrote a blog entry about taking into account the human element in software abstractions – i.e. the impact on productivity and development cost by making it easier for people to do things like distributed computing. I would argue that sometimes this is more important than purely technical considerations such as performance or clean code.

But I completely agree about the main point – which as I understand it is being sure to use the right tool for the right job, and not just blindly using RPC because it’s convenient.

I have a couple of other more detailed comments that I thought I’d add to the discussion.

On the definition of RPC – using RFC 707 is a bit like using the SOAP specification to define SOAP. I mean that there’s often a difference between theory and practice, and what we complain about with SOAP (myself included, despite the fact people are getting value from it) is that the spec isn’t implemented correctly. If you read the SOAP and WSDL specifications – the latest versions, especially, you would think they are very RESTful. But no one implements the RESTful bits – they tend to focus on the RPC style (as you have said, IONA has been as guilty of this as anyone despite our efforts to lobby for implementing the document oriented style).

I don’t want to go into a huge digression on this, but to me this is a great example of a kind of innovator’s dilemma, or a side effect of one. When we talk with our customers about SOAP, WSDL, etc. they tend to say something like “if you want me to use that, it has to be just as performant, reliable, secure etc. as what we already have.”

This of course entirely misses the point, since what’s important are the application requirements, not the technology used to develop the application. As you have pointed out many times, a RESTful approach could as easily meet many if not all enterprise application requirements. But (as you have also pointed out) this would require a change in thinking that a lot of people seem unwilling to tackle.

Another minor comment – the inability of the industry to solve the data type mapping problem does not in itself mean that the RPC mechanism is useless. It is true that interoperability decreases in proportion to the complexity and number of data types involved, but that doesn’t mean RPCs aren’t useful.

It is interesting to read about Erlang, REST, and explicit programming for distributed computing as a kind of historical advance. When RPCs first came out, we viewed them as a technological advance over the dominant style of the day, which was P2P (LU6.2 was the leader – and man, no one wanted to program that if they could avoid it).

But I also take the point that you are going to get better results for many types of distributed applications by explicitly programming. And I am also very impressed by what I’ve been reading about your Erlang work.

I don’t really think of RPC = transparent distribution. I think of RPC as a programming model. By definition people know they are doing remote calls because they have to create some kind of interface definition, compile proxies and stubs, link them into their applications, etc.

When I think about the big picture of distributed computing, it seems like there will always be some number of applications for which RPC is a better fit, and some number of applications for which asynchronous messaging is a better fit. One of the big problems I think we all have is that there is so much overlap in what can be done using either approach. I have a rule of thumb for this that depends on the significance of the reply. If the reply to a message needs to indicate whether or not the database update was performed, for example, RPC is a better fit. If the reply simply needs to indicate that the message was received, then asynchronous messaging is a better fit.

I know the above is kind of impressionistic, and not very precise. I suppose the point is to think about the tradeoffs and not just try to use one or the other for everything.

To do that, I’ve had to deal with Anonymous standing on his pedestal throwing belittling remarks.

Standing on a pedestal – yes I do that sometimes …

belittling – they were intended to be fun , sorry if you found them belittling …

Those remarks didn’t achieve much and shows his zealot nature. However, in between those useless remarks

talk about belittling … but I don’t really care … although I would disagree with you calling me a zealot, I wouldn’t care to argue why …

His method of creating an architectural style is a step above systems architecture

style (not his method) is a step above architecture – read the original paper documenting what Style is – garlan and shaw (and those guys were the first to study software architecture )

Next, REST is fundamentally not RPC

You had earlier argued exactly the opposite iirc, so are we zeolots for asking you to just RTFM ?!?

Hypermedia and the web is a fantastic solution for human/browser to computer communications. However, it offers only part of the solution in the area of computer to computer communications.

Thats how fielding intended it. Fielding when he first came out with REST only intended REST to be for information services , which when you think about it are a HUGE part of services and are VERY important – if your information services are hidden/pain in the ass to use – your employees are gonna be strait jacketed. So first please don’t “belittle” information services.

Second, it was only when everyone got us into WS-MESS that saner heads came together and said – look at that REST thing out there it seems to work nice for me , I am going to use it. This is not to say REST is complete and ready to use NOW . But as Martin Fowler/Jim Webber presented at infoQ, it is more like using the Agile method / evolutionary approach to distributed systems rather than the “intelligent design” that WS-* seems to favour.

The whole OSI 7 layer model is to me a pain to understand / use – I much prefer the TCP/IP 4 layer model. I can’t really reply to what you have written as I have long ago forgotten what these Session layer / presentation layers are – and I don’t see much point to the argument if I even find out.

RPC/SOA/ORBS have, without the mature tools to assist the developer.

As far as I can see, the only tools that developers really use (in WS-*) are the WSDL(with SOAP ofcourse) ones . Thats ALL. No one uses the other junk load of WS-* specs/ tools that vendors have come up with and are trying to sell. And as a result, all those most certainly aren’t mature.

The problem I have with protocol buffer is that it mislead people on XML’s performance characteristics.
XML doesn’t have a performance issue. The issue belongs to XML parsers. I have written an article on this…