Tag Archives: Work

Problem Statement

Some broker implementations require creating a copy the message forwarding it over to the backend. The broker also might slightly modify things like addressing headers etc. on the message for proper message routing within the DMZ. The problem is that we see a very high CPU cost in creating this copy message and this also results in lower throughput. Note: Streamed transfer mode is not in scope for this article.

Simulation

For all performance issues we need to measure and profile and to investigate this issue we initially try to simulate the pattern of the broker by just copying over the message and then forwarding it to a backend dummy service. We then take profiles of this to understand how much the actual cost of copying is.

An 8% cost for copying seems to be acceptable considering the value of making the copy and able to do other things if required. But then again this was not what is being observed. In the profiles from the actual broker we notice about 40% cost for creating a copy. This means that almost half the time is spent in creating a message copy. So effectively your throughput would almost drop to half when the broker is configured to create a copy of the message. This is excluding costs like logging etc.Evidently our simulation is not accurate so we need to isolate this further. We take in more functionality from the broker so that we hit this expensive path. One of the key observations was that the message is copied just before it is being forwarded. This also means that there are a bunch of manipulations that was done on the message and in our simulation we didn’t perform any manipulation. So to get this closer we need to probably change some things on the message.To keep it simple we did something like removing some header and adding another header to the message since most brokers modify headers before forwarding it over.

Now that we have identified the root cause we also need to identify the solution so that the broker can achieve the functionality without taking up so much CPU.

Solution

The solution is actually quite simple “Modify your message after you create the buffered copy”. I wantedgive the solution before the analysis since most of you would probably not be interested in the analysis but if you are then the rest would be interesting.

BufferedMessage headers haven’t been captured (this happens for e.g. when the user inserts a header in the first location)

If headers have been modified then CreateBufferedMessage takes an alternative path using the DefaultMessageBuffer. The reason is that a fully copy of the message has to be created if any buffered header has been modified. An internal property called headers.ContainsOnlyBufferedMessageHeadersis used to distinguish if the faster BufferedMessageBuffer can be used to create the buffered copy or not. If there are any modified headers then this means we need to assure that the message is fully marshaled over and the buffer itself cannot be copied(e.g. the user can add a reference type to the header) and so we fall back to a path that would fully reparse the message and create a fully deserialized copy of the modified message.The main point here is a copy should always be a deep copy and any kind of modification should not result in a message with shallow copied message parts. When you copy and create a message from the original then your message objects get its own copy of headers that it can play around with without affecting the original incoming message. Message copy by itself is a fast operation as you can see from the above profile and copying a modified message can be very CPU intensive when using buffered transfer mode.

For greater flexibility our router can be something like a pass through router. If we are just calling a backend service then we can use a generic contract to receive and forward messages to the back end service as shown below.

Here we create a copy of the message to consume locally on the broker incase we want to validate some parts of the message or log etc. Ideally the fastest would be to just directly forward it over but application sometimes require all incoming messages to be logged or validated at the entry point of the DMZ.

Pros

Loosely Coupling

Potentially can avoid a lot of serialization and deserialization cost. (Encoders need to match for this)

Changes in the backend usually do not require changes in the Broker provided the broker uses generic client contracts.

Cons

Synchronous pattern has its overhead and not scalable.

The router has to be of very high capacity to support high loads even though backend machines may block on IO.

Best Practices

Match your encoders – Encoder mismatches can cause a heavy serialization and deserialization on the router. This is because any change between the backend and the frontend encoders on the binding would require re-encoding of the message and hence a full read and write at the encoder layer.

Avoid having to create message copy. If the backend can create the copy/validate you save the routers CPU for more messages per second.

Make sure you use fast message copy – Buffered messages have a very optimized message copy path that WCF would take provided you haven’t changed parts in the message. I will talk about how to hit this fast path next.

I had never thought that I would actually write about a topic like this but sometimes you want to organize your thoughts and have an opinion on things. Being in the performance team for WCF has got me used to a plethora of message exchange patters which we lovingly refer to as MEP. There exists a broad spectrum of coding and implementation styles which we see day in an day out. There are those that are extreme and elaborate and overwhelmingly flexible and also those that are so convoluted and rigid that its almost close to assembly.

Its good to know know what message exchange patterns would be most suited for his or her needs. I think its an overkill to adopt a strategy where your application will force itself to use only a single message exchange patter. For example an ideology like “We will do only rest style request reply throughout our system” The number of layers we need to add in order to align ourselves with a philosophy like this would probably outweigh the benefits that it provides, specifically in scenarios that aren’t suited for patterns like these.

There is a really nice article on MSDN listing out 6 message exchange patters http://msdn.microsoft.com/en-us/library/aa751829.aspx To quickly reiterate they are Datagram/Request-Reply/Duplex and 3 of these with sessions on top. You can think of a session like a logical abstraction to say that the message is a part of a conversation. This has nothing to do with asp.net sessions and it is a way for WCF to correlate messages.

My experience it is generally more helpful to classify your problem and see what pattern really helps your issue and then slap on the contracts and protocols rather than fixing on the protocol/MEP and then forming a solution around that. I choose not to be an advocate of any particular style but I am against protocol fanatics who are inflexible and who believe that there are a fixed set of choices for certain types of scenario.

Systems are organic and so its hard to freeze implementations. The fact is patterns are similar too. Today you might be ok with TCP but there is nothing stopping you from switching to queues. As the system grows and there would be solutions you put in place to facilitate this kind of a change. Layers get added and MEPs also change.

If you have any questions on how to make a choice I would gladly try to help out.

To modify properties that are not exposed on the standard binding we can create a CustomBinding from the provided standard binding. We can then find the element required on the particular CustomBinding and tweak it. Another option would be to just hand craft the full standard binding if you know exactly how to stack up the elements. Here is an example to how to tweek the IdleTimeout.

WCF gives a very rich set of standard bindings that you can use for your endpoints. However we might need to tweak properties that might not be exposed on the standard bindings. You can handcraft the whole binding or you can start with standard binding as a template. Here are some ways.

Create the BindingElementCollection from the StandardBinding and update properties on it and use the BindingElementCollection to create the CustomBinding. This creates a full clone of binding elements and you can reset values on the elements using Find<BindingElement> on the collection.

Create the CustomBinding directly from the StandardBinding by passing the binding into the CustomBinding constructor. The first approach has an issue where properties like SendTimeout/ReceiveTimeout etc don’t get copied onto the CustomBinding since they are held by the Binding and not its child elements & for this simple reason would recommend the second approach.