The interesting part is the second reference (in bold) – the signature signs the KeyInfo (#KeyInfo001), which is part of the signature element itself. The regular api to add reference to a signature is this:

var reference = new Reference();
reference.Uri = "#KeyInfo001";

However it will not work here since it will look for an element with this ID in the signed document (e.g. soap envelope). However the ID is inside the signature element itself, which is still not a part of the document because it was not create yet.
Frederic had found a nice trick: Inherit from SignedXml and override the default logic to find the references such that it will search in the signature itself (which is the base class).

Saturday, January 29, 2011

One common patterns with web services is the router service. It can hide routing logic from the client or help with load balancing. Wcf 4 ships with libraries and samples to help build such a router very quickly.

Stephen Liedig had notified me recently on a problem which happens when a gSoap client calls a Wcf router. gSoap is a very popular CPP web services stack.

If the router would call the service using the http endpoint, everything would work. If it used the tcp endpoint, it could send the message, but the service implementation would never be called and the service would report this exception:

The formatter threw an exception while trying to deserialize the message: There was an error while trying to deserialize parameter http://tempuri.org/:user. The InnerException message was 'Element 'http://tempuri.org/:user' contains data of the ':User' data contract. The deserializer has no knowledge of any type that maps to this contract. Add the type corresponding to 'User' to the list of known types - for example, by using the KnownTypeAttribute attribute or by adding it to the list of known types passed to DataContractSerializer.'. Please see InnerException for more details.

Analysis

We need to ask two questions: Why this error happens? Why not with Http?

First we must take a look in the message that the router sent to the service. Here is how it appears in the Wcf log:

Something is missing – the “p” prefix declaration (xmlns:p=http://myNs/) which was present in the message form the client to the router does not appear. This makes this message invalid as it references the undeclared prefix "p" in the "p:User" derived type attribute. But why did the router sent an invalid message when the client sent it a good one? And why not with Http?

Let’s take a look at how Wcf routes messages. We can use the reflector to inspect System.ServiceModel.Routing.SoapProcessingBehavior+SoapProcessingInspector.MarshalMessage():

If the input message (from client) and output message (to server) have the same version, and no security is involved, then take the input as is and send it. This explains why the Http case works.

Otherwise take the input message body and copy it to a new message. Then copy custom user headers.

Since the “p” prefix is not defined under the body but under the root “envelope” element, this prefix is not sent which makes the message invalid. This explains the netTcp bug.

Why Wcf behaves in this way?

Dismissing this behavior as a bug will miss an important discussion. Consider the naïve fix: Parse each attribute under the body and search for the something:something pattern. Since the router has no information about the message semantics it has no way to know if the pattern relates to a prefix or is just a string which looks like this. Moreover doing this would be a huge performance hit, especially with big messages.

What should be the correct behavior?

In order to take both the functional and performance requirements into account, I would recommend the following approach:

Copy all namespace declarations from the root Envelope and Body elements of the original message into the corresponding elements of the new one (including the default namespace).

Make sure not to overload the previous prefixes with new definitions in both elements.

Yes, I can think of a few edge cases where this scheme fails. But a more comprehensive solution would cost in performance.

Conclusion

As it stands now, the default gSoap client fails to call the default Wcf router when protocol bridging is used, which is a real interoperability problem.

And Just to clarify, this web service does not define any derived (=inherited = known) types. For some reason the gSoap default message generation logic uses the xsi:type even on base types. This is not forbidden since it uses the correct type name, although there is no real reason to do it. Anyway this makes this case quite common and not limited to services with derived types.

In case the transforms element has childs (as in the last snippet before the above code), then the Read() call would put the cursor on the first child element (“transform” in singular). The while loop will end at the closing “transforms” element which is aligned with the following ReadEndElement() call.

In case there are no childs but there is a separate closing element:

<Reference URI="#_0">

<Transforms>

</Transforms>

<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>

</Reference>

then Read() would put us on the closing element, and since we will not enter the loop ReadEndElement will be right again.

In case of an empty element:

<Reference URI="#_0">

<Transforms />

<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>

</Reference>

the Read() would advance us behind the transformations tag! The next tag is the opening DigestMethod which of course fails for ReadEndElement().

So the case of an empty transforms element is responsible for the interoperability bug.

Workaround

If you are in a position to change the output of the Java server then various workarounds may apply. But what if you can’t?

The only real solution seems to be implementing the relevant ws-security portion by yourself. Here I show how to do the bare minimum of such a scheme – use the wcf security stack for the sending of the message and decrypting the response by ourselves. The missing piece will be the signature validation, which is very important in production but we can continue with development even without it as at least we get a readable response.

My solution involves using a custom message encoder. I have used the Wcf SDK message encoder sample as a template. When the response comes back to the encoder it decrypts it and remove the security from it before passing it to the upstream channel. To make sure the security channel will not fail (due to the now unsecured response) we verify our binding has EnableUnsecuredResponseset to true.

So we find the ciphered portion of the message and decrypt it. We then use it to construct a new unsecured message which we will pass along in the pipeline. Here this is done in a dirty way which hard codes the soap11 envelope.

We cannot decrypt it since it is encrypted with the server public key and we do not have the private one. However the security channel must keep reference to this key somewhere so that it will be able to decrypt the response (which it would do anyway unless the empty transformation bug). We need to use some reflection tricks to get the session key: