When importing a node from another context, XmlNode and friends do a resolve against all namespace declarations in scope. So, when importing such a header, we shouldn't get a duplicate namespace declaration.

The problem is, we don't get a duplicate namespace declaration, since XmlSerializer actually inserts a normal XML attribute into the Header element. That's why we seem to get another namespace declaration. It's actually not a declaration but a plain old attribute. It's even visible (in this case in XmlElement.Attributes), and it definitely shouldn't be there.

So if you hit this special case, remove all attributes before importing the node into your core document. Like this:

Since we are serializing against a specific XML schema (XSD), we have an option of schema compilation:

xsd /c President.xsd

This, obviously, yields a programmatic type system result in a form of a C# class. All well and done.

Now.

If we serialize the filled up class instance back to XML, we get a valid XML instance. It's valid against President.xsd.

There is a case where your schema changes ever so slightly - read, the namespaces change, and you don't want to recompile the entire solution to support this, but you still want to use XML serialization. Who doesn't - what do you do?

This will work even if you only have a compiled version of your object graph, and you don't have any sources. System.Xml.Serialization.XmlAttributeOverrides class allows you to adorn any XML serializable class with your own XML syntax - element names, attribute names, namespaces and types.

Remember - you can override them all and still serialize your angle brackets.

This document represents data and an enveloped digital signature over the complete XML document. The digital signature completeness is defined in the Reference element, which has URI attribute set to empty string (Reference Uri="").

Checking the Signature

The following should always be applied during signature validation:

Validating the digital signature

Validating the certificate(s) used to create the signature

Validating the certificate(s) chain(s)

Note: In most situations this is the optimal validation sequence. Why? Signatures are broken far more frequently then certificates are revoked/expired. And certificates are revoked/expired far more frequently then their chains.

[3] There can be multiple X509Certificate elements qualified with http://www.w3.org/2000/09/xmldsig# namespace in there. Xml Digital Signature specification is allowing the serialization of a complete certificate chain of the certificate used to sign the document. Normally, the signing certificate should be the first to be serialized.

This normally means that either the certificate is not valid (CRLed or expired) [4], or one of the chain's certificate is not valid/expired.

[4] The premise is that one checked the signature according to 1, 2, 3 schema described above.

The Question

Is digital signature valid even if CA revoked the certificate after the signature has already been done? Is it valid even after the certificate expires? If signature is valid and certificate has been revoked, what is the legal validity of the signature?

In legal terms, the signature would be invalid on both upper assertions, 1 and 2.

This means, that once the generator of the signature is dead, or one of his predecessors is dead, all his children die too.

Timestamps to the Rescue

According to most country's digital signature laws the signature is valid only during the validity of the signing certificate and validity of the signing certificate's chain, both being checked for revocation and expiry date ... if you don't timestamp it.

If the source document has another signature from a trusted authority, and that authority is a timestamp authority, it would look like this:

The second signature would be performed by an out-of-band authority, normally a TSA authority. It would only sign a hash value (in this case SHA1 hash) which was constructed by hashing the original document and the included digital signature.

This (second) signature should be checked using the same 1, 2, 3 steps. For the purpose of this mind experiment, let's say it would generate a booTimestampValid boolean.

In this case, even though the signature's certificate (or its chain) is invalid, the signature would pass legal validity if the timesamp's signature is valid, together with its certificate and certificate chain. Note that the TSA signature is generated with a different set of keys than the original digital signature.

This week XML is ten years old. The core XML 1.0 specification was released in February 1998.

It's a nice anniversary to have.

The XML + Namespaces specification has a built in namespace declaration of http://www.w3.org/XML/1998/namespace. That's an implicit namespace declaration, a special one, governing all other. One namespace declaration to rule them all. Bound to xml: prefix.

XML was born and published as a W3C Recommendation on 10th of February 1998.

This contract defines a simple method, called Process, which processes the input document. The idea is to define the document schema and validate inbound XML documents, while throwing exceptions on validation errors. The processing semantics is arbitrary and can support any kind of action, depending on the defined invoke document schema.

A simple instance document which validates against a version 1.0 processing schema could look like this:

Note that the default XML namespace changed, but that is not a norm. It only allows you to automate schema retrieval using the schema repository (think System.Xml.Schema.XmlSchemaSet), load all supported schemas and validate automatically.

The main benefit of this approach is decoupling the parameter model and method processing version from the communication contract. A service maintainer has an option to change the terms of processing over time, while supporting older version-aware document instances.

This notion is of course most beneficial in situations where your processing syntax changes frequently and has complex validation schemas. A simple case presented here is informational only.

So, how do we validate?

We need to check the instance document version first. This is especially true in cases where the document is not qualified with a different namespace when the version changes.

We grab the appropriate schema or schema set

We validate the inbound XML document, throw a typed XmlInvalidException if invalid

We process the call

The service side is quite straightforward.

Let's look at the client and what are the options for painless generation of service calls using this mechanism.

Generally, one can always produce an instance invoke document by hand on the client. By hand meaning using System.Xml classes and DOM concepts. Since this is higly error prone and gets tedious with increasing complexity, there is a notion of a schema compiler, which automatically translates your XML Schema into the CLR type system. Xsd.exe and XmlSerializer are your friends.

If your schema requires parts of the instance document to be digitally signed or encrypted, you will need to adorn the serializer output with some manual DOM work. This might also be a reason to use the third option.

The third, and easiest option for the general developer, is to provide a local object model, which serializes the requests on the client. This is an example:

The main benefit of this approach comes down to having an option on the server and the client. Client developers have three different levels of complexity for generating service calls. The model allows them to be as close to the wire as they see fit. Or they can be abstracted completely from the wire representation if you provide a local object model to access your services.

This schema describes a problem, which is defined by a name (typed as string), severity (typed as integer), definition (typed as byte array) and description (typed as string). The schema also says that the definition of a problem has an Id attribute, which we will use when digitally signing a specific problem definition. This Id attribute is defined as GUID, as the simple type GUIDType defines.

Instance documents validating against this schema would look like this:

Only a few of you out there are still generating XML documents by hand, since there exists a notion of schema compilers. In the .NET Framework world, there is xsd.exe, which bridges the gap between the XML type system and the CLR type system.

xsd.exe /c problem.xsd

The tool compiles problem.xsd schema into the CLR type system. This allows you to use in-schema defined classes and serialize them later on with the XmlSerializer class. The second instance document (exhibit A) serialization program would look like this:

If you look closely, you will notice two additional prefix namespace declarations in exhibit B bound to xsi and xsd prefixes, against exhibit A.

The fact is, that both documents (exhibit B, and exhibit A) are valid against the problem.xsd schema.

<theory>

Prefixed namespaces are part of the XML Infoset. All XML processing is done on XML Infoset level. Since only declarations (look at prefixes xsi and xsd) are made in exhibit B, the document itself is not semantically different from exhibit A. That stated, instance documents are equivalent and should validate against the same schema.

</theory>

What happens if we sign the Definition element of exhibit B (XmlSerializer generated, prefixed namespaces present)?

This document is the same as exhibit B, but has the Definition element digitally signed. Note the /Problem/Signature/SingedInfo/Reference[@URI] value. Digital signature is performed only on the Definition element and not the complete document.

Now, if one would validate the same document without the prefixed namespace declarations, as in:

As said earlier, all XML processing is done on the XML Infoset level. Since ambient prefixed namespace declarations are visible in all child elements of the declaring element, exhibits C and D are different. Explicitly, element contexts are different for element Definition, since exhibit C does not have ambient declarations present and exhibit D does. The signature verification fails.

</theory>

Solution?

Much simpler than what's written above. Force XmlSerializer class to serialize what should be serialized in the first place. We need to declare the namespace definition of the serialized document and prevent XmlSerializer to be too smart. The .NET Framework serialization mechanism contains a XmlSerializerNamespaces class which can be specified during serialization process.

Since we know the only (and by the way, default) namespace of the serialized document, this makes things work out OK:

It's a .NET Framework 2.0 application which can be used as a simple raw XML editor. It's got XSL support, XML differentiation, XML Schema validation, entity name intellisense, and, as the name suggests, it's as simple as notepad.exe. Superb performance on large documents, too.

Great. Tune it up, change the icons and layout then ship it with Vista, I say.

I find it quite attractive, since nowadays I don't spend as much time looking at angle brackets anymore.

A fellow MVP, Daniel Cazzulino, has a post titled AJAX may be the biggest waste of time for the web. While I agree with most of the points there, one should think about what Microsoft is doing to lower the AJAX development experience boundary.

Having to deal with JavaScript, raw (D)HTML and XML is definitely not going to scale from the developer penetration perspective. Nobody wants to do this is 2006. Therefore if Atlas guys make their magic happen, this would actually not be neccessary. It they achieve what they started, one would be abstracted from client side programming in most of the situations.

<atlas:UpdatePanel/> and <atlas:ScriptManager/> are your friends. And they could go a long way.

If this actually happens then we are actually discussing whether rich web based apps are more appropriate for the future web. There are scenarios that benefit from all these technologies, obviously. And if the industry concludes that DHTML with XmlHttpRequests is not powerful enough, who would stop the same model to produce rich WPF/E code from being emitted out of an Atlas enabled app.

We have, for the most part, been able to abstract the plumbing that is going on behind the scenes. If it's server side generated code, that should be running on a client, and if it is JavaScript, because all browsers run it, so be it.

We have swallowed the pill on the SOAP stacks already. We don't care if the communication starts with a SCT Request+Response messages, following by the key exchange. We do not care that a simple request-response model produces 15 messages while starting up. We do not care that there is raw XML being transfered. After all, it is all a fog, doing what it is supposed to do best - hiding the abstraction behind our beautiful SOAP/Services stack API.

Does Matevz really believe that the lack of a Microsoft editor on the XQuery spec is the reason it's taken so long?

No (that's why there's a 'maybe' and 'helps' in there). But it doesn't help either. From my point of view there are three real limiting factors for limping with XQuery for more than 6 years (1998 workshop, 1999 working group gathered):

Competitive corporate agendas

Becoming tightly coupled with other XML specs

Ambitious spec in the first place

In that order. Microsoft's reasons right now are completely transparent. They would be more than thankful if the spec reached Recommendation status. Including partial support in SQL Server 2005 is a bit of a gamble with development dollars. But holding it back, on the contrary, can backfire too.

Going back to my statement:

I'm wondering why the XQuery spec isn't moving anywhere. Maybe the lack of Microsoft editor in the editor list helps ignoring the importance of this technology in the soon-to-be-released products. Current editors don't seem to be bothered with the decisions Microsoft has to take. I'm sure though, that Jonathan Robie (DataDirect Technologies) is pushing hard on Microsoft's behalf.

From Jonathan's response I believe he doesn't agree with the editor part and not him pushing on Microsoft's behalf.

From my perception, major mainstream platform support for XQuery would do well both for the vendors and the XQuery in general. It's been cooking so long that it needs solid support, before becoming overcooked, like XML Schema. And yes, I agree that there are some wonderful implementations out in the wild already. Developer penetration is what this technology still has to achieve.

I'm sure, Jonathan, that Paul Cotton & Co, would be more than willing to wrap up, if things aligned. Looking forward to your viewpoint on why it's taking so long. The last one found is already a bit stale.

I just received an email from Stylus Studio creators asking me to sign a petition on the lack of XQuery support in .NET Framework 2.0.

I'm sorry. I cannot do that. It's just the rule I have.

Implementing a working draft version of an XML-based technology in a wide-spread product, like .NET Framework 2.0 is just out of the question. It has been done before with the XSL implementation in IE5, which then split to XSLT and XSL-FO, causing havoc for Microsoft.

On the other hand, implementing a stable subset of XQuery in SQL Server 2005 is another thing. While I don't necessarily agree with the necessity, I do agree that SQL 2005 and .NET Framework are two completely different beasts having different life cycle characteristics and flop-survival methods.

I'm wondering why the XQuery spec isn't moving anywhere. Maybe the lack of Microsoft editor in the editor list, helps ignoring the importance of this technology in the soon-to-be-released products. Current editors don't seem to be bothered with the decisions Microsoft has to take. I'm sure though, that Jonathan Robie (DataDirect Technologies) is pushing hard on Microsoft's behalf.

The problem lies in number 1. XML serialization stacks produce nasty angle bracket wire serialization format which needs to be parsed into the XML Infoset before it can be exposed by any programmatic XML-exposing technology, like a DOM, SAX or what have you. In the reverse things get done in the opposite direction.

Question still remains about whether the XML industry has reached a sweet spot in the (non)complexity of the serialization syntax to allow fast processing in the future. It is my belief that we will not see a great adoption of any binary XML serialization format, like XOP or BinaryXML outside the MTOM area, which pushes XOP into SOAP. That stated, one should recognize the importance of main vendors not reaching the agreement for quite some time. Even if they do reach it some time in the future, the processing time gap will long be gone, squashed by the Moore's law. This will essentially kill the push behind binary serialization advantages outside the transport mechanisms (read SOAP). Actually, having a 33% penalty on base64 encoded data is not something the industry could really be concerned about.

There are numerous limiting factors in designing an interoperable serialization syntax for binary XML. It all comes down to optimization space. What do we want to optimize? Parsing speed? Transfer speed? Wire size? Generation speed? Even if those don't seem connected, it turns out that they are sometimes orthogonal. You cannot optimize for generation speed and expect small wire size.

We will, in contrary, see a lot more XML Infoset binary representations that are vendor-centric, being only compatible in intra-vendor-technology scenarios. Microsoft's Indigo is one such technology, which will allow proprietary binary XML encoding (see System.ServiceModel.Channels.BinaryMessageEncoderFactory class) for all SOAP envelopes traveling between Indigo endpoints being on the same or different machines.

If this thing continues, and adds another stupidity on top of a base stack, we'll be back in the 70s.

Processing power and network throughput will handle the load of cross boundary XML being serialized as XML 1.0 + Namespaces. We do not need XML 1.1, which is a flop anyway, and for sure, we don't need another Infoset.

Let the major vendors deliver binary Infoset for intra-firewall scenarios. Every other form of communication mechanism should use the d*mn angle brackets, if it chooses the XML dialect for the payload.

All you need to do is replace one line in SubscriptionManagerFactory.cs:

return new XmlSubscriptionManager() as ISubscriptionManager;

With:

return new SqlSubscriptionManager() as ISubscriptionManager;

or

return new MemorySubscriptionManager() as ISubscriptionManager;

Since some members of the workspace are already working on configuration application block integration, all config data should go in there someday.

My implementation now uses SQL Server as a subscription storage for durable WS-Eventing subscriptions. System.Collections.Hashtable is used in memory based persistance model. Complete support includes:

I especially like the availability of arbitrary positioning of XPathNavigator to only serialize the bits you are interested in.

The only limitation of this solution is that it does not ship with FX 1.0/1.1 and you have to be a master in XML to fully grok it. But hey, if you don't, you can still use XmlDocument as a return type. :)

There are special cases, when one would like to bypass the other approach (passing XML as XmlDocument) on the server side. If you have all the data ready and want to pass it as quickly as humanly possible, without rehydrating the entire full blown DOM-capable object, you would use System.String (xsd:string in the XML world) and System.Text.StringBuilder to contacenate it.

If you don't know what to choose I propose this:

It is year 2004, therefore platform and tool support is available in a way that XML processing is not a limitation from the XSD type system -> platform type system conversion side. Therefore choose XmlDocument.

Choose XmlDocument.

Choose the string way if and only if you are expecting clients which have no other way to bridge/decouple the raw SOAP XML string into something programmatic inside your platform.

I wrote about a bug in validation engine of .NET Framework 1.0/1.1 a couple of weeks ago. There was a lot of posts/discussions/emails about this issue later on.

As Dare, Web Data Team PM, points out, it turns that this anomaly is manifesting itself through System.Xml.XmlValidaingReader, because System.Uri class has a problem. And System.Uri has a problem, because RFC 2396 does not support empty values in BNF notation of URI.

So, what I propose is that if you end up in a similiar situation that we did in a production environment and want to validate XML instances or XML digital signatures (which are likely to be prone to this problem too, depends on a generation engine) and current Whidbey release is not your cup of tea, THEN CHANGE THE SPECIFICATION/SCHEMA.

Simply change xsd:anyURI with xsd:string. It will help. :)

I know this is architecturally a bad idea. But there is no other way to get around this bug until Whidbey ships (unless you want to change platforms).

I'm glad that usability is driving the ambiguity choice in this case. I'm glad that decision has been made to support empty strings in System.Uri even though the spec is not clear. Some things are just more natural than others.

There is currently no workaround for .NET FX 1.0/1.1. Actually Whidbey is the only patch that fixes this. :)

The problem is even more troublesome when one does not have direct control over instance document syntax/serialization. For example in case of auto generated XML by Microsoft Office InfoPath during digital signature insertion. Attribute /Signature/SignedInfo/Reference/@URI is (according to XML Signature schema) typed as xs:anyURI.

Ultimately, I think the question is really a distraction. One of the great strengths of XML is that the instances exist independently not only from individual schema definitions, but also independently from the schema language of the day. [From: Don Box]

True indeed. In case of industry shifting to RelaxNG (very unlikely), instance documents would survive nicely. As long as there is a schema, that describes an instance document, everything is fine. When the connection is lost somehow, we can't talk about instances any more.

XML without a defined schema is no better than CSV. It's not even easier to parse.

We'll see how this works out. I have been at the PDC, seen Doug's talk and I agree that this allows schema to be versioned over time. What is bothering me is the structural extension of the schema itself, just to support versioning.

And yes, I know this is the only way, since W3C didn't pay attention to versioning in the first place. It still bothers me, since I like my content models clean.

Can't get one of my solutions to work on a Windows Server 2003 based server. Client works fine, but server side X509-based decryption fails with an error that should not happen (Cannot find the certificate and private key for decrtyption).

Everything installed and correctly setup. Even permissions. :)

Since even the official Microsoft newsgroup didn't help, I'm really stuck. The funny thing is, that if I disallow access to private key and/or remove the certificate, error message changes giving me a clue that WSE looks at the cert unsuccessfully.

It's not that I'm opposed to changing beta-time namespaces but all my documents, saved in Office 2003 beta2 as XML, won't open up properly in Office 2003 RTM. I have to change them by hand.

Another thing that pops a question is: What if Microsoft releases two Word versions within a year that need different namespaces? That has not happened yet, but this kind of namespace naming convention is not as flexible as a standard year/month, W3C like one.

This just
came through: Finally, Whidbey will provide increased power for performing
common tasks involving the manipulation of XML. In addition to delivering increased
performance when accessing and managing XML, Whidbey will include support for XML-based
data processing using XML Query Language (XQuery).

It just seems strange to me, since XQuery is
still in working draft status and will probably stay that way for quite some time.

One can also create an XmlNode instead of XPathNavigator,
but I prefer the XPath data model.

This seems a much more scalable solution than using "FOR
XML RAW/AUTO/EXPLICIT" and populating an XmlReader with SqlCommand.ExecuteXmlReader.
"FOR XML RAW/AUTO/EXPLICIT" is slow and requires an XML serialization/deserialization
pair.

There are two metods. One returns an RSS URL based on UDDI service key. One returns
an RSS URL based on UDDI service name. If multiple versions of RSS feeds are found,
service looks for RSS 2.0 feed first. Then RSS 1.0, then RSS 0.9x. It returns '-1'
if no feed is found.

What seemed to be no obstacle at all is turning out to complicate my architectural
designs lately.

Microsoft Message Queuing (MSMQ) has this strange limitation (at least for
year 2003) which prevents you to have messages longer than 4MB. Since most of .NET
architects are dumping object instances into MSMQ, which get serialized into XML,
we all have a problem with binary data. The problem lies in binary XML serialization,
XML Schema and its base64Binary datatype,
which is used in encoding. We do not get 4MB, but ~3MB message content limitation,
due to a well known 1.333 factor of base64 encoding.

Architectural design is now vastly different, since I have to split the binary documents,
while allowing them to be linked appropriately with their parent messages. And since
I'm building a document management system which will push .doc, .xls and friends on
a MSMQ stack, 4MB is often not enough.

Sure enough, it was it's 8th reincarnation. It finished yesterday and I gave a talk on Wednesday. Talked about XML versus CLR type system, dived into XML Schema specifics and compared early programmatic type systems with modern ones, including JVM and CLR.

Later on, I joined in and answered questions on a e-business related roundtable. The conference room was half full (~100 folks), which wasn't that bad. See you next year Maribor guys...