Hello,
I realize that we are past the official last call period for infoset, but as
the document has still not proceeded beyond working draft and as I am now
responsible for trying to produce a canonicalization specification that, at
a minimum, can be used by the digital signature working group to advance the
XML Digital Signature specification, I would like to request consideration
of a modification.
I would like to request that you add to the element and attribute properties
the namespace prefix that appeared in the input document. It is erroneous
to assert that namespace prefixes do not carry information value within an
XML document in light of the W3C's XPath recommendation. This is because
XPaths can A) appear in attribute values and element character content and
B) refer to namespace prefixes. I have posted a detailed discussion of this
topic, including two theorems which show that namespace rewriting harms
documents containing such XPaths [1]. As a corollary to those theorem,
namespace prefixes carry information value, so the information set is not
completely represented.
[1]
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2000AprJun/0159.html
I need you to add this because I plan to remove namespace rewriting from
canonicalization (precisely because it can change the logical meaning of a
document, which contradicts the meaning of canonicalization). Instead c14n
will write the prefix occuring in the originating document, so I cannot
continue to refer to the XML InfoSet document unless it is changed to
include this information. The option at hand is to rewrite the
specification as an application of XPath, which allows the application to
decide whether the reported QName includes the original prefix. Note that
this may occur anyway because I think it is a suitable way to deal with
canonicalizing partial XML documents.
By the way, I realize that writing the original namespace prefix means that
XML documents that are logically equivalent, but differ in the particular
namespace prefixes used will not be considered logically equivalent by
simple byte comparison of their canonical forms. However, as explained in
[1], namespace rewriting does not fix this problem. Moreover, it will
shortly be argued in the next draft of c14n that simple byte comparison for
logical equivalence is not a realistic goal. Aside from
application-specific knowledge of insignificant whitespace and
application-specific equivalencies, there are other equivalencies that can
result from the rules of RDF, the meaning of XPath expressions, character
models, XML Schemas, and other specifications that have not even been
written yet.
In order to have a c14n spec in time for DSig to use it, we need to focus on
canonicalizing XML 1.0 because the logical equivalencies it allows represent
the largest set of changes that existing applications are likely to make
with the expectation of impunity. Going forward, we need to impose stricter
guidelines on applications to ensure that information in the document is
preserved as it was given. For example, character content of "0.10" should
not be changed to "0.1" by the argument that the numbers are equivalent. As
far as XML is concerned, they aren't equal. They are only equal due to some
rules beyond those given in XML 1.0.
As for namespaces, the c14n spec will recognize their existence, but it will
not try to formally canonicalize them because this cannot be done without
changes to either the Namespace recommendation or the XML 1.0
recommendation.
John Boyer
Software Development Manager
PureEdge Solutions Inc. (formerly UWI.Com)
Creating Binding E-Commerce
jboyer@PureEdge.com