For these following three XML documents 1, 2, and 3 here is what I think they should canonicalize to with prefixRewrite according to the current spec - 1a, 2a and 3a. But I have also put an alternative for each - 1b, 2b and 3b.
Example 1: with namespace redefinition
<a:foo xmlns:a="http://a1">
<a:bar xmlns:a="http://a2"/>
</a:foo>
In the above example at first prefix "a" is defined to namespace http://a1 and then redefined to http://a2
With prefix rewriting should prefix "a" be rewritten to two different namespaces ns0 and ns1? Or should it be one namespace ns0 with redefinition. I.e. should it be
1a)
<ns0:foo xmlns:ns0="http://a1" >
<ns1:bar xmlns:ns1="http://a2"/>
</ns0:foo>
Or should it be
1b)
<ns0:foo xmlns:ns0="http://a1" >
<ns0:bar xmlns:ns0="http://a2"/>
</ns0:foo>
I think the spec says it should be 1a)
Example 2: with two prefixes defining the same namespace
<a:foo xmlns:a="http://a1">
<b:bar xmlns:b="http://a1"/>
</a:foo>
In this example both "a" and "b" are defined to the same namespace http://a1
So should they be rewritten to two different prefixes or to the same prefix?
I.e. should it be canonicalized to
2a)
<ns0:foo xmlns:ns0="http://a1">
<ns1:bar xmlns:ns1="http://a1"/>
</ns0:foo>
Or should it be
2b)
<ns0:foo xmlns:ns0="http://a1">
<ns0:bar/>
</ns0:foo>
Again I think the spec says it should be 2a). But maybe 2b) makes more sense, i.e. should we say that each URI should be mapped to one prefix?
Example 3: With prefixes being pushed down
<a:foo xmlns:a="http://a1" xmlns:b="http://a2" >
<b:bar>
<b:bar>
</a:foo>
Should each b get mapped to different prefixes? I.e. Should it be
3a)
<ns0:foo xmlns:ns0="http://a1" >
<ns1:bar xmlns:ns1="http://a2" >
<ns2:bar xmlns:ns2="http://a2" >
</ns0:foo>
Or should it be
3b)
<ns0:foo xmlns:ns0="http://a1" >
<ns1:bar xmlns:ns1="http://a2" >
<ns1:bar xmlns:ns1="http://a2" >
</ns0:foo>
Again according to the spec , it should be 3a)
I am thinking that we should change the definition of prefixRewrite so that we go by URI and not by prefix. i.e. each visibility utilized prefix gets mapped to a new prefix, so that there is a 1:1 mapping between URIs and new prefixes, but not a 1:1 mapping between original prefixes and new prefixes. With this definition we would get 1a), 2b and 3b) which I think makes more sense.
Pratik