I am trying to use XPath queries on a document with multiple namespaces and need to define a NamespaceContext. The only way to retrieve all namespaces I can think of, would be to iterate over all nodes and attributes and collect their namespace URIs and prefixes explicitly. This seems very inefficient, and I have doubts that it is the best practice.

Why do you need to retrieve all namespace URIs? When you construct the query, you better know the namespaces beforehand (and the actual prefixes used are irrelevant); otherwise what use is the query?
–
forty-twoFeb 13 '12 at 16:01

@forty-two I wanted to do this emulating the internal XPathAPI, which seems to handle namespaces transparently. So I built a Util to eliminate the need to pass the namespace data, but am not sure about the best approach.
–
kostjaFeb 13 '12 at 16:07

1 Answer
1

Because namespaces can be defined in both the header of the XML doc as well as the elements themselves, you are correct; if you need EVERY namespace used in a doc you would need to process the entire doc (including all imports if you want to be 100% complete).

If you are wanting people to be able to query your doc like a database with XPath-esque queries, you'll likely want to load up the doc into an in-memory representation that can be queried quickly anyway.

Given that you have to process the whole thing into memory anyway, that would be your opportunity to parse all the namespaces.

NOTE: I am ignoring the "why" you would need all the namespaces, I just assume you need them, in which case your assumption of processing the whole doc is correct.

Thank you. Confronted with the whys, I am not so sure I do need them - I thought it's how I have to do it. But i'll check and see what results I can achieve without namespace-awareness. If I dont need them after all - good riddance.
–
kostjaFeb 13 '12 at 19:02