I like what I have seen of Cypher, it looks very similar to XQuery. Neo justify it in their talks with "Developers don't get SparQL" which I find weird (as well as shocking) as Cypher looks similar to SParQL to me. The example where they were doing date comparisons by comparing date month and year suggests an impoverished type system. and there is also a potential pitfall with the ability to allocate properties to both nodes and relationships . If one is not careful how that is modelled you can suddenly find yourself effectively stuck with a schema that is cumbersome to unravel and you are nowhere near as agile as you thought you were.

Don't need an intermediate GraphML step, since Cypher looks so agreeable just generate it direct from the XML. In fact the quickest way to a graph would be to hack the original text files into Cypher's LoadCSV format. However the XML is still valuable as a source of denormalized master data so you won't need to pay the price of graph traversal to service such requirements.

So for now I press on with that as a dual solution. Neo looks a very good fit for this sort of application. Thank you.

Note you do things like can annotate any graph edge with a weight (or other property) so you can express confidence levels directly. You can then write queries to only retrieve nodes linked by a confidence greater than some value. There are differences in how each database supports edge properties. They can also be proxied by building new nodes to connect other nodes but that get's ugly fast if you need to do any amount of it.

In your case the problem will be building the edges (relationships) in the first place if you don't already have them. Once you have some basic relationships built out both languages will give you capabilities for things like cluster analysis.