[M] Schema namespace correction

We cannot guarantee that Data Providers will put the right namespace declaration in their discovery records. We will have to try to accept whatever records they give us.
We will have to replace any namespace declarations with a standard one that we know works in NDG, plus remove schemalocations so that exist doesn't try to validate and reject.

Why don't we just keep copies of the schemas (MOLES, CSML etc) in Exist, then documents can be validated while in Exist or during ingestion.

In the ingest script the schemalocation attribute would need to be corrected to point to the version in Exist rather than removed entirely. You would still need to remove rogue namespace declarations.

I think having the ability to validate documents in-situ in exist will prove useful, e.g. if we start editing documents in exist (either by hand or with software),
but this validation is only possible if we store the schemas in exist.

We would probably have to be pretty strict on versioning and only update the exist schema with a svn tagged change in version number.

We could do this. In discovery ingest we do need a stage where we validate records against the 'proper' schema and reject with some sort of report going back to DataProviders?. This could utilise exist validation, or I might want to do it in a separate script, haven't decided yet.

As harvested records from data providers are now ingested into a postgres, so no problem with exist trying to validate (& in any case validation had been turned off on exist on the discvery service anyway). Effectively fixed, I think.