Integration Level 2: detect XML encoding for internal XML files

These files shouldn't really need special characters, since they are technical descriptors (plexus.xml and so on). But this change is useful for non-ascii platforms (Z/OS with EBCDIC), where even simple ascii characters can't be read with platform encoding.

The Java API documentation is explicit about this fact (if you read it carefully: yes, look at the class description, not the constructor comments), but this is not obvious when using the API: developers tend to forget that they chose an encoding when using this API.

After you have replaced your FileReader/Writer constructor with this API which is explicit about encoding choice, you understand that if the file read/written is XML, platform encoding is a wrong choice: you need XML encoding detection, which is the purpose of ReaderFactory.newXmlReader(File) and WriterFactory.newXmlWriter(File)...

Integrating XML encoding detection in Maven plugins

A lot of Maven plugins read and write XML files, and they're actually doing it with platform encoding (ie FileReader/Writer): the change to Reader/WriterFactory.newPlatformReader/Writer should be done.

But there is a problem with Maven versions earlier than 2.0.6: in Maven 2.0.5 and earlier, plexus-utils version is forced by Maven Core and cannot be overriden by a plugin. MNG-2892 (released in Maven 2.0.6) fixed this limitation. Then Maven 2.0.6 is a prerequisite to fix plugins...

What can be done?

In maven-site-plugin, XML encoding classes from plexus-utils were copied to plugin's sources (MSITE-242 to remove them): there is a lot of XML files read by this plugin, with strong encoding support need, then this bad solution was really the best one. But this wouldn't be good to do such a copy in every plugin.

A light solution is to replace new FileReader( File ) with new InputStreamReader( new FileInputStream( File ), "utf-8" ): if XML encoding detection is not supported, at least reading the file with default XML encoding, UTF-8, is both more powerful and more coherent (not a bug but a missing feature).

Another solution would be to have XML encoding classes in another library than plexus-utils...

In plugins reading and writing POM files (install, deploy and release), there is no choice: XML encoding support must be the same as in Maven Core, then the classes will be copied in the plugins. But in assembly, for example, assembly.xml file is now simply read as UTF-8.

Here are Jira issues to track where they have been copied, to schedule their removal when upgrading prerequisite to Maven 2.0.6+: