I have added a -n flag to osis2mod.
This flag, to be enabled, requires osis2mod to be compiled with ICU
support enabled.
-n stands for normalized to NFC, the agreed upon UTF-8 encoding
When should this flag be used?
1) When the input is UTF-8
and
2) It is not known to be NFC
How to verify that the input is NFC?
Basically build a raw module with and without the flag, comparing the
output. The following should work on linux and macosx:
mkdir m n
osis2mod m input.xml
osis2mod n input.xml -n
cmp m/nt n/nt
cmp m/ot n/ot
When not to use it:
1) The input is already NFC
or
2) The input is cp1257 (Win Latin-1).
Serving Him,
DM