Note that it does not break unicode specification: (from http://unicode.org/faq/ utf_bom.html#BOM: ) "Can
a UTF-8 data stream contain the BOM character (in UTF-8 form)? If yes, then can
I still assume the remaining UTF-8 bytes are in big-endian order?

A: Yes, UTF-8 can contain a BOM. However, it makes no difference as to the
endianness of the byte stream. UTF-8 always has the same byte order. An initial
BOM is only used as a signature — an indication that an otherwise unmarked text
file is in UTF-8. Note that some recipients of UTF-8 encoded data do not expect
a BOM. Where UTF-8 is used transparently in 8-bit environments, the use of a
BOM will interfere with any protocol or file format that expects specific ASCII
characters at the beginning, such as the use of "#!" of at the beginning of
Unix shell scripts."

Subtitleeditor opens such a file just fine, but interprets the BOM as "ZERO WIDTH
NON-BREAKING SPACE (ZWNBSP)"*, which seems to be correct when such character is
in the middle of the file. However, it treats it in this way also when it is in
the beginning of the file, which seems to be a bug. It manifests especially
when one needs to convert the file to a different encoding. It complains
with "Save Document Failed.

Could not convert the text to the character coding 'WINDOWS-1250'" (Which is not very helpful - it could tell me like gedit that there are bad characters and even better it could list them, but I digress here).

When saving in a srt format, it also moves tha BOM character in the middle of the file, as it usually starts with something else (like subtitle number in .srt format).

I think that when importing unicode text, first character should be checked whether it is BOM or not.

In practice, when a cursor is right of such character, pressing left arrow

does seemingly nothing, but it moves the cursor left of such character.
Backspace seems to do nothing as well but deletes the character. Such character is also counted in CPS, number of characters and so on, which does not make much sense, but is a very narrow corner case.

Copyright (C) 2004-2006, the Gna! people. Posted items are owned by whoever posted them.
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.