ASCII is 7 bit, not 8. UTF-8 will fail if it encounters a character > 127 in HEAD as that value indicates that the character is more than 1 octet in UTF-8. This could very easily happen with NAME, CORP, ADDR etc. in HEAD prior to the CHAR tag informing you of the actual character set in use.

The 8 bit character sets you will normally see are ANSI with one of the various codepages.

[Editor –
David,
Yes, you are correct. Thanks for helping me out — ASCII is indeed 7 bit (0-127 decimal) ANSI standard. When I wrote 8bit Ascii, I was meaning the ISO_8859_ 1/2 code pages (or Windows-1250) which provided the diacriticals by using the 8th bit to map Ascii codes for the characters 128-255 decimal.

–Stanczyk ]

]]>By: Louis Kessler (@louiskessler)https://mikeeliasz.wordpress.com/2012/02/28/dying-for-diacriticals-beyond-ascii-howto-genealogy-polish/#comment-798
Wed, 29 Feb 2012 00:54:44 +0000http://mikeeliasz.wordpress.com/?p=2195#comment-798“UTF-8 can have that byte-order-mark (BOM) at the front of our gedcom or not and it is still UTF-8 ” – I didn’t know that. Thanks.
]]>