>According to the HTML I18N spec, all that is needed in this case is to
>specify
>CHARSET=CP1251, and the text would be correctly converted to the equivalent
>Unicodes.

The issue is not the coded content of the document, about which you are
correct. The issue is numeric character references of the form &nnnn.
Some HTML documents today use numeric references in the C1 range,
assuming they are the extra characters in cp1251. This is contrary to the
i18n spec, which states that all numeric character references refer to
Unicode. This means that all references in the C1 range are illegal
according to the spec.