Re: LYNX-DEV ISO-8859-2 SGML entites - second version

From:

Klaus Weide

Subject:

Re: LYNX-DEV ISO-8859-2 SGML entites - second version

Date:

Sat, 15 Mar 1997 23:55:03 -0600 (CST)

On Sun, 16 Mar 1997, Hynek Med wrote:
> Here goes my patch to add all ISO-8859-2 entities to HTMLDTD.c. The
> patch is relative to lynx2-7 subdirectory.
Ok, I'll put them in the code so they'll be there when I upload a new
version. Then we can see whether they create any problems.
Btw you didn't answer my question whether you had encountered eny real
sites with pages that uses these entities. Just curious.
> Klaus, I think it's ready to go to the new version of chartrans code. I
> would also welcome the swapping of behaviour when the assumed charset is
> the same as the display charset, as we discussed earlier.
I haven't forgotten it, just takes a while.
> (Is there
> anybody else than me and Klaus to discuss this?)
>
>
> I tried it on your Latin 2 test page, it works fine save for:
>
> - soft hyphen as &#173; works but &shy; doesn't display anything
Both should not display anything unless at rhe end of a line. That's
how soft hyphen is supposed to work, Fote probably put much work into
getting that right.
> - small n with acute is displayed as 'n', &ntilde; you have on the same
> line (why, BTW?) is displayed as 'n', too
An oversight I had not noticed before. I changed it now in
<URL:http://www.tezcat.com/~kweide/lynx-chartrans/test/>, so you should
now see three n's on that line, each and every one of them with an acute
accent.
> The errors above are the same as in lynx without chartrans.
>
> - capital D with stroke as &#272; works but &Dstrok; is displayed as DH,
> though small d with stroke works as it should
That is from the old-style ISO_Latin2[] table in LYCharSets.c (as Dave has
also observed). The line
"DH", /* capital Eth, Icelandic - Dstrok */
there should become
"\320", /* capital D with stroke - Dstrok */
> - capital N with acute displays &Nacute; - I don't know why
And you missed that "capital L, stroke" gets displayed as N with acute :)
At least that is what I see.
That seems to be a long-standing error in the Linux kbd package (from
where I took some of the tables). You can reproduce it with
echo -e '\033%G' # UTF-8 mode on
less -r /usr/doc/keytables/utflist # or wherever you have that file
echo -e '\033%@' # UTF-8 mode off
if you have the utflist file from the kbd package.
If you want to correct it yourself for now, change the second occurrence
of U+0141 in iso02_uni.tbl to U+0143.
> - new entities don't work at all on the ALT test page
Work in progress.
Thanks for checking!
Klaus
;
; To UNSUBSCRIBE: Send a mail message to address@hidden
; with "unsubscribe lynx-dev" (without the
; quotation marks) on a line by itself.
;