All the Perl that's Practical to Extract and Report

Navigation

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Without JavaScript enabled, you might want to
use the classic discussion system instead. If you login, you can remember this preference.

Please Log In to Continue

I had a long reply using Text::Unidecode here, but use.perl.org *really* doesn't want to format things the way I want it to (half the time it seems to double-encode my unicode, and never do multiline code or pre tags!), so I'll try using words instead of pictures to explain what I'm trying to talk about.
First, the easy question: How are those alternate readings sorted? It doesn't seem to be by first ms with that reading, nor by number of readings -- is it just hash order?
Second, the hard question -- wh

Put posts and comments through “encode 'us-ascii', $your_post, Encode::HTMLCREF”. That will make them come out as intended.

and never do multiline code or pre tags!

That’s on purpose; Slashcode has its own special <ecode> tag for that purpose (whose distinguishing features are: 1. you can write raw angle brackets and ampersands inside, and Slash will turn them into entities for you; 2. it uses <pre>, so very long lines will wrap properly (something that you can achieve in modern browsers via CSS by saying white-space: pre-wrap)).

(IMO it should nowadays just use Markdown. (Slash is older than Markdown, mind.) But since I have global shortcuts to translate the clipboard from Markdown to HTML, I don’t personally care either way.)

Slashcode has its own special <ecode> tag for that purpose (whose distinguishing features are: 1. you can write raw angle brackets and ampersands inside, and Slash will turn them into entities for you;

This is the part that doesn't play nicely with UTF-8, actually, although the <ecode> tag is almost always what I want - the Armenian characters get converted into entities upon comment submit, and those entities themselves have their ampersands turned into entities upon ecode conversion.

The conversion to entities is your browser’s doing, actually. It sees that the form should be submitted in ISO-Latin1, so it turns all the non-Latin1 characters into entities. Slashcode can’t actually know that you didn’t mean to send them that way. There is therefore no way to get around this.

All you can do is use plain <code> tags with <br> tags for linebreaks, sequences of &nbsp; for tabs, and manual escaping for ampersands and less-thans. It’s a pain to do manuall