What does Red Hat say on Perl/UTF-8 problems?

From: "M. Fioretti" <m fioretti inwind it>

To: Psyche List <psyche-list redhat com>

Subject: What does Red Hat say on Perl/UTF-8 problems?

Date: Sat, 22 Mar 2003 09:34:59 +0100

Hello,
Please try the short script attached, coming straight from the Perl Cookbook ftp
page, at www.ora.com . It gives a tree view of the output of the du
command. On a standard xterm in RH8 it also gives a bunch of these
errors:
Malformed UTF-8 character (unexpected end of string) at
./SOFT_TMP/cookbook.examples/ch05/dutree line 17, <> line 773.
Malformed UTF-8 character (unexpected end of string) at
./SOFT_TMP/cookbook.examples/ch05/dutree line 19, <> line 773.
The errors disappear when one "export LANG=en_US", kind of like the
man page issue. The script mb2md, converting mailboxes to Maildir
format, shows an almost identical problem (download it at
http://batleth.sapienti-sat.org/projects/mb2md/ )
After these two simple test cases, the actual question:
Red Hat did a Very Good Thing moving to Unicode/UTF-8 defaults in
psyche. Somebody had to start the mass migration, and I am glad that
they did. In spite of this, it cannot be denied that {old, 3rd party}
Perl scripts, working on {old, randomly encoded} text files break
when run in the default Perl/shell environment of psyche.
Is there a Red Hat page specifying:
what must be changed in scripts so that one does NOT need to
alter variables before and after perl things, maybe
one-liners
what kind of shell wrappers one must use when there is no
possible solution of the kind above
If there is no such page, why not?
Keep in mind that I'm perfectly aware that it is not only Red Hat
responsibility. I am posting almost the same message to the Perl/UTF-8
list. What I'm asking to Red Hat is something that clearly says:
"with our default environment and Perl packages, do this, this
and this to make your scripts working again"
"This and that specific behaviors are bugs in Perl, and we
have to wait that they fix it"
"This and that specific behaviors mean that the *script* is
hopelessly broken, and should be rewritten (ditto for specific
perl modules"
Any feedback is welcome!
Ciao,
Marco Fioretti
--
Marco Fioretti m.fioretti, at the server inwind.it
Red Hat for low memory http://www.rule-project.org/en/
The three most dangerous things are a programmer with a soldering
iron, a manager who codes, and a user who gets ideas.