Quick Tip #7: Texas and Unicode things

[“Texas” as a cute name has been changed to the less offensive and more prosaic “ASCII”]

Perl 6 is more than Unicode “aware”—it reaches into the Unicode Character Database to use appropriate and meaningful characters. There use to be a joke that the only thing that kept Perl 5 from expanding was the lack of unused punctuation keys. That’s not a problem anymore.

But, for most Unicode thingys, there is an ASCII version. That version is probably multiple characters, so it’s larger than the Unicode version. And, since it’s a larger, super-sized version, we’ll call it the “Texas” version. The Perl 6 docs map the Unicode to ASCII versions.

I wrote a small command-line program to convert between the two. It’s not very sophisticated and I plan on improving it later. I’d especially like to look up things based on what they do. For instance, search for “subset” and get subset operators. Another data column with a description would be nice. And, reading all this from a file. Although I’ve taken the data from a single page in the Perl 6 docs, there are many other things I could add (such as the quoting stuff). I’ve also saved this in my unicode2texas.p6 gist.