Search This Blog

First success and failure with Ruby program (ruby pt.3)

So if you haven't been keeping up I'm trying to write my first Ruby program to teach myself the ruby programming language. I coded in BASIC in high school and did a little C coding in college. Since then I've put that part of my brain to sleep with a diet of fruit loops and Keebler elf cookies. The program I'm writing takes a text file and reverses the order of the characters (in Hebrew) but not the order of the lines.

So status update: I got my program running this morning. Yesterday as I was going to bed (after the smoke detector woke me up -- long story tell you later) I picked up my ruby book and had an idea.

This morning I put it together. And now the program works (sort-of, but more on that later). It searches the directory for files that need to be reversed, reverses each line in order and outputs a file! Yay! But alas like many programming victories it was Pyrrhic.

And here is the failure (which I was expecting) it mangles Hebrew. Here is the input and outputted files:

This sucks quite a bit since my research shows that the method reverse! I'm using is the culprit. Ruby assumes you're using ANSI text. Why? Are we in the 386 era again? Wasn't this written by a Japanese guy? It seems to be a memory issue. ANSI text is 8-bytes (or bits?) while Unicode is 16-bytes (or bits, I don't remember which at the moment). So ANSI text occupies half the memory that Unicode does. However, this is Beautiful Sunshine (or B.S. for short) because Java uses Unicode. Ok so now to find how to handle Unicode in Ruby. [ed. Unicode is 2 bytes or 16 bits, ANSI is 1 byte or 8 bits]

But first the overview of how the program works. The program has three parts now.

It goes fetch the files it needs to reverse (in order!) and puts the whole file contents on variable text.

It goes through each line of the text and reverses it then appends each line (one at a time) to the output variable (which is outside the loop so it remains after this is done).

1st part

Writes the output variable to a file.

All good except step two. Which I'll have to figure out how to do differently.

2nd part

3rd part

Get link

Facebook

Twitter

Pinterest

Google+

Email

Other Apps

Labels

Comments

Post a Comment

Popular posts from this blog

Typing accents on a PC is a complicated Alt + three numbered code affair. One feels like a sorcerer casting a spell. "I summon thee accented é! I press the weird magical key Alt, and with 0191 get the flipped question mark!" For a bilingual person this meant that writing on the computer was a start-and-stop process. With Mac's it a whole lot easier, just Alt + e and the letter you wanted for accents and alt + ? for the question mark. No need to leave the keyboard for the number pad and no need to remember arcane number combinations or have a paper cheat sheet next to the keyboard, as I've seen in virtually every secretaries computer in Puerto Rico.

Linux has a interesting approach to foreign language characters: using a compose key. You hit this key which I typically map to Caps Lock and ' and the letter you want and voilá you get the accent. Kinda makes sense: single quotation mark is an accent, double gets you the ümalaut, works pretty well. Except for the ñ, wh…

There is interestingly enough a big difference between what's considered good writing in Spanish and English. V.S. Naipul winner of the 2001 Nobel prize for literature publish an article on writing. In it he emphasizes the use of short clear sentences and encourages the lack of adjectives and adverbs. Essentially he pushes the writer to abandon florid language and master spartan communication. This is a desired feature of English prose, where short clipped sentences are the norm and seamlessly flow into a paragraph. In English prose the paragraph is the unit the writer cares about the most.

This is not the case in Spanish where whole short stories (I'm thinking this was Gabriel Garcia Marquez but maybe it was Cortázar) are written in one sentence. Something so difficult to do in English that the expert translator could best manage to encapsulate the tale in two sentences. The florid language is what is considered good writing in Spanish but unfortunately this has lead to what …

I really like Github's Atom Text Editor. I really like that it's multi-platform allowing me to master one set of skills that is transferable to all platforms and all machines.

On thing that just burns me of the default set-up in Atom is the Autocomplete feature that seems to change my words as a type them. Because Ruby uses the end of line as a terminus for a statement you usually finish a word with pressing the return button and you get really annoying changes to your finished typed word a la MS Word. I find myself yelling "No that's not what I wrote!" at the screen in busy coffee shops.

I disabled autocomplete for a while but it is a very useful function. Then I found out they changed the package that gave the autocomplete to a new one called "Autocomplete Plus" that gives you more options. All that I needed to change to make autocomplete sane again: