"To be competitive at Perl golf, you have to be a Perl expert, with years of language experience" -- Yorkey

"Rubbish! In TPR 3, Mickut of Finland was on the leaderboard two hours after he learnt Perl! And Perl expert Petdance, despite many years of Perl experience, finished more than 80 strokes behind the leaders in the Fonality golf challenge" -- me

-- Yorkey and me arguing during a lunchtime walk to Kirribilli

Were it not for this chance argument during a lunchtime walk with my workmate
and good friend, Yorkey, I probably would have given up the
Roman to Decimal golf game after a few weeks.
After all, I was completely stuck at that time with my Perl solution
and had no fresh ideas to try out.
However, Yorkey's stubborn refusal to change his point of view provoked me into
proving my point to him by playing this golf in languages I hardly knew,
namely Ruby, Python and PHP.

I decided to start with Ruby because I at least had a passing familiarity with that
language after audreyt stayed for a few days at my home while my wife was out of town.
During her stay, we went through the library section of
the Pickaxe book
together because Audrey felt it would make a great model for documenting the Perl 6 libraries.
Unfortunately, the absent-minded Audrey left one of her earrings behind on our bedside
table and I assumed it belonged to my wife, so just left it there. On her return, my wife
saw the earring sitting on her bedside table and freaked. I had previously told my wife
that Perl hacker "autrijus" was coming to stay for a few days while she was away.
Luckily, when I hastily explained that autrijus had become audrey, my wife judged it
unlikely that I would invent such a story and quickly calmed down. :-)

After a full month of play, the Perl leader was robin on 60 strokes, with Ruby languishing
far behind on 73. So I naturally thought it "impossible" for Ruby to overtake Perl
in this game -- and ludicrous to suggest that I might be able to beat my Perl
solution in Ruby.
My expectations were much lower than that; I simply wanted to be competitive
in Ruby (anywhere in the 70s would be fine) so as to shut Yorkey up.

Taking the Lead in Ruby

Since I'd already worked out some basic magic formulae by that time,
I naturally started converting these to work with Ruby. Unfortunately,
in addition to mapping M -> 1000, D -> 500, C -> 100, L -> 50, X -> 10, V -> 5, I -> 1,
I needed to further map the trailing newline to zero because I could
find no short way of removing it in Ruby. In Perl, the trailing newline was
easily removed via /./g. This extra newline mapping invalidated
most of the Perl magic formulae I had previously found, so I had to adjust my
magic formula searcher and start searching all over again.

This program might contain the fastest known existing implementation of full forward crypt

My new and improved search program found a Ruby-friendly magic formula easily enough,
and I was flabbergasted when my first Ruby approach, despite using the nine character
each_byte method, was equal leader on 73 strokes!

As you can see, this is just a straightforward translation of the original
algorithm I was using in Perl, albeit with a magic formula replacing the
Perl regex-based lookup table. As I was quick to point out to Yorkey, I didn't
need to be a Ruby expert to do this, just needed to know the core parts
of the language and, more importantly, find a good algorithm.

I started with each_byte only because I couldn't get the shorter
getc function to work. For example, this attempt:

and it worked! The screen was littered with "warning: already initialized constant C"
messages (written to stderr), but these don't matter to codegolf, which only cares
about what is written to stdout. Combining with a well-known Ruby golfing
trick of replacing the three-char 238 with the two-char
?ascii-char-with-ord-of-238, shortened my solution to 65.
As you might expect, I felt elated at leading the Ruby experts by eight strokes!
And, more importantly, forcing Yorkey to eat his words.

Choking on my breakfast cereal

Complacency is dangerous in golf and I had become complacent.
If anyone had told me at this time that I could reduce my Ruby solution
from 65 strokes all the way down to 53, I would have declared them insane.

After basking for months in my newly acquired Ruby fame, I almost choked
on my breakfast cereal when I checked the codegolf leaderboard one morning and noticed that
Python golfing god, Mark Byers,
had posted a 59 stroke Ruby solution.
This was intolerable! Back to work.

After experimenting some more with Ruby's evaluation order, I came up with
a weird spaceship operator 60 stroke solution:

I've left the 238 above for readability, but my submitted solution
naturally used the ?ascii-char-with-ord-of-238 dirty trick mentioned earlier.
This solution introduces another dirty Ruby golfing trick, namely using
a .* "method call" for "multiplies" rather than *(...), thus
saving a stroke by eliminating the parens. You can try this trick routinely
when golfing in Ruby whenever you need to change operator precedence -- though
it doesn't always work, Ruby's parsing being pretty quirky, in my experience.
By the way, it was this Ruby solution that inspired my weird 62 stroke Perl spaceship operator
solution, mentioned in the previous article, an example of transferring ideas from
one language to another. Often the hard part in golf is generating new ideas to try,
and using multiple languages is a fertile source of fresh ideas.

Alas, I couldn't improve this solution further, so switched to Python, hoping to
take revenge on the "Python golfing god" there.

Python Baby Steps

As you might expect by now, my first Python attempt was the same ol' same ol':

89 strokes! This solution bears a close resemblance to the earlier Ruby ones.
Notice that Python, like Perl, but unlike Ruby, does not need to map the
trailing newline because the Python raw_input function removes it.

Notice too that in Python, alone among the four languages, assignment is not an operator. This proved a chronic nuisance in this game because I couldn't see any opportunity
to exploit evaluation order to eliminate the "previous value" variable (p
in the Python solution above).

Another generally applicable golfing tip is to study every single built-in function the
language has to offer, especially the short ones. When I did that, the Python
hash function caught my eye. I wonder if it could be used in
a magic formula? Well, it seems to have better properties for this
purpose than ord and is only one stroke longer.
Definitely worth a try. It did indeed improve things:

... but only by one stroke.
86 strokes now, but still a gaping eight strokes behind the Python golfing god.

Going for the Outright Lead

Necessity is the mother of invention

The Python solutions are different to the Ruby and Perl ones in that
you have to either map the hash/ord functions,
or assign them to a variable, as in x=84169%ord(c), because
all the magic formulae seen so far use the character twice.
It occurred to me therefore, that if I could find a magic formula that
used each character in the input string once only that would be a big
saving in Python. How to find such a formula? I have no idea, but I
played around one afternoon, just trying stuff, and stumbled on a gem:

Generally, formulae that map M -> 3, C-> 2, X -> 1 and I -> 0 are
highly effective because applying "%NNNN", where NNNN > 1000, does
not mangle the already matching 10**m, so instead of requiring seven
lucky hits, you now need only three (D, L and V).

Combining this new formula with the same modulo trick I used to move my Perl
solution from 62 to 60 strokes reduced my Python
solution to 78 strokes and tied for the lead with Mark Byers!

Code Golf is 10% strategy, 90% tactics

Actually, I've found many different 78 stroke Python solutions, but none shorter.
Here are some more variations in the middle line:

The last one is noteworthy in that it uses a different mapping, namely
M -> 2000, D -> 1000, C -> 200, L -> 100, X -> 20, V -> 10, I -> 2.
Also noteworthy is that, because it divides by two (n/2),
it also works with a:

initialization. This observation will allow us later to exploit a Ruby built-in variable ($.),
which is initialized to one.
Note that this second alternative mapping is available, without penalty, in Ruby and Python,
but not Perl and PHP, for various complicated tactical reasons.
These are the sort of tactical tricks that are crucial when fighting for the lead in golf.

Incredibly, applying what I learnt in my Python diversion to Ruby, plus yet another
dirty Ruby trick (using the Perl-inspired Ruby built-in variable $. to
eliminate the t=0), enabled me to reduce my Ruby solution from 60 strokes
all the way down to 53 and so steal the outright lead from "primo":

Of course, I can't prove that I've found the optimal magic formula.
It's also likely that further language or algorithmic golfing improvements will be found,
especially given my relative inexperience in Ruby and Python.

In the next installment of this series, I'll show off my PHP solutions.

you should think about compiling all of your wonderful PM articles in a book :)

I second that :) There is this PerlGolf History Book thing but no real book with a friendly language AFAICT. It'll be nice to have one and I'll definitely add that to my bookshelf. I liked the tone of Perl Hacks, this one will be even better perhaps :)

You know, it doesn't really surprise me that perl isn't always optimal for golf. I've seen it in action in that fibonacci golf tingy at Re: Fibo Golf challenge on 3 monkeys and (OT) Fibonacci numbers in Ruby - final shot - 24 chars etc. Also, when I write one-liners, which I do a lot, and I don't golf on purpose, the ruby ones always turn out to be shorter than the perl ones, only the ruby ones get hard to manage earlier when they grow. Ruby just has a larger, better standard library for basic things, but cpan wins when you need more complicated stuff like xml parsing.

Though I'll give a more detailed analysis in my next article, I just did a quick count of the last ten codegolf games. Ruby and Perl always finished first and second in code length: Perl won 6 times, Ruby 4 times. As for the longest solutions, it was: PHP 7 times, Python 3 times. That agrees with my impression when playing these games: Perl and Ruby are (often quite excitingly) vieing for the Silver Cup, while Python and PHP are battling for the wooden spoon. What surprised me enormously is popularity, measured by number of entries: Python 5, Ruby 4, Perl 1. It seems that golf is more popular in Python than Perl nowadays. That shocked me, because four years ago the top google hit for "Python golf" was a lovely pair of Python skin ladies golf shoes. :)

In golf, there are many more failed attempts than successful breakthroughs.
That's the nature of the game.
To keep the articles reasonably short, I focused on the breakthroughs,
omitting many of the failures.
To add a dose of reality, and while I'm still able to remember, I thought
I'd describe some of my more interesting "heroic failures" in this game.
You never know, someone may find an improvement and transform one of these failures into a winner.

Mapping to Different Number Bases

Mapping to different number bases is a generally useful golfing technique.
While I couldn't find a useful number base mapping for the original M -> 1000, D -> 500,
C -> 100, L -> 50, X -> 10, V -> 5, I -> 1, I spied an opportunity later
in the game after uncovering the alternative M -> 2000, D -> 1000, C -> 200, L -> 100,
X -> 20, V -> 10, I -> 2 mapping, illustrated in the following table.

Roman

octal

decimal

1<<n

2<<n

M

2000

1024

10

9

D

1000

512

9

8

C

200

128

7

6

L

100

64

6

5

X

20

16

4

3

V

10

8

3

2

I

2

2

1

0

Noticing this cute mapping led directly to the following
Python 82 stroker:

As you can see, this solution was not competitive in this game only because of the five stroke
Python string to int conversion penalty of having to call int() -- which wouldn't
have been required in Perl or PHP. This idea was similarly foiled in Ruby by the
need for string to int conversion.

Equally cruelly, in a language where the string to int conversion is
not required, namely PHP, this idea was crushed by the arbitrariness
of PHP's internal function names. You see, PHP calls this function
not oct, like any reasonable language, but decoct,
costing three precious strokes. Aaargh.
Finally, in Perl, the oct function converts the other way and I
couldn't find any very short way to convert from decimal to octal.

Eliminating Exponentiation

Though exponentiation is "natural" for Roman numerals,
it does cost around six strokes (10**()),
leaving the door open for shorter, if less natural, alternatives --
such as the PHP md5 lucky hit found in the third article of this series.

Though less ideal than PHP's md5 function, Python's built-in hash function
seems the best hope of eliminating exponentiation among the
other three languages.
Indeed, this 79 stroker proved only one stroke too long:

Shortly after posting this article, I (again) had difficulty swallowing my breakfast
cereal when "hallvabo" -- who had been lurking just one stroke
off the pace -- sprinted past both Mark Byers and myself,
snatching a six stroke lead on a remarkable 72 strokes.
What on Earth was hallvabo up to? I might add that when one golfer
tries to guess what another is doing, they are almost always wrong.

My first thought was that he'd performed some fancy functional
footwork on my magic formula. So I tried:

I was doing this in the middle of the night while lying in bed, without
access to a computer. I could not help but notice though, that the program
was now precisely 72 strokes in length! Hope! A straw to clutch.
So I ran a few tests by hand and couldn't find an example that broke this abbreviated
new algorithm. And it did indeed pass all 3999 tests
when I finally ran it on a computer the next day.

Because I'd been pondering a functional approach, I also noticed
that this neat trick of using the "running total" t for state,
rather than the ugly "previous value" p, made this
new algorithm a perfect fit for "reduce"!
You see, unlike Guido,
I very much like the reduce function, so, for fun (not golf, because reduce
never wins at golf), I put this to the proof.

As far as I'm aware, the simple functional algorithm shown below, to convert
from Roman to Arabic, has never been published before. Early in this game, I did
google around for functional algorithms and didn't find anything like it.
So, given that golf is sometimes accused of never producing anything useful,
I hereby lay claim to be the inventor of this golf-inspired algorithm and even give it
a name: the PGA-TRAM algorithm, or, in long-hand,
"Pure-functional Golfic Algorithm - Transform Roman to Arabic Magically".
Here's a sample implementation in Perl:

I chose the "19&" version of a magic formula here only because the bitwise and & operator is typically
faster than the modulo % operator. BTW, an interesting (non-golf) problem is to find the most
efficient magic formula: for that, I expect you'd try to use "fast" operators (such
as &, ^, |, >>), while avoiding "slow" ones (such as % and /).

I chose the "19&" version of a magic formula here only because the bitwise and & operator is typically faster than the modulo % operator. BTW, an interesting (non-golf) problem is to find the most efficient magic formula: for that, I expect you'd try to use "fast" operators (such as &, ^, |, >>), while avoiding "slow" ones (such as % and /).

Your criteria probably make sense if you are writing C code. I doubt they make sense when writing in any of the languages you are actually writing in.

In particular, in Perl your fastest formula is going to be the one with the fewest operations. Dispatching a Perl opnode is going to be (I'm guessing) two or three orders of magnitude slower than even something "slow" like a division machine-language instruction.

When putting a smiley right before a closing parenthesis, do you:

Use two parentheses: (Like this: :) )
Use one parenthesis: (Like this: :)
Reverse direction of the smiley: (Like this: (: )
Use angle/square brackets instead of parentheses
Use C-style commenting to set the smiley off from the closing parenthesis
Make the smiley a dunce: (:>
I disapprove of emoticons
Other