ICFP Programming Contest 2002

The scoring process for the normal division

How did we pick a winner among the ~168 submitted entries, you wonder?
How did we decide which robot was the best?
Below, we describe our judging procedure.

A robot is considered good if exhibits some degree of "intelligent"
behavior, i.e. it performs well in a wide variety of scenarios. The
robots were thus tested in a number of single-submission scenarios (a
submitted player on its own, or possibly meeting one of our
instrumented players) and a number multi-player scenarios, and the
best robot was the robot with the highest final score, obtained by
taking a weighted sum the scores from the different scenarios.

There are at least two sources of randomness that introduce "noise" in
the scores, and we have been careful to limit the effect of this noise:

The selection of scenarios is somewhat arbitrary. To avoid favoring
particular submissions (e.g. the ones that were written in the
judges' favorite programming language :-), the set of scenarios to test
was outlined before the testing procedure began, and
before we looked at any of the submissions.

The rules of the game has a random element: when two or more
robots issue moves with the same bid, their moves are executed in a
random order. To reduce the amount of noise introduced by this randomness,
the scores for multi-player scenarios are averages from a number of runs.

We concentrate on finding just the two best robots
and refrain from trying to give a complete ranking of all robots.

We started by running all robots in a variety of single-submission
scenarios (round 0). We then ran all robots in random groups of 8 in the
symmetry scenario (round 1).

Time would not allow extensive testing of all robots, so after round 1,
we picked the 20 best robots so far, confident that the best two would be
among these. We made a number of runs in two additional multi-player
scenarios, were all 20 robots participated in each run,
and summed up the scores so far (round 2).

After round 2, we limit further testing to the 8 best robots so far, again
under the assumption that the two best would be found among these.
The 8 robots all participated in a number of runs of two multi-player
scenarios (round 3).

The top two robots after round 3 were then run in a number of two-player
games (round 4). The total score after this determined the winners.

Errata

2002-10-25 15:45 PDT

We reran submission 243, which got around 200 points.

2002-10-10 12:00 PDT

It was discovered that the log file for submission 130 in the sokoban
scenario was incorrect. The test was rerun and the log file was replaced.
As a result, submission 130 lost 31 points.
(Thanks to Thomas G. Rokicki for pointing this out.)

2002-10-15 16:15 PDT

For no good reason,
scores for submission 257 were missing for the bridge, maze and sokoban
scenarios. These were added, and submission 257 gained 33 points.