Grand Champion Standings: A Short Elixir Program

I’ve been learning Elixir over the past few weeks, and I decided that it
was time to write a slightly less-than-trivial program. While Elixir is
based on Erlang, this program doesn’t play to Erlang’s strengths (massive
scaling, message passing, etc.) Instead, it was more of a voyage of discovery
for me, and this article is my way of taking you along on the tour.

The Problem at Hand

I work with an amateur sports association that holds several tournaments
throughout the year. At each of these tournaments, competitors can get
points towards a “grand champion” award presented at the end of the year.

Here’s how the scoring works: for local tournaments, first place earns three
points, second place is worth two points, and third place gets one point.
For a state-level tournament, first place gets five points, second place
gets four points, third place gets three points, fourth place gets two points,
and fifth through eighth place get one point.

The placing data is stored in a spreadsheet, a sample
of which looks like this table. (I have excluded the
first line of the file, which gives the age group for the data.)

First

Last

Team

Hollister

Santa Cruz

Oak Grove

Open State

Mark

Arnhelm

Knights

1

William

Alvarez

Woodside

1

3

-1

Ross

Carter

Alliance

1

2

2

Shohei

Takamura

Athlete Nation

2

1

2

-2

I already have a Perl program that will read directly from the spreadsheet
file and construct an HTML table of the standings. The output has point
values rather than placing, and it is
sorted in descending order of number of points:

First

Last

Team

Total

Hollister

Santa Cruz

Oak Grove

Open State

Shohei

Takamura

Athlete Nation

11

2

3

2

4

William

Alvarez

Woodside

9

3

1

5

Ross

Carter

Alliance

7

3

2

2

Mark

Arnhelm

Knights

3

3

As part of learning Elixir, I decided to re-implement this program. Instead of
trying to parse an OpenDocument file, I exported the spreadsheet as a
CSV file with \t (TAB) as the column separator.

Data Design

I store the data for each competitor in an Elixir record, defined as
follows. The points list contains the number of points the competitor
gained at each tournament.

If the file name doesn’t exist, File.open! raises an exception giving an
meaningful explanation of the error.

When Elixir reads a line, the line includes the ending newline character
(\r\n for Windows, \n for Linux, and \r for Macintosh). The
chomp/1 function deletes the trailing newline, and String.split/2 separates
it into a list of strings.

When you refer to a function in Elixir, you give its name and its number
of arguments, also called arity, so
chomp/1 refers to a function named chomp that takes one argument, and String.split/2 refers to the
split function that resides in the String module and takes two arguments.

This is the return value: a tuple with the headings and the result of
processing the input file.

Here is the chomp/1 function, which uses regular expressions to eliminate
any newline character(s) appearing at the end of the line (\z, not $ as
in many other regular expression engines).

The most interesting part here is Enum.sort/2, which takes a
list as its first argument and a function (the “sorting function”) as its
second argument. In Elixir, functions are on an equal footing with strings,
integers, and other types of data. You can assign a function to a variable,
you can pass it as an argument (as in this code), and you can even have a
function that returns another function as its value. Treating
functions as “first class citizens” is a very powerful feature, and once you
understand how to take advantage of it, it can make your code clearer and more
flexible.

Every time Enum.sort/2 needs to compare two items, it will pass those
items to the sorting function. The sorting function
returns true if the first item belongs before the second item, false
otherwise.

Here is the by_points/2 function. It first compares the total points; if
they are equal, then it orders by surname. If those are the same, it
orders by given name, and if those are the same, it uses the team name
to break the tie.

Processing a Row

A row is processed by splitting it on \t (TAB). The person’s
name and team are separated from the placings. A call
to Enum.map_reduce/3 can convert the placings (first, second, third)
to the appropriate number of points and get the total points all in
one shot.

The |> operator takes the output of the first function and uses it
as the first argument of the second function. The code is the equivalent of
String.split(chomp(row),"\t")

Enum.map_reduce/3 takes a list as its first argument, an “accumulator” as
its second argument, and a function for the third argument. Enum.map_reduce/3
passes each item and the accumulator in turn to the function, which returns a
tuple giving the converted item and the new value of the accumulator.

In my place_points/2 function, I used pattern matching to handle the cases
of an empty entry or an integer in the CSV file.

Creating the HTML

The hard work is done; read_csv/1 gives its caller a list of Competitor
records that are in the proper order. The following html_output/1 function
takes a CSV file name as its argument, passes that file name to read_csv/1,
and uses the return value to create the HTML file.

If the input file name ends with .csv, replace it with .html; otherwise
add .html at the end. The expression input_filename =~ ends_with_csv is shorthand
for Regex.match?(ends_with_csv, input_filename)

This part of the code relies heavily on heredocs to conveniently output
multi-line strings. The ending """ must be on a line of its own.

The construction #{variable} interpolates the value of the variable
into a string.

But wait…there’s more! You can interpolate a function call, and its
return value is inserted into the string.

I could have interpolated the entire table body as a huge string.
However, because there are often nearly a hundred competitors from fifteen
tournaments in a CSV file, I
felt it was better to use IO.puts/1 to write it to the output file one
line at a time.

The Table Header

To create the table header, make_header_row/1 gets the list of headings.
It uses pattern matching to isolate the first three items (given name, surname,
and team) from the competitor’s points. The function then reconstructs
the list of headings, adding "Total" as it goes,
and uses Enum.join to add closing and opening
table data tags between the items. This string is sandwiched between the tags
that open and close the table row, and the <> operator concatenates them
all together. Notice the use of defp to make this a private function; there
is no reason for any other module to call this function.

Otherwise, separate the first person in the list from the remainder,
and use bracket notation to assign the record’s components to individual
variables.

By using the bracket notation, I can refer to first instead of
person.given_name, last instead of person.surname, and so forth.

Deep breath here. Enum.map/2 takes a list with the
total points first and the tournament points after it as its
first argument. It passes each item in turn
to html_cell/1, which converts the value to a string. Then, Enum.join/2
wraps each of those strings in a table cell.

The finished table row goes to the output file, and emit_html_rows/2
is called again to process the remaining competitors.

Here’s html_cell/1. If someone has zero points (which means they didn’t
place in the tournament), the cell becomes a <br /> element. This is
necessary to ensure that the cell’s borders are visible in older browsers.
Otherwise, the number is converted to a string.

defp html_cell(item) do
if (item == 0) do
"<br />"
else
to_binary(item)
end

Conclusion

There you have it. I was able to use Elixir to
perform a relatively mundane task: open
a file, read it, do some calculations with the data, and write the data
out in a new format. I learned quite a few interesting features of
Elixir along the way, and I hope you did too.