Want to receive a weekly email containing
the scoop on our new titles along with the
occasional special offer? Just click the button.
(You can always unsubscribe later by editing your
account information).

Give us an email and a password (you can use the password later to log in and
change your preferences). We'll send you a newsletter roughly once a week.

Programming Elixir

Part 2: Functions and Pipelines

by Dave Thomas

Elixir is a modern, functional programming language designed
for high availability and concurrency. It has Ruby-like syntax
married to the power and reliability of the Erlang VM. If you
wanted to get into functional programming but were put off by
the academic feel, now’s the time to jump in.

Last month, we
looked at the basics of pattern matching, and saw how it is
universal in Elixir—it is the only way to bind a value to a
variable or parameter. But pattern matching really shines when
we apply it to functions.

Anonymous Functions

Elixir has a nice set of built-in modules. One of these,
Enum, lets you work on enumerable collections. One
of its most commonly used functions is map, which
applies a transformation function to a collection, producing a
new collection. Let’s fire up the Elixir interactive shell, iex,
and try it.

iex> Enum.map [2,4,6], fn val -> val * val end

[4,16,36]

The first argument we pass to map is the collection; in this
case a list of three integers. The second argument is an
anonymous function.

Anonymous functions (I’m going to call them fns
from now on) are defined between the keywords fn and
end. A right-facing arrow, -> separates a
list of zero or more parameters on the left from the body of the
function on the right.

We passed map the function fn val -> val*val
end. This fn takes a single parameter, val, and
the body of the function multiplies that value by itself,
implicitly returning the result.

A fn is just another Elixir value, so we also have written this
code as:

iex> square = fn val -> val * val end

#Function<erl_eval.6.17052888>

iex> Enum.map [2,4,6], square

[4,16,36]

You can call fns using something like a regular function call:

iex> square.(5)

25

The period and the parentheses are required.

Boy, That’s Way Too Much Typing!

No, it’s not, and we don’t reward whining around these parts
anyway.

But, that said, Elixir does have a shortcut. (What follows is
for Elixir 0.9. The syntax is likely to change in the near
future.) Let’s jump straight to the code.

iex> Enum.map [2,4,6], &1*&1

[4,16,36]

When Elixir sees &[1-9], it knows that
it needs to generate an anonymous function. The function will
have from 1 to n parameters, where n is the highest
numbered &n value. The body of the
function is basically the expression containing the ampersands,
with each ampersand value being mapped to the corresponding
parameter. So, &1*&1 is logically the same as
fn p1 -> p1*p1 end, and rem(&1,&2)
becomes fn p1,p2 -> rem(p1,p2) end.

Because fns are just values, you can even write the code as:

iex> square = &1 * &1

#Function<erl_eval.6.17052888>

iex> Enum.map [2,4,6], square

[4,16,36]

This is a fantastically useful shortcut, but there is a
gotcha. When deciding what code to make the body of the fn,
Elixir starts at the ampersand term and looks up the parse tree
for the immediately enclosing expression. With
&1*&1, it’s the multiplication operator. With
rem(&1,&2), it’s the function call to
rem.

But things go a little haywire if you write
&1+&2+&3. That’s because, internally, this
is represented as (&1+&2)+&3. Elixir
translates the expression containing &1 and
&2, so now we have (fn p1,p2 ->
p1+p2)+&3. Elixir then complains that &3 can’t
exist without a preceding &1 and &2. (And this is why
the syntax is likely to be overhauled).

Anyway, the moral is: use the ampersand syntax for simple
expressions, and regular fn…end for the rest.

Named Functions

Anonymous functions tend be be used for callback work—they are
typically short and localized in their use. Named functions,
however, are where the real work gets done.

Named functions can only exist inside Elixir modules. Here’s an
example:

defmodule AsciiDigit do

def valid?(character) do

character in ?0..?9

end

end

IO.inspect AsciiDigit.valid? ?4 # => true

IO.inspect AsciiDigit.valid? ?a # => false

To follow this code, you first have to know that the syntax
?x returns the integer character code for
x (so ?0 is 48).

Our example defines a module called AsciiDigit
containing a single function, valid?. This takes a
character code and returns true if it is a digit from 0
to 9. We use the range operator .. to define the first
and last valid character, and the in operator to check
for inclusion.

Elixir supports pattern matching when determining which function
to run. You can use def multiple times for the same
function, each with a different pattern of parameters. Elixir
will dynamically choose the one where the parameters match the
arguments passed.

An oldie-but-goodie shows this nicely. The specification for the
Fibonacci sequence (Brian Hogan groans) is

fib(0) = 1

fib(1) = 1

fib(n) = fib(n-2) + fib(n-1)

This isn’t code—it’s a mathematical definition. However, we can
turn it into code with almost no effort:

defmodule Fibonacci do

def fib(0), do: 1

def fib(1), do: 1

def fib(n), do: fib(n-2)+fib(n-1)

end

Enum.map 0..10, Fibonacci.fib(&1) #=> [1,1,2,3,5,8,13,21,34,55,89]

(I know, I know—this is not an efficient way of calculating the
series. I want to show Elixir, and not worry about faster
algorithms.)

Despite appearances, there’s just one definition of the
fib function in there. It just has three
heads—three patterns of arguments that select
different bodies.

The first two heads select the cases where the argument is 0 or
1. They use the abbreviated form of the body, do: expr
to return 1. The third form is the recursive step. If neither of
the first two match, the third one executes.

What happens if we pass our function a negative argument. Right
now, it will loop until we run out of stack or
patience—subtracting 1 or 2 from a negative number will never
reach 0. Fortunately, Elixir has guard clauses, which
allow us to put additional constraints on pattern matching.

Now, when we call fib with a negative number, Elixir
can’t find a function clause that matches, and so it raises an
exception. If you really wanted to, you could handle this case
in code, giving a more application-specific error:

defmodule Fibonacci do

def fib(0), do: 1

def fib(1), do: 1

def fib(n) when is_integer(n) and n > 1, do: fib(n-2)+fib(n-1)

def fib(x), do: raise"Can't find fib(#{x})"

end

Fibonacci.fib(10) #=> 89

Fibonacci.fib(-10) #=> ** (RuntimeError) Can't find fib(-10)

Fibonacci.fib("cat") #=> ** (RuntimeError) Can't find fib(cat)

We extended our guard clause to check that the parameter is an
integer, and then added a 4th function head that accepts any
parameter and reports an appropriate error.

Verifing Credit Card Numbers

Most of the long number strings we deal with every day (credit
card numbers, IMEI numbers in your phone, and so on) have a check
digit. This is normally the final digit of the number, and it is
calculated using some algorithm that combines all the previous
digits. So, when you enter your credit card number, the web page
can recalculate the check digit, and verify that it is the same
as the last digit in number you gave. It isn’t a check against
fraud; it’s simply a quick way of picking up typos.

Probably the most widely used technique is the Luhn
Algorithm. It reverses the digits in the number, and splits
them into two sets: digits at odd positions in the string, and
digits at even positions. It sums the odd digits. For the even
digits, if multiplies each by two. If the result is 10 or more,
it subtracts 9. It then sums all the results. Adding the sum of
odd and even positions will yield a result that’s divisible by
10 for valid numbers.

When I first started with Elixir, my head was still full of
conventional ways of doing things. As a result, I’d write
something like the following:

The Enum module has lots of functions for dealing with
collections. We’ll be using many of them in this code, so we
import the module. This means we can write map instead of
Enum.map.

Our valid? function is passed a list of UTF-8
digits. By coincidence, that’s exactly what the single quoted
string literal generates.

Using the description of the Luhn algorithm, we reverse the
digits, and then convert the UTF representation to the actual
integer value (so ?1, which is 41, gets mapped to 1). At this
point, given the argument '123', we’d have a list of
integers [3, 2, 1].

Now it gets messy. We need to partition the digits into those on
an even position and those at an odd position. To prepare to do that, we
use map, passing it the function fn number, index
-> {number, index} end. This function takes the actual
digit value, along with its index in the list, and maps it to a
tuple containing each.

At this point, alarm bells should be ringing. This is just too
damn hard. But we plow on, because that’s what programmers do.

The partition function takes a collection and a
function. It returns a tuple where the first element is a list of
values for which the function returned true, and the second
element is the rest of the values.

Now we have to sum the odd values. Whenever you need to reduce a
collection to a single value, you’ll probably want to use the
reduce function. It takes the collection, an initial
value, and a function. This function receives each element of
the collection in turn, along with the current value. Whatever
the function returns becomes the next current value. So, summing
a list of numbers can be done with

Enum.reduce list, fn val, sum => val + sum end

# or

Enum.reduce list, &1 + &2

But what we have is a list of
{value, index} tuples. This means we need to use
pattern matching on the first parameter of the function to
extract just the value. (The underscore in front of
_index means we’re ignoring this field.

Summing the even numbers is similar, but we have to do the
doubling, and the conversion of numbers ten or above.

At the end of all this, we can test this in iex. I’m using a standard
Visa test credit card number here, so don’t go booking a trip to
Tahiti using it.

$ iex validate_cc.exs

iex> CheckDigit.valid? '4012888888881881'

true

iex> CheckDigit.valid? '0412888888881881'

false

Tidying Up

Our solution works, but the style isn’t very functional. (That’s a
polite way of saying it’s butt ugly). To tidy it up, I look for
places where there’s clearly something wrong, and see if I can
fix them.

The first problem I see is the first three lines. I’m
transforming the given number into a reversed set of digits,
each with an associated index.

The word transform is the clue. Functional
programming is all about transforming data. It’s so important
that Elixir has a special operator, |>. This lets us
build pipelines of functions, where each transforms the results
of the previous. It lets us compose functionality.

Using the pipeline operator, we can rewrite the first three
lines as

numbers

|> reverse

|> map(fn char -> char - ?0 end)

|> map(fn digit, index -> {digit, index} end)

We take the original list, transform it by reversing it, then by
converting character codes to integers, and then by adding the
index.

The pipeline operator looks like magic, but it’s actually quite
simple. It takes the value of the expression on its left, and
inserts it as the first argument of the function call on its
right, shifting all the other arguments down.

So the second ugliness is all this partitioning and summing. Our
problem is that we’re thinking imperatively, not
functionally. We are telling Elixir each step of what to do,
when instead we should be thinking of the specification of what
we want, and let it work out the details.

Think back to our Fibonacci example. There we implemented our
specification as three function heads, which matched the two
special cases and the one general case. Can we do the same here?

Rather than processing our list of digits one element at a time,
what if we process it in twos? This means we’re working with a
pair of digits—the first will be the one at an odd position, the
second at the even position. We know how to calculate the Luhn
number for these two digits, and then we can add the result for
them into the result for the remainder of the list. That’s our
recursive step.

When we finally empty the list, we will have calculated the
required sum, so we can simply return it.

There’s one other case to consider. If the list has an odd
number of digits, then when we get to the end, we’ll only have a
single element. But we know that element is at an odd position,
so we can simply add it to the accumulated sum and return the
result.

So, here’s the new version of our code:

defmodule CheckDigit do

import Enum

@doc """

Determine if a sequence of digits is valid, assuming the last digit is

a Luhn checksum. (http://en.wikipedia.org/wiki/Luhn_algorithm)

"""

def valid?(numbers) when is_list(numbers) do

numbers

|> reverse

|> map(&1 - ?0)

|> sum

|> rem(10) == 0

end

defp sum(digits), do: _sum(digits, 0)

defp _sum([], sum), do: sum

defp _sum([odd], sum), do: sum + odd

defp _sum([odd, even | tail], sum) when even < 5 do

_sum(tail, sum + odd + even*2)

end

defp _sum([odd, even | tail], sum) do

_sum(tail, sum + odd + even*2 - 9)

end

end

The pipeline at the top is now a lot simpler—there’s no messing
with indexes, and no temporary variables. It reads like a code
version of the spec.

The sum function is an example of a common pattern. We
need to set the initial value of the value we’re summing, but we
don’t want the code that calls us to know about that detail, so
we write a version of sum that just takes the numbers,
and then calls the actual implementation, passing in the list
and a zero for the initial value. We could give the helper
functions the same name, but I prefer using _sum to
differentiate them. (Many Elixir programmers would have called
them do_sum, but that always strikes me as being too
imperative.)

The _sum function has 4 heads:

If the list is empty, we return the sum that we’ve been
accumulating in the second parameter.

If the list has one element, add its value to the sum so far
and return it. This is the terminating condition for a list
with an odd number of elements.

Otherwise, we extract the first two elements from the
list. This uses the pattern [odd,even|tail]. The
first element is bound to odd, the second to
even, and the remainder of the list is bound to
tail.

Looking back at the Luhn algorithm, we have two cases to
consider. If the result of multiplying the even number by
two is less than ten, then that’s the number we add into the
sum. We use a guard class to check for this.

Otherwise we have to subtract nine from the product. That’s
what the 4th function body does.

Notice how we’re passing the updated sum around as the second
parameter to the function—this is a universal pattern when you
want to accumulate a value or values across a set of recursive
function calls.

What’s Different About this Code

When you write in a language such as Java, C#, or Ruby, you’re
working at a number of levels simultaneously. Part of your brain
is thinking about the specification—what has to get done. The
other part is thinking about the implementation—the nuts and
bolts of how to do it. And that’s where things often get bogged
down.

Look at the last example. We’re iterating over a set of
digits. We’re selecting those with odd or even positions. We’re
performing conditional calculations. We’re summing the
result. And there isn’t a single control structure in the
program. No ifs, no loops. The code pretty much reflects
the specification of what we want to happen.

And that’s one of the reasons I’m a fan of functional
programming in general, and Elixir in particular.

Next Month

Let’s start getting parallel. We’ll see how we can use Elixir to
run hundreds of thousands of processes, and how to coordinate
their work.

While you’re waiting, check out the screencasts available on my
Elixir book’s home
page.

And if you want the real scoop, you can always buy the book. :)

(Fibonacci image from http://www.maths-rometus.org/)

Dave Thomas is a programmer who likes to evangelize cool
stuff. He cowrote The Pragmatic Programmer, and
was one of the creators of the Agile Manifesto. His book
Programming Ruby introduced the Ruby language to
the world, and Agile Web Development with Rails
helped kickstart the Rails revolution.