From math to machine: translating a function to machine code

In this post I'm going to explore how a mathematical concept can be redefined
in progressively more computer-oriented terms, all the way from high level
languages down to machine code, ready for direct execution by a computer. To
that end, I'm going to define the same logic in several different but related
formats:

Math - pure mathy goodness

Haskell - a functional programming language

C - an imperative programming language

Assembly - a more readable representation of machine code

x86-64 machine code - the real deal

If you're interested in how language styles can differ or curious about what
your code might look like after being compiled, keep reading!

Factorials in math

A factorial is the product of an integer and all smaller integers greater
than 0. There are lots of ways to describe a definition like this. One such way
is as follows:

$$n! = \prod_{k=1}^n k$$

This definition states that n! is the product of all integers from 1 to n.
For example, the factorial of 5 is:

5! = 1 * 2 * 3 * 4 * 5 = 120

Here are the first few factorials:

n

n!

0!

1

1!

1

2!

2

3!

6

4!

24

5!

120

One important use of factorials is calculating the total number of permutations
of a set. For example, the string "cat" can be rearranged in 6 possible ways:
"cat", "act", "atc", "tac", "tca", and "cta". This string has 3 letters
and 3! = 6.

The string "a", which has one character, can only be arranged in that one
way. You can't reorder the string "a", so it has only one permutation:
1! = 1.

This comes up a lot in algorithm analysis. An algorithm which has to consider
every possible permutation of its input is said to run in factorial time. In
Big O notation that looks like this: O(n!). Algorithms of this type scale
very poorly, so it's useful to be able to recognize these kinds of algorithms,
if only so you know to try to find a faster way to solve the problem.

Factorials in a functional language

Just like there are lots of ways to describe something mathematically, there
are also lots of ways to describe things to computers. Let's start with
Haskell, which among other useful features, happens to have a pretty cool
looking logo:

Haskell is a purely functional language. In broad terms, this means that
instead of telling the computer what to do, a Haskell program tells the
computer what things are. Once a program has been written in Haskell, it's
up to the Haskell compiler to figure out how to translate those definitions
into instructions which a computer can understand.

Take a look at the following Haskell function, which calculates the factorial
of the number provided to it:

factorial::Int->Int
factorial n =product [1..n]

If you haven't played around with functional languages yet, this probably looks
pretty strange.

The first line says that factorial is a function which takes an integer and
returns another integer. Here's an oddly-formatted version of that first line,
spaced out so you can see roughly which parts of the syntax mean what:

-- factorial is a function which takes an integer and returns an integerfactorial::Int->Int

This first line is technically optional but it's usually good practice to
include it. Haskell is pretty smart so it can figure out type signatures on its
own most of the time, but it's still useful to document the function signature
for other programmers or your future self.

The second line defines the function body, which can be read as "the factorial
of n is equal to the product of all integers from 1 to n". Here's another
spaced out version to show which parts of the syntax mean what:

-- The factorial of n is equal to the product of all integers from 1 to n
factorial n =product [1.. n]

Notice we're not telling Haskell how to calculate a factorial, we're defining
what a factorial is. This is one of the more important differences between
functional languages and imperative languages.

Let's break this definition down further. When this function is called with
some number n, the part after the equal sign is evaluated and returned:

product [1..n]

First, let's look at the part in square brackets:

[1..n]

This is a list range. A list in Haskell is kind of like an array in other
languages. That is, it's an ordered collection of values, all with the same
type. You can have lists of ints, floats, strings, custom types, or even lists
of lists.

The .. makes this particular list a range. This creates a list of all
integers from 1 to n. So if n is equal to 5, this will make a list with 5
values:

[1, 2, 3, 4, 5]

Once the list range is evaluated, it's passed to the product function, like
so:

product [1, 2, 3, 4, 5]

The product function takes a list of numbers, multiplies them all together,
and returns the result. So this will evaluate to:

1*2*3*4*5

The answer turns out to be 120, which just so happens to be 5!, the
very number we were looking for. What a lucky coincidence!

Once the factorial function above is defined, you can get the factorial for a
number by calling like this:

Factorials in an imperative language

We've seen how the mathematical idea of factorials can be expressed in the
style of a functional programming language. Now we'll go another level deeper
and see the same thing in an imperative language called C, which I like quite a
bit, even though its logo is unfortunately not as cool as Haskell's:

Programming in functional languages like Haskell generally works by defining
what things are and letting the language work out how to arrive at the answer.
Programming in an imperative language involves explaining to the computer how
to perform the calculations yourself, as a series of steps the computer can
follow.

Fundamentally, this is the same exact logic as in the Haskell version, it's
just specified in a different way.

At a high level, this function does the following:

Set ret to 1. This is going to be the return value.

Multiply n by ret.

Subtract 1 from n.

Repeat steps 2-3 as long as n is greater than 1.

Return the value in ret.

Let's step through this line by line to see how this is done.

intfactorial(int n)
{

This marks the beginning of the factorial function. It states that
factorial is a function which takes an integer called n and returns another
integer. This doesn't map to English quite as naturally as the Haskell type
signature, but we can try:

// Returning an int, factorial is a function which takes an int named nintfactorial ( int n)

The ordering of the syntax makes it a bit more awkward, but this line means
the same thing as the Haskell function signature.

int ret = 1;

This declares a new integer called ret and gives it the value 1. This is
going to be the return value. We'll repeatedly multiply n against this
variable and return it when we're done.

while (n > 1)
{

This starts a loop. The while (n > 1) part means to run the code inside
the curly braces { ... } over and over as long as n is greater than 1.
If n is 0 or 1 at the start of the function, this loop will never run at all.

ret *= n;

Each time the loop runs, we multiply n by ret and store the result in
ret.

n--;

Then we subtract 1 from n. This way, n will keep going down each time the
loop runs.

}

This is the end of the loop body. When execution reaches this point, it will
jump back to the beginning of the loop and run it again, assuming the
conditional in the where line is still true.

return ret;
}

Once the loop is finished, we return the value in ret and end the function.

This factorial function can be called like this:

factorial(5);

When it's called with a 5, the following steps happen:

ret is set to 1.

n is 5 and 5 > 1, so the loop body runs.

ret is multiplied by 5, changing it to 5.

n is decremented, changing it to 4.

n is 4 and 4 > 1, so the loop body runs again.

ret is multiplied by 4, changing it to 20.

n is decremented, changing it to 3.

n is 3 and 3 > 1, so the loop body runs again.

ret is multipled by 3, changing it to 60.

n is decremented, changing it to 2.

n is 2 and 2 > 1, so the loop body runs again.

ret is multiplied by 2, changing it to 120.

n is decremented, changing it to 1.

n is 1 and 1 is not greater than 1, so the loop ends.

ret, with a value of 120, is returned to the caller.

Factorials in assembly language

Despite differences in style, the C and Haskell functions are both relatively
high-level. That means you don't need to bother yourself much with the
particulars of the machine you're writing the code for: both the C and Haskell
compilers can handle turning code into something appropriate for the computer
being targeted. But what does the code they generate look like?

This gets us into assembly language. Assembly language is a symbolic form of
machine code. For the most part, instructions in assembly language map directly
to machine code instructions.

Because of this, we can't state things in quite the same terms in assembly as
we did in C and Haskell: higher level languages do a lot to adapt their syntax
to how humans think, but in assembly we have to do some of that work ourselves
and adapt our thinking to the particulars of the hardware.

There are actually lots of assembly language syntaxes. In this case we'll be
using the Netwide Assembler, also known as nasm. Before we move
on, let's get the truly important stuff out of the way. Here's nasm's logo:

I'm afraid Haskell still wins, but this one isn't bad at all.

Here's a factorial function written in nasm syntax for an x86-64 computer:

Okay, what happened? If you can make any sense of this at all you're either
already familiar with assembly or you're a lot smarter than I am. The C and
Haskell versions at least have some complete words and some familiar-ish
expressions. However, even though the style of code has changed substantially,
the same logic is here, in more or less the same order.

Take a look at this version with comments added to show roughly what each line
does in C:

The mapping isn't perfect, but with the comments in place, this code should
look a little less odd.

Even though it's described differently, the basic logic is the same as the C
version. This is no accident: C is a fairly thin layer over assembly and most
of its constructs can map pretty directly to a wide variety of machines.

One critical difference between the assembly version and the C/Haskell
versions is that there is no type signature in assembly. Nowhere does the
assembly version define what inputs the factorial function accepts or what
outputs it returns. Instead, it expects that the input value n has been
loaded into a register called rax before the function was called. It
leaves its return value in rdi when it exits, again assuming that the caller
will know where to find that answer. Nowhere in the code is this expressed in
concrete terms: to use this function you basically have to already know how it
works. Ideally there would be comments in the code or external documentation
containing this information. If not, you'd have to read the function's code to
try and work out how to use it.

When the function starts, it sets rdi to 1, which will be the return value.
Next, it repeatedly multiplies that return value by the value in rax,
subtracting 1 from rax each time. Once rax reaches 0 or 1, the function
ends and the return value is left in rdi for the caller to use. Assuming the
caller put an integer in rax before calling the factorial function, it will
find the factorial of that integer in rdi when the function returns.

If you're curious what these instructions really do beyond just seeing how
they could map to C-style syntax, read on!

More detail on the assembly version

First, a quick primer. The CPU doesn't really think in terms of variables like
int ret = 1; or expressions like ret *= n;. Instead, it has a
number of registers. Each register can store a fixed amount of data. On a
64-bit processor, the general-purpose registers store 64 bits each.

By executing instructions, a program can load data into these registers and
then do math on that data.

Since registers are tiny chunks of memory inside the CPU hardware, performing
operations on registers is lightning-fast because the CPU doesn't need to wait
on data to move to or from system memory.

A few of the more commonly-used registers are:

Register

Description

rax

General-purpose

rbx

General-purpose

rcx

General-purpose

rdx

General-purpose

rdi

General-purpose

rsi

General-purpose

rbp

Often used to keep track of the start of a stack call frame

rsp

Always points to the top of the stack

rip

Always points to the next instruction to be executed

The general-purpose registers can mostly be used however you want. Other
registers have specific purposes with rules about how they can be used or
modified.

The CPU can only perform calculations on data loaded into registers. So in
order to add two numbers together, you first have to tell the computer to load
each number into a register, and then you can tell it to add the values in
those registers together.

Let's go through the assembly function line by line to see how it works in
more detail.

factorial:

This marks the beginning of the factorial function. A name followed by a
colon is called a label. We can tell the computer to jump to this label
whenever we want the code after the label to run.

At the beginning of the function, we assume that the caller set rax to
some integer n.

movrdi,1

We're going to return the result of the function in a register called rdi.
This instruction sets rdi's initial value to 1. We have no idea what this
register is set to at the beginning of the function because registers aren't
automatically cleared when functions are called. We have to set it to
something before we use it.

.loop:

This is another label, marking the beginning of the loop. The dot at the front
makes it a local label, making it local to the function we're in. We can jump
to .loop: anytime we want the loop to run.

cmprax,1

Each time the loop runs, the first thing we need to do is check if the loop
should end yet.

Remember n is stored in rax. This instruction compares the value in rax
to 1. It doesn't do anything with that information, it just sets things up so
we can act on it later.

jle .done

This instruction acts on the previous compare instruction. jle means to
jump if less than or equal to. So if rax is less than or equal to
1, execution will skip ahead to the done: label, ending the loop. Otherwise,
execution will continue to the next instruction, which will run the loop body.

imulrdi,rax

If the program didn't jump out of the loop to the .done: label, we know that
n must be 2 or higher. This instruction multiplies rdi by rax and stores
the result in rdi.

decrax

This instruction decrements rax, which means to subtract 1. If rax is 5,
this instruction will set it to 4.

jmp .loop

This instruction jumps back to the .loop: label, which starts the loop over
again.

.done:

Once the loop is finished running, execution will jump here.

ret

The factorial result should now be sitting in rdi. This instruction ends the
function, causing execution to jump back to wherever it left off when this
function was called. The return value of n! will be left in the rdi
register for the caller to use.

So you can see that the logic is pretty similar to the C version. Implementing
higher-level constructs like C's while loop requires jumping around between
labels and separate comparison instructions but it works the same. The function
is much less self documented since it has no formal type definition, but
otherwise it takes the same input and provides the same output.

Factorials in machine code

We've seen the assembly language version of a factorial function, but can a
computer run that directly? The answer is.. almost. Assembly language is a
mnemonic for machine code, meaning that each instruction maps to a machine
code instruction. However, in assembly, the instructions are specified using
bits of English words and numbers in decimal notation in order to be easier for
humans like me and (presumably) you to read and write.

We can assemble code by hand using the convenient reference at
ref.x86asm.net. A detailed look at hand-assembling
code is probably a topic for another day, but just for fun, let's take a quick
look at how the assembly function could map to machine code.

This is the factorial function in machine code. Makes sense, right? I'm glad
you understand, thanks for reading!

...

Yeah I can't read this very well either, but this is kind of how a computer
sees machine code. It's a big slab of bytes sitting somewhere in memory. The
rip register stores an address to one of those bytes. When the computer runs
an instruction it checks the value of rip to see where it's pointing and it
decodes the data it finds there. That means that according to a bunch of
rules it sorts out what instruction is meant by a series of bytes.

Once decoded, the CPU does whatever the instruction tells it to do. By default
rip is advanced to point to the next instruction after the one being run.
This way, the next time the CPU runs an instruction, rip will be pointing at
the next instruction in memory. This causes instructions to run in sequence.
However, in some cases (such as jumps) the instruction modifies the rip
register to point somewhere else, causing execution to jump around.

The code above is represented as hex. Each pair of hex digits is one byte. This
could just as easily be represented as a series of 0s and 1s (8 per byte) or in
decimal (a number from 0-255 for each byte). For example:

Decimal

Hex

Binary

72

48

01001000

The way the data is represented isn't really important. You could make up your
own encoding format if you wanted, even though nobody else would know how to
read it.

Since this slab of hex bytes isn't very helpful, let's break it up into
instructions:

Probably the biggest difference between this and the assembly version (other
than vaguely English-inspired words turning into a soup of hex digits) is the
lack of labels. That's because labels like factorial: and .done: are a
convenience provided by assemblers. In machine code, jumps work by changing
the value in rip to point somewhere else.

Take a look at the assembled version of jle .done:

7e 0c jle .done ; Jump ahead 12 bytes

In this instruction, each byte has a meaning:

Hex

Role

Value

Meaning

7e

Opcode

jle

Jump if less than or equal to

0c

Operand

12

Jump 12 bytes forward

So the 7e tells the computer to jump depending on a previous cmp
instruction. 0c tells it exactly where to jump, assuming the jump happens.
0c is hex for 12. All together, this means to jump forward 12 bytes from
the current position.

When an instruction is executed, rip will be pointing to the next byte after
that instruction. So when 7e 0c (jle .done) is executed, rip will be
pointing to 48 0f af f8 (imul rdi, rax). If the jump occurs, rip will
be increased by a value of 12, making it point to the c3 (ret) all the
way at the end. The next instruction to run will therefore be either 48 0f af f8 (imul rdi, rax) or c3 (ret), depending on the outcome of the
comparison being performed.

How about jumping backwards? It works the same, except it uses a negative
offset. Take a look at the code for jmp .loop, which jumps back to the
start of the loop:

e9 ec ff ff ff jmp .loop ; Jump back 20 bytes

This instruction can be broken into two pieces like the previous one:

Hex

Role

Value

Meaning

e9

Opcode

jmp

Jump no matter what (unconditional jump)

ec ff ff ff

Operand

-20

Jump 20 bytes backward

So e9 tells the CPU to jump and ec ff ff ff tells it to jump
backward 20 bytes. When this instruction is executed, rip will be pointing to
c3 (ret) at the end of the function. Applying a delta of -20 to rip
will cause execution to jump back 20 bytes to
48 3d 01 00 00 00 (cmp rax, 1), which will run the loop again.

Providing labels and letting us jump to labels instead of offsets is a very
convenient feature provided by assemblers. Without it, you'd have to count
bytes to implement control structures like conditionals and loops. Every time
you added, removed, or even changed instructions, you'd have to recalculate all
your jump offsets. So assemblers help out a lot more than just translating
pseudo-English like jmp or rax to their binary equivalents.

Other than the labels being replaced by relative offsets and everything being
converted into a binary format, it's the same logic as the assembly version,
which is nearly the same as the C version. It's almost like all these languages
and formats are somehow related to each other. Spooky!

Conclusion

We've seen an idea translated gradually down through several languages, ending
up with machine code. Hopefully this has been interesting and possibly even
enlightening. If you enjoyed this, feel free to send me millions of dollars.
Thanks for reading!