ARM Assembler

ARM processors are RISC (Reduced Instruction Set Computer) chips used widely in mobile phones and other embedded devices. The use of an ARM processor in the Raspberry Pi will promote the study of RISC processors in schools. The ARM registers and instruction sets are different from those we have described for x86 Intel processors. This section will compare ARM assembler with Intel syntax, so you should be familiar with the material in our in-line assembler tutorial. It is our intention to write assembler code for the Raspberry Pi, but because of its restricted availability we will start by testing our code in a simulator.

We installed Cygwin (which gives us some Unix-type functionality within Windows) then GNUARM. From the GNUARM home page we selected the "FILES" page and executed the GCC-4.1 toolchain setup file labelled

You can choose how you use the registers, but you may need to preserve the contents of certain registers by pushing them before use then popping them afterwards. By convention, registers r0 to r3 and r11 may legitimately be "corrupted" by a routine. The following table shows conventional uses of registers.

Table 1. Uses of Registers

Name

Alternative name

Description

r0 - r3

Used to hold arguments for procedures and as scratch registers (for temporary
storage). R0 is used to return the result of a function.

r4 - r9

v1 - v6

General purpose or storage of variables

r10

sl, v7

Stack limit pointer, used by assemblers for stack checking when this option is
selected by the user.

r11

fp, v8

Frame pointer. From Jack Crenshaw's section on local
variables when translating procedures, "Formal parameters are
addressed as positive offsets from the frame pointer, and locals as negative
offsets".

r12

ip

Intra-Procedure-call scratch. Used with r0 - r3 for temporary storage and
original contents do not need to be preserved.

r13

sp

Stack pointer

r14

lr

Link register, holding the return address from a function

r15

pc

Program counter, holding the address of the next instruction

The instruction set is summarised in a quick reference card. We tabulate below selected
instructions that you are most likely to use at first, together with their equivalents in Intel syntax.

Table 2. Selected Operations

ARM Mnemonic

Intel Mnemonic

Function

ADD

ADD
INC

Addition

SUB

SUB
DEC

Subtraction

RSB

Reverse subtraction

MUL

IMUL

Multiplication

AND

AND

Bitwise AND

ORR

OR

Bitwise OR

EOR

XOR

Bitwise exclusive OR

MVN

NOT

Bitwise NOT

TST

TEST

Test (performs bitwise AND and sets flags according to the result)

CMP

CMP

Compare

B

JMP

Unconditional jump/branch

BEQ

JE

Jump/branch if equal

PUSH

PUSH

Push onto stack

POP

POP

Pop from stack

MOV
LDR
STR

MOV

Transfer (copy)

MOVEQ

CMOVE

Copy if equal

ADR

LEA

Load address

BL

CALL/RET

Call a subroutine then return

The following bullet points highlight features of ARM assembly language that are strikingly different from Intel syntax.

The result register (the first register following the mnemonic) can be different from the two operands. For example, the instruction ADD r0, r1, r2 will add r2 to r1 and store the result in r0.

The suffix "S" added to mnemonics such as MOV makes the operation affect flags.

A conditional suffix such as EQ (if equal), NE (if not equal), GT (if greater than), GE (if greater than or equal) can be added to most mnemonics.

If you need to operate on data in a memory location you must load it into a register first. (ARM processors have a load/store architecture).

For storing (mnemonic STR) the source precedes the destination.

When loading, you can load directly from memory (e.g. ldr r1, num1), but when saving, you must put the address in the destination register and indirect address using square brackets (e.g. str r0, [r3]).

There are restrictions on the immediate values that you can load directly with MOV, but you can use the syntax ldr rx, =immediate value e.g. ldr r0, =625.

You can apply shifts to the second operand "cheaply" as part of an operation.

Only recent ARM processors handle the division operator. You need to branch to a library routine instead. You can use other routines such as puts for outputting a string and printf for outputting a string with formatted parameters.

Sample code follows these tabulated commands for processing it. The arm-elf commands require that the paths of the bin folders of Cygwin and GNUARM are in the list of paths among your Environment Variables. Change the working directory to that of your source files e.g. C:\ARM.

Table 3. Useful Commands

Command at Cygwin prompt

Result

cd C:\ARM

Changes current directory to C:\ARM

arm-elf-gcc -o temp.elf temp.s

Assembles temp.s

arm-elf-run temp.elf

Simulates the running of temp.elf

arm-elf-gcc -S -o div.s div.c

Compiles div.c as far as the ARM assembler div.s (showing how compiler uses registers and routines)

The above code would not assemble on the Raspberry Pi. The amended code below assembles and runs on the simulator and on the Raspberry Pi. Instead of loading a variable directly with ldr r4, num1, we need to load the address of the variable into another register (ldr r6 =num1) then use indirect addressing (ldr r4, [r6]). We have added .data before the data declarations and .text to mark the start of the code section.

Commands for the Raspberry Pi

You can transfer files between a PC and the Pi without networking by using either the SD Card or a USB memory stick. The boot directory /boot is visible in windows and is useful for transferring files between a computer running Windows and the Pi using the SD card.

cd /usr/bin changes the current directory to the one containing gcc.

gcc -o ~/temp /boot/temp.s compiles the assembler file temp.s in the boot directory to the executable temp in your home directory.

cd Changes the current directory to your home directory.

./temp executes the file temp in the current directory.

This screenshot shows a way of compiling without changing the current directory. The 80 KB program scrot is for capturing screenshots. You can install it on your Pi with the command sudo apt-get install scrot. Our screenshot is saved to a USB memory stick for cropping on a PC. (We installed usbmount with the command sudo apt-get install usbmount
so that it detects our flash drive, mounts it to /media/usb then unmounts it upon removal).

The next version (for the simulator and Pi) shows how you can save values to memory. The arm processor is well-supplied with registers for storing variables, but if you run out of registers you can load the address of a variable into a register (r3 in the code below) and then use indirect addressing to save data. The use of memory locations instead of registers will slow down the execution.

See how ARM assembler code for the simulator is generated from TINY source code using programs TINY11ELF and TINY14ELF.
These programs generate assembler code that arm-elf-gcc assembles and links to create .elf executables that run in the arm-elf-run simulator. We need to make minor modifications to them so that their output code will assemble using the gcc translator on the Raspberry Pi.