Accessing the Raspberry Pi’s 1MHz timer

A fixed-rate timer is not part of the ARM specification, but most ARM-based SoC’s have such a timer. The Raspberry Pi is no exception. However, reading its timer in Linux takes a Unix hacker’s understanding.

According to the official Broadcom 2835 documentation, the free-running 1MHz timer resides at ARM address 0x20003004(*), which breaks down thus: hardware I/O base at 0x20000000, system timer control at offset 0x3000, 1MHz counter at offset 0x0004. That’s all fine and well, but how can a program read the timer?

As it turns out, ancient Unix design provides a means for a program to read physical memory, via /dev/mem. According to the mem(4) man page, byte addresses in the /dev/mem device file are identical to core memory addresses. Thus, seek()-ing to offset 0x20003004 in /dev/mem is just like reading the Raspberry Pi’s system timer. So how can a program access that tiny space in /dev/mem?

POSIX provides a means to access a tiny region of a file, via mmap(). The catch is, the system timer is read-only, so any attempt at writing to it is “discouraged” (and may result in a catastrophic failure). The parameters to mmap() must be set so that a program won’t attempt to write to the mmap()’ed region. Passing O_RDONLY to open(), and PROT_READ to mmap(), assure that a program attempting to write to its mapped memory will fail.

Because such “raw” access to memory can reveal security information, and the ability to write to memory can alter any aspect of the running system, the /dev/mem device is restricted to the root user. Even if a program is setuid-root (which sets the effective UID), if the invoking user is not root (real UID 0), then any attempt to open /dev/mem will fail. Even if a non-root user is a member of the kmem group, which has read-only privileges on /dev/mem, opening it read-only still fails.

Here is a simple demonstration program, which prints timing information of a loop in microseconds:

The primary steps to access the timer are the open(), the mmap(), and setting up the timer pointer. The call to open() is straightforward, but the call to mmap() is somewhat more involved. The call looks like this:

mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, ST_BASE)

The first parameter is the address within the calling process’s space where the mapped file’s contents should be made visible, or NULL to let the kernel determine an arbitrary address. 4096 is the number of bytes to map, one page of the ARM’s virtual memory. PROT_READ sets the page attribute to read-only in the memory manager, so that any attempt to write will cause a protection error. MAP_SHARED means that any read of the mapped area gets the actual, current contents of the file it’s mapped to. Since the timer is updating continuously, this is the model we want. The file descriptor fd is the handle to the open file /dev/mem. ST_BASE is the page-aligned(**) offset into /dev/mem that we want. If the mapping succeeds, the address returned is the virtual address in the user space process, stored in st_base. Thus, taking into account the memory paging system, the virtual address st_base (variable) points to the ARM address ST_BASE (0x20003000).

Adding TIMER_OFFSET to st_base, then casting it as a pointer to “long long int” (64 bits), gets a pointer to the complete 1MHz timer. After that, it’s simply a matter of de-referencing the pointer to find out how many microseconds the ARM CPU has been running. (Exercise for the reader: why not cast st_base to “long long int *” before adding the timer offset? What could possibly go wrong?)

The program is a proof-of-concept for reading the timer. It starts by reading an initial timer value, sleeping for 1 second, then reading another timer value and printing the difference. On my RPi, clocked at 600 MHz, the typical difference is 1,000,225 microseconds, and at 900 MHz, it’s typically 1,000,210 microseconds. Sleeping for 1 second accounts for 1,000,000 of those microseconds; why the extra 0.2 milliseconds?

With the system clocked at 600 MHz, each timer tick is actually 600 CPU clock cycles. So, 600×225=135,000 CPU clock cycles, for calling glibc’s printf(), which does a binary-to-decimal ASCII conversion, then calls the kernel write() function; and calling glibc’s sleep(), which in turn makes three different kernel calls. In addition, the kernel’s nanosleep() doesn’t automatically resume a process after its sleep interval finishes; the kernel merely marks the process as Runnable, so that the scheduler may resume the process according to the scheduling policy. All figured, accomplishing all that non-sleeping stuff in 135,000 cycles is pretty impressive. (At 900 MHz, it figures to 189,000 extra cycles. I’m not sure why the extra 54,000 cycles, unless invariant RAM speed creates a conflict with the pipeline.)

A 1MHz timer resolution won’t provide cycle-level insight, but it can give a general overview into task duration. After all, even a 30 FPS video record can show the difference when two instructions are compressed into one in a timing loop.

NOTES

(*)The actual VideoCore base address of the system timer is 0x7E003000, but the VC/ARM MMU translates the I/O space to 0x20000000 on the ARM CPU side. See the BCM2835 datasheet, particularly pages 5 and 172. ^

(**)Page alignment of the file offset is a requirement of GNU glibc. The mmap() user-space call actually translates to the kernel’s mmap2(), which takes a page number as the last parameter, rather than a byte offset. On 32-bit systems, this allows access to 16-terabyte offsets. ^

Technically speaking, the RPi2 uses the Broadcom 2836, which has the memory-mapped I/O space at a different base address than the 2835. However, even taking into account the new mapping, I wasn’t able to get this to work on my own RPi2, in Raspbian or Slackware ARM. Same goes for the kernel driver in my later article.

This does not seem to work correctly on the Raspberry Pi Zero (W). It does indeed count approximately 1000000 counts when using sleep(1), however whenever sleep() is not used (which it never will be for most real-life scenarios) the *timer value never changes. Thus it seems that only sleep() changes the timer value?