If you have been following the Feabhas blog for some time, you may remember that in April of last year I posted about my experiences of using the MQTT protocol. The demonstration code was ran the ARM Cortex-M3 based mbed platform.

For those that are not familiar with the mbed, it is an “Arduino-like” development platform for small microcontroller embedded systems. The variant I’m using is built using an NXP LPC1768 Cortex-M3 device, which offers a plethora of connection options, ranging from simple GPIO, through I2C and SPI, right up to CAN, USB and Ethernet. With a similar conceptual model to Arduino’s, the drivers for all these drivers are supplied in a well-tested (C++) library. The mbed is connect to a PC via a USB cable (which also powers it), so allows the mbed to act as a great rapid prototyping platform. [I have never been a big fan of the 8-bit Arduino (personal choice no need to flame me ) and have never used the newer ARM Cortex-M based Arduino’s, such as the Due.]

However, in its early guise, there were two limitations when targeting an mbed (say compared to the Arduino).

First was the development environment; initially all software development was done through a web-based IDE. This is great for cross-platform support; especially for me being an Apple fanboy. Personally I never had a problem using the online IDE, especially as I am used to using offline environments such as Keil’s uVision, IAR’s Embedded Workbench and Eclipse. Over the years the mbed IDE has evolved and makes life very easy for importing other mbed developers libraries, creating your own libraries and even have an integrated distributed version control feature. But the need Internet connection inhibit the ability to develop code on a long flight or train journey for example.

Second, the output from the build process is a “.bin” (binary) file, which you save on to the mbed (the PC sees the mbed as a USB storage device). You then press the reset button on the mbed to execute your program. I guessing you’re well ahead of me here, but of course that means there is no on-target debug capabilities (breakpoints, single-step, variable and memory viewing, etc.). Now of course one could argue, as we have a well-defined set of driver libraries and if we followed aTest-Driven-Development (TDD) process that we don’t need target debugging (there is support for printf style debugging via the USB support serial mode); but that is a discussion/debate for another session! I would hazard a guess most embedded developers would prefer at least the option of target based source code debugging? Read more »

An optional part of the ARMv7-M architecture is the support of a Memory Protection Unit (MPU). This is a fairly simplistic device (compared to a fully blow Memory Management Unit (MMU) as found on the Cortex-A family), but if available can be programmed to help capture illegal or dangerous memory accesses.
When first looking at programming the MPU it may seem rather daunting, but in reality it is very straightforward. The added benefit of the ARMv7-M family is the well-defined memory map.
All example code is based around an NXP LPC1768 and Keil uVision v4.70 development environment. However as all examples are built using CMSIS, then they should work on an Cortex-M3/4 supporting the MPU.
First, let’s take four types of memory access we may want to capture or inhibit:

Tying to read at an address that is reserved in the memory map (i.e. no physical memory of any type there)

Trying to write to Flash/ROM

Stopping areas of memory being accessible

Disable running code located in SRAM (eliminating potential exploit)

Before we start we need to understand the microcontrollers memory map, so here we can look at the memory map of the NXP LPC1768 as defined in chapter 2 of the LPC17xx User Manual (UM10360).

512kB FLASH @ 0x0000 0000 – 0x0007 FFFF

32kB on-chip SRAM @ 0x1000 0000 – 0x1000 7FFFF

8kB boot ROM @ 0x1FFF 0000 – 0x1FFF 1FFF

32kB on-chip SRAM @ 0x2007 C000 [AHB SRAM]

GPIO @ 0x2009C000 – 0x2009 FFFF

APB Peripherals @ 0x4000 0000 – 0x400F FFFF

AHB Peripheral @ 0x5000 0000 – 0x501F FFFF

Private Peripheral Bus @ 0xE000 0000 – 0xE00F FFFF

Based on the above map we can set up four tests:

Read from location 0x0008 0000 – this is beyond Flash in a reserved area of memory

Write to location 0x0000 4000 – some random loaction in the flash region

Read the boot ROM at 0x1FFF 0000

Construct a function in SRAM and execute it

The first three tests are pretty easy to set up using pointer indirection, e.g.:

Setting up the MPU

There are a lot of options when setting up the MPU, but 90% of the time a core set are sufficient. The ARMv7-M MPU supports up to 8 different regions (an address range) that can be individually configured. For each region the core choices are:

the start address (e.g. 0x10000000)

the size (e.g. 32kB)

Access permissions (e.g. Read/Write access)

Memory type (here we’ll limit to either Normal for Flash/SRAM, Device for NXP peripherals, and Strongly Ordered for the private peripherals)

Executable or not (refereed to a Execute Never [XN] in MPU speak)

Both access permissions and memory types have many more options than those covered here, but for the majority of cases these will suffice. Here I’m not intending to cover privileged/non-privileged options (don’t worry if that doesn’t make sense, I shall cover it in a later posting).
Based on our previous LPC1768 memory map we could define as region map thus:

Not that the boot ROM has not been explicitly mapped. This means any access to that region once the MPU has been initialized will get caught as a memory access violation.
To program a region, we need to write to two registers in order:

MPU Region Base Address Register (CMSIS: SCB->RBAR)

MPU Region Attribute and Size Register (CMSIS: SCB->RASR)

MPU Region Base Address Register

Bits 0..3 specify the region number
Bit 4 needs to be set to make the region valid
bits 5..31 have the base address of the region (note the bottom 5 bits are ignored – base address must also be on a natural boundary, i.e. for a 32kB region the base address must be a multiple of 32kB).

After the MPU has been enabled, ISB and DSB barrier calls have been added to ensure that the pipeline is flushed and no further operations are executed until the memory access that enables the MPU completes.

Using the Keil environment, we can examine the MPU configuration:

Rerunning the tests with MPU enabled

To get useful output we can develop a memory fault handler, building on the Hard Fault handler, e.g.

Test 1 – Reading undefined region

The SCB->MMFAR contains the address of the memory that caused the access violation, and the PC guides us towards the offending instruction

Test 2 – Writing to RO defined region

Test 3 – Reading Undefined Region (where memory exists)

Test 4 – Executing code in XN marked Region

The PC gives us the location of the code (in SRAM) that tried to be executed

The LR indicates the code where the branch was executed

So, we can see with a small amount of programming we can (a) simplify debugging by quickly being able to establish the offending opcode/memory access, and (b) better defend our code against accidental/malicious access.

Optimizing the MPU programming.

Once useful feature of the Cortex-M3/4 MPU is that the Region Base Address Register and Region Attribute and Size Register are aliased three further times. This means up to 4 regions can be programmed at once using a memcpy. So instead of the repeated writes to RBAR and RASR, we can create configuration tables and initialize the MPU using a simple memcpy, thus:

Divide by zero error

You would expect a hardware “divide-by-zero” (div0) error. Possibly surprisingly, by default the Cortex-M3 will not report the error but return zero.

Configuration and Control Register (CCR)

To enable hardware reporting of div0 errors we need to configure the CCR. The CCR is part of the Cortex-M’s System Control Block (SCB) and controls entry trapping of divide by zero and unaligned accesses among other things. The CCR bit assignment for div0 is:

[4]

DIV_0_TRP

Enables faulting or halting when the processor executes an SDIV or UDIV instruction with a divisor of 0:0 = do not trap divide by 01 = trap divide by 0. When this bit is set to 0, a divide by zero returns a quotient of 0.

So to enable DIV_0_TRP, we can use CMSIS definitions for the SCB and CCR, as in:

SCB->CCR |= 0x10;

If we now build and run the project you will need to stop execution as it will appear to run forever. When execution is stopped you’ll find debugger stopped in the file startup_ARMCM3.s in the CMSIS default Hard Fault exception handler:

Override The Default Hard Fault_Handler

As all the exceptions handlers are build with “Weak” linkage in CMSIS, it is very easy to create your own Hard Fault handler. Simply define a function with the name “HardFault_Handler”, as in:

void HardFault_Handler(void)
{
while(1);
}

If we now build, run and then stop the project, we’ll find the debugger will be looping in our new handler rather than the CMSIS default one (alternatively we could put a breakpoint at the while(1) line in the debugger).

Rather than having to enter breakpoints via your IDE, I like to force the processor to enter debug state automatically if a certain instruction is reached (a sort of debug based assert). Inserting the BKPT (breakpoint) ARM instruction in our code will cause the processor to enter debug state. The immediate following the opcode normally doesn’t matter (but always check) except it shouldn’t be 0xAB (which is used for semihosting).

If we now build and run, the program execution should break automatically at the BKPT instruction.

Error Message Output

The next step in developing the fault handler is the ability to report the fault. One option is, of course, to use stdio (stderr) and semihosting. However, as the support for semihosting can vary from compiler to compiler, I prefer to use Instrumented Trace Macrocell (ITM) utilizing the CMSIS wrapper function ITM_SendChar, e.g.

Fault Reporting

Now that we have a framework for the Hard Fault handler, we can start reporting on the actual fault details. Within the Cortex-M3’s System Control Block(SCB) is theHardFault Status Register (SCB->HFSR). Luckily again for use, CMSIS has defined symbols allowing us to access these register:

Building and running the application should now result in the following output:

By examining the HFSR bit configuration, we can see that the FORCED bit is set.

When this bit is set to 1, the HardFault handler must read the other fault status registers to find the cause of the fault.

Configurable Fault Status Register (SCB->CFSR)

A forced hard fault may be caused by a bus fault, a memory fault, or as in our case, a usage fault. For brevity, here I am only going to focus on the Usage Fault and cover the Bus and Memory faults in a later posting (as these have additional registers to access for details).

Register dump

One final thing we can do as part of any fault handler is to dump out known register contents as they were at the time of the exception. One really useful feature of the Cortex-M architecture is that a core set of registers are automatically stacked (by the hardware) as part of the exception handling mechanism. The set of stacked registers is shown below:

Using this knowledge in conjunction with AAPCS we can get access to these register values. First we modify our original HardFault handler by:

modifying it’s name to “Hard_Fault_Handler”

adding a parameter declared as an array of 32–bit unsigned integers.

void Hard_Fault_Handler(uint32_t stack[])

Based on AAPCS rules, we know that the parameter label (stack) will map onto register r0. We now implement the actual HardFault_Handler. This function simply copies the current Main Stack Pointer (MSP) into r0 and then branches to our Hard_Fault_Handler (this is based on ARM/Keil syntax):

Finally, if you’re going to use the privilege/non–privilege model, you’ll need to modify the HardFault_Handler to detect whether the exception happened in Thread mode or Handler mode. This can be achieved by checking bit 3 of the HardFault_Handler’s Link Register (lr) value. Bit 3 determines whether on return from the exception, the Main Stack Pointer (MSP) or Process Stack Pointer (PSP) is used.

When linking C programs there are (in general) only a couple of errors you’re likely to see. If, for example, you have two functions in different files, both with external linkage, then the files will compile okay, but when you link you’ll likely see an error along these lines:

Most of the time this makes sense and is as expected; however there is a particular instance where it gets in the way.

If we need to supply a code framework where we need placeholders (stubs) for someone else to fill in at a later date, it can sometimes mean developing complex makefiles and/or conditional compilation to allow new code to be introduced as seamlessly as possible.

However, there is a hidden gem supported by most linkers called “weak linkage”. The principle of weak linkage is that you can define a function and tag it as (surprisingly) weak, e.g.

This year I have honour of being invited to present at the Embedded Systems Conference in Bengaluru (Bangalore), India. Based on previous visits these classes are very well attended and always generate a lot of post-class discussions.

The other class is one I have presented at ESC in the US before but not in India; Understanding Mutual Exclusion: The Semaphores vs. the Mutex. This presentation is based around much of the material in some of my previous postings (see RTOS related blog postings ). I still find this class really interesting. In general, whenever you bring up the mutex/semaphore discussion most people jump to a per-conceived idea that they know the differences; by the end of this class most understand that their mental model was incorrect.

Back in February, at the embeddedworld exhibition and conference in Nuremberg, Germany, ARM announced the latest version (version 3) of the Cortex(tm) Microcontroller Software Interface Standard (CMSIS). The major addition is the introduction of an abstraction layer for Real-Time Operating Systems (RTOS).

The presentation I’m giving explains; what the abstraction layer offers, how it maps on to an underlying RTOSs API (e.g. ARM/Keil’s RTX), and what is required to re-target another RTOS.

If you can’t make the event (I would recommend it if you can, the last one had a lot of very useful information), then I plan to make the presentation available as a slide deck and a narrated video after the event, so watch this space.

As I’m sure you’re well aware, the ash cloud from the Eyjafjallajokull volcano (which began erupting in mid-March) pretty much brought much of European airspace to a standstill over last weekend and into this week. Now that UK airspace has been reopened, it appears I can resume plans for my visit to the Embedded Systems Conference in San Jose next week.

The sessions I’m directly involved with are:

Examining ARM’s Cortex Microcontroller Software Interface Standard – CMSISDate/Time: Tuesday (April 27, 2010) 3:15pm — 4:15pm
In this session I shall be examining ARM’s Cortex Microcontroller Software Interface Standard (CMSIS – typically pronounced C-M-Sis) . CMSIS aims to provide a framework of reusable components for software developers, silicon and compiler vendors to utilize. Aspects such a power-up boot management and interrupt control have been abstracted and defined by ARM for their Cortex-M family of microcontroller cores. I’ll spend time explaining the CMSIS framework, examining the code from a compiler and chip perspective, and finish off by discussing the pros and cons I see in such an approach.

Understanding Mutual Exclusion: The Semaphores vs. the MutexDate/Time: Wednesday (April 28, 2010) 12:00pm — 12:50pmLocation (room): ESC Theater 2
This presentation is based around much of the material in some of my previous postings (see RTOS related blog postings ). One of the things I like about ESC events is that some of the sessions are free. So if you happen to be attending just the exhibition, then you can still attend this session.

The State of EmbeddedDate/Time: Wednesday (April 28, 2010) 4:00pm — 4:50pmLocation (room): ESC Theater 1
Finally, I have been invited to participate in an informal session along with Jack Ganssle and Dan Saks (moderated by Rich Nass of Embedded.com and EE Times) to discuss some of the latest trends in embedded systems. We did a similar session at ESC UK back October 2009, which was very well received. It shall be interested to see how being the token European with a US audience differs from the UK event. This is also a free session.