Raspberry Pi and real-time Linux

Let's have a look at the OSADL QA Farm data

The first (single-core) version of the Raspberry Pi* (BCM2708) has been monitored since long time at the OSADL QA Farm. It is located at rack #7/slot #3. Unfortunately, the network is not very stable, it has so-so real-time capabilities, and the system needs to be rebooted from time to time.

We, therefore, were curious to see how the next version, the four-core Raspberry Pi* (BCM2709), would behave. First attempts to run a PREEMPT_RT patched Linux kernel on it were frustrating, since the board did not boot at all. The best we could achieve was a patched kernel with configuration CONFIG_PREEMPT_RT_BASE=y but CONFIG_PREEMPT_RT_FULL unset.

A workaround

Recently, however, it was found that the boot failure was related to the FIQ exception handler that the driver of the Synopsis DWC host USB controller is using by default. Fortunately, this driver may be instructed not to use FIQ with the two additional kernel parameters

dwc_otg.fiq_enable=0dwc_otg.fiq_fsm_enable=0which works around the problem, so the board could be added to the OSADL QA Farm for further testing. It is located in rack #b/slot #3 and under continuous monitoring of all QA Farm standard variables. These are our initial findings:

A final solution instead of a workaround

The solution, of course, would be to understand the crashes of the FIQ exception handler and to fix the underlying problem. We found that the lockup occurs when the IRQ handler thread gets preempted while it holds the FIQ spin lock. The solution, thus, is to disable the IRQ while the FIQ spin lock is held, irrespective of whether the interrupt handler is threaded or not. The patch that was created for this purpose introduces two new macros

fiq_fsm_spin_lock_irqsavefiq_fsm_spin_unlock_irqrestore

to facilitate the implementation. The complete patch is available here. To better compare the two kernel versions – original kernel with FIQ disabled vs. patched kernel with FIQ enabled – a shadow system was installed at rack #b, slot #3 that runs the patched kernel. So far, everything runs well.