Porting RTOS Device Drivers to Embedded Linux

Linux has taken the embedded marketplace by storm. According to
industry analysts, one-third to one-half of new embedded 32- and
64-bit designs employ Linux. Embedded Linux already dominates
multiple application spaces, including SOHO networking and
imaging/multifunction peripherals, and it now is making vast strides in
storage (NAS/SAN), digital home entertainment (HDTV/PVR/DVR/STB)
and handheld/wireless, especially in digital mobile phones.

New embedded Linux applications do not spring, Minerva-like, from
the heads of developers; a majority of projects must accommodate
thousands, even millions of lines of legacy source code. Although
hundreds of embedded projects have successfully ported existing code
from such platforms as Wind River's VxWorks and pSOS, VRTX, Nucleus and
other RTOSes to Linux, the exercise is still nontrivial.

To date, the majority of literature on migration from legacy RTOS
applications to embedded Linux has focused on RTOS APIs, tasking and
scheduling models and how they map to Linux user-space equivalents.
Equally important in the I/O-intensive sphere of embedded programming
is porting RTOS application hardware interface code to the more
formal Linux device driver model.

This article surveys several common approaches to memory-mapped
I/O frequently found in legacy embedded applications. These range
from ad hoc use of interrupt service routines (ISRs) and user-thread
hardware access to the semi-formal driver models found in some RTOS
repertoires. It also presents heuristics and methodologies for
transforming RTOS code into well-formed Linux device drivers. In
particular, the article focuses on memory mapping in RTOS code
vs. Linux, porting queue-based I/O schemes and redefining RTOS
I/O for native Linux drivers and dæmons.

RTOS I/O Concepts

The word that best describes most I/O in RTOS-based systems is
informal. Most RTOSes were designed for older MMU-less CPUs,
so they ignore memory management even when an MMU is present and
make no distinction between logical and physical addressing. Most
RTOSes also execute entirely in privileged state (system mode),
ostensibly to enhance performance. As such, all RTOS application
and system code has access to the entire machine address space,
memory-mapped devices and I/O instructions. Indeed, it is very
difficult to distinguish RTOS application code from driver code
even when such distinctions exist.

This informal architecture leads to ad hoc implementations of I/O
and, in many cases, the complete absence of a recognizable device
driver model. In light of this egalitarian non-partitioning of work,
it is instructive to review a few key concepts and practices as
they apply to RTOS-based software.

In-Line Memory-Mapped Access

When commercial RTOS products became available in the mid-1980s,
most embedded software consisted of big mainline loops with polled
I/O and ISRs for time-critical operations. Developers designed RTOSes
and executives into their projects mostly to enhance concurrency
and aid in synchronization of multitasking, but they eschewed any
other constructs that got in the way. As such, even when an RTOS
offered I/O formalisms, embedded programmers continued to perform
I/O in-line:

More disciplined developers usually segregate all such in-line I/O
code from hardware-independent code, but I have encountered plenty
of I/O spaghetti as well.
When faced with pervasive in-line memory-mapped I/O usage, embedded
developers who are new to Linux always face the temptation to
port all such code as-is to user space, converting the
#define of
register addresses to calls to mmap(). This approach works fine for
some types of prototyping, but it cannot support interrupt processing,
has limited real-time responsiveness, is not particularly secure
and is not suitable for commercial deployment.

RTOS ISRs

In Linux, interrupt service is exclusively the domain of the kernel.
With an RTOS, ISR code is free-form and often indistinguishable from
application code, other than in the return sequence. Many RTOSes
offer a system call or macro that lets code detect its own context,
such as the Wind River VxWorks intContext(). Common also is the
use of standard libraries by ISRs, with accompanying reentrancy
and portability challenges.

Most RTOSes support the registration of ISR code and handle interrupt
arbitration and ISR dispatch. Some primitive embedded
executives, however, support only direct insertion of ISR start
addresses into hardware vector tables.
Even if you attempt to perform read and write operations in-line
in user space, you have to put your Linux ISR into kernel space.

Comment viewing options

As the author and a few commenters rightfully noted you can go the easy path by mapping all VxWorks tasks to Linux user mode processes/threads. The downside is that the performance hit can be huge (see above comment about mmap overhead).

Fortunately, it looks like now there is a solution - www.femtolinux.com allows to run user processes in kernel mode, removing the user/kernel barrier. FemtoLinux processes are pretty much identical to VxWorks tasks.

15 years as a VxWorks developer, now doing the linux side of the game. 99% of the time, the "real driver" approach is the preferred one - you get protection, etc. (I've ported almost all of my old VxWorks drivers to Linux that way) but there is the odd case - and I'm dealing with one now, where mmap() may buy you the realtime response you need - where even the interrupts are too slow.
(porting from linux to VxWorks is the easy direction - you're going from protected to unprotected,life is easy, aside from a few calls that aren't allowed.)
My catch at the moment though - on the architecture that I'm working with, is that the mmap call is expensive - more than you might think. Each access, by the time it has rolled up and unrolled the various page tables, is appearing to take 700ns - dropping memory bandwidth to less than 14MB/Sec. And that bytes. Pun intended.
Like everything, you've got to evaluate what you're doing, and why