The Benefits of RTOSes in the Embedded IoT

RTOSes will play a greater role in embedded microcontroler designs for the Internet of Things as developers migrate from 8-bit and 16-bit to 32-bit MCUs where there will be requirements for enhanced device functionality, low cost and higher performance.

Industry estimates predict there will be 25 Billion Internet of Things (IoT) devices shipping by the year 2019 (source: Business Insider). Since an IoT device will require some type of network connectivity (e.g., Wi-Fi, Bluetooth LE, ZigBee, Ethernet), and many will also include a graphical user interface (GUI), these new IoT devices generally will require 32-bit microprocessors in order to provide the necessary address space and processing power to support such connectivity and a responsive GUI.

There already is strong migration from 8-bit and 16-bit microprocessors due to requirements for enhanced device functionality as well as the attractive cost/performance attributes of new 32-bit microprocessors. The predicted IOT explosion promises to sharply accelerate this migration.

On the hardware front, the migration to 32-bit processors is clear, but what about the software side? Just the increased connectivity requirements alone necessitate the execution of communication protocol stacks on the 32-bit embedded microprocessor, which in turn necessitates the use of a real-time operating system (RTOS).

GUI design and runtime software from third parties typically rely on RTOS services as well. With the rapid growth of the IoT and the new devices being developed to take advantage of it, it’s becoming more likely that you will consider using an RTOS in the near future.

There are many benefits to using an RTOS, which is why more and more embedded devices are using them. In fact, according to the most recent UBM Embedded Developer survey released in May 2015, over 60 percent of current projects include real-time capabilities, over a third include a GUI, and over 70 percent report using an RTOS or scheduler of some kind. Of those who selected a commercial RTOS, 53 percent cited the number one reason as "real-time capability." To decide whether your application might benefit from the use of an RTOS, here are some benefits to consider.

Benefit #1: Fast, Guaranteed Real-Time Responsiveness
An RTOS invisibly handles allocation of the processor to the threads that perform the various duties of the embedded device. With the proper assignment of thread priorities, the application software does not have to concern itself with how much processing time each individual thread takes. Even better, changes in the processing responsibilities of a given thread do not impact the processor allocation (responsiveness) of higher-priority threads.

Many legacy 8/16-bit devices employ a polling-loop software architecture to distribute processing time among its various threads. Although easy to understand and appropriate in very simple devices, this approach suffers when used in more complex devices. The fundamental problem of the polling loop is that responsiveness of any thread in the loop is determined by the processing in the rest of the polling loop.

The worst-case responsiveness is effectively the worst-case processing through the polling loop. If processing in the polling loop changes dynamically, so does the responsiveness of each thread. As greater complexity is added to the polling loop, the more difficult it becomes to predict and meet real-time requirements. Take, for example, an existing device that must sample an external sensor in real-time such as a simple polling loop that examines a real-time sensor and then performs three additional functions.

This simple polling loop examines a real-time sensor and then performs three additional functions.

The developer has complete access and understanding of this closed, simple system such that the real-time interrogation of the external sensor can be guaranteed. However, if this example device then needs to be connected to the cloud, the polling loop would need to be expanded to also invoke a software protocol stack for the cloud communication.

Adding this extra processing to the polling loop might reduce the responsiveness to a level where the external sensor is no longer sampled properly. In addition, communication protocols often range from 50KB to 100KB in size, which indicates a level of complexity that might make it difficult to calculate its worst-case processing requirement.

Example of polling loop that makes an additional call to the cloud. Just adding this one call can reduce responsiveness enough that the external sensor would no longer be correctly sampled.

In sharp contrast, the response time using an RTOS is constant. In the same example, the protocol stack processing could be placed (or called from) a lower-priority thread, while the real-time sensor sampling processing could be placed in a higher-priority thread. In this case, the RTOS would ensure that the real-time sensor sampling thread always takes precedence over the communication protocol stack.

This results in completely deterministic (on the order of 120 cycles on a Cortex-M) and faster response times for the external sensor sampling thread. And of course, this is all done invisibly to the application software.

Benefit #2: Reduced Overhead
A possible enhancement to the polling-loop responsiveness problem is to simply add more polling. By placing more polling calls throughout the code, some of the responsiveness problems can be solved. However, this approach increases overhead, which drains power and decreases throughput. Worse, it requires the developer to tackle more and more issues of context switching and preemption, both of which are already solved by an RTOS.

Although an RTOS does introduce some overhead (in API calls and context switching), the amount of overhead is small and constant. Once a simple polling loop becomes more complex, it’s likely the overhead in the polling loop will exceed that of an RTOS.

RTOS context-switching on a Cortex-M typically takes less than 120 cycles (this can vary from architecture to architecture and RTOS to RTOS). In order for a polling loop (or other alternative to an RTOS) to do better, it would have to be able to predict and guarantee that the worst-case delay in activating a thread with something to do would be less than 120 cycles (which is about 50-60 lines of C code).

That means that the time taken to check each thread in the loop, and to execute the code for any thread that has something to do, would have to be less than 60 instructions total. Even if no other thread had anything to do, just checking each thread would consume cycles, until the thread that requires servicing is checked.

An RTOS context switch is often more efficient than a polling loop. With an RTOS such as ThreadX, the polling loop would need to improve on 120 cycles or less than 60 instructions total—a real challenge when all loops must be checked and executed.

For loops with 5-10 threads, this might work (again, assuming that no intervening thread has work to do). Once the loop gets more lengthy, those 60 instructions become inadequate to get all the way around. And, in a worst case analysis, ALL the intervening threads must be given processing time, making real-time response virtually impossible even for a minimal 2-thread loop.

Benefit #3: Simplified Development
In legacy, small device development, each firmware developer on the team is required to have intimate knowledge of the run-time behavior and requirement of the device in total because the processor allocation logic is dispersed though the application code. Small device firmware projects (typically less than 32KB of total memory) can be reasonably well-handled by one or two firmware developers.

Given the relatively small code size, the developers have a fighting chance to understand the processing requirements of all the firmware. However, as functionality of the device increases (e.g., cloud communication protocols, etc.), the development team must be expanded and the knowledge of all the firmware processing requirements naturally becomes less well-known. Communication among the code modules developed by each team member must then be designed and implemented, to allow for inter-thread synchronization and information exchange.

An RTOS eases development as the device functionality increases by eliminating the need to fully understand the processing requirements of every firmware component. With an RTOS, firmware developers can concentrate on their specific piece of the firmware and not have to worry about the processing requirements of the other firmware in the device. Moreover, they have inter-thread communication services (e.g., messaging, semaphores, etc.) that are efficient, consistent, and well-defined.

Benefit #4: Ease of Feature Upgrade
An RTOS invisibly handles the processor allocation logic, such that real-time performance of a high-priority thread can easily be guaranteed – regardless of whether the firmware is 32KB or 1MB in size, and regardless of the number of threads in the application. This alone makes it easier to maintain the application and easier to add new features to a device.

In addition, most commercial RTOS offerings also have an extensive set of middleware that is pre-integrated and ready to be deployed. Having the ability to easily add networking, files systems, USB, and graphical user interfaces makes adding new features to a device that much easier.

By integrating the RTOS with the middleware, it is possible to reduce the time to market for developers and optimize system performance.

Benefit #5: Easier Application Portability
Applications that use an RTOS access its service functions through an application programming interface (API). The API makes the RTOS platform-independent, meaning it's the same regardless of what processor it's running on.

That makes switching processors easier, since none of the application's service references have to be changed. The application will run anywhere the RTOS can run. With most popular commercial and open source RTOSes, that means virtually any 32-bit processor architecture. This gives developers the benefit of application portability with minimal changes to their code.

Benefit #6: Safety Certification
Many safety-critical systems require certification by responsible government or other regulatory authorities. This generally requires that the system developer provide artifacts for all of the software in the system. Many commercial RTOSes are pre-certified by these regulatory authorities, and that offers huge benefits to developers.

By using a pre-certified RTOS, no artifacts need be provided for the RTOS, speeding the process of certification of the system. Without using a pre-certified RTOS, developers must either qualify their own scheduling code, or provide artifacts for the RTOS software they are using - often at a cost or additional time.

When Is an RTOS Overkill?
While it’s hard to precisely state the criteria for needing an RTOS, as a rough guideline when the total memory (ROM and RAM combined) of a device is less than 16KB, there is a decent chance that using an RTOS is overkill or impossible. Such devices typically have a dedicated purpose and most often use an 8-bit or 16-bit microprocessor.

Once device firmware exceeds 32KB of total memory and/or utilizes a GUI or communication protocol (like all IoT devices will), it will almost certainly need an RTOS. Even a small 32KB, dedicated-purpose, non-IoT device could benefit by using an RTOS by simply isolating foreground and background processing into two separate threads. This configuration would only cost an additional 3KB in total memory for the RTOS, while making the device firmware much simpler and much easier to enhance in the future.

With the continued migration to 32-bit microprocessors and billions of new embedded IoT devices coming to market in the next several years, there is a strong case for using a commercial RTOS. Coupled with the relatively small cost of using a commercial RTOS, it is practically a foregone conclusion that an RTOS is in your near future – if you aren’t using one already.

-- William E. Lamie, co-founder and CEO of Express Logic, Inc., is the architect of the ThreadX RTOS. Prior to founding Express Logic, he authored the Nucleus RTOS and co-founded Accelerated Technology.