Introduction

This page is intended to serve as a collecting point for presentations, documents, results, links and descriptions
about testing Realtime performance of Linux systems. In the first section, please upload or place links to presentations
or documentsion on the subject of RT testing for linux.

Terminology

This document uses the definitions for real time terminology found in: Real Time Terms

Test programs

RT Measurement programs

Here is a list of programs that have been used for realtime testing:

lpptest

lpptest - included in the RT-preempt patch

It consists of a

1. driver in the linux kernel, to toggle a bit on the parallel port, and watch for a response toggle back

RealFeel

This program is a very simple test of how well a periodic interrupt is processed.
The program programs a periodic interrupt using /dev/rtc to fire at a fixed interval.
The program measures the time duration from interrupt to interrupt, and compares
this to the expected value for the duration. This simple program just prints a list
of variances from the expected value, forever.

This program uses the TSC in user space for timestamps.

RealFeel (ETRI version rf-etri)

This program (latency.c) extends realfeel in several ways:

it adds command line arguments to allow runtime control of most parameters

it adds a histogram feature to dump the results to a histogram

it can do both linear and logarithmic histograms

it locks the process pages in memory (very important)

it changes the scheduling priority to SCHED_FIFO, at highest priority (very important)

it adds conditional code to trigger output to a parallel port pin (for capture to an external probe or logic analyzer)

it abstracts the routine to get the timestamp, with the function: getticks()

it handles the interrupt signal and does a clean exit of the main loop (on user break?)

it tracks min, max and average latency for whole run, and for every 1000 cycles of the loop

it adds a timestamp to the /dev/rtc driver, and reads this as part of the rtc data

Woerner test

Trevor Woerner wrote an interesting test which received an interrupt on the
serial port, and pushed data through several processes, before sending
back out the serial port. This test requires an external machine for triggering
the test and measuring the results.

Issues and Techniques

This is a list of issues and techniques for dealing with them, having to do with
testing realtime performance in Linux.

ping flood isn't good as stress test

At one of the sessions at ELC 2007, Nicholas McGuire stated that a pingflood test
is actually a poor test of RT performance, since it causes locality in the networking
code rather than stressing the system.

Here is a list of issues that have to be dealt with:

what tests are available on all platforms?

is special clock hardware or registers required for a test (e.g. realfeel, which only supports i386?)

does the program cross-compile?

Does generation of the test conditions perturb the test results?

Is special external hardware required?

How is the system stressed?

How to stress memory (cause cache-flushes and swapping)

How to stress bad code paths (long error paths, fault injection?)

How is performance measured?

Using the LATENCY_TRACE option

Quote about latency-test from Ingo:

I'm seeing roughly half of that worst-case IRQ latency on similar
hardware (2GHz Athlon64), so i believe your system has some hardware
latency that masks the capabilities of the underlying RTOS. It would be
interesting to see IRQSOFF_TIMING + LATENCY_TRACE critical path
information from the -RT tree. Just enable those two options in the
.config (on the host side), and do:
echo 0 > /proc/sys/kernel/preempt_max_latency
and the kernel will begin measuring and tracing worst-case latency
paths. Then put some load on the host when you see a 50+ usec latency
reported to the syslog, send me the /proc/latency_trace. It should be a
matter of a few minutes to capture this information.

Number of samples recommended

Ingo wrote:

also, i'm wondering why you tested with only 1,000,000 samples. I
routinely do 100,000,000 sample tests, and i did one overnight test with
more than 1 billion samples, and the latency difference is quite
significant between say 1,000,000 samples and 100,000,000 samples. All
you need to do is to increase the rate of interrupts generated by the
logger - e.g. my testbox can handle 80,000 irqs/sec with only 15% CPU
overhead.

Things to watch for in testing

> First things first, we want to report back that our setup is validated
> before we go onto this one. So we've modified LRTBF to do the
> busy-wait thing.
here's another bug in the way you are testing PREEMPT_RT irq latencies.
Right now you are doing this in lrtbf-0.1a/drivers/par-test.c:
if (request_irq ( PAR_TEST_IRQ,
&par_test_irq_handler,
#if CONFIG_PREEMPT_RT
SA_NODELAY,
#else //!CONFIG_PREEMPT_RT
SA_INTERRUPT,
#endif //PREEMPT_RT
you should set the SA_INTERRUPT flag in the PREEMPT_RT case too! I.e.
the relevant line above should be:
SA_NODELAY | SA_INTERRUPT,
otherwise par_test_irq_handler will run with interrupts enabled, opening
the window for other interrupts to be injected and increasing the
worst-case latency! Take a look at drivers/char/lpptest.c how to do this
properly. Also, double-check that there is no IRQ 7 thread running on
the PREEMPT_RT kernel, to make sure you are measuring irq latencies.

Initial results were that linux.2.4.20 was 3X fast for best-case interrupt latency

After reviewing code and finding that the interrupt code path was almost identical, a different, more lightweight tracer was used (Zoom-in tracer) showing latencies were almost the same between 2.4 kernel and 2.6 kernel

This describes RT features and how they evolved from 2.4.20 to 2.6.16. Test results are shown for preemptible kernel (2.4.20), voluntary preemption, RT-preempt, and hybrid kernel approach (RTAI). The platforms tested were an SH4 board and an EDEN board, with a VIA processor (i386 clone). RT-preempt is shown to have good RT characteristics, for later kernel versions.

Notes on ineffective tests

Nicholas McGuire wrote:

The tests noted in the LKML post on this page are very problematic,
ping - -f is not testing RT at all, it keeps the kernel in a very small active
page set thus reducing page related penalties, the while loop using dd
is also not too helpfull as it will de-facto run only in memory and cause
absolutely no disk/mass-storage related interaction (try the same with
mount -o remount,sync / first and it will be devastating ! (limited to ext2/ext3/ufs))

Notes on test requirements - need to test kernel error paths

Nicholas McGuire wrote:

The big problem with RT tests published is that they are all looking at the good case,
they are loading the system but assuming successfull operations. The worst cases pop
up when you run in the error paths of the kernel - then a trivial application can
induce very large jitter in the system (run crashme in the background and rerun
the tests...)

Notes on test requirements - need for usage profile

Also lmbench can give a statistic view of things (and not even that very precisely
in some case i.e. context switch measurements are flawed) so this is not of much
help for descision makers which variant to use - it does not help if the average
performance is good but the mobile phone or mp3 klicks at 1s intervals
"deterministically" - so I guess RT benchmarks need a notion of usage-profile
to be of value.