IRIX Binary Compatibility, Part 5Reverse Engineering Threading

Welcome back to our series on IRIX binary compatibility. In this part, we
will study IRIX and NetBSD threading models. We will also examine how it is
possible to emulate IRIX native threads on NetBSD, though NetBSD does not
support a similar feature for its native binaries.

Introduction to Multithreading

Traditional UNIX systems implement multitasking by using multiple
processes. A process is the unit of execution. It has an execution context and
a virtual memory space, both private and specific to the process.

This scheme has some drawbacks. Consider the need for efficient
interprocess communication. Traditionally, the programmer needs to arrange for
an area of memory to be shared between processes. This makes the program more
complex. Another problem is that switching virtual memory spaces on process
switches can be expensive.

The solution to these problems is multithreading. In UNIX terminology,
multithreading is the ability to run multiple execution contexts within the
same virtual address space. Each execution context is known as a thread. The
set of threads sharing the same address space is usually known as the process.
Since memory is shared among the different threads of a process, communication
between threads is transparent and efficient.

There are two popular ways of implementing threads on a UNIX system: kernel
assisted threads and user threads.

With user threads, the kernel is not aware that there are multiple
execution contexts in the process. Everything is done in userland, by timely
switching between the threads. The advantage of this method is that there is no
need for kernel support, which means that it is easy to implement user threads
on any UNIX system. Additionally, there is an existing standard for the
programming interface: POSIX.1c. Because of these two advantages, user threads
are often referred as portable threads or POSIX threads.

In This Series

IRIX Binary Compatibility, Part 6
With IRIX threads emulated, it's time to emulate share groups, a building block of parallel processing. Emmanuel Dreyfus digs deep into his bag of reverse engineering tricks to demonstrate how headers, documentation, a debugger, and a lot of luck are helping NetBSD build a binary compatibility layer for IRIX.

IRIX Binary Compatibility, Part 2
Emmanual Dreyfus shows us how he implemented the things necessary to start an IRIX binary. These things include the program's arguments, environment, and for dynamic binaries, the ELF auxiliary table, which is used by the dynamic linker to learn how to link the program.

IRIX Binary Compatibility, Part 1
This article details the IRIX binary compatibility
implementation for the NetBSD operating system. It covers creating a new emulation subsystem inside the NetBSD kernel as well as some reverse engineering to understand and reproduce how IRIX internals work.

The big disadvantage of user threads is that because the kernel has no
knowledge of the existence of the threads, when one thread makes a blocking
system call, the process, and therefore all of the threads in that process, are
blocked. Also, because the kernel only sees one process, user threads cannot
take advantage of multiple processors.

Kernel assisted threads are designed to address these two drawbacks. With
this scheme, the kernel is responsible for scheduling the different threads
within a process. Because it is aware of their existence, the kernel can
schedule other threads of a process when a thread makes a blocking system call.
The kernel is also able to balance multiple threads of a single process among
multiple processors. One big disadvantage of kernel assisted threads is that it
requires kernel support, which means that some UNIX kernels will not support
them. Kernel assisted threads are not easily portable. This is why kernel
assisted threads are also known as native threads.

Another big drawback is that there is no standardized programming interface
for handling native threads, and each OS has its own set of system calls syntax
and semantics. For instance, Linux creates native threads using Linux uses clone(2)
and IRIX uses sproc(2).

One side note about kernel processes. Kernel processes are UNIX processes
running entirely in kernel mode and never executing in user mode. Examples of
kernel processes are the pagedaemon(8) on NetBSD, whose job is to
swap out pages that have been unused for some time. Because it only manipulates
kernel structures, pagedaemon does not need to run in user mode, therefore it is named as a kernel process. The problem is that kernel processes
are sometimes referred as kernel threads because they are different execution
contexts all sharing the kernel virtual memory space. This creates confusion
with native threads, which do exactly the same thing though they share a user
virtual memory space. In order to avoid confusion, we will try to speak about
kernel processes and kernel assisted threads (or native threads) using these
names in this document.