Re: On SMP

:Hi Matt,
:
:based on this post and the one about X.org running with threads, there is
:something I don't understand.
:
:DFly has light-weight kernel threads, right? So more that one thread
:(process?) can be in the kernel at once, right?
:
:What is the issue with userland threads, pthreads, libc_r, and so on? What
:is the 1:1 threading library you mentioned?
:
:Sorry for the naive questions, but I guess I've never harmonized my generic
:knowledge of threads (Win32, python, Java) with POSIX/*nix/(DFly)BSD
:threading primitives.
:
:Jonathon
Its a good question. I'll try to give a summary.
First, don't get confused between user threads and kernel threads.
They are whole different entities. The fact that DragonFly has
kernel threads and kernel processes does not help us deal with
the userland threading problem much at all.
There are two major facets involved with userland threading:
(1) The concept of a user 'thread'. Every user thread has an execution
context and its own stack, and other things.
(2) The concept of a kernel context, used when the userland thread
performs a system call. A kernel context needs its own stack.
Now, in a non-threaded program there is only one user 'thread' and
only one kernel context (the kernel process). Its ok for the
one user thread to make a system call which might block in the kernel
because, well, there is only one thread anyway.
In a multi-threaded userland program you have M user threads rather
then just one.
The problem the threading library faces is how to handle the case
when the kernel blocks. If a threading library is trying to manage,
say, 100 userland threads with only one kernel context, then it cannot
allow the kernel to block on any system call because that would wind
up blocking all 100 threads instead of just the one that made the system
call.
There are many ways to solve the problem, here are four of them:
* The threading library implements a M:1 model. This is the 'select'
or 'kqueue' method of dealing with blocking conditions. The threading
library is written so as to never make a system call which might block.
It uses select() and non-blocking I/O to determine when one of its
user threads needs to 'block', without actually blocking in the
kernel. This was how the original BSD threading library worked.
This model can also be used to implement M:N, at least to a degree.
The problem with this model is that most I/O related system calls that
might be just one system call in a normal program wind up being three
or four using this model. The overhead can get nasty. Plus there are
other problems... since there is only one real kernel process POSIX
signal sharing requirements can cause problems.
* The threading library implements a 1:1 model. That is, for every
user thread created the threading library rfork()'s a new process.
that way any given thread can 'block' in the kernel without blocking
the other threads. This is basically the model that linux uses.
* The threading library implements a M:N model where M is the number of
user threads and N is a dynamic number of kernel contexts. There is
still only one process but temporary kernel contexts are created
whenever a system call might block. When the system call blocks,
the kernel performs an 'upcall' to userland to allow it to continue
running with a new kernel context while the other one is blocked.
This is the KSE model that FreeBSD implements.
* The threading library implements a M:NCPUS model where M is the
number of user threads and NCPUS is the number of cpus in the system.
The library creates a kernel context for each cpu and manages any
number of threads using those fixed number of contexts. This requires
asynchronous system call support -- The syscall messaging that we
are slowly implementing in DragonFly.
This is theoretically the most efficient threading model, baring
discussions on messaged syscall overhead. This is the model I would
eventually like to be using in DragonFly.
There are other models.
In anycase, the core problem with all of these models is how to handle
the case where a system call blocks.
In the M:1 kqueue/select model the thread library tries to maintain
total control in order to avoid the case where the kernel might block
unexpectedly on it.
In the 1:1 model the threading library doesn't try to control blockages,
it just lets them happen and gives each user thread a kernel context so
it can block without effecting other threads.
In the M:N model where N is dynamic... the KSE model, the kernel tries to
return control to userland when it would otherwise block to allow userland
to run another thread.
In the M:NCPUS messaged syscall model, the kernel provides an asynchronous
system call interface so the userland threading library does not have
to do any fancy workarounds to maintain control.
The biggest problem we face in dealing with these models is simply the
fact that the kernel codebase was not originally designed with massive
resource sharing or multiple-contexts-referencing-the-same-process
at the same time. For example, a typical system call mostly assumes
that elements of the process structure will not be ripped out from
under it in the middle of the system call if it happens to block.
-Matt
Matthew Dillon
<dillon@xxxxxxxxxxxxx>