Thread Pool (part 1)

published:
Thu, 1-Dec-2005
|
updated: Tue, 7-Mar-2006

So a reader commented about my WaitableThread
class. "Couldn’t you use a custom thread pool for this?" and
encouraged me to take a look at a "smart" thread pool on
Code Project. As it happened,
the one he was recommending was one I'd taken a look at previously.
Yes, you see the developer who wanted to launch lots of threads (see
my previous post) had discovered this implementation and was all for
using it. I took a look at it and decided that I could see enough
problems on a cursory examination that I didn't want to use it for
real code. Hence, the effort to write a WaitableThread class.

(Honestly I don't want to list the issues I found with this thread
pool, but a lot of them had to do with the class model design and how
well it was tested -- wishy-washy issues, if you like. Nevertheless, I
would advise you not to use it with .NET 2.0, especially not on a 32-bit
Pentium, since it does have
at
least one bug that may show up on a 64-bit processor.)

So my correspondent replied, "I hadn’t looked at the code, but I do
like the idea of using my own reusable pool of threads (or other
objects). Done right, it seems a good way to avoid overloading the
framework’s thread pool, control the number of threads, and prevent
garbage collector abuse. I haven’t taken this route on anything, but
I’ve considered it a few times."

Reading between the lines, I took this as a gauntlet contemptuously
thrown down: "Well, then, Mr SmartyPants, write a better one!" (Well,
OK, my reader didn’t really mean this, but it sounds good. A defiant
act to stir the emotions, eh?)

So, let's see what we can do. My proposal is this: I'll design and
write a thread pool class for .NET in C#, using the best TDD I know. A
little like Ron Jeffries does
on his site. This way, you get
a decent thread pool you can use, an understanding of what it does
(and what it doesn't do), and also an idea of what it's like to write
something non-trivial TDD style. I'll be taking many days to do this,
by the way; obviously I still have a real job to do, so don't take the
elapsed calendar time as an indication of how much work TDD is.

If you like, email me as I
go along to proffer other requirements, to give me encouragement, or
to lay into me about my obvious dearth of programming and design
skills. I'll try and incorporate suggestions and advice into
successive posts.

So, without further ado, here's my initial list of requirements for a
thread pool.

The user should be able to queue tasks for the thread pool to execute

The thread pool, at its discretion, should dequeue a task and execute it

The tasks should have a priority (one of a limited number)

Higher-priority tasks are dequeued before lower-priority tasks

Higher-priority tasks do not pre-empt lower-priority tasks that
are already executing

A task should be "waitable", that is, the user should be able to
block/wait on the task until it completes (in .NET terms, the task
should be a WaitHandle). This means the user code can queue up a bunch
of tasks and then wait for the first to finish (or all of them to
finish).

The thread pool should execute each task in a separate thread

The thread pool should be thread-safe (that is, callable from many
user threads in a thread-safe manner)

The thread pool should solely be responsible for managing its threads,
the user should have no contact with them

The user should be able to abort a task

If the task has not yet started, it is merely dequeued and thrown away

If the task is executing, the thread pool will make a best effort to
abort the thread

First it will signal a flag to ask the task to abort as quickly and as
safely as possible

Second, if a given timeout expires, the thread (and hence the task) is
simply aborted

If the task has already completed or cannot be found (that is, the
thread pool has never seen it), nothing happens

The thread pool, at its discretion, can add one or more threads so
that it can execute more tasks in parallel

Once a thread has completed a task, it should wait until another task
is available

There should be a user-defined millisecond timeout on this behavior

0 means "don’t wait, just terminate"

If the timeout occurs, the thread should terminate

This behavior allows the number of threads to decrease

The user can specify the maximum number of threads the thread pool can
use

The thread pool should maintain some kind of statistics on its
performance

Details to be decided, but may include number of tasks waiting,
executing, completed, some timing data, etc

The statistics should be queryable as the thread pool is running

The thread pool should be able to log exception, warning,
informational, etc events

Informational events could include

A task is enqueued, dequeued, completed

A new thread is started

An existing thread terminates

Warning events could include

A task is aborted

Exception events should include

Any exception thrown inside the thread pool and its related classes

The thread pool should not be a singleton (that is, it should be a
normal instance class)

On completion (whether the task completed normally, or was aborted),
the user should be able to query some statistics about the task (e.g.,
time spent queued, time spent executing, completion status).

Wow, I think that'll do for a start.

Already I can see several possible classes making themselves known
from the requirements as written. The thread pool class for a
kick-off, a task class, maybe a thread class,a priority queue (mmm,
how did that get in? funny that), some kind of log. However, at this
stage, that's as far as I want to go: maybe I'm wrong about some of
them. We'll start exploring the domain model with TDD next time.