Introduction

Multithreading and the necessary synchronization can sometimes
be quite challenging. Windows 2000 brings some new concepts into
play which doesnt make it really much easier to cope with this topic.
Processes, threads, fibers, jobs and whatelse is available to "help"
writing applications that do many things at the same time.

Most less experienced developers get lost while trying to figure
out what to use and when. After finally having figured out
how to multithread, the joy of debugging a multithreaded application
starts. Things start to get really funny when you finally have a
multithreaded solution up and running and it deadlocks on a multiprocessor
machine. Never assume that your multithreaded application behaves
correctly until it runs sucessfully on a multiprocessor machine.

There are many factors that can cause your concept
to fail: compiler optimizations due to missing or incomplete
variable declarations (e.g. missing volatile), deadlocks
because of unexpected execution behavior, simple logic errors and many others.

Job-based Multithreading

In many cases a solution that creates one thread
for each job could just blow the system and achive the
opposite effect: a slowdown due to too many threads running.
Imagine a class to enumerate and process files and directories. A simple
implementation could just launch one thread for each directory
found to speed up processing.

The idea is good, but can you imagine what will happen
if your customer has limited throughput on disk I/O but his
disc-controller is quite good in caching? The result will be an
explosion of threads because directory and file enumeration
is fast because of the disc cache, but processing of the files contents
is slow as a result of the limited disc I/O.

A better solution in most cases is a limited set of threads
(known as a thread pool). A queue of jobs feeds the threads with work.
While Windows 2000 and Windows XP support
thread pooling and
jobs, NT4 and Win9x do not. Which makes it just a bit more difficult
to write cross-Windows applications.

The CJobManager class brings the advantages of a
thread pool and jobs down to the C++ level and is fully cross-windows
(although I actually did'nt test it under Win9x).

CJobManager provides a very flexible interface
to be of use in most applications that "require" multithreading
while still supporting GUI processing. In one of the comming articles
I will present a more sophisticated approach with guarded execution
and secure multithreaded object orientation.

But for now, lets stick with the simple
implementation of CJobManager.

The Class

CJobManager defaults to be used as a
singleton, thus has no public constructor/destructor.
Your application gets a CJobManager object by
calling the static memberfunction
CJobManager::GetJobManager(BOOL bNewManager, int nThreads).

Anyway, by specifying TRUE as the first
argument, a new instance of a CJobManager will be allocated
and initialized. Be sure to save the returned pointer, as the function
does not track created instances.

The most important function is
CJobManager::AddJob(LPTHREADEDJOB pJob).
The argument pJob points to a simple structure which carries
all information needed to process a job. This structure is the point
where you can extend the implementation of a job to accomodate your needs.

The class implementation defaults to be
non-terminating to its threads. This means any
instance of CJobManager will wait for completion
of the running jobs without interrupting the threads. You can
however, specify that you definitly want the threads to be
terminated when the application needs to, by calling SetTerminate(TRUE).

You can at any time increase or decrease the number
of threads (default is eight) in a instance
of CJobManager by calling SetThreads(nThreads).
When in non-terminating mode, the function will wait for
each thread to finish its job before removing it from the pool.

If your application wants to wait for completion of the jobs
it can call CJobManager::WaitCompletion(BOOL bCurrentJobsOnly).
WaitCompletion returns only after all jobs have been processed.
It does allow message processing by calling the
CJobManager::ProcessMessages() function.

When your application needs to poll for
completion of the jobs it calls CJobManager::IsCompleted().
Together with CJobManager::Wait(DWORD dwMillisec, BOOL bGUIprocessing)
you can construct wait loops to do some processing while
waiting for completion of the jobs.

The CJobManager class also supports
a feedback function that can be called from within a jobs
processing function. The current job provides this function
through Feedback(LPVOID pFeedback). This function will
return TRUE when no feedback function is provided, else
it calls the function and returns the return value. The intention of
this feedback function is on one hand, to allow for real feedback on
actual processed data which can be provided to the feedback function
with the LPVOID argument, and on the other hand to allow ending the
current job by returning FALSE.

Call CJobManager::SetFeedback(PJOBMANAGER_FEEDBACK pfn)
to set a feedback function.

The Sample

The included sample uses a modified
CFileInfo class
from Antonio Tejada Lacaci to enumerate all files from a
choosen drive. The sample starts processing at the root of
the drive and adds a job for every found directory. These jobs
are processed by the default eight threads. The processing can be
interrupted at any time with the "Stop" button. Every 250 milliseconds
the dialog updates the count of found files.

Note: the sample leaks memory when you close the dialog while
its still running because some objects are not freed when
terminating the CJobManager.

Compatibility

Tested under Windows XP, Windows 2000 and NT4SP6 on single
and dual processor machines. UNICODE compatible
but not included in this sample. VC6.

Share

About the Author

Comments and Discussions

Hi Andreas,
Could you please help me with implementing this class through an asp page. I have already made the component that is called from a simple asp page. The component executes alright but the code control does not return to the next piece of code to be executed in the asp page.
Thanks for the help in advance, Peter

Could u guide me as to how i could use the CManager class and the structure into an ATL component that i have already have. My ATL component works upon the ADO library. Thanks, iwould be really grateful if you could reply at the earliet, Peter

I was just wondering what should be used InPlace of the CSingleLock
object that is used in MFC. Im not sure if there is a Standard WTL class that
is compatible. I tried using one of your classes like this :-

I was surprised that Suspend/Resume is used to handle threads that are not busy. The recommended solution is to wait for an event, or use sleep() and that's what I've allways used. In this case, each thread suspends itself, that seems dangerous.

From the Platform SDK :
The SuspendThread function is not particularly useful for synchronization because it does not control the point in the code at which the thread's execution is suspended. However, you might want to suspend a thread in a situation where you are waiting for user input that could cancel the work the thread is performing. If the user input cancels the work, have the thread exit; otherwise, call ResumeThread.

Yes, using an event would be the common way to do it. But it adds a lot of complexity which isnt necessary here.
The thread just needs to know if it has something to do or not. If not, it suspends itself. No needs to run if there is nothing to do.

The class is designed to get lots of jobs done as fast as possible, without much dependency or requirement.

Thats why I decided to go the simplest way. When a thread isnt running, there is no job to do, no need also to check the threads status. It runs - its busy, it sleeps - well, its idle.

The whole synchronization can be done through the job queue itself.

As for your quote of the SDK: the thread IS suspending itself because it knows its own state and has a welldefined control off the point of execution.

Suspending the threads isn't realy a good idea.
If the threads was executing a function of ODBC API for instance, a Window Message Queue has been implicitly added to this thread by the underlaying Driver (especialy MS Access).
So if now this thread will be suspended, it can't process any messages. So if an other Application uses the DDE or COM environment, some messages will be sent to each application and their threads. This course the sending application to freeze immediatly.
As example if you would use TextPad Context Extension of Explorer to open a file with it, the Text Pad application wil freeze.
This happends to many other applications too, this is only an example for it.

"Inactive" thread should be wakeup from time to time, if the dealing with API's that create message queues implicitly.
It would be a good idea to using one(ore more) Events to wait for and using "MsgWaitForMultipleObjects(...)". This allows the implicite wakeup or handle it internaly, if some message processing will be required.

Thank you it works now.
I have a question..
I am working on a wrongram that would recieve webpages and save them on disk... so when i call to the function that shhould recieve the page my application freezes. Could i prevent that with CJobManager (i dont really understand well the threads thing ). I am using AmHttpUtils this is how I recieve the page:
CAmHttpSocket http;
char *s = http.GetPage(_T("http://www.google.com/"); // when this called the application freezes for a while
Could i do this with the JobManager so it wont freeze and give me the 's' after it had recieved the page?
Thank you!

I always wondered if using multithreading to scan a directory hierarchy is wise.

I mean, the hard disk can access only one sector at a time, right? And if one thread asks to see one sector (cluster) in the inner cylinders, and another wants to see one in the outer cylinders, the head will back and forth and loose time. Or not?
Maybe the disk cache can help here, I am not sure how. I am still on the knowledge of the hardware of i486 computers

What do you think about this issue?

Anyway, thank you for sharing your knowledge on this difficult subject.

Philippe Lhoste wrote:I always wondered if using multithreading to scan a directory hierarchy is wise.

That is hard to answer in a general way. What works well on one system, may fail completly on another because of the different hardware used. Current hard discs may read one sector at a time, a bunch of sectors (one for each head) and cache them, use caching internally, on a external cache controller and so on. What is true today may not be true tomorrow. So, we may do what works best on most systems.
And that depends very strongy on the functionality of the application or even on a specific function. Whenever you think of using multithreading, try to make test cases and variying implementations to check out what it brings.
There is NO WAY to say: that's the way to do it right.
Always play around with different thread counts on different system configurations to see the effects.