MsgWaitForMultipleObjects and the queue state

One danger of
the MsgWaitForMultipleObjects function
is calling it when there are already messages waiting
to be processed, because MsgWaitForMultipleObjects
returns only when there is a new event in the queue.

In other words, consider the following scenario:

PeekMessage(&msg, NULL, 0, 0, PM_NOREMOVE)
returns TRUE indicating that there is a message.

Instead of processing the message, you ignore it and call
MsgWaitForMultipleObjects.

This wait will not return immediately, even though
there is a message in the queue. That's because the call to
PeekMessage told you that a message was ready,
and you willfully ignored it. The
MsgWaitForMultipleObjects message tells you only
when there are new messages; any message that you already knew
about doesn't count.

If it so happens that there were two messages
in your queue, the MsgWaitForMultipleObjects does not
return immediately, because there are no new messages; there is an
old message you willfully ignored, however.

When MsgWaitForMultipleObjects tells you that there is
a message in your message queue, you have to process all
of the messages until PeekMessage returns FALSE,
indicating that there are no more messages.

Note, however, that this sequence is not a problem:

PeekMessage(&msg, NULL, 0, 0, PM_NOREMOVE)
returns FALSE indicating that there is no message.

A message is posted into your queue.

You call MsgWaitForMultipleObjects and include
the QS_ALLPOSTMESSAGE flag.

This wait does return immediately, because the incoming posted
message sets the "There is a new message in the queue that nobody
knows about" flag, which QS_ALLPOSTMESSAGE matches
and therefore causes MsgWaitForMultipleObjects to
return immediately.

Armed with this knowledge, explain why the observed behavior with the
following code is "Sometimes my program gets stuck and reports one
fewer record than it should. I have to jiggle the mouse to get the
value to update. After a while longer, it falls two behind, then three..."

No, that was Raymond’s whole point! If you peek, then wait, it fails as Raymond pointed out. While it is possible that you could write it as a PeekMessage loop, just writing while(PeekMessage()) isn’t enough.

If while’s used to replace the current if, that should be enough, I think…

Raymond, in your example, is there actually any work done in the windows proc for WM_NEWRECORD? I currently don’t understand why the falling behind would increase unless the counter was actually increased inside the window proc.

I would rather have had the ability to open a read-only handle to a live event object created by the message queue. It’d be way more flexible than trying to kludge MsgWaitForMultipleObjects into whatever scheme or framework I’m working with.

Ray: reentrancy is a general problem in code that uses Windows message queues. This is just something you have to deal with – the system can’t protect you from all possible scenarios.

Joshua: simply exposing an event would be very inefficient if you are only interested in a certain type of messages. You’d need some mechanism to tell the system when to signal the event. And then you’d have to deal with cases where two or more threads are waiting on the same event (should this be illegal? or should each thread get its own event object?)

Ray: reentrancy is a general problem in code that uses Windows message queues. This is just something you have to deal with – the system can’t protect you from all possible scenarios.

Joshua: simply exposing an event would be very inefficient if you are only interested in a certain type of messages. You’d need some mechanism to tell the system when to signal the event. And then you’d have to deal with cases where two or more threads are waiting on the same event (should this be illegal? or should each thread get its own event object?)

Jiggling the mouse to get a response is a common occurence in the Windows 2000 Start menu and Windows XP Classic version Start menu. It is so common that it doesn’t even take thinking. But this base note does make me wonder. You know, click on the Start button, move up to Programs, move to the right and locate the folder containing the link you really want to click on, but that folder doesn’t expand. You have to move the mouse to hover over another folder and then move back to the one you really wanted to expand. So does Start menu processing contain the bug described here?

ATL includes a function "AtlWaitWithMessageLoop". Back a while ago on the ATL mailing lists, there was discussion of some deadlock possibilities with it (in part due to the issue you raise here), and how it might be improved. Here’s the version that I came up with and currently use. I’ve often wondered if the approach I use has any possible problems. What improvements could be made to this version? (the formatting might get messed up)

Ah… This is exactly one problem that I’ve encountered during my serial port application development. I was totally puzzled at the random lost of records. We (the development team) did tried to figure out the pattern of problem but never success, so the problem hang there until a new collegue told us about this…

Gads… this function is so florked up that several articles have appeared in Windows Developer/MSDN magazine about it (I wrote one of them, so it stuck in my mind).

A much more fundamental danger of this function is that it can break the synchronization of the mutex if you’re not careful how you use it (recursive waits in particular are problematic).

Consider the case where you have a Lock() function that properly first calls MsgWaitForMultipleObjects, then processes the pending messages in a loop as Raymond suggests.

Now consider two different message handlers that need to access the protected resource:

case WM_PAINT:

Lock();

if (pProtected) {

// Call some function that calls some other

// function that does the following:

Lock(); // This recursive Lock is broken

pProtected->ProcessPaint();

Unlock();

}

Unlock();

case WM_KILLFOCUS:

Lock();

if (pProtected) {

delete pProtected;

pProtected = NULL;

}

Unlock();

The Lock above that’s labelled "This recursive lock is broken" can process a WM_KILLFOCUS message which will successfully delete the protected resource (the thread already holds the mutex, so the wait will suceed). This, of course, causes a crash in the next line.

This all seems to be due to the fact that MsgWaitForMultipleObjects checks the message queue *first* rather than checking the state of the waited-for object first.

So, effectively, the necessity for processing the messages with a Peek() loop has generated a little preemptive operating system of its own, and since it’s in the same thread that has the locked resource, the mutex doesn’t protect.

The only solution I was able to come up with is to always put a "if (WaitForSingleObject(hMutex, 0) != WAIT_OBJECT_0)" before you try to use MsgWaitForMultipleObjects.

Can someone tell me if the following would be acceptable in the message loop for a window thread, or if I am still using the MsgWaitForMultipleObjects the wrong way. Sorry, in advance, if the code shows up incorrectly, I have never posted here before.

In my opinion, situations that justify MsgWaitForMultipleObjects are quite rare nowadays. In the past thread were expensive and developers needed to scratch their heads how to cram in gazillion things in one thread. Today it’s just doesn’t worth it. Besides, this is annoying when you have message loops splattered here and there in the code. If I need to process some data which involves synchronization issues, then I don’t even want to think about messing it with GUI. Just stash it in worker thread, do the job, then notify GUI in some flexible and lazy way (say, with PostMessage).

That is a completely unrelated issue. The matter of the Start menu, at random times not expanding folders unless the mouse is moved to a different folder and moved back, occurs even when all applications come from Microsoft. It is intermittently reproducible both on real PCs and virtual PCs. Now I resume thinking that your base note here looks like a likely explanation. The symptom really resembles the example you gave.

Is there a way to tell Windows to "from now on pretend i don’t know about any messages in the queue" function? Basically undo the strangeness that PeekMessage does.

Does MsgWaitForMultipleObjects also tag queued messages as "read" like PeekMessage. i.e. Once i "realize" there is a message in the queue with MsgWaitForMultipleObjects, am i then forced to process all messages in the queue?

"In my opinion, situations that justify MsgWaitForMultipleObjects are quite rare nowadays. In the past thread were expensive and developers needed to scratch their heads how to cram in gazillion things in one thread."

Actually MsgWaitForMultipleObjects turns out to be very useful in multithread situations.

It’s the only way for you to add a user-defined window message that works like WM_PAINT — a very low priority message, and there’s only one of them in the queue no matter how many times you post it.

I used a message loop based on MsgWaitForMultipleObjects when I had a program with a worker thread that did computation and wrote output text. There was also a main UI thread which controlled the window, so the worker thread should not be allowed to change the window contents itself.

When the worker thread (which you can imagine as printing a million digits of Pi) has some text to output, it must post a message to the UI thread to get the text displayed. But if you used ordinary user-defined messages, you could get thousands of tiny WM_USERs clogging up the message queue.

The solution is to have a text buffer that holds the new output. Any time new text is added to the buffer, the worker thread posts a WM_PAINT-type message to the UI thread. There’s only one WM_PAINT at any time, and it’s always handled after more critical interactive messages (like scroll-bar manipulation). When the UI thread services the message, it outputs all the text in the buffer at once, just as a WM_PAINT handler updates the whole invalidated region of the window, regardless of how many smaller invalidations composed it.

But since you can’t define your own WM_PAINT-type message, you can get the same effect by using MsgWaitForMultipleObjects and replacing "post a message" with "flip on the event object." If you write the loop correctly (as described by Raymond) you get the same behavior.

(Raymond and others: Are there any potential gotchas to this approach?)

"Actually MsgWaitForMultipleObjects turns out to be very useful in multithread situations.

It’s the only way for you to add a user-defined window message that works like WM_PAINT — a very low priority message, and there’s only one of them in the queue no matter how many times you post it."

I agree. That’s the example of MsgWaitForMultipleObjects where it’s handy. However, with two reservations:

1. You use it to "improve" regular main message loop, not to implement bastard message loop somewhere in program.

2. While less elegant, simple WM_TIMER with polling of text buffer is much less confusing than MsgWaitForMultipleObjects. Unless you really need to update GUI ASAP, updating it 3-5 times in second will be indistinguishable from user’s point of view. Consider how significantly less explanation requires WM_TIMER to maintenace programmer than esoteric gotchas of MsgWaitForMultipleObjects.

"Actually MsgWaitForMultipleObjects turns out to be very useful in multithread situations.

It’s the only way for you to add a user-defined window message that works like WM_PAINT — a very low priority message, and there’s only one of them in the queue no matter how many times you post it."

I agree. That’s the example of MsgWaitForMultipleObjects where it’s handy. However, with two reservations:

1. You use it to "improve" regular main message loop, not to implement bastard message loop somewhere in program.

2. While less elegant, simple WM_TIMER with polling of text buffer is much less confusing than MsgWaitForMultipleObjects. Unless you really need to update GUI ASAP, updating it 3-5 times in second will be indistinguishable from user’s point of view. Consider how significantly less explanation requires WM_TIMER to maintenace programmer than esoteric gotchas of MsgWaitForMultipleObjects.

This is, of course, true. And as I mentioned there are numerous ways you can screw yourself, by yourself, if you’re not careful.

However, this function is the only one I know of where the action is entirely implicit. I.e. you’re not calling something that you can know is going to be doing a SendMessage in the context of the current thread. You, in fact, can’t know what it will do at all.

As a result, when you’re talking about a mutex that’s grabbed by message handlers (which is one of the main uses of this kind of function… to avoid the inevitable deadlock that will happen if you ever do anything that requires message processing in such a handler), the only safe thing to do is to always return without processing any pending messages if the mutex is already held.

Not only does this function not do that by default (which obviates almost its entire point), there’s not even an option for it… you have to code that additional check by hand yourself or you’re looking at a complete maintenance nightmare.

It’s not the end of the world, but it is an extremely obscure pitfall.