Richard B. Johnson wrote:> The actual problem in the production machine involves two absolutely> independent tasks that end up using the same shared 'C' runtime> library. There should be no interaction between them, none> whatsover. However, when they both execute rand(), they interact in> bad ways. This interraction occurs on random days at monthly> intervals.

On Linux (unlike Windows), there is _no_ interaction between thelibraries of different tasks. Neither of them sees changes to theother's memory space.

If you are seeing a fault, then there might well be a bug, even akernel bug, but your test program does not illustrate the same problem.

What is the "bad interaction" that you observed at monthly intervals?Also a SIGSEGV?

> This is likely caused by the failure to use "-s" in the compilation> of a shared library function, fixed in subsequent releases.

No, this has nothing to do with it. Unlike Windows and some embeddedenvironments, Linux shared libraries do not have "shared writable data"sections.

> So, I allowed rand() to be "interrupted" just as it would be in a> context-switch. I simply used a signal handler, knowing quite well> that the "interrupt" could occur at any time. [...] What I brought> to light was a SIGSEGV that can occur when the shared-library rand()> function is "interrupted".

You have made a mistake. You program shows a different problem to theone which you noticed every month or so.

Calling a function from a signal handler while it is being interruptedby that handler is _very_ different from tasks context switching.They are not similar at all! (Yes, signals can be used to simulatecontext switches, but not like this!)

Your code interrupts one call to rand() and calls rand() _within_the interrupt handler. The inner call and outer call interfere, in avery similar way to calling it twice from two threads (note: threadsnot tasks). The memory state becomes corrupted.

This is _very_ different from two independent tasks context switching.Independent tasks do not share the same memory space, not even whenthey share the same libraries, so this type of corruption isn'tpossible.

Summary: your monthly "bad interaction" is not illustrated in thistest program. It's a different problem.