On 24 Aug 2008, at 01:26, Brandon S. Allbery KF8NH wrote:
> On 2008 Aug 23, at 18:34, Krzysztof Skrzętnicki wrote:
>> Recently I wrote computation intensive program that could easily
>> utilize both cores. However, there was overhead just from compiling
>> with -threaded and making some forkIO's. Still, the overhead was not
>> larger than 50% and with 4 cores I would probably still get the
>> results faster - I didn't experience an order of magnitude slowdown.
>> Perhaps it's the issue with OS X.
>>> All that's needed for multicore to be a *lot* slower is doing it
> wrong. Make sure you're forcing the right things in the right
> places, or you could quietly be building up thunks on both cores
> that will cause lots of cross-core signaling or locking. And, well,
> make sure the generated code isn't stupid. Quite possibly the PPC
> code is an order of magnitude worse than the better-tested Intel code.
Except that the test was running on a Core2Duo, and it runs very fast
when ghc does the threading on one core. My personal guess is that to
do it properly threaded requires *lots* of kernel boundary crosses to
do the locking etc on OS X (being a nearly-micro-kernel). The test
program was almost 100% made up of thread locking code.
Bob