I am not really sure running with these high core counts is such as great idea because you're taking a program designed for one kind of environment and dropping it into something quite different.

There are going to be some inefficiencies that prevent linear scaling. Some engines may do better than others in this respect. But these environments may test scaling more than they test search or eval, and the results won't give a very indicator of relative strength to people who don't have $20,000+ computer systems.

I am not really sure running with these high core counts is such as great idea because you're taking a program designed for one kind of environment and dropping it into something quite different.

There are going to be some inefficiencies that prevent linear scaling. Some engines may do better than others in this respect. But these environments may test scaling more than they test search or eval, and the results won't give a very indicator of relative strength to people who don't have $20,000+ computer systems.

--Jon

True, but sooner or later 90 threads will become "normal" in home computers, so it's just a matter of time... engines WILL have to adapt, or be relegated.

NPS probably 20% or so higher, depending on engine. But I am not sure about the effective speed-up from 45 threads on 45 physical cores to 90 threads on 48 physical cores with HT with Lazy SMP. In old times, with YBW, this was a clear NO, but nowadays I am not sure. Probably they would have better left 45 threads on 45 cores, less engines will have problems, less heat or possible throttle from using almost all CPU resources. The gain, if any, is anyway small even for well scaling SMP, and some engines might even perform worse.

Yes, one of the points is that after a while you don't get much gain from more cores.

The scaling does vary though. I am noticing right now that Houdini is apparently getting something over 100M nodes/second on the CCC hardware, which is quite an astonishing number, and Arasan is getting about 30M, a bit more in the endgame. I did give them a version for Windows that is aware of processor groups but I am not clear that is what they are running. I don't have hardware even close to this to test on.

I am not really sure running with these high core counts is such as great idea because you're taking a program designed for one kind of environment and dropping it into something quite different.

There are going to be some inefficiencies that prevent linear scaling. Some engines may do better than others in this respect. But these environments may test scaling more than they test search or eval, and the results won't give a very indicator of relative strength to people who don't have $20,000+ computer systems.

--Jon

Indeed. This will not be so much a chess contest than a stability contest. Those that don't crash win. Most engines aren't tested for this kind of use case. And even those that are flawless without any SMP bug (assuming this is even possible which I doubt) will show very little gain on so many threads. You're probably better off stopping at 8 or 16 threads. After that you are just wasting electricity, and crashing engines.

Theory and practice sometimes clash. And when that happens, theory loses. Every single time.