Author
Topic: Threadripper 2990 Performance Issues (Read 1572 times)

It is all about render-speed on our Threadripper-node: We use a TR 2990/Cinema4D Setup as a render node since a year. At the beginning (I think, it was when we used Corona V3), all felt very fast – and renderings where calculated in no time. This changed, when we compared the results of the Threadripper and one of our iMacPro Workstations that has an 8-Core Xeon last fall (in Corona 4 hot fix 2&3). The results where +/- at the same level.

Then I built an additional render node, based on the new Ryzen 3950. Now we are rendering a movie clip – and while the results of the Ryzen 3950 (with a Cinebench of around 9000) are pleasing (14 minutes per frame), the Threadripper 2990 node (with a cinebench of around 11000) takes more than double the time for each frame (32 mins per frame).

For making sure, that there is no «cinebench-error», we rendered an image with good old physical rendering. The result was as we expected: Threadripper goes around 20% faster than Ryzen.

While researching, I now found several entries in this forum, that are about RAM and correct setup. We use 2 x 16GB Corsair LPX at 3200MHz (our MoBo is a Gigabyte Designare x399). Can it be, that we need to add some more RAM-Modules to get better results (at least two channels instead of only one? Or are other clockings of the RAM better than 3200 MHz? What RAM-Setup would you recommend?

If there is underlying Hardware setup issue, it's odd it would only manifest from latest Corona version. It would have to always be there, otherwise it would be Corona fault (which it can be!).

Make sure:

- You are running latest bios for the board.- You are running latest Windows 10 update. There were lot of improvements for Zen architecture.- You are running latest chipset drivers for AMD if for some reason Windows 10 updater doesn't install them.- Your memory is running at the designed speed by loading the XMP/DCPP profile in bios, or by manually loading the 3200 speed.- You are running high-performance or ultra-high performance power settings mode. (You can get the ultra mode with easy command line option, just google it).

To get full quad-channel bandwidth, you need 4 memory dimms, or 8. They have to be placed correctly if you only for by 4. 2 Dimms like you have at the moment gives you dual-channel.

Try to compare Ray/s second during those renders, to see if the rendering speed is actually slower, or if the slower rendering time is due to slow precomputing period (which often runs single-threaded for some reason at some parts).I have seen reports with animations and Backburner where the nodes would be too slow to start and precompute, slowing the total rendering time even if rendering speed was fine.

I have two 2990WX in our office and I don't have any of these issues, but people report them very often and 2990WX is very peculiar chip (due to specific memory controller setup only having access to two dies, half the chip only directly and also NUMA nodes), and Corona is very sensitive to latency, something Zen architecture is worse at and having incorrect setup can easily break.

If there is underlying Hardware setup issue, it's odd it would only manifest from latest Corona version. It would have to always be there, otherwise it would be Corona fault (which it can be!).

Make sure:

- You are running latest bios for the board.- You are running latest Windows 10 update. There were lot of improvements for Zen architecture.- You are running latest chipset drivers for AMD if for some reason Windows 10 updater doesn't install them.- Your memory is running at the designed speed by loading the XMP/DCPP profile in bios, or by manually loading the 3200 speed.- You are running high-performance or ultra-high performance power settings mode. (You can get the ultra mode with easy command line option, just google it).

To get full quad-channel bandwidth, you need 4 memory dimms, or 8. They have to be placed correctly if you only for by 4. 2 Dimms like you have at the moment gives you dual-channel.

Try to compare Ray/s second during those renders, to see if the rendering speed is actually slower, or if the slower rendering time is due to slow precomputing period (which often runs single-threaded for some reason at some parts).I have seen reports with animations and Backburner where the nodes would be too slow to start and precompute, slowing the total rendering time even if rendering speed was fine.

I have two 2990WX in our office and I don't have any of these issues, but people report them very often and 2990WX is very peculiar chip (due to specific memory controller setup only having access to two dies, half the chip only directly and also NUMA nodes), and Corona is very sensitive to latency, something Zen architecture is worse at and having incorrect setup can easily break.

Good tips Juraj, will try and go through them on our machines too. We have 3 workstations all with 2990wx and find render performance sluggish. Same scene on these workstations renders with around 3.2mil rays/s while on our one older workstation with a i9 7980XE it jumps to around 4.3mil.

Have tried disabling the windows flow-guard thing and running the Core Prio program as suggested by others, but I'm not seeing any differences. Also never had a workstation crash as much as this one...

That is interesting.. I wonder if some specific scene setup could run very poorly on 2990WX.

I have still 7980XE, 2990WX and various Dual-Xeons like 2698v4, all three which have very similar performance (from 4500 to 5200 Cinebench R15 scores).And when I run distributed, they all contribute equal amounts of passes (2990WX do provide on average 10perc. more) every single time.

So this really is mystery.. I have two such builds, one with MSI MEG and one wit Zenith Alpha, and neither even runs memory at 2933 because it wasn't stable (2800 CL16 both).Both 128GB memory with any swapping disabled in Windows.

I can't comment on sluggishness because it's Veronika's workstation and other is node so I don't use them, but she did frequently comment on such behavior but that is always hard to judge since I find 3dsMax, Photoshop and Chrome to be all equally slow softwares.

You can also try downloading Spectre disabler and disable both protections which affect mainly just Intel, but they still get installed by Windows automatically.

Also, with my 2016 3dsMax, none of my scenes ever crash by themselves. Like ever, I had maybe 5-6 crashes last year, and they were all Corona related (reloading textures in CoronaBitmap and importing stuff from Copitor both during Interactive, which...doesn't work of course).

I really wonder what this comes down to... Also, at some point did you try to reinstall whole Windows from ground up and see if that changes? Some things are just.. mystery.

Cheers, will try those suggestions too! (though what is the memory swapping you talk about?)

We run max 2020 and I get a lot of crashes + very annoying viewport glitches (everything in it very often disappears until I click w. the mouse). But I also just get straight up PC crashing/reboots on render starts + blue screens. Then it runs for days on same scene without crashes. I wonder if I should play a bit with ram timings and power setup in bios as everything is just running standard now. The workstation is brand new. Only couple of months old. So can't see a reinstall would fix much.

I'm curious to how the rays/s work. Is it just an overall view of how complex the scene is to render? In the corona benchmark I get upwards of 12-14mil for that scene while our i9 is around 7.5m rays/s.

If you get blue screens, that there is definitely something wrong on hardware level, at best driver level.

Do you have good PSU? What do you mean running "standard" on memory means? Loaded XMP profile? What Capacity & Frequency & Timings is it? On which board?

Swapping is virtual memory, when Windows offloads data from system memory onto your (mostly) main drive. With large enough memory, this isn't strictly necessary anymore, and while Windows should be efficient in deciding when to do this... it doesn't mean it can be trusted.

Corona benchmarks is around 43secs, but the real action happens in my scenes from yesterday regarding rays/s:4k scene rendered yesterday w. 2.1mil rays/s totalToday after tweaks: 4.7mill !Scene in 1800px from yesterday was at 3.2mil and today 5.7mil! Overall system responsiveness also seemed better while rendering and I was very happy. But then I wanted to open Photoshop to see how resposive that would be (as we have had very laggy performance in there while rendering), but when I opened PS = blue screen with a "sorry windows ran in to a problem and needs to reboot :("

Temps seemed okay hovering around 68degrees when rendering. You think it could be memory? I've just put them down to 3000mhz now and will try render same scene and open PS again.

The correct multiplier for Ryzen at 3000 is 2933. It with run under both, but it's better advised to put it precisely at 2933.

But here's the thing, 3000 CL16 is low-quality memory bin from Hynix, there is no guarantee it will be super stable in 2933 at 8x16GB configuration but it's lottery.(For 1st & 2nd gen Zen chips, it's not issue for 3rd gen).

Also, don't overclock anything on CPU. Don't even turn on PBO. Keep it stock. No overclock is necessary, and it absolutely not going to be stable on your Prime X399 Board with 8 phases.This board was never meant for 250W CPUs. It's miracle it runs.

Compare it to something like MSI MEG which has 16 phases for CPU and 3 phases for Memory. That and Zenith Alpha are the only board for which 2990WX can be run safely, longterm, and overclocked.

(EDIT: I might add why I advise against overclocking despite the best overclocker on youtube Roman does opposite in his video. He is using open-air setup on Asus Zenith, not Prime. He only populated 4 dimm slots with only 32GB memory in total, and he used the best Samsung B-die memory. The more memory you put on Zen architecture, the higher the stress on CPU even without overclock yet. Also, any meager gains you get in all-core, you loose in single-thread boost. Because your system will be render slightly faster, but it's also gonna be slower.Unless you have extreme setup, it's not worth it in any way whatsover.)

I love this forum. This is stuff not really showing up on my initial googling :)

So you would recommend to just load the default profile in the bios and only set the AI overclock tuner to D.O.C.P and set mem to 2933mhz. Leave everything else, no changes power? Or only set mem to 2933mhz and nothing else, not even timings etc? There's a bunch of other "enhancing" settings that seems I can enable or turn on. Not recommending that either?

No "enhancing" ! All those semi auto-overclock crap just makes things worse, esp. when your board is insufficient for the CPU.

DOCP is XMP for AMD, so yes you can turn then on. If it sets memory automatically to 2933 great, if not, turn it yourself.Each board has different bios options, but memory often has simple presets like "2933 CL16" which will automatically set all voltages and subtimings,etc.. You never have to do this yourself, that's stupid hobby for Reddit dwarves. Or it will be preset for speed (2933) and preset for timings group (16/18/18/38),etc..

Google/Reddit/Youtube rarely helps much because even the power users or pro reviewers are often enthousiasts who never ever worked on those PCs outside of editing Premiere video. They will run 32GB ram on professional workstations instead of 128-256, test some random Blender scene for 20 minutes and then go write "useful" stuff. The experience and advice can be useful, or not at all.

And people wonder who would need more than 128 GB :- ). Some process already reserved that space into committed memory (which would otherwise also include pagefile when allowed) and that's probably Corona/3dsMax also.