PhysX SDK 2.8.4: Optimized CPU Cloth Simulation

The new PhysX SDK 2.8.4 comes with an optimized CPU cloth simulation path and is compiled with SSE2 option. Optimized CPU cloth simulation ? According to the test I did, this is true. The cloth sample shipped with PhysX SDK shows clearly the gain in performance. I tested on my dev system with a GTX 460 (R260.63) + Quad Core X 9650 @ 3.2GHz:

The gpu usage could be because the framerate qudrupled. I mean the gpu has to render 4 times faster no?

I guess the same thing would happen in your fluid mark with CPU test and ATI card. If you used multithreading the framerate would jump so the card should render faster and the gpu usage would increase as well.

So what’s the verdict really? Did the performance gain come from SSE2 alone or something else? Have you pinpointed it?

A Geeks3D friend told me that you cannot have more than 20% with SSE2. I had read to that article of the X87 PhysX instructions, that you can have 100% more performance with SSE2. Now we see 400%. I am confused! 😛