If I use mass stored in bj.w, I end up with NaNs as a result of simulation, even after very first step. Particle positions are correct, because when I choose particle weight from cbuffer or from global constant, simulation works. I init all particle weights to the same number, same as the g_fParticleMass constant in shader.

Funy about this is that if I do the same thing in MS example I mentioned above, the result is very same - I get no output and buffer contains NaNs. Why am I unable to use 4th vector component from a shared memory in this case?? It is initialized properly on CPU side and the copied to GPU (verified)