Parallel execution of CUDAfy code

Is it possible to run cuda code in a single GPU inside a parallel cpu loop (i.e. Parallel.For)? I've tried something like that but an exception was thrown regarding the .cu file being locked and used by another process...

It works fine in a single-threaded scenario, but my problem is that this function
may be called in a multithreaded context so I would like to adapt it in order to make it work in such a case.

If the user chooses to enable CPU parallel processing, this function will be indirectly called two times in parallel by "DW_CalcKvalue" ("DW_CalcFugCoeff" => "CalcLnFug" => "CalcLnFugGPU") located in the base property
package class:

I've found something in the forums about EnableMultithreading(), SetCurrentContext(), Lock(), Unlock(), and also took a look at the Unit Tests sample, but couldn't get it to work. I kept getting an "ErrorInvalidContext" exception.

I should probably move the GPU instance to outside the function and make it 'global' in DWSIM, but I didn't have time to try that yet.

I have another question: In the example you posted above, If I were to run a Sub inside the function (as is my case), where should I put the LoadModule() call? In the main thread or inside the function? After or before the Lock() call? Also, the CudafyModule
instance is created on a per gpu or per thread basis?