Since I'm having suspicions the "black box" (GPU) is not shutting down cleanly in some larger code (others perhaps too), I would include a cudaDeviceReset() at the end of main(). But wait! This would Segmentation fault all instances of classes statically created in main() with non-trivial CUDA code in destructors, right? E.g.

cudaDeviceReset explicitly destroys any context on the active device being held by process or thread that calls it. But that it all it does. If you have CUDA API calls which need a context to work in destructor code, then you can't have them called after the context is already destroyed (and it will be destroyed automatically by the runtime when the process terminates).
–
talonmiesJul 23 '12 at 9:15

mmm...this did not solve the problem to me...
–
JackOLanternMay 30 '13 at 20:33

1

So you may try to take the declaration of "t" in another parenthesis. And then call cudaDeviceReset after the end of this paranthesis. So it may force the destruction of the "t" before the device reset.. """int main(..) { { A t; t.someoperation(); } cudaDeviceReset(); }"""
–
phoadJun 3 '13 at 20:33

Thanks. I think it is a good point. I will check as soon as I can and let you know.
–
JackOLanternJun 4 '13 at 20:18