Hi, I'm having a difficulty that I haven't been able to find documented. I'm on a brand new MacBook Pro with Retina display (NVIDIA GeForce GT 650M), and my CULA codes are crashing out on culaInitialize(). I tried running some tests from the shipped CULA examples, with the same symptoms; abort() on culaInitialize(). The NVIDIA SDK examples all run fine. It seems that the problem is that I have to use the CUDA 5 RC libraries in place of the CULA R15 (CUDA 4.2) CUDA libraries. This makes sense, but I wanted to make sure that it was correct. And when I do this, I get "Insufficient memory" errors.

Backtrace from the debugger gives: Program received signal SIGABRT, Aborted.0x00007fff86addd46 in __kill ()(gdb) bt#0 0x00007fff86addd46 in __kill ()#1 0x00007fff88cb4eec in __abort ()#2 0x00007fff88cb5d43 in __stack_chk_fail ()#3 0x00000001001dd1d2 in cudalib::GetDeviceCount ()#4 0x00000001001f4828 in culaInitialize ()#5 0x0000000100000e4d in main ()

It seems to be grabbing the CULA 4.2 cuda runtime and blas:% otool -L gesvdgesvd: libcula_core.dylib (compatibility version 0.0.0, current version 0.0.0) libcula_lapack.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libcublas.dylib (compatibility version 1.1.0, current version 4.2.0) @rpath/libcudart.dylib (compatibility version 1.1.0, current version 4.2.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)

Recompiling with a pointer to the NVIDIA libraries:% gcc -m64 -o gesvd gesvd.c -DNDEBUG -O3 -I/usr/local/cula/include -L/usr/local/cuda/lib -L/usr/local/cula/lib64 -lcula_core -lcula_lapack -lcublas -lcudart -pthreadyields the following:% otool -L gesvdgesvd: libcula_core.dylib (compatibility version 0.0.0, current version 0.0.0) libcula_lapack.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libcublas.dylib (compatibility version 1.1.0, current version 5.0.0) @rpath/libcudart.dylib (compatibility version 1.1.0, current version 5.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)% ./gesvd --------------------------------------------------------------------------------This example demonstrates using CULA to implement an image compressionalgorithm. Two images will be generated:

So, no crash, but I don't know if the failure is indicative of another problem. The same happens with the CULA "systemSolve" example:% otool -L systemSolvesystemSolve: libcula_core.dylib (compatibility version 0.0.0, current version 0.0.0) libcula_lapack.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libcublas.dylib (compatibility version 1.1.0, current version 5.0.0) @rpath/libcudart.dylib (compatibility version 1.1.0, current version 5.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)% ./systemSolve ------------------- SGESV-------------------Allocating MatricesInitializing CULACalling culaSgesvInsufficient memory to complete this operation

You definitely want to be sticking to the CUDA 4.2 runtime for this, and preferably the one that came with CULA. The driver can be 4.2 or 5.0, in theory, but the 4.2 driver would be the one closer to our test configuration.

This could also be a Mountain Lion compatibility issue. Mountain Lion is not on our supported OS list at this time.