I am a graduate student doing my independent study which happened to be in a subject i am not that good at ( linear algebra). So, what we do in our group is test several of the magma library kernels for research projects in formal verification. In order to do so, we need to find the correct arguments values for each kernel we formally verify. It takes too long to figure that out (especially for me). Can any one help me figure out correct arguments values that I can plug in "magmablas_chemv_200_L_special" and possible other kernels in the same file "magma-1.2.0/magmablas/chemv_fermi.cu" ?

I spent more than 3 days and haven't yet figured out the correct parameters.

I am pasting the code I am trying to make it work to start with. When I paste the code, you will know that I am trying to "completely" isolate each kernel to instrument the code later on after making it work first on nvcc ...etc. This is why you will see me commenting all includes and stripping all needed code from them to a single file.

//---------------------------------------------------- // Utility custom functions to help generate the data //---------------------------------------------------- void fillVector(cuFloatComplex * x, magma_int_t n, cuFloatComplex aValue){ /** Author: ... Fills a complex numbers vector with 'aValue'. This function is modifiable so that it fills random complex numbers instead of just a single value. However, since this function is just for generating regular tests, that feature was left out. */ int i; for (i = 0; i < n; i++) { // TODO fill in this part x[i]= aValue; }}

A - COMPLEX*16 array of DIMENSION ( LDA, n ). Before entry with UPLO = 'U' or 'u', the leading n by n upper triangular part of the array A must contain the upper triangular part of the hermitian matrix and the strictly lower triangular part of A is not referenced. Before entry with UPLO = 'L' or 'l', the leading n by n lower triangular part of the array A must contain the lower triangular part of the hermitian matrix and the strictly upper triangular part of A is not referenced. Note that the imaginary parts of the diagonal elements need not be set and are assumed to be zero. Unchanged on exit.

LDA - INTEGER. On entry, LDA specifies the first dimension of A as declared in the calling (sub) program. LDA must be at least max( 1, n ). Unchanged on exit. It is recommended that lda is multiple of 16. Otherwise performance would be deteriorated as the memory accesses would not be fully coalescent.

// changed on exit, need copy back to host cuFloatComplex *A; // A is an 'n' by 'n' hermitian matrix as a 1D array of (LDA,n) dimensions A = (cuFloatComplex *) malloc(MEM_SIZE_A); fillHermitianMatrix(A, DEGREE_A ,make_cuFloatComplex(1,0)); // n is degree for an n by n matrix, diagonal is 1's

I agree with John; you should look at the magmablas_chemv_200 and magmablas_chemv2_200 routines which are the high-level routines that call the appropriate lower level routine based on the input parameters. Most of the parameters are documented there.

WC is a workspace, which is normally allocated in magmablas_chemv_200, or passed to magmablas_chemv2_200. It doesn't appear that you allocate the correct size for the workspace, which will lead to memory errors on the GPU.

The arrays A, X, Y are all on the GPU. The scalars alpha, beta are on the CPU. You appear to be allocating GPU pointers for alpha and beta, then derefencing GPU pointers on the CPU in your chemv call. This will lead to a segfault.

You don't need to malloc() the alpha and beta scalars. Just declare them as scalars (on the CPU) and use them directly.