This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').

Parameters

[in]

uplo

magma_uplo_t

= MagmaUpper: Upper triangle of A is stored;

= MagmaLower: Lower triangle of A is stored.

[in]

n

INTEGER The order of the matrix A. n >= 0.

[in]

nb

INTEGER The inner blocking. nb >= 0.

[in,out]

A

DOUBLE PRECISION array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details.

INTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.

[out]

dT

DOUBLE PRECISION array on the GPU, dimension N*NB, where NB is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another.

[out]

info

INTEGER

= 0: successful exit

< 0: if INFO = -i, the i-th argument had an illegal value

Further Details

If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors

Q = H(n-1) . . . H(2) H(1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).

If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(n-1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).

The contents of A on exit are illustrated by the following examples with n = 5:

This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').

Parameters

[in]

uplo

magma_uplo_t

= MagmaUpper: Upper triangle of A is stored;

= MagmaLower: Lower triangle of A is stored.

[in]

n

INTEGER The order of the matrix A. N >= 0.

[in]

nb

INTEGER The inner blocking. nb >= 0.

[in,out]

A

DOUBLE PRECISION array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details.

INTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.

[in,out]

dAmgpu

DOUBLE PRECISION array of pointer, dimension (ngpu) Each point to a DOUBLE PRECISION array, dimension (LDDA, nlocal) which hold the local matrix on each GPU.

[in]

ldda

INTEGER The leading dimension of the array dAmgpu. ldda >= max(1,n).

[in,out]

dTmgpu

DOUBLE PRECISION array of pointer, dimension (ngpu) Each point to a DOUBLE PRECISION array on the GPU, dimension n*nb, where nb is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another.

[in]

lddt

INTEGER The leading dimension of each array dT. lddt >= max(1,nb).

[in]

ngpu

INTEGER The number of GPUs.

[in]

distblk

INTEGER Internal parameter for performance tuning. The size of the distribution/computation.

[in]

queues

Array of magma_queue_t that point to the queues to be used in execution/communications. Dimension >= max(3, ngpu+1) Queue to execute in.

[in]

nqueue

INTEGER The number of queues should be >= max(3, ngpu+1).

[out]

info

INTEGER

= 0: successful exit

< 0: if INFO = -i, the i-th argument had an illegal value

Further Details

If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors

Q = H(n-1) . . . H(2) H(1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).

If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(n-1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).

The contents of A on exit are illustrated by the following examples with n = 5:

This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').

Parameters

[in]

uplo

magma_uplo_t

= MagmaUpper: Upper triangle of A is stored;

= MagmaLower: Lower triangle of A is stored.

[in]

n

INTEGER The order of the matrix A. n >= 0.

[in]

nb

INTEGER The inner blocking. nb >= 0.

[in,out]

A

REAL array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details.

INTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.

[out]

dT

REAL array on the GPU, dimension N*NB, where NB is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another.

[out]

info

INTEGER

= 0: successful exit

< 0: if INFO = -i, the i-th argument had an illegal value

Further Details

If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors

Q = H(n-1) . . . H(2) H(1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).

If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(n-1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).

The contents of A on exit are illustrated by the following examples with n = 5:

This version stores the triangular matrices T used in the accumulated Householder transformations (I - V T V').

Parameters

[in]

uplo

magma_uplo_t

= MagmaUpper: Upper triangle of A is stored;

= MagmaLower: Lower triangle of A is stored.

[in]

n

INTEGER The order of the matrix A. N >= 0.

[in]

nb

INTEGER The inner blocking. nb >= 0.

[in,out]

A

REAL array, dimension (LDA,N) On entry, the symmetric matrix A. If UPLO = MagmaUpper, the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = MagmaLower, the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. On exit, if UPLO = MagmaUpper, the Upper band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements above the band diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors; if UPLO = MagmaLower, the the Lower band-diagonal of A is overwritten by the corresponding elements of the band-diagonal matrix T, and the elements below the band-diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors. See Further Details.

INTEGER The dimension of the array WORK. LWORK >= 1. For optimum performance LWORK >= N*NB, where NB is the optimal blocksize.
If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA.

[in,out]

dAmgpu

REAL array of pointer, dimension (ngpu) Each point to a REAL array, dimension (LDDA, nlocal) which hold the local matrix on each GPU.

[in]

ldda

INTEGER The leading dimension of the array dAmgpu. ldda >= max(1,n).

[in,out]

dTmgpu

REAL array of pointer, dimension (ngpu) Each point to a REAL array on the GPU, dimension n*nb, where nb is the optimal blocksize. On exit dT holds the upper triangular matrices T from the accumulated Householder transformations (I - V T V') used in the factorization. The nb x nb matrices T are ordered consecutively in memory one after another.

[in]

lddt

INTEGER The leading dimension of each array dT. lddt >= max(1,nb).

[in]

ngpu

INTEGER The number of GPUs.

[in]

distblk

INTEGER Internal parameter for performance tuning. The size of the distribution/computation.

[in]

queues

Array of magma_queue_t that point to the queues to be used in execution/communications. Dimension >= max(3, ngpu+1) Queue to execute in.

[in]

nqueue

INTEGER The number of queues should be >= max(3, ngpu+1).

[out]

info

INTEGER

= 0: successful exit

< 0: if INFO = -i, the i-th argument had an illegal value

Further Details

If UPLO = MagmaUpper, the matrix Q is represented as a product of elementary reflectors

Q = H(n-1) . . . H(2) H(1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in A(1:i-1,i+1), and tau in TAU(i).

If UPLO = MagmaLower, the matrix Q is represented as a product of elementary reflectors

Q = H(1) H(2) . . . H(n-1).

Each H(i) has the form

H(i) = I - tau * v * v'

where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), and tau in TAU(i).

The contents of A on exit are illustrated by the following examples with n = 5: