m_slk/compute_eigen1 [ Functions ]

comm= MPI communicator
cplex=1 if matrix is real, 2 if complex
nbli_global number of lines
nbco_global number of columns
matrix= the matrix to process
vector= eigenvalues of the matrix
istwf_k= option parameter that describes the storage of wfs

OUTPUT

vector

SIDE EFFECTS

results= ScaLAPACK matrix coming out of the operation
eigen= eigenvalues of the matrix

m_slk/matrix_set_local_cplx [ Functions ]

Sets a local matrix coefficient of complex type.
-------------------------------------------------------
Positioning of a component of a matrix thanks to its local indices
-------------------------------------------------------

m_slk/my_locc [ Functions ]

Let K be the number of rows or columns of a distributed matrix, and assume that its process grid has dimension p x q.
LOCr( K ) denotes the number of elements of K that a process would receive if K were distributed over the p
processes of its process column.
Similarly, LOCc( K ) denotes the number of elements of K that a process would receive if K were distributed over
the q processes of its process row.
The values of LOCr() and LOCc() may be determined via a call to the ScaLAPACK tool function, NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upperbound for these quantities may be computed
by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A

m_slk/my_locr [ Functions ]

Let K be the number of rows or columns of a distributed matrix, and assume that its process grid has dimension p x q.
LOCr( K ) denotes the number of elements of K that a process would receive if K were distributed over the p
processes of its process column.
Similarly, LOCc( K ) denotes the number of elements of K that a process would receive if K were distributed over
the q processes of its process row.
The values of LOCr() and LOCc() may be determined via a call to the ScaLAPACK tool function, NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upperbound for these quantities may be computed
by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A

glob_pmat(n*(n+1)/2)=One-dimensional array containing the global matrix A packed columnwise in a linear array.
The j-th column of A is stored in the array glob_pmat as follows:
if uplo = "U", glob_pmat(i + (j-1)*j/2) = A(i,j) for 1<=i<=j;
if uplo = "L", glob_pmat(i + (j-1)*(2*n-j)/2) = A(i,j) for j<=i<=n.
where n is the number of rows or columns in the global matrix.
uplo=String specifying whether only the upper or lower triangular part of the global matrix is used:
= "U": Upper triangular
= "L": Lower triangular

Slk_mat<matrix_scalapack>=ScaLAPACK matrix (matrix A)
Slk_vec<matrix_scalapack>=The distributed eigenvectors X. Not referenced if JOBZ="N"
JOBZ (global input) CHARACTER*1
Specifies whether or not to compute the eigenvectors:
= "N": Compute eigenvalues only.
= "V": Compute eigenvalues and eigenvectors.
RANGE (global input) CHARACTER*1
= "A": all eigenvalues will be found.
= "V": all eigenvalues in the interval [VL,VU] will be found.
= "I": the IL-th through IU-th eigenvalues will be found.
UPLO (global input) CHARACTER*1
Specifies whether the upper or lower triangular part of the Hermitian matrix A is stored:
= "U": Upper triangular
= "L": Lower triangular
VL (global input) DOUBLE PRECISION
If RANGE="V",the lowerbound of the interval to be searched for eigenvalues. Not referenced if RANGE =
"A" or "I"
VU (global input) DOUBLE PRECISION
If RANGE="V", the upperbound of the interval to be searched for eigenvalues. Not referenced if RANGE =
"A" or "I".
IL (global input) integer
If RANGE="I", the index (from smallest to largest) of the smallest eigenvalue to be returned. IL >= 1.
Not referenced if RANGE = "A" or "V".
IU (global input) integer
If RANGE="I", the index (from smallest to largest) of the largest eigenvalue to be returned. min(IL,N) <=
IU <= N. Not referenced if RANGE = "A" or "V"
ABSTOL (global input) DOUBLE PRECISION
If JOBZ="V", setting ABSTOL to PDLAMCH( CONTEXT, "U") yields the most orthogonal eigenvectors.
The absolute error tolerance for the eigenvalues. An approximate eigenvalue is accepted as converged when
it is determined to lie in an interval [a,b] of width less than or equal to
ABSTOL + EPS * max( |a|,|b| ) ,
where EPS is the machine precision. If ABSTOL is less than or equal to zero, then EPS*norm(T) will be used
in its place, where norm(T) is the 1-norm of the tridiagonal matrix obtained by reducing A to tridiagonal form.
Eigenvalues will be computed most accurately when ABSTOL is set to twice the underflow threshold
2*PDLAMCH("S") not zero. If this routine returns with ((MOD(INFO,2).NE.0) .OR. (MOD(INFO/8,2).NE.0)),
indicating that some eigenvalues or eigenvectors did not converge, try setting ABSTOL to 2*PDLAMCH("S").

OUTPUT

mene_found= (global output) Total number of eigenvalues found. 0 <= mene_found <= N.
eigen(N)= (global output) Eigenvalues of A where N is the dimension of M
On normal exit, the first mene_found entries contain the selected eigenvalues in ascending order.

SIDE EFFECTS

If JOBZ="V", the local buffer Slk_vec%buffer_cplx will contain part of the distributed eigenvectors.
Slk_mat<ScaLAPACK_matrix>=
%buffer_cplx is destroyed when the routine returns

Slk_matA<matrix_scalapack>=ScaLAPACK matrix (matrix A)
Slk_matB<matrix_scalapack>=ScaLAPACK matrix (matrix B)
Slk_vec<matrix_scalapack>=The distributed eigenvectors X. Not referenced if JOBZ="N"
IBtype (global input) integer
Specifies the problem type to be solved:
= 1: sub( A )*x = (lambda)*sub( B )*x
= 2: sub( A )*sub( B )*x = (lambda)*x
= 3: sub( B )*sub( A )*x = (lambda)*x
JOBZ (global input) CHARACTER*1
Specifies whether or not to compute the eigenvectors:
= "N": Compute eigenvalues only.
= "V": Compute eigenvalues and eigenvectors.
RANGE (global input) CHARACTER*1
= "A": all eigenvalues will be found.
= "V": all eigenvalues in the interval [VL,VU] will be found.
= "I": the IL-th through IU-th eigenvalues will be found.
UPLO (global input) CHARACTER*1
Specifies whether the upper or lower triangular part of the Hermitian matrix sub(A) and sub(B) is stored:
= "U": Upper triangular
= "L": Lower triangular
VL (global input) DOUBLE PRECISION
If RANGE="V",the lowerbound of the interval to be searched for eigenvalues. Not referenced if RANGE =
"A" or "I"
VU (global input) DOUBLE PRECISION
If RANGE="V", the upperbound of the interval to be searched for eigenvalues. Not referenced if RANGE =
"A" or "I".
IL (global input) integer
If RANGE="I", the index (from smallest to largest) of the smallest eigenvalue to be returned. IL >= 1.
Not referenced if RANGE = "A" or "V".
IU (global input) integer
If RANGE="I", the index (from smallest to largest) of the largest eigenvalue to be returned. min(IL,N) <=
IU <= N. Not referenced if RANGE = "A" or "V"
ABSTOL (global input) DOUBLE PRECISION
If JOBZ="V", setting ABSTOL to PDLAMCH( CONTEXT, "U") yields the most orthogonal eigenvectors.
The absolute error tolerance for the eigenvalues. An approximate eigenvalue is accepted as converged when
it is determined to lie in an interval [a,b] of width less than or equal to
ABSTOL + EPS * max( |a|,|b| ) ,
where EPS is the machine precision. If ABSTOL is less than or equal to zero, then EPS*norm(T) will be used
in its place, where norm(T) is the 1-norm of the tridiagonal matrix obtained by reducing A to tridiagonal form.
Eigenvalues will be computed most accurately when ABSTOL is set to twice the underflow threshold
2*PDLAMCH("S") not zero. If this routine returns with ((MOD(INFO,2).NE.0) .OR. (MOD(INFO/8,2).NE.0)),
indicating that some eigenvalues or eigenvectors did not converge, try setting ABSTOL to 2*PDLAMCH("S").

OUTPUT

mene_found= (global output) Total number of eigenvalues found. 0 <= mene_found <= N.
eigen(N)= (global output) Eigenvalues of A where N is the dimension of M
On normal exit, the first mene_found entries contain the selected eigenvalues in ascending order.

SIDE EFFECTS

Slk_vec<matrix_scalapack>:
%buffer_cplx local output (global dimension (N,N)
If JOBZ = 'V', then on normal exit the first M columns of Z
contain the orthonormal eigenvectors of the matrix
corresponding to the selected eigenvalues.
If JOBZ = 'N', then Z is not referenced.
Slk_matA<matrix_scalapack>:
%buffer_cplx
(local input/local output) complex(DPC) pointer into the
local memory to an array of dimension (LLD_A, LOCc(JA+N-1)).
On entry, this array contains the local pieces of the
N-by-N Hermitian distributed matrix sub( A ). If UPLO = 'U',
the leading N-by-N upper triangular part of sub( A ) contains
the upper triangular part of the matrix. If UPLO = 'L', the
leading N-by-N lower triangular part of sub( A ) contains
the lower triangular part of the matrix.
On exit, if JOBZ = 'V', then if INFO = 0, sub( A ) contains
the distributed matrix Z of eigenvectors. The eigenvectors
are normalized as follows:
if IBtype = 1 or 2, Z**H*sub( B )*Z = I;
if IBtype = 3, Z**H*inv( sub( B ) )*Z = I.
If JOBZ = 'N', then on exit the upper triangle (if UPLO='U')
or the lower triangle (if UPLO='L') of sub( A ), including
the diagonal, is destroyed.
Slk_matB=
%buffer_cplx
(local input/local output) complex*(DPC) pointer into the
local memory to an array of dimension (LLD_B, LOCc(JB+N-1)).
On entry, this array contains the local pieces of the
N-by-N Hermitian distributed matrix sub( B ). If UPLO = 'U',
the leading N-by-N upper triangular part of sub( B ) contains
the upper triangular part of the matrix. If UPLO = 'L', the
leading N-by-N lower triangular part of sub( B ) contains
the lower triangular part of the matrix.
On exit, if INFO <= N, the part of sub( B ) containing the
matrix is overwritten by the triangular factor U or L from
the Cholesky factorization sub( B ) = U**H*U or
sub( B ) = L*L**H.

m_slk/slk_read [ Functions ]

Routine to read a square scaLAPACK distributed matrix from an external file using MPI-IO.

INPUTS

uplo=String specifying whether only the upper or lower triangular part of the global matrix is stored on disk:
= "U": Upper triangular is stored
= "L": Lower triangular is stored
= "A": Full matrix (used for general complex matrices)
symtype=Symmetry type of the matrix stored on disk (used only if uplo = "L" or "A").
= "H" for Hermitian matrix
= "S" for symmetric matrix.
= "N" if matrix has no symmetry (not compatible with uplo="L" or uplo="U".
is_fortran_file=.FALSE. is C stream is used. .TRUE. for writing Fortran binary files.
[fname]= Mutually exclusive with mpi_fh. The name of the external file from which the matrix will be read.
The file is open and closed inside the routine with MPI flags specified by flags.
[mpi_fh]=File handler associated to the file (already open in the caller). Not compatible with fname.
[flags]=MPI-IO flags used to open the file in MPI_FILE_OPEN. Default is MPI_MODE_RDONLY. Referenced only when fname is used.

SIDE EFFECTS

Slk_mat<matrix_scalapack>=Structured datatype defining the scaLAPACK distribution with the local buffer
supposed to be allocated.
%buffer_cplx=Local buffer containg the distributed matrix stored on the external file.
If fname is present then the file is opened and closed inside the routine. Any exception is fatal.
[offset]=
input: Offset used to access the content of the file. Default is zero.
output: New offset incremented with the byte size of the matrix that has been read (Fortran
markers are included if is_fortran_file=.TRUE.)

TODO

* Generalize the implementation adding the reading of the real buffer.

m_slk/slk_single_fview_read [ Functions ]

Return an MPI datatype that can be used to read a scaLAPACK distributed matrix from
a binary file using MPI-IO.

INPUTS

Slk_mat<matrix_scalapack>=Structured datatype defining the scaLAPACK distribution with the local buffer.
uplo=String specifying whether only the upper or lower triangular part of the global matrix is stored on disk:
= "U": Upper triangular is stored
= "L": Lower triangular is stored
= "A": Full matrix (used for general complex matrices)
[is_fortran_file]=.FALSE. is C stream is used. .TRUE. for writing Fortran binary files
with record markers. In this case etype is set xmpio_mpi_type_frm provided that
the mpi_type of the matrix element is commensurate with xmpio_mpi_type_frm. Defaults to .TRUE.

OUTPUT

etype=Elementary data type (handle) defining the elementary unit used to access the file.
slk_type=New MPI type that can be used to instantiate the MPI-IO view for the Fortran file.
Note that the view assumes that the file pointer points to the FIRST Fortran record marker.
offset_err=Error code. A non-zero value signals that the global matrix is too large
for a single MPI-IO access (see notes below).

NOTES

With (signed) Fortran integers, the maximum size of the file that
that can be read in one-shot is around 2Gb when etype is set to byte.
Using a larger etype might create portability problems (real data on machines using
integer*16 for the marker) since etype must be a multiple of the Fortran record marker
Due to the above reason, block_displ is given in bytes and must be stored in a integer
of kind XMPI_ADDRESS_KIND. If the displacement is too large, the routine returns
offset_err=1 so that the caller will know that several MPI-IO reads are needed to
read the local buffer.

m_slk/slk_single_fview_read_mask [ Functions ]

Return an MPI datatype that can be used to read a scaLAPACK distributed matrix from
a binary file using MPI-IO. The view is created using the user-defined mask function
mask_of_glob. The storage of the data on file is described via the user-defined function offset_of_glob.

INPUTS

Slk_mat<matrix_scalapack>=Structured datatype defining the scaLAPACK matrix.
mask_of_glob(row_glob,col_glob,size_glob) is an integer function that accepts in input
the global indeces of the matrix size_glob(1:2) are the global dimensions.
Return 0 if (row_glob,col_glob) should not be read.
offset_of_glob(row_glob,col_glob,size_glob,nsblocks,sub_block,bsize_elm,bsize_frm)
nsblocks=Number of sub-blocks (will be passed to offset_of_glob)
sub_block(2,2,nsblocks)=Global coordinates of the extremal points delimiting the sub-blocs
e.g. sub_block(:,1,1) gives the coordinates of the left upper corner of the first block.
sub_block(:,2,1) gives the coordinates of the right lower corner of the first block.
[is_fortran_file]=.FALSE. is C stream is used. Set to .TRUE. for writing Fortran binary files
with record marker.

OUTPUT

my_nel=Number of elements that will be read by this node.
etype=Elementary data type (handle) defining the elementary unit used to access the file.
This is the elementary type that must be used to creae the view (MPI_BYTE is used).
slk_type=New MPI type that can be used to instantiate the MPI-IO view for the Fortran file.
Note that the view assumes that the file pointer points to the FIRST Fortran record marker.
offset_err=Error code. A returned non-zero value signals that the global matrix is too large
for a single MPI-IO access. See notes in other slk_single_fview_* routines.

m_slk/slk_single_fview_write [ Functions ]

Returns an MPI datatype that can be used to write a scaLAPACK distributed matrix to
a binary file using MPI-IO.

INPUTS

Slk_mat<matrix_scalapack>=Structured datatype defining the scaLAPACK distribution with the local buffer.
uplo=String specifying whether only the upper or lower triangular part of the global matrix is stored on disk:
= "U": Upper triangular is stored
= "L": Lower triangular is stored
= "A": Full matrix (used for general complex matrices)
[is_fortran_file]=.FALSE. is C stream is used. .TRUE. for writing Fortran binary files
with record marker. In this case etype is set xmpio_mpi_type_frm provided that
the mpi_type of the matrix element is commensurate with xmpio_mpi_type_frm. Defaults to .TRUE.
glob_subarray(2,2) = Used to select the subarray of the global matrix. Used only when uplo="All"
glob_subarray(:,1)=starting global coordinates of the subarray in each dimension (array of nonnegative integers >=1, <=array_of_sizes)
glob_subarray(:,2)=Number of elements in each dimension of the subarray (array of positive integers)

OUTPUT

nelw=Number of elements to be written.
etype=Elementary data type (handle) defining the elementary unit used to access the file.
slk_type=New MPI type that can be used to instantiate the MPI-IO view for the Fortran file.
Note that the view assumes that the file pointer points to the FIRST Fortran record marker.
offset_err=Error code. A non-zero value signals that the global matrix is too large
for a single MPI-IO access (see notes below).

SIDE EFFECTS

elw2slk(:,:) =
input: pointer to null().
output: elw2slk(2,nelw) contains the local coordinates of the matrix elements to be written.
(useful only if the upper or lower triangle of the global matrix has to be written or when
uplo="all" but a global subarray is written.

NOTES

With (signed) Fortran integers, the maximum size of the file that
that can be read in one-shot is around 2Gb when etype is set to byte.
Using a larger etype might create portability problems (real data on machines using
integer*16 for the marker) since etype must be a multiple of the Fortran record marker
Due to the above reason, block_displ is given in bytes and must be stored in a Fortran
integer of kind XMPI_ADDRESS_KIND. If the displacement is too large, the routine returns
offset_err=1 so that the caller will know that several MPI-IO reads are needed to
write the local buffer.

m_slk/slk_symmetrize [ Functions ]

uplo=String specifying whether only the upper or lower triangular part of the global matrix has been read
= "U": Upper triangular has been read.
= "L": Lower triangular has been read.
= "A": Full matrix (used for general complex matrices)
symtype=Symmetry type of the matrix (used only if uplo = "L" or "A").
= "H" for Hermitian matrix
= "S" for symmetric matrix.
= "N" if matrix has no symmetry (not compatible with uplo="L" or uplo="U".

SIDE EFFECTS

Slk_mat<matrix_scalapack>=Structured datatype defining the scaLAPACK distribution with the local buffer
supposed to be allocated.
%buffer_cplx=Local buffer containg the distributed matrix stored on the external file.

m_slk/slk_write [ Functions ]

Routine to write a squared scaLAPACK distributed matrix on an external file using MPI-IO.

INPUTS

Slk_mat<matrix_scalapack>=Structured datatype defining the scaLAPACK distribution with the local buffer
containing the distributed matrix.
uplo=String specifying whether only the upper or lower triangular part of the global matrix is used:
= "U": Upper triangular
= "L": Lower triangular
= "A": Full matrix (used for general complex matrices)
is_fortran_file=.FALSE. is C stream is used. .TRUE. for writing Fortran binary files.
[fname]= Mutually exclusive with mpi_fh. The name of the external file on which the matrix will be written.
The file is open and closed inside the routine with MPI flags specified by flags.
[mpi_fh]=File handler associated to the file (already open in the caller). Not compatible with fname.
[flags]=MPI-IO flags used to open the file in MPI_FILE_OPEN.
Default is MPI_MODE_CREATE + MPI_MODE_WRONLY + MPI_MODE_EXCL.
[glob_subarray(2,2)] = Used to select the subarray of the global matrix. Used only when uplo="All"
NOTE that each node should call the routine with the same value.
glob_subarray(:,1)=starting global coordinates of the subarray in each dimension (array of nonnegative integers >=1, <=array_of_sizes)
glob_subarray(:,2)=Number of elements in each dimension of the subarray (array of positive integers)

OUTPUT

Only writing. The global scaLAPACK matrix is written on file fname.
If fname is present then the file is open and closed inside the routine. Any exception is fatal.

SIDE EFFECTS

[offset]=
input: Offset used to access the content of the file. Default is zero.
output: New offset incremented with the byte size of the matrix that has been read (Fortran
markers are included if is_fortran_file=.TRUE.)

Slk_mat<type(matrix_scalapack)>=The object storing the local buffer, the array descriptor, the context
and other quantities needed to call ScaLAPACK routines.
On entry, this array contains the local pieces of the N-by-N Hermitian distributed matrix sub( A ) to be factored.
If UPLO = 'U', the leading N-by-N upper triangular part of sub( A ) contains the upper triangular part of the matrix,
and its strictly lower triangular part is not referenced.
If UPLO = 'L', the leading N-by-N lower triangular part of sub( A ) contains the lower triangular part of the distribu-
ted matrix, and its strictly upper triangular part is not referenced.
On exit, the local pieces of the upper or lower triangle of the (Hermitian) inverse of sub( A )

m_slk/slk_zinvert [ Functions ]

slk_zinvert provides an object-oriented interface to the ScaLAPACK set of routines used to compute
the inverse of a complex matrix in double precision

SIDE EFFECTS

Slk_mat<type(matrix_scalapack)>=The object storing the local buffer, the array descriptor, the context
and other quantities needed to call ScaLAPACK routines. In input, the matrix to invert, in output
the matrix inverted and distributed among the nodes.