On OS X, successfully running said programs typically requires setting the
environment variable DYLD_LIBRARY_PATH to include the location of the
Elemental shared library, e.g., /usr/local/lib. On Unix systems, the
LD_LIBRARY_PATH variable should be set instead.

In order to use Elemental’s python interface, one must also set the PYTHONPATH variable to include the python/ subfolder of the installation, e.g.,
/usr/local/python. The program

Elemental supports a variety of different floating-point types; the typical
order of increasing cost is float, double, El::DoubleDouble,
El::QuadDouble, El::Quad, El::BigFloat. El::DoubleDouble and
El::QuadDouble are essentially concatenations of the mantissas of two and
four doubles (respectively), while El::Quad is an IEEE 128-bit format that
is typically implemented in software, and El::BigFloat is a wrapper for
GNU MPFR’s arbitrary-precision floating-point arithmetic.

The arbitrary-precision floating-point type in Elemental, El::BigFloat,
which currently builds on top of GNU MPFR, defaults to 256-bit but is
configurable at runtime with the routine
voidEl::mpfr::SetPrecision(mpfr_prec_tprecision). With the default
choice of 256 bits, the output of our QueryLimits function is:

A main function for calling this should ideally separately handle exceptions
for each datatype, as it turns out that neither single-precision nor
double-precision can successfully factor 20 x 20 Hilbert matrices:

A cursory comparison of the residual and solution errors relative to machine epsilon reveals that the relative residuals are near optimal but the relative
solution error loses about 26 digits of accuracy. Because the relative residuals
are on the order of machine epsilon, iterative refinement cannot be expected to
yield any improvement.

template<typenameField>// 'Field' can be complex, e.g., El::Complex<double>voidSolveUniform(El::Intn,constEl::Grid&grid){// Since 'Field' could be real or complex, Elemental has a convenience// template for extracting the real 'base' field.typedefEl::Base<Field>Real;if(grid.Rank()==0)El::Output("Attempting to solve uniform system with ",El::TypeName<Field>());// Form an n x n random matrix with i.i.d. entries sampled from the// uniform distribution over the ball of radius 'radius' centered atconstFieldcenter=Field(0);// center samples at the originconstRealradius=Real(1);// sample over ball of radius 1El::DistMatrix<Field>A(grid);El::Uniform(A,n,n,center,radius);// Form a uniform random vectorEl::DistMatrix<Field>x(grid);El::Uniform(x,n,1);constRealxFrob=El::FrobeniusNorm(x);// Form b := A x using a GEneral Matrix Vector (GEMV) product.// The terminology is derived from the BLAS naming convention.El::DistMatrix<Field>b(grid);El::Gemv(El::NORMAL,Field(1),A,x,b);constRealbFrob=El::FrobeniusNorm(b);// Form xComp := inv(A) bEl::DistMatrix<Field>xComp(b);El::LinearSolve(A,xComp);// Form r := b - A xEl::DistMatrix<Field>r(b);El::Gemv(El::NORMAL,Field(-1),A,xComp,Field(1),r);constRealrFrob=El::FrobeniusNorm(r);if(grid.Rank()==0)El::Output("|| r ||_2 / || b ||_2 = ",rFrob/bFrob);// Form e := x - xCompEl::DistMatrix<Field>e(x);e-=xComp;constRealeFrob=El::FrobeniusNorm(e);if(grid.Rank()==0){El::Output("|| e ||_2 / || x ||_2 = ",eFrob/xFrob);El::Output("");}}

An extremely basic (especially in the sense of not configuring the process grid)
example of driving the above routine for the suite of datatypes is given below.
Unlike the previous example, there is no need to wrap every single test with an
try/catch block since we are not attempting to solve pathologically
ill-conditioned systems.

intmain(intargc,char*argv[]){El::Environmentenv(argc,argv);try{constEl::Intn=El::Input("--n","matrix width",20);#ifdef EL_HAVE_MPCmpfr_prec_tprec=El::Input("--prec","MPFR precision",256);#endifEl::ProcessInput();#ifdef EL_HAVE_MPCEl::mpfr::SetPrecision(prec);#endif// While this is equivalent to the default constructor, it is a// middle-of-the-road example of building a process grid. If// performance is important, passing in a good height for the process// grid (and ensuring a good mapping of processes to your network)// is highly recommended.constEl::Gridgrid(El::mpi::COMM_WORLD);SolveUniform<float>(n,grid);SolveUniform<El::Complex<float>>(n,grid);SolveUniform<double>(n,grid);SolveUniform<El::Complex<double>>(n,grid);#ifdef EL_HAVE_QDSolveUniform<El::DoubleDouble>(n,grid);SolveUniform<El::Complex<El::DoubleDouble>>(n,grid);SolveUniform<El::QuadDouble>(n,grid);SolveUniform<El::Complex<El::QuadDouble>>(n,grid);#endif#ifdef EL_HAVE_QUADSolveUniform<El::Quad>(n,grid);SolveUniform<El::Complex<El::Quad>>(n,grid);#endif#ifdef EL_HAVE_MPCSolveUniform<El::BigFloat>(n,grid);SolveUniform<El::Complex<El::BigFloat>>(n,grid);#endif}catch(std::exception&e){El::ReportException(e);}return0;}

Elemental provides sequential and distributed implementations (for its suite
of datatypes) for Cholesky factorizations and quadratic-time low-rank
factorization updates and downdates. The following shows a generic routine
for exercising these interfaces in a sequential environment:

Elemental provides sequential and distributed implementations of Gaussian
Elimination with no pivoting, partial pivoting, and full pivoting, as well as
quadratic-time algorithms for performing low-rank modifications of an existing
factorization.

Note

The 0.87.5 release only supported a rank-one LU update interface, while
0.87.6 supports a higher-rank interfaces that calls the rank-one interface
in a loop. This approach will be replaced with a (much faster) single-pass
low-rank updating scheme in future releases.

Running this for float and double on a single core should for 4000 x 4000
matrices and rank-two updates on a recent Macbook Pro with Accelerate and
VECLIB_MAXIMUM_THREADS=1 should yield results similar to:

template<typenameField>voidTestSVD(El::Intm,El::Intn,El::Intrank){El::Output("Testing SVD with ",El::TypeName<Field>());typedefEl::Base<Field>Real;// Form a random rank-r m x n matrix A := X YEl::Matrix<Field>A;El::Matrix<Field>X,Y;El::Uniform(X,m,rank);El::Uniform(Y,n,rank);El::Gemm(El::NORMAL,El::NORMAL,Field(1),X,Y,A);// Decide on the details behind the SVD solverEl::SVDCtrl<Real>ctrl;ctrl.bidiagSVDCtrl.useQR=false;// This is the default option// We can choose between El::THIN_SVD, El::COMPACT_SVD, El::FULL_SVD,// and El::PRODUCT_SVD. A 'compact' SVD is most appropriate when the rank// is less than the smallest dimension. 'Thin' is the default.ctrl.bidiagSVDCtrl.approach=El::COMPACT_SVD;// We can optionally request more accurate (but slower) internal QR// iterations at the base of our Divide and Conquer tree. The default is// El::RELATIVE_TO_MAX_SING_VAL_TOL.ctrl.bidiagSVDCtrl.tolType=El::RELATIVE_TO_SELF_SING_VAL_TOL;El::Matrix<Real>s;El::Matrix<Field>U,V;El::SVDInfoinfo=El::SVD(A,U,s,V,ctrl);El::Output("Number of secular equation iterations: ",info.bidiagSVDInfo.dcInfo.secularInfo.numIterations);El::Output("Number of secular equation deflations: ",info.bidiagSVDInfo.dcInfo.secularInfo.numDeflations);El::Output("Number of internal QR iterations: ",info.bidiagSVDInfo.qrInfo.numIterations);}

but perhaps the simplest is via Elemental’s Python interface. For example,
the following Python script computes and visualizes the pseudospectra of the
famous Fox-Li/Landau matrices (of dimension 50, 100, and 300) over a 300 x 300
uniform grid.

importmath,time,ElifEl.havePyPlot:El.plt.set_cmap('bone')# pyplot.set_cmap seems to open an empty figure (which can be detected by# running pyplot.get_fignums()), and so we manually close it. Unfortunately,# some versions of pyplot generate a spurious warning of:## can't invoke "event" command: application has been destroyed## when calling pyplot.close().El.plt.close(1)realRes=imagRes=300# grid resolution# Display an instance of the Fox-Li/Landau matrixA=El.DistMatrix(El.zTag)nList=(50,100,300)forninnList:El.FoxLi(A,n)# Show the Real part of the matrixAReal=El.DistMatrix(El.dTag)El.RealPart(A,AReal)El.Display(AReal,'Real part of Fox-Li matrix (n={})'.format(n))# Compute the Schur decomposition (overwriting A with the Schur factor)schurStart=time.time()w=El.Schur(A)schurStop=time.time()ifA.Grid().Rank()is0:print('Schur decomp for n={}: {} [sec]'.format(n,schurStop-schurStart,))# Compute the spectral portrait of the Schur factorportraitStart=time.time()portrait,box=El.TriangularSpectralPortrait(A,realRes,imagRes)portraitStop=time.time()ifA.Grid().Rank()is0:print('Portrait for n={}: {} [sec]'.format(n,portraitStop-portraitStart,))# Display the eigenvalues on top of log10 plot of portraitEl.DisplayPortrait(portrait,box,title='Fox-Li portrait (n={})'.format(n),eigvals=w)El.Finalize()

An equivalent C++ program should should call the routines El::Schur and El::TriangularSpectralPortrait.

Some of the output of the above script includes the visualization of the real
part of the 50 x 50 discretization:

as well as its pseudospectra:

Similar pseudospectra are produced for the 100 x 100 and 300 x 300
discretizations:

where \(\mathcal{K}\) is a product of Second-Order and Positive Orthant
cones. Since Elemental supports the gamut of floating-point types from float
up to El::BigFloat, one can solve Second-Order Cone Programs to 1000 digits
of accuracy (should one so desire).

Elemental has fairly unique support for solving complex-valued Shortest Vector
Problems. For example we can search for a short vector (via BKZ(15)) in a
knapsack-type basis of dimension 20 with a bottom row uniformly sampled within
the unit ball of radius one million via:

One can also easily solve
SVPChallenge 40 via Elemental’s BKZ.
(Though it is important to note that Elemental does not yet support the
important aggressive precision-dropping optimization of libraries such as
NTL and
FPLLL).

Note

There is no agreed-upon convention for whether lattices should be represented
as collections of column or row vectors. Elemental adopts the convention of
column vectors, whereas the SVP Challenge uses row vectors. For this reason,
in addition to [ and ] symbols needing to be removed from the above link,
the matrix is transposed in the code below.

El::Matrix<Real>BTrans;El::Read(BTrans,"SVPChallenge-40.txt");El::Matrix<Real>B;El::Transpose(BTrans,B);El::Matrix<Real>R;// Will hold the 'R' of the QR factorization of final BEl::BKZCtrl<Real>ctrl;// One can modify the members to customize BKZctrl.blocksize=20;// Enumerate over windows of 20 vectors at a timeautoinfo=El::BKZ(B,R,ctrl);