Singular Value Decomposition

Let A be a general real m-by-n matrix. The singular value
decomposition (SVD) of A is the factorization,
where
U and V are orthogonal, and
,
,
with
.
If A is complex, then
its SVD is
where U and V are unitary,
and
is as before with real
diagonal elements.
The
are called the singular values,
the first r columns of V
the right singular vectors and
the first r columns of U
the left singular vectors.

The routines described in this section, and listed in Table 2.12,
are used to compute this decomposition.
The computation proceeds in the following stages:

1.

The matrix A is reduced to bidiagonal
form: A=U1B V1T if
A is real (A=U1B V1H if A is complex), where U1 and V1
are orthogonal (unitary if A is complex), and B is real and
upper-bidiagonal when
and lower bidiagonal when m < n, so
that B is nonzero only on the main diagonal and either on the first
superdiagonal (if )
or the first subdiagonal (if m<n).

2.

The SVD of the bidiagonal matrix B is computed:
,
where U2 and V2 are orthogonal and
is diagonal as described
above. The singular vectors of A are then U = U1U2 and
V = V1V2.

The reduction to bidiagonal form is performed by the subroutine xGEBRD,
or by xGBBRD
for a band matrix.

The routine xGEBRD represents
U1 and V1 in factored form as products of elementary reflectors,
as described in section 5.4.
If A is real,
the matrices U1 and V1 may be computed explicitly using routine xORGBR,
or multiplied by other matrices without forming U1 and V1 using routine xORMBR.
If A is complex, one instead uses xUNGBR
and xUNMBR, respectively.

If A is banded and xGBBRD is used to reduce it to bidiagonal form,
U1 and V1 are determined as products of Givens rotations
, rather than
as products of elementary reflectors. If U1 or V1 is required, it
must be formed explicitly by xGBBRD. xGBBRD uses a vectorizable
algorithm, similar to that used by xSBTRD (see Kaufman [77]).
xGBBRD may be much faster than xGEBRD when the bandwidth is narrow.

The SVD of the bidiagonal matrix is computed either by subroutine
xBDSQR
or by subroutine
xBDSDC.
xBDSQR uses QR iteration when singular vectors are desired
[32,23],
and otherwise uses the dqds algorithm [51].
xBDSQR is more accurate than its counterparts in LINPACK and EISPACK:
barring underflow and overflow, it computes all the singular values of
B to nearly full relative precision, independent of their magnitudes.
It also computes the singular vectors much more accurately. See
section 4.9 and
[32,23,51] for details.

xBDSDC uses a variation of Cuppen's divide and conquer algorithm to find
singular values and singular vectors [69,58].
It is much faster than xBDSQR if singular vectors of large matrices are desired.
When singular values only are desired, it uses the same dqds algorithm as xBDSQR
[51].
Divide-and-conquer is not guaranteed to compute singular values to nearly
full relative precision, but in practice xBDSDC is often at least as
accurate as xBDSQR.
xBDSDC represents the singular vector matrices U2 and V2
in a compressed format requiring only
space instead
of n2. U2 and V2 may subsequently be generated explicitly
using routine xLASDQ, or multiplied by vectors without forming
them explicitly using routine xLASD0.

If ,
it may be more efficient to first perform a QR factorization
of A, using the routine xGEQRF,
and then to compute the SVD of the n-by-n matrix R, since
if A = QR and
,
then the SVD of A is given by
.
Similarly, if ,
it may be more efficient to first perform an
LQ factorization of A, using xGELQF. These preliminary QR and LQ
factorizations are performed by the drivers
xGESVD
and
xGESDD.

The SVD may be used to find a minimum norm solution to a (possibly)
rank-deficient linear least squares
problem (2.1). The effective rank, k, of A can be determined as the
number of singular values which exceed a suitable threshold.
Let
be the leading k-by-k submatrix of ,
and
be the matrix consisting of the first k columns of V.
Then the solution is given by:

where
consists of the first k elements of
c = UTb =
U2TU1Tb. U1Tb can be computed using xORMBR, and
xBDSQR has an option to multiply a vector by U2T.