From charlesreid1

SVD Decomposition

SVD:

  • Powerful set of techniques for dealing with sets of equations that are singular or numerically near-singular
  • Can diagnose problems with matrix systems where LU or Gaussian elimination fails.
  • Method of choice for solving linear least-squares problems

Basic theorem of linear algebra: any M x N matrix A can be written as product of three matrices :

  • U is an M x N column-orthogonal matrix
  • W is an N x N diagonal matrix with positive or zero elements (singular values)
  • V is the transpose of an N x N orthogonal matrix

M > N corresponds to the overdetermined situation (more equations than unknowns)

M < N corresponds to the underdetermined situation (fewer equations than unknowns)

V is orthogonal in sense that columns are orthonormal, meaning

In linear algebra terms,

If M >= N, then U is also column-orthonormal,

If M < N, two things happen:

  • Singular values are all zero
  • Corresponding columns of U are also 0

Decomposition into can always be done

W contains the singular values - it is conventional to return them in sorted, descending order

SVD routines are complicated, one of the few cases where it is ok to treat it as a black box

Range and Nullspace

Suppose you have an M x N matrix A, an N-dimensional vector x, and an M-dimensional vector b, in a linear system Ax = b

The matrix A is a linear mapping from an N -dimensional vector space into an M dimensional vector space

The map might only be able to reach a lower dimensional subspace of the full M dimensional vector space

The max dimension of the range of the map is the rank of A

There may be nonzero vectors x that are mapped to zeros by the matrix A, Ax = 0. The space of all such vectors compose the nullspace of A. The dimension of this space is the nullity of A.

Rank-nullity theorem: for a matrix A, rank + nullity = N (number of columns)

If A has rank less than N (if nullity is greater than 0), 2 things happen:

  • Most RHS vectors b yield no solution
  • Some RHS vectors b yield multiple solutions (a subspace of them)

SVD constructs the orthogonal bases for the nullspace and range of a matrix:

  • Columns of U whose same-numbered elements w_j are NONZERO are orthonormal set of basis vectors spanning the range.
  • Columns of V whose same-numbered elements w_j are ZERO are an orthonormal basis for the nullspace.

SVD implementation should have methods that return rank and nullity (as integers), and range and nullspace (each should be packaged as a matrix of bundled vectors, where columns for mthe orthonormal basis of either the range or the nullspace). This implementation is basically just finding the nonzero and zero elements w_j, and extracting the corresponding columns of U or V (respectively).

SVD of Square Matrix

For N x N square matrices, U V W are all square matrices of same size

U and V are orthogonal, so their inverses are equal to their transpose

W is diagonal, so W inverse is just matrix whose diagonals are inverse of diagonals of W

If any of the w_j values are 0 (exactly 0 or numerically equivalent to 0), matrix is singular. SVD can give diagnosis of a singular matrix.

Condition number of matrix is defined as ratio of largest of w_j's to smallest of w_j's. Singular means condition nubmer is infinite; Ill-conditioned means condition number approaches machine floating point precision.

Homogeneous equations (RHS = 0) solved immediately by SVD. Solution is any linear combination of columns returned by nullspace method.

If RHS not zero, question is whether it lies in range of A or not. If so, singular set of equations does have solution x (has more than one solution). To single out one solution in particular, pick the one with the smallest magnitude. To find that vecotr using SVD, simply replace 1/w_j with 0 if w_j is 0. This is referred to as the Moore-Penrose inverse (pseudoinverse) of A.

Zeroing out a singular value is equivalent to throwing away one linear combination of the set of equations - but we end up throwing away precisely the one equation that is most corrupted by roundoff error. Ignoring it is better than not ignoring it, since it tends to throw off the solution to all the other equations.

Solving Underdetermined Systems with SVD

Fewer equations that unknowns - use solve to get the shortest solution, then use nullspace to get a set of basis vectors for the nullspace.

Your solutions will be former, plus any linear combinations of the latter.

Solving Overdetermined Systems with SVD

This is the situation for least-squares. The least squares solution vector iss given by applying the pseudoinverse to the linear system:

In general, none of the elements of W will need to be set to zero. However, sometimes there are column degeneracies in the original matrix A, which means a particular value of will need to be set to 0.

These cases correspond to linear combinations of that are particularly insensitive to data - in terms of linear regression, these correspond to DATA or OBSERVATIONS that contribute NO INFORMATION to the model. These observations can be "thrown out" because they don't give you any additional information. This can help reduce the number of free parameters.