From charlesreid1

LU Decomposition

This is based on decomposing the matrix A into two matrices, L and U:

L is a lower triangular matrix, U is an upper triangular matrix.

To solve the linear system:

we proceed in two steps:

Equation 1 is solved by forward substitution:

and the second equation is solved by back substitution:

The advantage is that solving a linear system with a triangular matrix is much easier.

The back and forward substitutions execute operations. If there are N RHS vectors (i.e., if we're explicitly inverting a matrix), and we take into account the reduced number of operations due to all but one element in each RHS being zero, the total number of operations is:

For forward substitution, the operation count is:

For back substitution, the operation count is:

BUT, there is also the cost of performing the decomposition itself, which is not free. This can be done with Crout's Algorithm.

Adding the cost of the LU decomposition itself brings the total cost of inverting a matrix to , which is the same complexity class as Gaussian elimination. This highlights an important fact: LU decomposition is simply a modified version of Gaussian elimination.

The advantage of LU decomposition is that, once A has been decomposed into L and U, it remains unchanged and can be used to solve as many RHS as desired, one at a time. (Gauss-Jordan and Gaussian elimination both depend on manipulating the left and right sides simultaneously.)


Crout's Decomposition Algorithm

To decompose the matrix A into the matrices L and U, there are equations that relate the components of L, U, and A. Crout's Algorithm provides a way to solve for the components by rearranging the equations.

Crout's Algorithm does not have any extra storage requirements - it works by changing the matrix A into L and U in-place.

Pivoting is essential for Crout's Algorithm to be stable. Partial pivoting can be implemented efficiently, and partial pivoting is enough to keep the algorithm stable. The implementation requires a loop over three indices, so there are six possible permutations of the indices i, j, and k. There is is a particular permutation that works best for row-based storage: kij or ikj. Pivoting is easiest with kij indexing. This method of pivoting is called "outer product Gaussian elimination" (Golub and Van Loan).

Overall, LU decomposition using Crout's algorithm can be solved for a single RHS vector by executing operations in inner loops (one multiply and one add per operation). This is a factor of 3 better than the cost of Gauss-Jordan elimination.

To invert a matrix using LU decomposition, the forward and back substitution steps should be included, making the total cost which is the same as Gauss Jordan elimination.

Pseudocode

Pseudocode to illustrate how to solve a matrix system Ax=b using LU decomposition and a class called "LUdcmp" is given below:

const Int n = ...
MatrixDouble a(n,n);
VectorDouble b(n), x(n);

// initialize a, b, and x

LUdcmp alu(a);
alu.solve(b,x);

Determinant

To calculate a determinant of an LU decomposed matrix, take the product of the diagonal elements of the decomposed matrix LU. (Recall Crout's Algorithm computes this decomposed matrix place and replaces elements of A with elements of the decomposed matrix).

The only complication here is that Crout's algorithm performs a rowwise permutation of the matrix. If the number of row permutations was even (positive parity), the determinant is the product of each of the diagonals. If the number of row permutations was odd (negative parity), the determinant is the opposite of the product of each of the diagonals.

Complex Matrix Systems

If matrix A is real and RHS vector is complex, matrix equation becomes:

In this case:

  • Perform LU decomposition as usual
  • Back substitute real component of RHS vector to get real part of solution
  • Back substitute complex component of RHS vector to get complex part of solution

If the matrix A is complex, the matrix equation becomes:

If A is complex, two options:

  • Rewrite decomposition routine to use complex arithmetic (more complicated code, fewer operations)
  • Solve real and imaginary parts of matrix separately (utilizes same code, costs twice as many operations/storage space)

The second approach writes the real and imaginary linear systems separately, leading to a matrix-vector system of size 2x2, 2x1, and 2x1, where each element is a matrix. The 2x2 matrix stores A and C twice. Thus, the method costs twice as much storage. It also costs twice as much time (complex multiplication would use 4 operations per multiply, but matrix multiplication uses 8 operations per multiply).