Linear systems
Eigenvalues
Sparse matrices
Preliminaries
Triangular and tridiagonal systems
LU decomposition
Cholesky decomposition
tr(AB)=tr(BA)
tr(ABC)=tr(BCA)=tr(CAB)
tr(ABC)≠tr(ACB)
tr(A′B)=vec(A)′vec(B)=vec(B)′vec(A)
det(AB)=det(A)det(B)
det(A−1)=[det(A)]−1
Weinstein-Aronszajn identity: det(Im+Am×nBn×m)=det(In+BA)
(A+UCV)−1=A−1−A−1U(C−1+VA−1U)−1VA−1
Special cases:
If D−CA−1B is invertible:
[ABCD]−1=[A−1+A−1B(D−CA−1B)−1CA−1−A−1B(D−CA−1B)−1−(D−CA−1B)−1CA−1(D−CA−1B)−1]
If A−BD−1C is invertible:
[ABCD]−1=[(A−BD−1C)−1−(A−BD−1C)−1BD−1−D−1C(A−BD−1C)−1D−1+D−1C(A−BD−1C)−1BD−1]
If A is invertible:
det(ABCD)=det(A)det(D−CA−1B)
For any A,B∈Rn×n:
det(ABBA)=det(A+B)det(A−B)
L=[×××⋮⋮⋱××⋯×],U=[×⋯××⋱⋮⋮×××]
Triangular matrices are important building blocks of numerical linear algebra
Cheap multiplication, determinant, and linear system solving
Many matrix decompositions are in the form of triangular matrices
A triangular matrix is invertible if and only if all its diagonal entries are non-zero
Invertible lower (upper) triangular matrices are closed under multiplication and inversion
The determinant of a triangular matrix is the product of its diagonal entries
det[λ1×λ2⋮⋮⋱××⋯λm]=λ1det[λ2⋮⋱×⋯λm]=⋯=λ1⋯λm
Question: What is the computational complexity of multiplying a triangular matrix with a vector?
O(n2)
We are interested in solving the linear systems of equations Lx=b L=[l11l21l22⋮⋮⋱ln1ln2⋯lnn]
The algorithm is called forward substitution
Let lk,1:k=(lk1,…,lkk)′, x1:k=(x1,…,xk)′, k=1,…,n
The k-th equation gives l′k,1:kx1:k=k−1∑i=1lkixi+lkkxk=bk
Start with x1=b1/l11
For k=2,…,n, iterate xk=(bk−k−1∑i=1lkixi)/lkk
Start with x1=b1/l11
For k=2,…,n, iterate xk=(bk−k−1∑i=1lkixi)/lkk
Computational complexity is O(n2)
Similarly, we can solve the linear systems of equations Ux=b with an upper triangular matrix U=[u11u12⋯u1nu22⋯u2n⋱⋮unn]
The algorithm is called backward substitution
Similarly, we can solve the linear systems of equations Ux=b with an upper triangular matrix U=[u11u12⋯u1nu22⋯u2n⋱⋮unn]
The algorithm is called backward substitution
Please derive this algorithm by yourself
set.seed(123)n = 2000b = rnorm(n)M = 0.1 * matrix(rnorm(n^2), n)diag(M) = abs(diag(M)) + 1# Get lower and upper triangular partsL = M * lower.tri(M, diag = TRUE)U = M * upper.tri(M, diag = TRUE)
forwardsolve()
and backsolve()
functions to implement the two algorithmsx1 = forwardsolve(L, b)max(abs(L %*% x1 - b)) # Verify L*x1 = b
## [1] 4.193368e-12
x2 = backsolve(U, b)max(abs(U %*% x2 - b)) # Verify U*x2 = b
## [1] 5.541345e-12
forwardsolve()
/backsolve()
x1 = forwardsolve(M, b)max(abs(L %*% x1 - b)) # Verify L*x1 = b
## [1] 4.193368e-12
x2 = backsolve(M, b)max(abs(U %*% x2 - b)) # Verify U*x2 = b
## [1] 5.541345e-12
forwardsolve()
/backsolve()
instead of solve()
library(dplyr)bench::mark( solve(L, b), forwardsolve(L, b), backsolve(U, b), max_iterations = 10, check = FALSE) %>% select(expression, min, median)
## # A tibble: 3 × 3## expression min median## <bch:expr> <bch:tm> <bch:tm>## 1 solve(L, b) 1.42s 1.42s## 2 forwardsolve(L, b) 1.33ms 1.42ms## 3 backsolve(U, b) 1.31ms 1.43ms
Implement the backward substitution algorithm
Experiment on simulated matrices
Use backsolve()
to verify the result
T=[b1c1a2b2c2a3b3⋱⋱⋱cn−1anbn]
Can be represented by three vectors
Storage requirement is linear in n, i.e., O(n)
Cheap multiplication, determinant, and linear system solving
Question: What is the computational complexity of multiplying a tridiagonal matrix with a vector?
Can be represented by three vectors
Storage requirement is linear in n, i.e., O(n)
Cheap multiplication, determinant, and linear system solving
Question: What is the computational complexity of multiplying a tridiagonal matrix with a vector?
O(n)
[b1c1a2b2c2a3b3⋱⋱⋱cn−1anbn][x1x2x3⋮xn]=[d1d2d3⋮dn]
c′i={cibi,i=1cibi−aic′i−1,i=2,…,n−1,d′i={dibi,i=1di−aid′i−1bi−aic′i−1,i=2,…,n
xn=d′n,xi=d′i−c′ixi+1,i=n−1,…,1
set.seed(123)n = 2000diagonal = rnorm(n)lsubdiag = rnorm(n - 1)usubdiag = rnorm(n - 1)b = rnorm(n)# Explicitly form the matrixM = diag(diagonal)diag(M[2:n, ]) = lsubdiagdiag(M[, 2:n]) = usubdiagM[1:5, 1:5]
## [,1] [,2] [,3] [,4] [,5]## [1,] -0.5604756 1.1210219 0.0000000 0.00000000 0.0000000## [2,] -0.5116037 -0.2301775 0.1965498 0.00000000 0.0000000## [3,] 0.0000000 0.2369379 1.5587083 0.65011319 0.0000000## [4,] 0.0000000 0.0000000 -0.5415892 0.07050839 0.6710042## [5,] 0.0000000 0.0000000 0.0000000 1.21922765 0.1292877
tridiagmatrix()
function in the cmna R package implements the described algorithmlibrary(cmna)x = tridiagmatrix(lsubdiag, diagonal, usubdiag, b)max(abs(M %*% x - b)) # Verify M*x = b
## [1] 2.298162e-13
solve()
bench::mark( solve(M, b), tridiagmatrix(lsubdiag, diagonal, usubdiag, b), max_iterations = 10, check = FALSE) %>% select(expression, min, median)
## # A tibble: 2 × 3## expression min median## <bch:expr> <bch:tm> <bch:tm>## 1 solve(M, b) 1.37s 1.37s## 2 tridiagmatrix(lsubdiag, diagonal, usubdiag, b) 614.1µs 650.25µs
LU decomposition is an important method to solve general linear systems
Factorizes an n×n matrix A into the product of two triangular matrices, A=LU
L lower triangular, with ones on the diagonal
U upper triangular
Let A be an n×n matrix such that all of its diagonal submatrices are nonsingular. Then there exist a lower triangular matrix L with unit diagonal elements (i.e., lii=1), and an upper triangular matrix U, such that A=LU.
Suppose the decomposition A=LU is given
If we wish to solve Ax=b, then:
First, solve Ly=b to get y (forward substitution)
Second, solve Ux=y to get x (backward substitution)
Suppose the decomposition A=LU is given
Question: How to compute det(A)?
Suppose the decomposition A=LU is given
Question: How to compute det(A)?
det(A)=det(L)det(U)=det(U)=u11⋯unn
Based on the relation
In practice, LU decomposition is performed with a permutation matrix Q, which is typically called pivoting: QA=LU
This is used to enhance the numerical stability
Q has the effect of interchanging rows of A
Q satisfies Q′Q=QQ′=I
When using this factorization to solve linear systems Ax=b
First, compute ˜b=Qb
Second, solve Ly=˜b to get y
Third, solve Ux=y to get x
Decomposition part is O(n3)
Linear system solving part is O(n2)
lu()
function in the Matrix package can be used to compute the LU decomposition of a square matrixlibrary(Matrix)
## Warning: package 'Matrix' was built under R version 4.2.3
set.seed(123)n = 5M = matrix(rnorm(n^2), n)lu_obj = lu(M)decomp = expand(lu_obj)names(decomp) # A named list
## [1] "L" "U" "P"
decomp$L
## 5 x 5 Matrix of class "dtrMatrix" (unitriangular)## [,1] [,2] [,3] [,4] [,5] ## [1,] 1.00000000 . . . .## [2,] -0.35957699 1.00000000 . . .## [3,] 0.04523514 -0.49963385 1.00000000 . .## [4,] 0.08294543 -0.27038315 -0.28235315 1.00000000 .## [5,] -0.14767195 0.21751065 0.15641432 -0.65806034 1.00000000
decomp$U
## 5 x 5 Matrix of class "dtrMatrix"## [,1] [,2] [,3] [,4] [,5] ## [1,] 1.5587083 -1.2650612 0.4007715 -1.9666172 -1.0260044## [2,] . 1.2601781 1.3681900 1.0797629 -1.4367513## [3,] . . 0.7761478 1.3298022 -1.4003294## [4,] . . . 0.3577540 -1.3237976## [5,] . . . . -0.7090854
decomp$P
## 5 x 5 sparse Matrix of class "pMatrix"## ## [1,] . | . . .## [2,] . . . . |## [3,] | . . . .## [4,] . . | . .## [5,] . . . | .
# Verify P*L*U = Amax(abs(decomp$P %*% decomp$L %*% decomp$U - M))
## [1] 2.220446e-16
lu_solve = function(decomp, b){ bt = t(decomp$P) %*% b y = forwardsolve(as.matrix(decomp$L), bt) # L*y = b x = backsolve(as.matrix(decomp$U), y) # U*x = y x}
set.seed(123)n = 2000M = matrix(rnorm(n^2), n)b = rnorm(n)decomp = expand(lu(M))x = lu_solve(decomp, b)max(abs(M %*% x - b)) # Verify M*x = b
## [1] 1.933176e-11
bench::mark( solve(M, b), expand(lu(M)), lu_solve(decomp, b), max_iterations = 10, check = FALSE) %>% select(expression, min, median)
## # A tibble: 3 × 3## expression min median## <bch:expr> <bch:tm> <bch:tm>## 1 solve(M, b) 1.4s 1.4s## 2 expand(lu(M)) 1.37s 1.37s## 3 lu_solve(decomp, b) 17.63ms 18.85ms
If Ax=b only needs to be solved once:
solve()
is more convenientsolve()
internally uses algorithms similar to LU decompositionIf a sequence of linear systems Ax(k)=b(k) need to be solved iteratively
lu_solve()
to get each solution x(k)We will encounter the second case in many optimization problems
LU decomposition applies to general square matrices
If we can guarantee the matrix A to be positive definite
Example: A=X′X+λI, λ>0
Then the Cholesky decomposition A=LL′ can be computed
L is a lower triangular matrix
Let A be an n×n real symmetric positive definite matrix. Then there exists a unique lower triangular matrix L with positive diagonal elements, such that A=LL′.
Based on the relation ljj=√ajj−j−1∑k=1l2jk lij=(aij−j−1∑k=1likljk)/ljj,i=j+1,…,n
Suppose the decomposition A=LL′ is given
If we wish to solve Ax=b, then:
First, solve Ly=b to get y (forward substitution)
Second, solve L′x=y to get x (backward substitution)
Suppose the decomposition A=LL′ is given
Question: How to compute det(A)?
Suppose the decomposition A=LL′ is given
Question: How to compute det(A)?
det(A)=[det(L)]2=l211⋯l2nn
Decomposition part is O(n3)
Linear system solving part is O(n2)
Same orders as LU decomposition, but in general faster
chol()
function can be used to compute Cholesky decompositionset.seed(123)n = 5M = matrix(rnorm(n^2), n)A = crossprod(M) # positive definiteR = chol(A)R
## [,1] [,2] [,3] [,4] [,5]## [1,] 1.678801 -1.873512 -0.1240544 -2.4977185 -0.6449736## [2,] 0.000000 1.383693 1.2267898 0.6009298 -0.7682812## [3,] 0.000000 0.000000 0.7676374 1.1355032 -0.8695156## [4,] 0.000000 0.000000 0.0000000 0.3672150 -1.0253122## [5,] 0.000000 0.000000 0.0000000 0.0000000 0.5906208
chol_solve = function(R, b){ y = backsolve(R, b, transpose = TRUE) # R'*y = b x = backsolve(R, y) # R*x = y x}
set.seed(123)n = 2000M = matrix(rnorm(n^2), n)A = crossprod(M)b = rnorm(n)R = chol(A)max(abs(crossprod(R) - A)) # Verify R'R = A
## [1] 1.364242e-12
x = chol_solve(R, b)max(abs(A %*% x - b)) # Verify A*x = b
## [1] 4.437666e-09
bench::mark( solve(A, b), chol(A), chol_solve(R, b), max_iterations = 10, check = FALSE) %>% select(expression, min, median)
## # A tibble: 3 × 3## expression min median## <bch:expr> <bch:tm> <bch:tm>## 1 solve(A, b) 1.38s 1.38s## 2 chol(A) 1.31s 1.31s## 3 chol_solve(R, b) 3.62ms 5.04ms
As long as you can guarantee A is positive definite, prefer to use Cholesky decomposition
solve()
does not know the special property of A, so it will use the slightly slower LU decomposition
Cholesky decomposition utilizes this "prior information"
Cholesky decomposition can also be used when a sequence of linear systems Ax(k)=b(k) need to be solved iteratively
[1] Folkmar Bornemann (2018). Numerical Linear Algebra. Springer.
[2] Grégoire Allaire and Sidi Mahmoud Kaber (2008). Numerical linear algebra. Springer.
[3] Åke Björck (2015). Numerical methods in matrix computations. Springer.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |