





Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Iterative methods for linear systems, Banach Fixed point, theorem Jacobi and Gauss-Siedel, Iterations , Relaxation
Typology: Study notes
1 / 9
This page cannot be seen from the preview
Don't miss anything!






We have seen that a direct algorithm for solving Ax = b requires O(n^3 ) work. This amount of work becomes inpractical quite quickly. The basic aim of an iterative method is to produce a method of approximating the action of the inverse of the matrix such that the amount of work required is better than O(n^3 ). To understand the general concept, let us write the matrix A as A = A − B + B with an invertible matrix B at our disposal. Then, the equation Ax = b can be reformulated as b = Ax = (A − B)x + Bx and hence as x = B−^1 (B − A)x + B−^1 b =: Cx + c =: F (x), so that x is a fixed point of the mapping F. To calculate this fixed point, we can use the following simple iterative process. We first pick a starting point x 0 and then form xi+1 := F (xi), i = 1, 2 , 3.... (13) If this sequence converges and if F is continuous, the limit has to be a fixed point of F.
We will now derive a general convergence result for the iteration process (13). Definition 10.1 A mapping F : Rn^ → Rn^ is called a contraction mapping with respect to a norm ‖ · ‖ on Rn^ if there is a constant 0 < q < 1 such that ‖F (x) − F (y)‖ ≤ q‖x − y‖ for all x, y ∈ Rn. A contraction mapping is Lipschitz-continuous with Lipschtiz-constant q < 1. Theorem 10.2 (Banach) If F : Rn^ → Rn^ is a contraction mapping then F has exactly one fixed point x∗. The sequence xj+1 := F (xj ) converges for every starting point x 0 ∈ Rn. Furthermore, we have the error estimates ‖x∗^ − xj ‖ ≤ q 1 − q ‖xj − xj− 1 ‖ (a posteriori), ‖x∗^ − xj ‖ ≤ qj 1 − q ‖x 1 − x 0 ‖ (a priori).
If we apply this theorem to our special iteration function F (x) = Cx + c, where C is the iteration matrix, we see that ‖F (x) − F (y)‖ = ‖Cx + c − (Cy + c)‖ = ‖C(x − y)‖ ≤ ‖C‖‖x − y‖, so that we have convergence if ‖C‖ < 1. Unfortunately, this depends on the chosen vector and hence matrix norm, while, since all norms on Rn^ are equivalent, the fact that the sequence converges does not depend on the norm. In other words, having an induced matrix norm with ‖C‖ < 1 is sufficient for convergence but not necessary. A sufficient and necessary condition can be stated using the spectral radius of the iteration matrix. Definition 10.3 Let A ∈ Rn×n^ with eigenvalues λ 1 , λ 2 ,... , λn ordered so that |λ 1 | ≥ |λ 2 | ≥ · · · ≥ |λn|. The spectral radius of A is given by ρ(A) := |λ 1 |. Note that, if λ is an eigenvalue of A with eigenvector x, then λr^ is an eigenvalue of Ar, r = 1, 2 , 3 ,... with eigenvector x. Hence, ρ(Ar) = ρ(A)r. Theorem 10.4 (1) If ‖·‖ is a compatible matrix-norm then ρ(A) ≤ ‖A‖ for all matrices A ∈ Rn×n. (2) For any # > 0 there is an induced norm, ‖ · ‖ such that ρ(A) ≤ ‖A‖ ≤ ρ(A) + #. This allows us to state and prove our main convergence result for iterative processes. Theorem 10.5 The iteration xj+1 = Cxj + c converges for every starting point if and only if ρ(C) < 1. Proof: Assume first that ρ(C) < 1. Then, we can pick an # > 0 such that ρ(C)+# < 1 and, by Theorem 10.4, we can find an induced matrix norm ‖ · ‖ such that ‖C‖ ≤ ρ(C) + # < 1, which gives convergence. Assume now that the iteration converges to x∗^ for every starting point x 0. If we pick the starting point such that x = x 0 − x∗^ is an eigenvector of C with eigenvalue λ, then xj − x∗^ = F (xj− 1 ) − F (x∗) = C(xj− 1 − x∗) =... = Cj^ (x 0 − x∗) = λj^ (x 0 − x∗). Since the expression on the left hand side tends to zero for j → ∞, so does the expression on the right hand side. This, however, is only possible if |λ| < 1. Since λ was an arbitrary eigenvalue of C, this shows that ρ(C) < 1.!
Theorem 10.8 The Jacobi method converges for every starting point if the matrix A is strongly row diagonally dominant. Proof: We use the row sum norm to calculate the norm of the iteration matrix C: ‖C‖∞ = max 1 ≤i≤n ∑^ n k= |cik| = max 1 ≤i≤n ∑^ n k= k !=i |aik| |aii|
Hence, we have convergence.! A closer inspection of the method (15) shows that the computation of x (j+1) i is independent of any other x (j+1) !. This means that, on a parallel or vector computer all components of the new iteration x(j+1)^ can be computed simultaneously. However, it also gives us the possibility to improve the process. For example, to calculate x( 2 j +1)we could already employ the newly computed x( 1 j +1). Then, for computing x( 3 j +1)we could use x( 1 j +1)and x( 2 j +1)and so on. This leads to the following iteration scheme. Definition 10.9 The Gauss-Seidel method is given by the iteration scheme x (j+1) i =^
aii
bi − ∑^ i−^1 k= aikx (j+1) k − ∑^ n k=i+ aikx (j) k
, 1 ≤ i ≤ n. (16) To analyse the convergence of this scheme, we have to find the iteration matrix C = I − B−^1 A. To this end, we rewrite (16) as aiix (j+1) i + ∑^ i−^1 k= aikx (j+1) k =^ bi^ − ∑^ n k=i+ aikx (j) k , which translates into (L + D)x(j+1)^ = −Rx(j)^ + b. Thus, the iteration matrix of the Gauss-Seidel method is given by CG = −(L + D)−^1 R. Later on, we will prove a more general version of the following theorem. Theorem 10.10 If A = AT^ is positive definite then the Gauss-Seidel method converges.
A further improvement of both methods can be achieved by Relaxation. We start by looking at the Jacobi method. Here, the iterations can be written as x(j+1)^ = D−^1 b − D−^1 (L + R)x(j) = x(j)^ + D−^1 b − D−^1 (L + R + D)x(j) = x(j)^ + D−^1 (b − Ax(j)). The latter equality shows that the new iteration x(j+1)^ is given by the old iteration x(j) corrected by the D−^1 -multiple of the residual b − Ax. In practice, one often notice that the correction term is off the correct correction term by a fixed factor. Hence, it makes sense to introduce a relaxation parameter ω and to form the new iteration as x(j+1)^ = x(j)^ + ωD−^1 (b − Ax(j)), (17) which gives the following component-wise scheme: Definition 10.11 The Jacobi Relaxation is given by x (j+1) i =^ x (j) i +^ ω aii
bi − ∑^ n k= aikx (j) k
, 1 ≤ i ≤ n. Of course, the relaxation parameter should be chosen such that the convergence improves compared to the original Jacobi method. The iteration matrix follows from x(j+1)^ = x(j)^ + ωD−^1 b − ωD−^1 (L + D + R)x(j) = [(1 − ω)I − ωD−^1 (L + R)]x(j)^ + ωD−^1 b to be CJ (ω) = [(1 − ω)I − ωD−^1 (L + R)] = (1 − ω)I + ωCJ , which shows that CJ (1) = CJ corresponds to the classical Jacobi method. Theorem 10.12 Assume that CJ = −D−^1 (L + R) has only real eigenvalues λ 1 ≤ λ 2 ≤
... ≤ λn < 1 with corresponding eigenvectors z(1),... , z(n). Then, C(ω) has the same eigenvectors z(1),... , z(n), but with eigenvalues μj = 1 − ω + ωλj for 1 ≤ j ≤ n. The spectral radius of C(ω) is minimised by choosing ω∗^ =
2 − λ 1 − λn
In the case of λ 1 (= −λn Relaxation converges faster then the Jacobi method.
This idea can be used to introduce relaxation for the Gauss-Seidel method as well. We start by looking at Dx(j+1)^ = b − Lx(j+1)^ − Rx(j+1)^ and replace the iteration on the left hand side by z(j+1)^ and then use linear interpolation again. Hence, we set Dz(j+1)^ = b − Lx(j+1)^ − Rx(j+1), x(j+1)^ = (1 − ω)x(j)^ + ωz(j+1). Multiplying the second equation with D and inserting the first one yields Dx(j+1)^ = (1 − ω)Dx(j)^ + ωb − ωLx(j+1)^ − ωRx(j) and hence (D + ωL)x(j+1)^ = [(1 − ω)D − ωR] x(j)^ + ωb. Thus, the iteration matrix of the relaxed Gauss-Seidel method is given by CG(ω) = (D + ωL)−^1 [(1 − ω)D − ωR]. We can rewrite this component-wise. Definition 10.13 The Gauss-Seidel Relaxation or SOR (successive over-relaxation) method is given by x (j+1) i =^ x (j) i +^ ω aii
bi − ∑^ i−^1 k= aikx (j+1) k − ∑^ n k=i aikx (j) k
, 1 ≤ i ≤ n. Again, we have to deal with the question on how to choose the relaxation parameter. Theorem 10.14 The spectral radius of the iteration matrix CG(ω) of SOR satisfies ρ(CG(ω)) ≥ |ω − 1 |. Hence, convergence is only possible if ω ∈ (0, 2). Proof: The iteration matrix CG(ω) can be written in the form CG(ω) = (I + ωD−^1 L)−^1 [(1 − ω)I − ωD−^1 R]. The first matrix in this product is a normalised lower triangular matrix and the second matrix is an upper triangular matrix with diagonal entries all equal to 1 − ω. Since the determinant of a matrix equals the product of its eigenvalues, we have | 1 − ω|n^ = | det CG(ω)| ≤ ρ(CG(ω))n, which gives the result.! We will now show that for a positive definite matrix ω ∈ (0, 2) is also sufficient for conver- gence. Since ω = 1 gives the classical Gauss-Seidel method, we also cover Theorem 10.10.
Theorem 10.15 Let A ∈ Rn×n^ be symmetric and positive definite. Then, the SOR method converges for every relaxation parameter ω ∈ (0, 2). Proof: We have to show that ρ(CG(ω)) < 1. To this end, we rewrite the iteration matrix CG(ω) in the form CG(ω) = (D + ωL)−^1 [D + ωL − ω(L + D + R)] = I − ω(D + ωL)−^1 A = I −
ω
with B = (^) ω^1 D + L. Let λ ∈ C be an eigenvalue of CG(ω) with corresponding eigenvector x ∈ Cn, which we assume to be normalised by ‖x‖ 2 = 1. Then, we have CG(ω)x = (I − B−^1 A)x = λx or Ax = (1 − λ)Bx. Since A is positive definite, we must have λ (= 1 such that we can conclude 1 1 − λ
xT^ Bx xT^ Ax
Since A is symmetric, we can conclude that B + BT^ = ( (^) ω^2 − 1)D + A, such that the real part of 1/(1 − λ) satisfies )
1 − λ
xT^ (B + BT^ )x xT^ Ax
ω
xT^ Dx xT^ Ax
because, on account of ω ∈ (0, 2), the expression 2/ω−1 is positive, as well as xT^ Dx/xT^ Ax. The latter follows since the diagonal entries of a positive definite matrix have to be positive. If we write λ = u + iv then we can conclude that 1 2
1 − λ
1 − u (1 − u)^2 + v^2 and hence |λ|^2 = u^2 + v^2 < 1.! Example 10.16 Suppose we wish to solve the system Ax = b where A =
,^ b^ =
Note that A is symmetric and diagonally dominant.