Numerical Linear Algebra, Lecture Notes - Mathematics - 5, Study notes of Mathematical Methods for Numerical Analysis and Optimization

Iterative methods for linear systems, Banach Fixed point, theorem Jacobi and Gauss-Siedel, Iterations , Relaxation

Typology: Study notes

2010/2011

Uploaded on 09/09/2011

luber-1
luber-1 🇬🇧

4.8

(12)

293 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
10 Iterative Methods for Linear Systems
10.1 Introduction
We have seen that a direct algorithm for solving
Ax=b
requires O(n3) work. This amount of work becomes inpractical quite quickly. The basic
aim of an iterative method is to produce a method of approximating the action of the
inverse of the matrix such that the amount of work required is better than O(n3).
To understand the general concept, let us write the matrix Aas A=AB+Bwith an
invertible matrix Bat our disposal. Then, the equation Ax=bcan be reformulated as
b=Ax=(AB)x+Bxand hence as
x=B1(BA)x+B1b=: Cx+c=: F(x),
so that xis a fixed point of the mapping F. To calculate this fixed point, we can use the
following simple iterative process. We first pick a starting point x0and then form
xi+1 := F(xi),i=1,2,3.... (13)
If this sequence converges and if Fis continuous, the limit has to be a fixed point of F.
10.2 Banach’s Fixed Point Theorem
We will now derive a general convergence result for the iteration process (13).
Definition 10.1 AmappingF:RnRnis called a contraction mapping with respect to
a norm #·#on Rnif there is a constant 0<q<1such that
#F(x)F(y)#≤q#xy#
for all x,yRn.
A contraction mapping is Lipschitz-continuous with Lipschtiz-constant q<1.
Theorem 10.2 (Banach) If F:RnRnis a contraction mapping then Fhas exactly
one fixed point x.Thesequencexj+1 := F(xj)converges for every starting point x0Rn.
Furthermore, we have the error estimates
#xxj#≤ q
1q#xjxj1#(a posteriori),
#xxj#≤ qj
1q#x1x0#(a priori).
33
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Numerical Linear Algebra, Lecture Notes - Mathematics - 5 and more Study notes Mathematical Methods for Numerical Analysis and Optimization in PDF only on Docsity!

10 Iterative Methods for Linear Systems

10.1 Introduction

We have seen that a direct algorithm for solving Ax = b requires O(n^3 ) work. This amount of work becomes inpractical quite quickly. The basic aim of an iterative method is to produce a method of approximating the action of the inverse of the matrix such that the amount of work required is better than O(n^3 ). To understand the general concept, let us write the matrix A as A = A − B + B with an invertible matrix B at our disposal. Then, the equation Ax = b can be reformulated as b = Ax = (A − B)x + Bx and hence as x = B−^1 (B − A)x + B−^1 b =: Cx + c =: F (x), so that x is a fixed point of the mapping F. To calculate this fixed point, we can use the following simple iterative process. We first pick a starting point x 0 and then form xi+1 := F (xi), i = 1, 2 , 3.... (13) If this sequence converges and if F is continuous, the limit has to be a fixed point of F.

10.2 Banach’s Fixed Point Theorem

We will now derive a general convergence result for the iteration process (13). Definition 10.1 A mapping F : Rn^ → Rn^ is called a contraction mapping with respect to a norm ‖ · ‖ on Rn^ if there is a constant 0 < q < 1 such that ‖F (x) − F (y)‖ ≤ q‖x − y‖ for all x, y ∈ Rn. A contraction mapping is Lipschitz-continuous with Lipschtiz-constant q < 1. Theorem 10.2 (Banach) If F : Rn^ → Rn^ is a contraction mapping then F has exactly one fixed point x∗. The sequence xj+1 := F (xj ) converges for every starting point x 0 ∈ Rn. Furthermore, we have the error estimates ‖x∗^ − xj ‖ ≤ q 1 − q ‖xj − xj− 1 ‖ (a posteriori), ‖x∗^ − xj ‖ ≤ qj 1 − q ‖x 1 − x 0 ‖ (a priori).

If we apply this theorem to our special iteration function F (x) = Cx + c, where C is the iteration matrix, we see that ‖F (x) − F (y)‖ = ‖Cx + c − (Cy + c)‖ = ‖C(x − y)‖ ≤ ‖C‖‖x − y‖, so that we have convergence if ‖C‖ < 1. Unfortunately, this depends on the chosen vector and hence matrix norm, while, since all norms on Rn^ are equivalent, the fact that the sequence converges does not depend on the norm. In other words, having an induced matrix norm with ‖C‖ < 1 is sufficient for convergence but not necessary. A sufficient and necessary condition can be stated using the spectral radius of the iteration matrix. Definition 10.3 Let A ∈ Rn×n^ with eigenvalues λ 1 , λ 2 ,... , λn ordered so that |λ 1 | ≥ |λ 2 | ≥ · · · ≥ |λn|. The spectral radius of A is given by ρ(A) := |λ 1 |. Note that, if λ is an eigenvalue of A with eigenvector x, then λr^ is an eigenvalue of Ar, r = 1, 2 , 3 ,... with eigenvector x. Hence, ρ(Ar) = ρ(A)r. Theorem 10.4 (1) If ‖·‖ is a compatible matrix-norm then ρ(A) ≤ ‖A‖ for all matrices A ∈ Rn×n. (2) For any # > 0 there is an induced norm, ‖ · ‖ such that ρ(A) ≤ ‖A‖ ≤ ρ(A) + #. This allows us to state and prove our main convergence result for iterative processes. Theorem 10.5 The iteration xj+1 = Cxj + c converges for every starting point if and only if ρ(C) < 1. Proof: Assume first that ρ(C) < 1. Then, we can pick an # > 0 such that ρ(C)+# < 1 and, by Theorem 10.4, we can find an induced matrix norm ‖ · ‖ such that ‖C‖ ≤ ρ(C) + # < 1, which gives convergence. Assume now that the iteration converges to x∗^ for every starting point x 0. If we pick the starting point such that x = x 0 − x∗^ is an eigenvector of C with eigenvalue λ, then xj − x∗^ = F (xj− 1 ) − F (x∗) = C(xj− 1 − x∗) =... = Cj^ (x 0 − x∗) = λj^ (x 0 − x∗). Since the expression on the left hand side tends to zero for j → ∞, so does the expression on the right hand side. This, however, is only possible if |λ| < 1. Since λ was an arbitrary eigenvalue of C, this shows that ρ(C) < 1.!

Theorem 10.8 The Jacobi method converges for every starting point if the matrix A is strongly row diagonally dominant. Proof: We use the row sum norm to calculate the norm of the iteration matrix C: ‖C‖∞ = max 1 ≤i≤n ∑^ n k= |cik| = max 1 ≤i≤n ∑^ n k= k !=i |aik| |aii|

Hence, we have convergence.! A closer inspection of the method (15) shows that the computation of x (j+1) i is independent of any other x (j+1) !. This means that, on a parallel or vector computer all components of the new iteration x(j+1)^ can be computed simultaneously. However, it also gives us the possibility to improve the process. For example, to calculate x( 2 j +1)we could already employ the newly computed x( 1 j +1). Then, for computing x( 3 j +1)we could use x( 1 j +1)and x( 2 j +1)and so on. This leads to the following iteration scheme. Definition 10.9 The Gauss-Seidel method is given by the iteration scheme x (j+1) i =^

aii

bi − ∑^ i−^1 k= aikx (j+1) k − ∑^ n k=i+ aikx (j) k

, 1 ≤ i ≤ n. (16) To analyse the convergence of this scheme, we have to find the iteration matrix C = I − B−^1 A. To this end, we rewrite (16) as aiix (j+1) i + ∑^ i−^1 k= aikx (j+1) k =^ bi^ − ∑^ n k=i+ aikx (j) k , which translates into (L + D)x(j+1)^ = −Rx(j)^ + b. Thus, the iteration matrix of the Gauss-Seidel method is given by CG = −(L + D)−^1 R. Later on, we will prove a more general version of the following theorem. Theorem 10.10 If A = AT^ is positive definite then the Gauss-Seidel method converges.

10.4 Relaxation

A further improvement of both methods can be achieved by Relaxation. We start by looking at the Jacobi method. Here, the iterations can be written as x(j+1)^ = D−^1 b − D−^1 (L + R)x(j) = x(j)^ + D−^1 b − D−^1 (L + R + D)x(j) = x(j)^ + D−^1 (b − Ax(j)). The latter equality shows that the new iteration x(j+1)^ is given by the old iteration x(j) corrected by the D−^1 -multiple of the residual b − Ax. In practice, one often notice that the correction term is off the correct correction term by a fixed factor. Hence, it makes sense to introduce a relaxation parameter ω and to form the new iteration as x(j+1)^ = x(j)^ + ωD−^1 (b − Ax(j)), (17) which gives the following component-wise scheme: Definition 10.11 The Jacobi Relaxation is given by x (j+1) i =^ x (j) i +^ ω aii

bi − ∑^ n k= aikx (j) k

, 1 ≤ i ≤ n. Of course, the relaxation parameter should be chosen such that the convergence improves compared to the original Jacobi method. The iteration matrix follows from x(j+1)^ = x(j)^ + ωD−^1 b − ωD−^1 (L + D + R)x(j) = [(1 − ω)I − ωD−^1 (L + R)]x(j)^ + ωD−^1 b to be CJ (ω) = [(1 − ω)I − ωD−^1 (L + R)] = (1 − ω)I + ωCJ , which shows that CJ (1) = CJ corresponds to the classical Jacobi method. Theorem 10.12 Assume that CJ = −D−^1 (L + R) has only real eigenvalues λ 1 ≤ λ 2 ≤

... ≤ λn < 1 with corresponding eigenvectors z(1),... , z(n). Then, C(ω) has the same eigenvectors z(1),... , z(n), but with eigenvalues μj = 1 − ω + ωλj for 1 ≤ j ≤ n. The spectral radius of C(ω) is minimised by choosing ω∗^ =

2 − λ 1 − λn

In the case of λ 1 (= −λn Relaxation converges faster then the Jacobi method.

This idea can be used to introduce relaxation for the Gauss-Seidel method as well. We start by looking at Dx(j+1)^ = b − Lx(j+1)^ − Rx(j+1)^ and replace the iteration on the left hand side by z(j+1)^ and then use linear interpolation again. Hence, we set Dz(j+1)^ = b − Lx(j+1)^ − Rx(j+1), x(j+1)^ = (1 − ω)x(j)^ + ωz(j+1). Multiplying the second equation with D and inserting the first one yields Dx(j+1)^ = (1 − ω)Dx(j)^ + ωb − ωLx(j+1)^ − ωRx(j) and hence (D + ωL)x(j+1)^ = [(1 − ω)D − ωR] x(j)^ + ωb. Thus, the iteration matrix of the relaxed Gauss-Seidel method is given by CG(ω) = (D + ωL)−^1 [(1 − ω)D − ωR]. We can rewrite this component-wise. Definition 10.13 The Gauss-Seidel Relaxation or SOR (successive over-relaxation) method is given by x (j+1) i =^ x (j) i +^ ω aii

bi − ∑^ i−^1 k= aikx (j+1) k − ∑^ n k=i aikx (j) k

, 1 ≤ i ≤ n. Again, we have to deal with the question on how to choose the relaxation parameter. Theorem 10.14 The spectral radius of the iteration matrix CG(ω) of SOR satisfies ρ(CG(ω)) ≥ |ω − 1 |. Hence, convergence is only possible if ω ∈ (0, 2). Proof: The iteration matrix CG(ω) can be written in the form CG(ω) = (I + ωD−^1 L)−^1 [(1 − ω)I − ωD−^1 R]. The first matrix in this product is a normalised lower triangular matrix and the second matrix is an upper triangular matrix with diagonal entries all equal to 1 − ω. Since the determinant of a matrix equals the product of its eigenvalues, we have | 1 − ω|n^ = | det CG(ω)| ≤ ρ(CG(ω))n, which gives the result.! We will now show that for a positive definite matrix ω ∈ (0, 2) is also sufficient for conver- gence. Since ω = 1 gives the classical Gauss-Seidel method, we also cover Theorem 10.10.

Theorem 10.15 Let A ∈ Rn×n^ be symmetric and positive definite. Then, the SOR method converges for every relaxation parameter ω ∈ (0, 2). Proof: We have to show that ρ(CG(ω)) < 1. To this end, we rewrite the iteration matrix CG(ω) in the form CG(ω) = (D + ωL)−^1 [D + ωL − ω(L + D + R)] = I − ω(D + ωL)−^1 A = I −

ω

D + L

A

= I − B−^1 A,

with B = (^) ω^1 D + L. Let λ ∈ C be an eigenvalue of CG(ω) with corresponding eigenvector x ∈ Cn, which we assume to be normalised by ‖x‖ 2 = 1. Then, we have CG(ω)x = (I − B−^1 A)x = λx or Ax = (1 − λ)Bx. Since A is positive definite, we must have λ (= 1 such that we can conclude 1 1 − λ

xT^ Bx xT^ Ax

Since A is symmetric, we can conclude that B + BT^ = ( (^) ω^2 − 1)D + A, such that the real part of 1/(1 − λ) satisfies )

1 − λ

xT^ (B + BT^ )x xT^ Ax

ω

xT^ Dx xT^ Ax

because, on account of ω ∈ (0, 2), the expression 2/ω−1 is positive, as well as xT^ Dx/xT^ Ax. The latter follows since the diagonal entries of a positive definite matrix have to be positive. If we write λ = u + iv then we can conclude that 1 2

1 − λ

1 − u (1 − u)^2 + v^2 and hence |λ|^2 = u^2 + v^2 < 1.! Example 10.16 Suppose we wish to solve the system Ax = b where A =

 ,^ b^ =

Note that A is symmetric and diagonally dominant.