Download Conjugate Gradient Methods for Solving Linear Systems and Optimization Problems and more Slides Computer Science in PDF only on Docsity!
Conjugate Gradient Methods
Optimization Problems
- Solution of equations can be formulated as an optimization problem, e.g., density functional theory in electronic structure, conformation of proteins, etc
- Minimization with constraints – operations research (linear programming, optimal conditions in management science, traveling salesman problem, etc)
Local & Global Extremum
Bracketing and Search in 1D
Bracket a minimum means that for given a < b < c , we have f ( b ) < f ( a ), and f ( b ) < f ( c ). There is a minimum in the interval (a,c). a b
c
Golden Section Search
- Choose x such that the ratio of intervals [ a , b ] to [ b , c ] is the same as [ a , x ] to [ x,b ]. Remove [ a , x ] if f[ x ] > f[ b ], or remove [ b , c ] if f[ x ] < f[ b ].
- The asymptotic limit of the ratio is the Golden mean
a b c
x
(^5 1) 0. 2
(^)
Parabolic Interpolation &
Brent’s Method
1 (^ )^2 ( )^ ( )^ (^ )^2 ( )^ ( ) 2 ( ) ( ) ( ) ( ) ( ) ( ) x b b^ a^ f a^ f c^ b^ c^ f b^ f a b a f a f c b c f b f a ^ ^ ^ ^ Brent’s method combines parabolic interpolation with Golden section search, with some complicated bookkeeping. See NR, page 404-405 for details.
Local Properties near Minimum
- Let P be some point of interest which is at the origin x = 0. Taylor expansion gives
- Minimizing f is the same as solving the equation
2 ,
( ) ( ) 1 2 1 2
^
i (^) i i j i j i i j T T
f f f^ x f x x x x x c
x P
b x x A x
A x b
T (^) for transpose of a matrix
Search along Coordinate Directions
Search minimum along x direction, followed by search minimum along y direction, and so on. Such method takes a very large number of steps to converge. The curved loops represent f ( x , y ) = const.
Simulated Annealing
- To minimize f ( x ), we make random change to x by the following rule:
- Set T a large value, decrease as we go
- Metropolis algorithm: make local change from x to x ’. If f decreases, accept the change, otherwise, accept only with a small probability
r = exp[-( f ( x ’)- f ( x ))/ T ]. This is done by
comparing r with a random number 0 < ξ < 1.
Traveling Salesman Problem
Singapore
Kuala Lumpur
Hong Kong
Taipei
Shanghai
Beijing (^) Tokyo
Find shortest path that cycles through each city exactly once.
cgs Conjugate Gradients Squared method Syntaxx = cgs(A,b) cgs(A,b,tol) cgs(A,b,tol,maxit) cgs(A,b,tol,maxit,M) cgs(A,b,tol,maxit,M1,M2) cgs(A,b,tol,maxit,M1,M2,x0) cgs(afun,b,tol,maxit,m1fun,m2fun,x0,p1,p2,...) [x,flag] = cgs(A,b,...) [x,flag,relres] = cgs(A,b,...) [x,flag,relres,iter] = cgs(A,b,...) [x,flag,relres,iter,resvec] = cgs(A,b,...)
Conjugate Gradient Method
- Introduction, Notation and Basic Terms
- Eigenvalues and Eigenvectors
- The Method of Steepest Descent
- Convergence Analysis of the Method of Steepest Descent
- The Method of Conjugate Directions
- The Method of Conjugate Gradients
- Convergence Analysis of Conjugate Gradient Method
- Complexity of the Conjugate Gradient Method
- Preconditioning Techniques
- Conjugate Gradient Type Algorithms for Non- Symmetric Matrices (CGS, Bi-CGSTAB Method)
1. Introduction, Notation and Basic Terms
- The CG is one of the most popular iterative methods for solving large systems of linear equations Ax = b which arise in many important settings, such as finite difference and finite element methods for solving partial differential equa-tions, circuit analysis etc.
- It is suited for use with sparse matrices. If A is dense, the best choice is to factor A and solve the equation by backsubstitution.
- There is a fundamental underlying structure for almost all the descent algorithms: (1) one starts with an initial point; (2) deter- mines according to a fixed rule a direction of movement; (3) mo-ves in that direction to a relative minimum of the objective function; (4) at the new point, a new direction is determined and the process is repeated. The difference between different algorithms depends upon the rule by which successive directions of movement are selected.
1. Introduction, Notation and Basic Terms (Cont’d)
- A matrix is a rectangular array of numbers, called elements.
- The transpose of an m n matrix A is the n m matrix A T^ with elements a ij T^ = a ji
- Two square n n matrices A and B are similar if there is a nonsin- gular matrix S such that B = S -1 AS
- Matrices having a single row are called row vectors ; matrices having a single column are called column vectors. Row vector: a = [a 1 , a 2 , …, an] Column vector: a = (a 1 , a 2 , …, an)
- The inner product of two vectors is written as
n i 1 i i
x T^ y x y