Conjugate Gradient Method-Optimization Methods-Lecture Slides, Slides of Convex Optimization

Dr. Hanumant Chawd delivered this lecture at Alagappa University for Convex Optimization course. Its main points are: Conjugate, Gradient, Method, Krylov, Sequence, Linear, Systems, Analysis, Preconditioning, Sparse, Direct

Typology: Slides

2011/2012

Uploaded on 07/15/2012

saginala
saginala 🇮🇳

4.5

(2)

80 documents

1 / 30

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Conjugate Gradient Method
direct and indirect methods
positive definite linear systems
Krylov sequence
spectral analysis of Krylov sequence
preconditioning
Prof. S. Boyd, EE364b, Stanford University
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e

Partial preview of the text

Download Conjugate Gradient Method-Optimization Methods-Lecture Slides and more Slides Convex Optimization in PDF only on Docsity!

Conjugate Gradient Method

direct and indirect methods

positive definite linear systems

Krylov sequence

spectral analysis of Krylov sequence

preconditioning

Prof. S. Boyd, EE364b, Stanford University

docsity.com

Three classes of methods for linear equations

methods to solve linear system

Ax

b

A

R

n

×

n

dense direct

(factor-solve methods)

runtime depends only on size; independent of data, structure, orsparsity

work well for

n

up to a few thousand

sparse direct

(factor-solve methods)

runtime depends on size, sparsity pattern; (almost) independent ofdata

can work well for

n

up to

4

or

5

(or more)

requires good heuristic for ordering

Prof. S. Boyd, EE364b, Stanford University

1 docsity.com

Symmetric positive definite linear systems

SPD system of equations

Ax

b,

A

R

n

×

n

A

A

T

examples

Newton/interior-point search direction:

2

φ

x

x

φ

x

least-squares normal equations:

A

T

A

x

A

T

b

regularized least-squares:

A

T

A

μI

x

A

T

b

minimization of convex quadratic function

x

T

Ax

b

T

x

solving (discretized) elliptic PDE (

e.g.

, Poisson equation)

Prof. S. Boyd, EE364b, Stanford University

3 docsity.com

analysis of resistor circuit:

Gv

i

v

is node voltage (vector),

i

is (given) source current

G

is circuit conductance matrix

G

ij

total conductance incident on node

i

i

j

conductance between nodes

i

and

j

i

j

Prof. S. Boyd, EE364b, Stanford University

4 docsity.com

Solution and error

x

A

1

b

is solution

x

minimizes (convex function)

f

x

x

T

Ax

b

T

x

f

x

Ax

b

is gradient of

f

with

f

f

x

, we have

f

x

f

x

T

Ax

b

T

x

x

⋆T

Ax

b

T

x

x

x

T

A

x

x

x

x

2 A

i.e.

f

x

f

is half of squared

A

-norm of error

x

x

Prof. S. Boyd, EE364b, Stanford University

6 docsity.com

a relative measure (comparing

x

to

τ

f

x

f

f

f

x

x

2 A

x

2 A

(fraction of maximum possible reduction in

f

, compared to

x

Prof. S. Boyd, EE364b, Stanford University

7 docsity.com

Krylov subspace

(a.k.a. controllability subspace)

K

k

span

b, Ab,... , A

k

1

b

p

A

b

p

polynomial

deg

p < k

we define the

Krylov sequence

x

(1)

, x

(2)

as

x

(

k

)

= argmin

x

∈K

k

f

x

) = argmin

x

∈K

k

x

x

2 A

the CG algorithm (among others) generates the Krylov sequence Prof. S. Boyd, EE364b, Stanford University

9 docsity.com

Properties of Krylov sequence

f

x

(

k

+1)

f

x

(

k

)

(but

r

can increase)

x

(

n

)

x

i.e.

x

∈ K

n

even when

K

n

R

n

x

(

k

)

p

k

A

b

, where

p

k

is a polynomial with

deg

p

k

< k

less obvious: there is a

two-term recurrence

x

(

k

+1)

x

(

k

)

α

k

r

(

k

)

β

k

x

(

k

)

x

(

k

for some

α

k

β

k

(basis of CG algorithm)

Prof. S. Boyd, EE364b, Stanford University

10 docsity.com

Spectral analysis of Krylov sequence

A

Q

Q

T

Q

orthogonal,

diag

λ

1

,... , λ

n

define

y

Q

T

x

¯b

Q

T

b

y

Q

T

x

in terms of

y

, we have f

x

f

y

x

T

Q

Q

T

x

b

T

QQ

T

x

y

T

y

¯b

T

y

n

∑^ i

=

λ

i

y

(^2) i

¯b

i

y

i

so

y

⋆i

¯b

i

i

f

n i

=

¯b

2 i

i

Prof. S. Boyd, EE364b, Stanford University

12 docsity.com

Krylov sequence in terms of

y

y

(

k

)

= argmin

y

¯ K

k

f

y

K

k

= span

¯b,

¯b,... ,

k

1

¯b

y

(

k

)

i

p

k

λ

i

¯b

i

deg

p

k

< k

p

k

= argmin

deg

p<k

n

∑^ i

=

¯b

2 i

λ

i

p

λ

i

2

p

λ

i

Prof. S. Boyd, EE364b, Stanford University

13 docsity.com

τ

k

min

deg

q

k, q

(0)=

n i

=

¯y

2

i

λ

i

q

λ

i

2

n i

=

y

2

i

λ

i

min

deg

q

k, q

(0)=

max

i

=

,...,n

q

λ

i

2

if there is a polynomial

q

of degree

k

, with

q

, that is small on

the spectrum of

A

, then

f

x

(

k

)

f

is small

if eigenvalues are clustered in

k

groups, then

y

(

k

)

is a good approximate

solution

if solution

x

is approximately a linear combination of

k

eigenvectors of

A

, then

y

(

k

)

is a good approximate solution

Prof. S. Boyd, EE364b, Stanford University

15 docsity.com

A bound on convergence rate

taking

q

as Chebyshev polynomial of degree

k

, that is small on interval

[

λ

min

, λ

max

]

, we get

τ

k

κ

κ

k

κ

λ

max

min

convergence can be much faster than this, if spectrum of

A

is spread

but clustered

Prof. S. Boyd, EE364b, Stanford University

16 docsity.com

Convergence

0

1

2

3

4

5

6

7

0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.

1

k

τ

k

Prof. S. Boyd, EE364b, Stanford University

18 docsity.com

Residual convergence

0

1

2

3

4

5

6

7

0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.

1

k

η

k

Prof. S. Boyd, EE364b, Stanford University

19 docsity.com