Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Lecture Notes on Numerical Optimization | MATH 693A, Study notes of Mathematics

San Diego State University (SDSU)Mathematics

Material Type: Notes; Subject: Mathematics; University: San Diego State University; Term: Fall 2009;

Typology: Study notes

Pre 2010

Uploaded on 03/28/2010

koofers-user-f91 🇺🇸

8 documents

1 / 19

This page cannot be seen from the preview

Don't miss anything!

Recap

Trust-Region Newton

Numerical Optimization

Lecture Notes #15

Practical Newton Methods — Trust-Region Newton Methods

Peter Blomgren,

h[email protected]i

Department of Mathematics and Statistics

Dynamical Systems Group

Computational Sciences Research Center

San Diego State University

San Diego, CA 92182-7720

http://terminus.sdsu.edu/

Fall 2009

Peter Blomgren, h[email protected]iTrust-Region Newton Methods — (1/19)

Discover Study notes of Mathematics San Diego State University (SDSU)

Partial preview of the text

Download Lecture Notes on Numerical Optimization | MATH 693A and more Study notes Mathematics in PDF only on Docsity!

Recap Trust-Region Newton

Numerical Optimization

Lecture Notes

Practical Newton Methods — Trust-Region Newton Methods

Peter Blomgren,

〈[email protected]〉

Department of Mathematics and Statistics Dynamical Systems Group Computational Sciences Research Center San Diego State University San Diego, CA 92182- http://terminus.sdsu.edu/

Fall 2009

Recap Trust-Region Newton

Outline

(^1) Recap Hessian Modifications Trust Region Algorithm

(^2) Trust-Region Newton Newton-Dogleg Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Recap Trust-Region Newton Hessian Modifications Trust Region Algorithm

Recall: The Trust Region Algorithm

Algorithm: Trust Region

[ 1] Set k = 1, ∆b > 0 , ∆ 0 ∈ (0, ∆)b, and η ∈ [0, 14 ] [ 2] While optimality condition not satisfied [ 3] Get ¯pk (approximate solution, Today’s Discussion) [ 4] Evaluate ρk [ 5] if ρk < (^14) [ 6] ∆k+1 = 14 ∆k [ 7] else [ 8] if ρk > 34 and ‖¯pk ‖ = ∆k [ 9] ∆k+1 = min(2∆k , ∆)b [10] else [11] ∆k+1 = ∆k [12] endif [13] endif [14] if ρk > η [15] ¯xk+1 = ¯xk + ¯pk [16] else [17] ¯xk+1 = ¯xk [18] endif [19] k = k + 1 [20] End-While

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Trust-Region Methods: Bk not Positive Definite is OK(?)

Trust-region methods do not require that the model Hessian is positive definite. It is possible to use the exact Hessian Bk = ∇^2 f (¯xk ) directly and find the search direction ¯pk by solving the trust-region subproblem

min p ¯∈Rn^

f (¯xk ) + ∇f (¯xk )T^ ¯p +

¯pT^ Bk ¯p, ‖p¯‖ ≤ ∆k

Some of the techniques we discussed, e.g. dogleg, require that Bk is positive definite. We have seen quite few ideas floating around, lets review what we have seen in the context of our methods: (i) the dogleg method, (ii) 2D-subspace minimization, (iii) nearly exact solution, and (iv) the CG method.

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Newton-Dogleg 2 of 2

In order to make the dogleg method work for general (non-positive definite) Bk s we can use the Hessian modification from last time to replace Bk → (Bk + Ek ) ︸︷︷︸ Pos.Def and use this matrix in the dogleg solution. There is a price to pay. When the matrix Bk is modified, the importance of different directions are changed in different ways. This may negatively impact the benefits of the trust-region approach. Modifications of the type Ek = τ I behave somewhat more predictably than modifications of the type Ek = diag(τ 1 , τ 2 ,... , τn). Usage of the dogleg method for non-convex problems is somewhat dicey, and even though it may work it is not the preferred method.

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Newton-2D-Subspace-Minimization

In much the same way we modified the dogleg method, we can

adapt the 2D-subspace minimization subproblem to work in the

case of indefinite Bk

¯p^ min∈Rn^ f^ (¯xk^ ) +^ ∇f^ (¯xk^ )T^ ¯p^ +

1 2 ¯pT^ Bk ¯p, ‖¯p‖ ≤ ∆k , p¯ ∈ span(∇f (¯xk ), ¯pB^ )

can be applied when Bk is positive definite, and with a modified

B˜k = (Bk + Ek ) which is positive definite in the case when Bk is

not positive definite:

min ¯p∈Rn^ f (¯xk ) + ∇f (¯xk )T^ ¯p +^1 2 ¯pT^ B˜k ¯p, ‖¯p‖ ≤ ∆k , p¯ ∈ span(∇f (¯xk ), ¯pB˜^ )

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Trust-Region Newton-CG 1 of 4

The trust-region subproblem

min p ¯∈Rn^ f (¯xk ) + ∇f (¯xk )T^ ¯p +

¯pT^ Bk ¯p, ‖p¯‖ ≤ ∆k

can be solved using the Conjugate Gradient (CG) method, with two additional termination criteria (one of which we have seen already). For each subproblem we must solve

Bk ¯pk = −∇f (¯xk )

We apply CG with the following stopping criteria (standard) The system has been solved to desired accuracy. (previous) Negative curvature encountered. (new) Size of the approximate solution exceeds the trust-region radius.

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Trust-Region Newton-CG 2 of 4

In the case of negative curvature we follow the direction to the boundary of the trust region; we get Steihaug’s Method

Algorithm: CG-Steihaug Given ǫ > 0 ; set ¯p 0 = 0, ¯r 0 = ∇f (¯xk ), ¯d 0 = −¯r 0 if( ‖¯r 0 ‖ < ǫ ) return(¯p 0 ) while( TRUE ) if( ¯dTj B¯dj ≤ 0 ) % Negative Curvature Find τ ≥ 0 such that ¯p = ¯pj + τ ¯dj satisfies ‖¯p‖ = ∆ return(¯p) endif αj = ¯rTj ¯rj /¯dTj B¯dj , ¯pj+1 = ¯pj + αj ¯dj if( ‖¯pj+1‖ ≥ ∆ ) % Step outside trust region Find τ ≥ 0 such that ¯p = ¯pj + τ ¯dj satisfies ‖¯p‖ = ∆ return(¯p) endif ¯rj+1 = ¯rj + αj B¯dj if( ‖¯rj+1‖ ≤ ǫ‖¯r 0 ‖ ) return(¯pj+1) βj+1 = ¯rTj+1¯rj+1/¯rTj ¯rj , d¯j+1 = −¯rj+1 + βj+1d¯j end-while

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Trust-Region Newton-CG 4 of 4

Less-than-good properties of TR-Newton-CG: Any direction of

negative curvature is accepted — the accepted direction can give

an insignificant reduction in the model.

There is an extension of CG known as Lanczos method, and it is

possible to build a TR-Newton-Lanczos algorithm which does not

terminate when encountering the first direction of curvature, but

continues to search for a direction of sufficient negative curvature.

TR-Newton-Lanczos is more robust, but comes at a cost of a more

expensive solution of the subproblem.

We leave the discussion of the Lanczos algorithm to Math 643 (to

be offered in ∼Spring 2049).

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Trust-Region Newton-PCG(M) 1 of 5

As we have seen in other (very similar) settings, adding preconditioning to the CG-solver can cut the number of iterations quite drastically. It would seem like a good (and natural) idea to add preconditioning to the Trust-Region Newton-CG scheme. We have to be a little careful... For the standard CG-Steihaug, the following is true

Theorem The sequence of vectors generated by CG -Steihaug satisfies

0 = ‖¯p 0 ‖ 2 < ‖¯p 1 ‖ 2 < · · · < ‖¯pj ‖ 2 < ‖¯pj+1‖ 2 <... ‖¯p‖ 2 ≤ ∆

This does not hold for preconditioned PCG(M)-Steihaug. This means that the sequence can leave the trust region, and then come back!

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Trust-Region Newton-PCG(M) 3 of 5

As usual, we never make this change of variables explicitly. Instead the CG-Steihaug algorithm is modified so that the wherever we have a multiplication by D−^1 or D−T^ we solve the appropriate linear system. Note, if D−T^ Bk D−^1 = I the preconditioning is perfect. Usually

D−T^ Bk D−^1 = I + E

and if we multiply by DT^ from the left and D from the right we see

Bk = D︸︷︷T^ D ︸ M

+ D︸ T︷︷^ RD ︸

So that M ≈ Bk , and R captures the “inexactness” of the preconditioning.

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Trust-Region Newton-PCG(M) 4 of 5

We can get a good general-purpose preconditioner by using a variant of the Cholesky factorization, LLT^ = Bk. We have discussed two ideas in connection with the Cholesky factorization — last time, we talked about how to modify it to get an approximate factorization of an indefinite matrix, i.e.

[L, LT^ ] =

choldecomp(Bk ) = cholesky(Bk + diag(τ 1 , τ 2 ,... , τn)) modelhess(Bk ) = cholesky(Bk + λI )

We have also (in general terms) talked about the incomplete Cholesky factorization, which preserves the sparsity pattern of Bk by not allowing fill-ins. Putting the two together we get something like the algorithm on the next slide... (do not implement this one!)

Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG

Comments

We have looked at Newton methods (with quadratic convergence, if and only if we implement and solve all the subproblems in the right way) for both the linesearch and trust-region approach, and have developed quite a powerful framework of algorithms that are suitable and quite stable for large problems. Are we done??? — Not quite! We have 4 main topics left

Estimation of derivatives — how to proceed if the gradient and/or the Hessian is not available in analytic form.
Quasi-Newton methods — how to proceed if the Hessian is not available (too expensive).
Application to Nonlinear Least Squares problems.
Application to Nonlinear Equations. — If we can minimize, we can also solve ¯F(¯x) = ¯ 0.

Lecture Notes on Numerical Optimization | MATH 693A, Study notes of Mathematics

Related documents

Partial preview of the text

Download Lecture Notes on Numerical Optimization | MATH 693A and more Study notes Mathematics in PDF only on Docsity!

Numerical Optimization

Lecture Notes

Practical Newton Methods — Trust-Region Newton Methods

Peter Blomgren,

〈[email protected]〉

Fall 2009

Outline

Recall: The Trust Region Algorithm

Algorithm: Trust Region

Trust-Region Methods: Bk not Positive Definite is OK(?)

Newton-Dogleg 2 of 2

Newton-2D-Subspace-Minimization

In much the same way we modified the dogleg method, we can

adapt the 2D-subspace minimization subproblem to work in the

case of indefinite Bk

can be applied when Bk is positive definite, and with a modified

B˜k = (Bk + Ek ) which is positive definite in the case when Bk is

not positive definite:

Trust-Region Newton-CG 1 of 4

Trust-Region Newton-CG 2 of 4

Trust-Region Newton-CG 4 of 4

Less-than-good properties of TR-Newton-CG: Any direction of

negative curvature is accepted — the accepted direction can give

an insignificant reduction in the model.

There is an extension of CG known as Lanczos method, and it is

possible to build a TR-Newton-Lanczos algorithm which does not

terminate when encountering the first direction of curvature, but

continues to search for a direction of sufficient negative curvature.

TR-Newton-Lanczos is more robust, but comes at a cost of a more

expensive solution of the subproblem.

We leave the discussion of the Lanczos algorithm to Math 643 (to

be offered in ∼Spring 2049).

Trust-Region Newton-PCG(M) 1 of 5

Trust-Region Newton-PCG(M) 3 of 5

+ D︸ T︷︷^ RD ︸

Trust-Region Newton-PCG(M) 4 of 5

[L, LT^ ] =

Comments