











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Subject: Mathematics; University: San Diego State University; Term: Fall 2009;
Typology: Study notes
1 / 19
This page cannot be seen from the preview
Don't miss anything!












Recap Trust-Region Newton
Department of Mathematics and Statistics Dynamical Systems Group Computational Sciences Research Center San Diego State University San Diego, CA 92182- http://terminus.sdsu.edu/
Recap Trust-Region Newton
(^1) Recap Hessian Modifications Trust Region Algorithm
(^2) Trust-Region Newton Newton-Dogleg Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
Recap Trust-Region Newton Hessian Modifications Trust Region Algorithm
[ 1] Set k = 1, ∆b > 0 , ∆ 0 ∈ (0, ∆)b, and η ∈ [0, 14 ] [ 2] While optimality condition not satisfied [ 3] Get ¯pk (approximate solution, Today’s Discussion) [ 4] Evaluate ρk [ 5] if ρk < (^14) [ 6] ∆k+1 = 14 ∆k [ 7] else [ 8] if ρk > 34 and ‖¯pk ‖ = ∆k [ 9] ∆k+1 = min(2∆k , ∆)b [10] else [11] ∆k+1 = ∆k [12] endif [13] endif [14] if ρk > η [15] ¯xk+1 = ¯xk + ¯pk [16] else [17] ¯xk+1 = ¯xk [18] endif [19] k = k + 1 [20] End-While
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
Trust-region methods do not require that the model Hessian is positive definite. It is possible to use the exact Hessian Bk = ∇^2 f (¯xk ) directly and find the search direction ¯pk by solving the trust-region subproblem
min p ¯∈Rn^
f (¯xk ) + ∇f (¯xk )T^ ¯p +
¯pT^ Bk ¯p, ‖p¯‖ ≤ ∆k
Some of the techniques we discussed, e.g. dogleg, require that Bk is positive definite. We have seen quite few ideas floating around, lets review what we have seen in the context of our methods: (i) the dogleg method, (ii) 2D-subspace minimization, (iii) nearly exact solution, and (iv) the CG method.
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
In order to make the dogleg method work for general (non-positive definite) Bk s we can use the Hessian modification from last time to replace Bk → (Bk + Ek ) ︸ ︷︷ ︸ Pos.Def and use this matrix in the dogleg solution. There is a price to pay. When the matrix Bk is modified, the importance of different directions are changed in different ways. This may negatively impact the benefits of the trust-region approach. Modifications of the type Ek = τ I behave somewhat more predictably than modifications of the type Ek = diag(τ 1 , τ 2 ,... , τn). Usage of the dogleg method for non-convex problems is somewhat dicey, and even though it may work it is not the preferred method.
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
¯p^ min∈Rn^ f^ (¯xk^ ) +^ ∇f^ (¯xk^ )T^ ¯p^ +
1 2 ¯pT^ Bk ¯p, ‖¯p‖ ≤ ∆k , p¯ ∈ span(∇f (¯xk ), ¯pB^ )
min ¯p∈Rn^ f (¯xk ) + ∇f (¯xk )T^ ¯p +^1 2 ¯pT^ B˜k ¯p, ‖¯p‖ ≤ ∆k , p¯ ∈ span(∇f (¯xk ), ¯pB˜^ )
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
The trust-region subproblem
min p ¯∈Rn^ f (¯xk ) + ∇f (¯xk )T^ ¯p +
¯pT^ Bk ¯p, ‖p¯‖ ≤ ∆k
can be solved using the Conjugate Gradient (CG) method, with two additional termination criteria (one of which we have seen already). For each subproblem we must solve
Bk ¯pk = −∇f (¯xk )
We apply CG with the following stopping criteria (standard) The system has been solved to desired accuracy. (previous) Negative curvature encountered. (new) Size of the approximate solution exceeds the trust-region radius.
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
In the case of negative curvature we follow the direction to the boundary of the trust region; we get Steihaug’s Method
Algorithm: CG-Steihaug Given ǫ > 0 ; set ¯p 0 = 0, ¯r 0 = ∇f (¯xk ), ¯d 0 = −¯r 0 if( ‖¯r 0 ‖ < ǫ ) return(¯p 0 ) while( TRUE ) if( ¯dTj B¯dj ≤ 0 ) % Negative Curvature Find τ ≥ 0 such that ¯p = ¯pj + τ ¯dj satisfies ‖¯p‖ = ∆ return(¯p) endif αj = ¯rTj ¯rj /¯dTj B¯dj , ¯pj+1 = ¯pj + αj ¯dj if( ‖¯pj+1‖ ≥ ∆ ) % Step outside trust region Find τ ≥ 0 such that ¯p = ¯pj + τ ¯dj satisfies ‖¯p‖ = ∆ return(¯p) endif ¯rj+1 = ¯rj + αj B¯dj if( ‖¯rj+1‖ ≤ ǫ‖¯r 0 ‖ ) return(¯pj+1) βj+1 = ¯rTj+1¯rj+1/¯rTj ¯rj , d¯j+1 = −¯rj+1 + βj+1d¯j end-while
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
As we have seen in other (very similar) settings, adding preconditioning to the CG-solver can cut the number of iterations quite drastically. It would seem like a good (and natural) idea to add preconditioning to the Trust-Region Newton-CG scheme. We have to be a little careful... For the standard CG-Steihaug, the following is true
Theorem The sequence of vectors generated by CG -Steihaug satisfies
0 = ‖¯p 0 ‖ 2 < ‖¯p 1 ‖ 2 < · · · < ‖¯pj ‖ 2 < ‖¯pj+1‖ 2 <... ‖¯p‖ 2 ≤ ∆
This does not hold for preconditioned PCG(M)-Steihaug. This means that the sequence can leave the trust region, and then come back!
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
As usual, we never make this change of variables explicitly. Instead the CG-Steihaug algorithm is modified so that the wherever we have a multiplication by D−^1 or D−T^ we solve the appropriate linear system. Note, if D−T^ Bk D−^1 = I the preconditioning is perfect. Usually
D−T^ Bk D−^1 = I + E
and if we multiply by DT^ from the left and D from the right we see
Bk = D︸ ︷︷T^ D ︸ M
R
So that M ≈ Bk , and R captures the “inexactness” of the preconditioning.
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
We can get a good general-purpose preconditioner by using a variant of the Cholesky factorization, LLT^ = Bk. We have discussed two ideas in connection with the Cholesky factorization — last time, we talked about how to modify it to get an approximate factorization of an indefinite matrix, i.e.
choldecomp(Bk ) = cholesky(Bk + diag(τ 1 , τ 2 ,... , τn)) modelhess(Bk ) = cholesky(Bk + λI )
We have also (in general terms) talked about the incomplete Cholesky factorization, which preserves the sparsity pattern of Bk by not allowing fill-ins. Putting the two together we get something like the algorithm on the next slide... (do not implement this one!)
Recap Trust-Region Newton Newton-2D-Subspace-Minimization Newton-Nearly Exact Solution Trust-Region Newton-(P)CG
We have looked at Newton methods (with quadratic convergence, if and only if we implement and solve all the subproblems in the right way) for both the linesearch and trust-region approach, and have developed quite a powerful framework of algorithms that are suitable and quite stable for large problems. Are we done??? — Not quite! We have 4 main topics left