


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Augmented Langrangian Method, Merit Function, Algorithm
Typology: Study notes
1 / 4
This page cannot be seen from the preview
Don't miss anything!



RAPHAEL HAUSER MATHEMATICAL INSTITUTE, UNIVERSITY OF OXFORD
(NLP) min x∈Rn^ f (x)
s.t. gE (x) = 0 gI (x) ≥ 0 ,
the iterates xk still converge to x∗^ if the LICQ and the second order sufficient opti- mality conditions hold at this point. In fact, μ can even be held constant after a while and the convergence of xk continues!
1.1. Motivation. The method is motivated by the observation that if we knew the Lagrange multipliers λ∗^ such that (x∗, λ∗) is a KKT point for (NLP), then we could find x∗^ by solving the unconstrained problem
min x∈Rn^
L(x, λ∗). (1.1)
Indeed, as already remarked in Lemma 1.2 i) of Lecture 12, the first set of KKT conditions ∇xL(x∗, λ∗) = 0 amount to the first order necessary optimality conditions for (1.1). Of course, λ∗^ is not known, but we know from Lecture 13 that one can obtain estimates λ[k]^ which can be used to set up the problem
min x∈Rn^ L(x, λ[k]).
as an approximation of (1.1). If the estimates λ[k]^ can be iteratively improved and made to converge to λ∗, then this can form the basis of an algorithmic framework for solving (NLP).
1.2. The Merit Function. The merit function used by this algorithm is the augmented Lagrangian of (NLP), defined as follows,
LA(x, λ, μ) = L(x, λ) +
2 μ
i∈I∪E
˜g i^2 (x)
= f (x) −
i∈I∪E
λigi(x) +
i∈I∪E
˜gi(x) 2 μ
gi(x)
= f (x) +
i∈I∪E
( (^) ˜g i(x) 2 μ
− λi
gi(x),
where ˜gi is defined as in Lecture 13,
˜gi(x) =
gi(x) (i ∈ E) min(gi(x), 0) (i ∈ I).
LA is thus nothing else but the Lagrangian “augmented” by the quadratic penalty term introduced in Lecture 13, ensuring that x becomes gradually more feasible as the homotopy parameter μ is reduced.
1.3. The Algorithm.
Algorithm 1.1 (AL). S0 Initialisation: choose the following, x 0 ∈ Rn^ (starting point, not necessarily feasible) λ[0]^ ∈ R|E∪I|^ (initial ”guestimate” of Lagrange multiplier vector) μ 0 > 0 (initial value of homotopy parameter) (τk)N 0 ց 0 (error tolerance) S1 For k = 0, 1 , 2 ,... repeat y[0]^ := xk, l := 0 until ‖∇xLA(y[l], λ[k], μk)‖ ≤ τk repeat compute y[l+1]^ such that LA(y[l+1], λ[k], μk) < LA(y[l], λ[k], μk) (using unconstrained minimisation method) l ← l + 1 end xk+1 := y[l] λ [k+1] i :=^ λ
[k] i −^
˜gi(xk+1) μk ,^ (i^ ∈ E ∪ I), λ[ ik +1]← max(0, λ[ ik +1]), (i ∈ I) choose μk+1 ∈ (0, μk) end
A quick argument gives insight into why this method can be expected to converge before μk reaches very small values. We have
∇xLA(xk+1, λ[k], μk) = ∇f (xk+1) −
i∈E∪I
λ[ ik ]− ˜gi(xk+1) μk
∇gi(xk+1).
Using ‖∇xLA(xk+1, λ[k], μk)‖ ≤ τk, we find
∑
i
λ[ ik ]−
˜gi(xk+1) μk
∇gi(xk+1) = ∇f (xk+1) + O(τk).
Arguments similar to those given in the proof of Theorem 2.2 in Lecture 13 show that
λ[ ik ]−
g˜i(xk+1) μk
≃ λ∗ i , (i ∈ E ∪ I).
Therefore, we have
˜gi(xk+1) ≃ μk
λ[ ik ]− λ∗ i
, (i ∈ E ∪ I),
xk+1 ∈ Bε(x∗)
‖λ[k+1]^ − λ∗‖
(1.2),(1.5) ≤ M μk
δ μk
= δM <
δ ¯μ
δ μk+
where the last inequality follows from μk+1 ≤ μk. Thus, the same conditions hold again, and by induction they hold for all subsequent iterations.
‖λ[k]^ − λ∗‖, ‖xk − x∗‖ ≤ (M ¯μ)k−k^0 ‖λ[k^0 ]^ − λ∗‖ ≤
2 k−k^0
‖λ[k^0 ]^ − λ∗‖.
This shows that xk → x∗^ and λ[k]^ → λ∗^ at a Q-linear rate if μ ≤ μ¯ is held fixed.
Additional Recommended Reading: Section 17.4, Nocedal–Wright.