Augmented Langarangian Method, Lecture Notes - Mathematics -, Study notes of Mathematical Methods

Augmented Langrangian Method, Merit Function, Algorithm

Typology: Study notes

2010/2011

Uploaded on 09/09/2011

luber-1
luber-1 🇬🇧

4.8

(12)

293 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
C12.1B: CONTINUOUS OPTIMISATION
LECTURE 14: THE AUGMENTED LAGRANGIAN METHOD
RAPHAEL HAUSER
MATHEMATICAL INSTITUTE, UNIVERSITY OF OXFORD
1. The Augmented Lagrangian Method. In Lecture 13 we saw that the
quadratic penalty method has the disadvantage that the penalty parameter µhas to
be reduced to very small values before xkbecomes feasible to high accuracy. Moreover,
we pointed out that reducing µto very small values can lead to numerical instabilities
if the method is not implemented very carefully.
We will now see a related method that does not require µkto converge to zero,
and yet in a neighbourhood of a KKT point xof the nonlinear optimisation problem
(NLP) min
xRnf(x)
s.t. gE(x) = 0
gI(x)0,
the iterates xkstill converge to xif the LICQ and the second order sufficient opti-
mality conditions hold at this point. In fact, µcan even be held constant after a while
and the convergence of xkcontinues!
1.1. Motivation. The method is motivated by the observation that if we knew
the Lagrange multipliers λsuch that (x, λ) is a KKT point for (NLP), then we
could find xby solving the unconstrained problem
min
xRnL(x, λ).(1.1)
Indeed, as already remarked in Lemma 1.2 i) of Lecture 12, the first set of KKT
conditions xL(x, λ) = 0 amount to the first order necessary optimality conditions
for (1.1).
Of course, λis not known, but we know from Lecture 13 that one can obtain
estimates λ[k]which can be used to set up the problem
min
xRnL(x, λ[k]).
as an approximation of (1.1).
If the estimates λ[k]can be iteratively improved and made to converge to λ, then
this can form the basis of an algorithmic framework for solving (NLP).
1.2. The Merit Function. The merit function used by this algorithm is the
augmented Lagrangian of (NLP), defined as follows,
LA(x, λ, µ) = L(x, λ) + 1
2µX
i∈I∪E
˜g2
i(x)
=f(x)X
i∈I∪E
λigi(x) + X
i∈I∪E
˜gi(x)
2µgi(x)
=f(x) + X
i∈I∪E
˜gi(x)
2µλigi(x),
1
pf3
pf4

Partial preview of the text

Download Augmented Langarangian Method, Lecture Notes - Mathematics - and more Study notes Mathematical Methods in PDF only on Docsity!

C12.1B: CONTINUOUS OPTIMISATION

LECTURE 14: THE AUGMENTED LAGRANGIAN METHOD

RAPHAEL HAUSER MATHEMATICAL INSTITUTE, UNIVERSITY OF OXFORD

  1. The Augmented Lagrangian Method. In Lecture 13 we saw that the quadratic penalty method has the disadvantage that the penalty parameter μ has to be reduced to very small values before xk becomes feasible to high accuracy. Moreover, we pointed out that reducing μ to very small values can lead to numerical instabilities if the method is not implemented very carefully. We will now see a related method that does not require μk to converge to zero, and yet in a neighbourhood of a KKT point x∗^ of the nonlinear optimisation problem

(NLP) min x∈Rn^ f (x)

s.t. gE (x) = 0 gI (x) ≥ 0 ,

the iterates xk still converge to x∗^ if the LICQ and the second order sufficient opti- mality conditions hold at this point. In fact, μ can even be held constant after a while and the convergence of xk continues!

1.1. Motivation. The method is motivated by the observation that if we knew the Lagrange multipliers λ∗^ such that (x∗, λ∗) is a KKT point for (NLP), then we could find x∗^ by solving the unconstrained problem

min x∈Rn^

L(x, λ∗). (1.1)

Indeed, as already remarked in Lemma 1.2 i) of Lecture 12, the first set of KKT conditions ∇xL(x∗, λ∗) = 0 amount to the first order necessary optimality conditions for (1.1). Of course, λ∗^ is not known, but we know from Lecture 13 that one can obtain estimates λ[k]^ which can be used to set up the problem

min x∈Rn^ L(x, λ[k]).

as an approximation of (1.1). If the estimates λ[k]^ can be iteratively improved and made to converge to λ∗, then this can form the basis of an algorithmic framework for solving (NLP).

1.2. The Merit Function. The merit function used by this algorithm is the augmented Lagrangian of (NLP), defined as follows,

LA(x, λ, μ) = L(x, λ) +

2 μ

i∈I∪E

˜g i^2 (x)

= f (x) −

i∈I∪E

λigi(x) +

i∈I∪E

˜gi(x) 2 μ

gi(x)

= f (x) +

i∈I∪E

( (^) ˜g i(x) 2 μ

− λi

gi(x),

where ˜gi is defined as in Lecture 13,

˜gi(x) =

gi(x) (i ∈ E) min(gi(x), 0) (i ∈ I).

LA is thus nothing else but the Lagrangian “augmented” by the quadratic penalty term introduced in Lecture 13, ensuring that x becomes gradually more feasible as the homotopy parameter μ is reduced.

1.3. The Algorithm.

Algorithm 1.1 (AL). S0 Initialisation: choose the following, x 0 ∈ Rn^ (starting point, not necessarily feasible) λ[0]^ ∈ R|E∪I|^ (initial ”guestimate” of Lagrange multiplier vector) μ 0 > 0 (initial value of homotopy parameter) (τk)N 0 ց 0 (error tolerance) S1 For k = 0, 1 , 2 ,... repeat y[0]^ := xk, l := 0 until ‖∇xLA(y[l], λ[k], μk)‖ ≤ τk repeat compute y[l+1]^ such that LA(y[l+1], λ[k], μk) < LA(y[l], λ[k], μk) (using unconstrained minimisation method) l ← l + 1 end xk+1 := y[l] λ [k+1] i :=^ λ

[k] i −^

˜gi(xk+1) μk ,^ (i^ ∈ E ∪ I), λ[ ik +1]← max(0, λ[ ik +1]), (i ∈ I) choose μk+1 ∈ (0, μk) end

A quick argument gives insight into why this method can be expected to converge before μk reaches very small values. We have

∇xLA(xk+1, λ[k], μk) = ∇f (xk+1) −

i∈E∪I

λ[ ik ]− ˜gi(xk+1) μk

∇gi(xk+1).

Using ‖∇xLA(xk+1, λ[k], μk)‖ ≤ τk, we find

i

λ[ ik ]−

˜gi(xk+1) μk

∇gi(xk+1) = ∇f (xk+1) + O(τk).

Arguments similar to those given in the proof of Theorem 2.2 in Lecture 13 show that

λ[ ik ]−

g˜i(xk+1) μk

≃ λ∗ i , (i ∈ E ∪ I).

Therefore, we have

˜gi(xk+1) ≃ μk

λ[ ik ]− λ∗ i

, (i ∈ E ∪ I),

  • Without loss of generality, we may assume that ¯μ ≤ (2M )−^1. Note that if (λ[k], μk) satisfy the conditions of part i) of the theorem and if xk ∈ Bε(x∗), then xk is a good starting point for solving the problem (1.3) and we have

xk+1 ∈ Bε(x∗)

‖λ[k+1]^ − λ∗‖

(1.2),(1.5) ≤ M μk

δ μk

= δM <

δ ¯μ

δ μk+

where the last inequality follows from μk+1 ≤ μk. Thus, the same conditions hold again, and by induction they hold for all subsequent iterations.

  • Let k 0 be the iteration where (1.4) and (1.5) first hold. Induction on k shows that

‖λ[k]^ − λ∗‖, ‖xk − x∗‖ ≤ (M ¯μ)k−k^0 ‖λ[k^0 ]^ − λ∗‖ ≤

2 k−k^0

‖λ[k^0 ]^ − λ∗‖.

This shows that xk → x∗^ and λ[k]^ → λ∗^ at a Q-linear rate if μ ≤ μ¯ is held fixed.

Additional Recommended Reading: Section 17.4, Nocedal–Wright.