Understanding Duality & Subgradient Method in Nonlinear Programming, Slides of Computer Science

An in-depth exploration of dual computational methods in nonlinear programming. It covers the concept of duality, the role of lagrangian relaxation, and the structure of dual problems. The document also delves into dual derivatives, subgradients, and the key subgradient method property. Additionally, it discusses the non-differentiable dual and the subgradient method for solving non-differentiable optimization problems.

Typology: Slides

2012/2013

Uploaded on 03/27/2013

ekana
ekana 🇮🇳

4

(44)

370 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
NONLINEAR PROGRAMMING
LECTURE 21: DUAL COMPUTATIONAL METHODS
LECTURE OUTLINE
Dual Methods
Nondifferentiable Optimization
********************************
Consider the primal problem
minimize f (x)
subject to x X, gj
(x) 0, j =1,...,r,
assuming −∞ <f
< .
Dual problem: Maximize
q(µ)= inf L(x, µ)= inf
xX{f (x)+ µ g(x)}
xX
subject to µ 0.
Docsity.com
pf3
pf4
pf5
pf8

Partial preview of the text

Download Understanding Duality & Subgradient Method in Nonlinear Programming and more Slides Computer Science in PDF only on Docsity!

NONLINEAR PROGRAMMING

LECTURE 21: DUAL COMPUTATIONAL METHODS

LECTURE OUTLINE

• Dual Methods

• Nondifferentiable Optimization

• Consider the primal problem

minimize f (x)

subject to x ∈ X, gj†(x) ≤ 0 , j = 1,... , r,

assuming −∞ < f

∗ < ∞.

• Dual problem: Maximize

q(μ) = inf L(x, μ) = inf x∈X {f (x) + μ g(x)} x∈X†

subject to μ ≥ 0.

PROS AND CONS FOR SOLVING THE DUAL

• The dual is concave.

• The dual may have smaller dimension and/or

simpler constraints.

• If there is no duality gap and the dual is solved

exactly for a Lagrange multiplier μ

, all optimal pri-

mal solutions can be obtained by minimizing the

Lagrangian L(x, μ

) over x ∈ X.

• Even if there is a duality gap, q(μ) is a lower

bound to the optimal primal value for every μ ≥ 0.

• Evaluating q(μ) requires minimization of L(x, μ)

over x ∈ X.

• The dual function is often nondifferentiable.

• Even if we find an optimal dual solution μ

, it may

be difficult to obtain a primal optimal solution.

′ ′

DUAL DERIVATIVES

• Let

xμ† = arg min L(x, μ) = arg min f (x) + μ g(x). x∈X x∈X†

Then for all μ ∈ 

r

q(˜ μ ) = inf f (x) + ˜μ g(x) x∈X† ≤ f (xμ) + ˜μ g(xμ) = f (xμ) + μ g(xμ) + (˜ ′ μ − μ) ′ g(xμ) = q(μ) + (˜μ − μ) ′ g(xμ).

• Thus g(xμ) is a subgradient of q at μ.

• Proposition: Let X be compact, and let f and g

be continuous over X. Assume also that for every

μ, L(x, μ) is minimized over x ∈ X at a unique point

xμ. Then, q is everywhere continuously differen-

tiable and

∇q(μ) = g(xμ), ∀ μ ∈  r† .

NONDIFFERENTIABLE DUAL

• If there exists a duality gap, the dual function is

nondifferentiable at every dual optimal solution.

• Important nondifferentiable case: When q is

polyhedral, that is,

q(μ) = min aiμ + bi† , i∈I†

where I is a finite index set, and ai† ∈ 

r†

and bi†

are given (arises when X is a discrete set, as in

integer programming).

• Proposition: Let q be polyhedral as above, and

let Iμ† be the set of indices attaining the minimum

Iμ† = i ∈ I | a i μ + bi† = q(μ).

The set of all subgradients of q at μ is

∂q(μ) = g � g = ξiai, ξi† ≥ 0 , ξi† = 1.

i∈Iμ i∈Iμ

KEY SUBGRADIENT METHOD PROPERTY

• For a small stepsize it reduces the Euclidean

distance to the optimum.

M g k μk μk^ + sk^ g k μk+1^ = [ μk^ + sk^ g k^ ]+ μ* < 90 o Contours of q

• Proposition: For any dual optimal solution μ

we have

∗ ‖μ k+ − μ ∗ ‖ < ‖μ k† − μ ‖,

for all stepsizes s

k†

such that

2 q(μ ∗ ) − q(μ k ) 0 < s k† <. ‖gk^ ‖^2

STEPSIZE RULES

• Diminishing stepsize is one possibility.

• More common method:

α k† q k† − q(μ k ) k† s = , ‖gk^ ‖^2

where q

k† ≈ q ∗

and

0 < α k† < 2.

• Some possibilities:

− q k†

is the best known upper bound to q

0 = 1

and α

k†

decreased by a certain factor every

few iterations.

− α k†

= 1 for all k and

q k† = 1 + β(k) ˆ k† q ,

where ˆq

k† = max 0 ≤i≤k†q(μ i

), and β(k) > 0 is

adjusted depending on algorithmic progress

of the algorithm.