Download Understanding Duality & Subgradient Method in Nonlinear Programming and more Slides Computer Science in PDF only on Docsity!
′
NONLINEAR PROGRAMMING
LECTURE 21: DUAL COMPUTATIONAL METHODS
LECTURE OUTLINE
• Dual Methods
• Nondifferentiable Optimization
• Consider the primal problem
minimize f (x)
subject to x ∈ X, gj†(x) ≤ 0 , j = 1,... , r,
assuming −∞ < f
∗ < ∞.
• Dual problem: Maximize
q(μ) = inf L(x, μ) = inf x∈X {f (x) + μ g(x)} x∈X†
subject to μ ≥ 0.
PROS AND CONS FOR SOLVING THE DUAL
• The dual is concave.
• The dual may have smaller dimension and/or
simpler constraints.
• If there is no duality gap and the dual is solved
exactly for a Lagrange multiplier μ
∗
, all optimal pri-
mal solutions can be obtained by minimizing the
Lagrangian L(x, μ
∗
) over x ∈ X.
• Even if there is a duality gap, q(μ) is a lower
bound to the optimal primal value for every μ ≥ 0.
• Evaluating q(μ) requires minimization of L(x, μ)
over x ∈ X.
• The dual function is often nondifferentiable.
• Even if we find an optimal dual solution μ
∗
, it may
be difficult to obtain a primal optimal solution.
′
′ ′
DUAL DERIVATIVES
• Let
xμ† = arg min L(x, μ) = arg min f (x) + μ g(x). x∈X x∈X†
Then for all μ ∈
r
q(˜ μ ) = inf f (x) + ˜μ g(x) x∈X† ≤ f (xμ) + ˜μ g(xμ) = f (xμ) + μ g(xμ) + (˜ ′ μ − μ) ′ g(xμ) = q(μ) + (˜μ − μ) ′ g(xμ).
• Thus g(xμ) is a subgradient of q at μ.
• Proposition: Let X be compact, and let f and g
be continuous over X. Assume also that for every
μ, L(x, μ) is minimized over x ∈ X at a unique point
xμ. Then, q is everywhere continuously differen-
tiable and
∇q(μ) = g(xμ), ∀ μ ∈ r† .
′
′
NONDIFFERENTIABLE DUAL
• If there exists a duality gap, the dual function is
nondifferentiable at every dual optimal solution.
• Important nondifferentiable case: When q is
polyhedral, that is,
q(μ) = min aiμ + bi† , i∈I†
where I is a finite index set, and ai† ∈
r†
and bi†
are given (arises when X is a discrete set, as in
integer programming).
• Proposition: Let q be polyhedral as above, and
let Iμ† be the set of indices attaining the minimum
Iμ† = i ∈ I | a i μ + bi† = q(μ).
The set of all subgradients of q at μ is
∂q(μ) = g � g = ξiai, ξi† ≥ 0 , ξi† = 1.
i∈Iμ i∈Iμ
KEY SUBGRADIENT METHOD PROPERTY
• For a small stepsize it reduces the Euclidean
distance to the optimum.
M g k μk μk^ + sk^ g k μk+1^ = [ μk^ + sk^ g k^ ]+ μ* < 90 o Contours of q
• Proposition: For any dual optimal solution μ
∗
we have
∗ ‖μ k+ − μ ∗ ‖ < ‖μ k† − μ ‖,
for all stepsizes s
k†
such that
2 q(μ ∗ ) − q(μ k ) 0 < s k† <. ‖gk^ ‖^2
STEPSIZE RULES
• Diminishing stepsize is one possibility.
• More common method:
α k† q k† − q(μ k ) k† s = , ‖gk^ ‖^2
where q
k† ≈ q ∗
and
0 < α k† < 2.
• Some possibilities:
− q k†
is the best known upper bound to q
∗
0 = 1
and α
k†
decreased by a certain factor every
few iterations.
− α k†
= 1 for all k and
q k† = 1 + β(k) ˆ k† q ,
where ˆq
k† = max 0 ≤i≤k†q(μ i
), and β(k) > 0 is
adjusted depending on algorithmic progress
of the algorithm.