Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Primal Dual, Lecture Notes - Mathematics -, Study notes of Mathematical Methods

University of Oxford Mathematical Methods

Primal-Dual path-following for linear programming

Typology: Study notes

2010/2011

Uploaded on 09/09/2011

luber-1 🇬🇧

4.8

(12)

293 documents

1 / 10

This page cannot be seen from the preview

Don't miss anything!

C12.1B: CONTINUOUS OPTIMISATION

LECTURE 16: PRIMAL-DUAL PATH-FOLLOWING FOR LINEAR

PROGRAMMING

RAPHAEL HAUSER

MATHEMATICAL INSTITUTE, UNIVERSITY OF OXFORD

1. Interior-Point Methods for Linear Programming. The barrier method

we studied in Lecture 15 is an example of a so-called interior-point method: all inter-

mediate solutions are in the interior of the admissible domain. Although originally

devised for nonlinear programming, it was discovered in the mid 1980-ies that such

algorithms can also solve linear programming problems very efficiently.

Crucially, these algorithms can be designed so that they run in polynomial time,

that is, an upper bound on the number of bit operations performed by the algorithm

until completion can be given as a polynomial in the bit-length of the input data

(A, b, c) of a LP instance

(P) min

x∈RncTx

s.t. Ax =b,

x≥0.

The simplex algorithm on the other hand is known to take a number of bit oper-

ations that is exponential in the bit-length of (A, b, c) on certain LP instances. The

emergence of polynomial-time nterior-point methods for LP has therefore been hailed

as a great success, and it has sparked a huge research effort into improving these

methods and extending them to other convex problem classes.

Today, interior-point methods are often the best approach to solving very large

scale LP problems, and they are the method of choice for other important classes

of convex programming. We cannot enter the discussion of methods that underlie

industrial-strength software here, but we will use this lecture to outline the design

and analysis of the simplest case of such an algorithm.

2. Perturbations of LP problems. Recall the primal and dual linear pro-

gramming problems in standard form:

(P) min

x∈RncTx

s.t. Ax =b,

x≥0.

(D) max

y∈RmbTy

s.t. ATy+s=c,

s≥0.

For the purposes of the developing a lean theory, we will make the following reg-

ularity assumptions:

Definition 2.1. We say that (P) and (D) satisfy the standard LP regularity

assumption if the following conditions are met:

1

Discover Study notes of Mathematical Methods University of Oxford

Partial preview of the text

Download Primal Dual, Lecture Notes - Mathematics - and more Study notes Mathematical Methods in PDF only on Docsity!

C12.1B: CONTINUOUS OPTIMISATION

LECTURE 16: PRIMAL-DUAL PATH-FOLLOWING FOR LINEAR

PROGRAMMING

RAPHAEL HAUSER MATHEMATICAL INSTITUTE, UNIVERSITY OF OXFORD

Interior-Point Methods for Linear Programming. The barrier method we studied in Lecture 15 is an example of a so-called interior-point method: all inter- mediate solutions are in the interior of the admissible domain. Although originally devised for nonlinear programming, it was discovered in the mid 1980-ies that such algorithms can also solve linear programming problems very efficiently. Crucially, these algorithms can be designed so that they run in polynomial time, that is, an upper bound on the number of bit operations performed by the algorithm until completion can be given as a polynomial in the bit-length of the input data (A, b, c) of a LP instance

(P) min x∈Rn^

cTx

s.t. Ax = b, x ≥ 0.

The simplex algorithm on the other hand is known to take a number of bit oper- ations that is exponential in the bit-length of (A, b, c) on certain LP instances. The emergence of polynomial-time nterior-point methods for LP has therefore been hailed as a great success, and it has sparked a huge research effort into improving these methods and extending them to other convex problem classes. Today, interior-point methods are often the best approach to solving very large scale LP problems, and they are the method of choice for other important classes of convex programming. We cannot enter the discussion of methods that underlie industrial-strength software here, but we will use this lecture to outline the design and analysis of the simplest case of such an algorithm.

Perturbations of LP problems. Recall the primal and dual linear pro- gramming problems in standard form:

(P) min x∈Rn cTx

s.t. Ax = b, x ≥ 0.

(D) max y∈Rm

bTy

s.t. ATy + s = c, s ≥ 0.

For the purposes of the developing a lean theory, we will make the following reg- ularity assumptions:

Definition 2.1. We say that (P) and (D) satisfy the standard LP regularity assumption if the following conditions are met:

i) A has linearly independent row vectors, that is, rank(A) = m, ii) (P) is strictly feasible, that is, there exists a point x ∈ Rn^ such that Ax = b and x > 0 componentwise, iii) (D) is strictly feasible, that is, there exist points (y, s) ∈ Rm^ × Rn^ such that ATy + s = c and s > 0 componentwise.

Note that these regularity assumptions are nothing else but the Slater constraint qualification both for (P) and (D). Whenever the standard LP regularity assumption does not hold, we can preprocess the input data A, b, c of the problems (P),(D) and obtain equivalent problems (P’),(D’) that satisfy the assumption. Thus, there is no loss of generality in assuming that (P) and (D) are regular. The following notation will subsequently be used for the primal, dual and primal- dual feasible domains:

FP = {x : Ax = b, x ≥ 0 }, FD = {(y, s) : ATy + s = c, s ≥ 0 }, F P◦ = {x : Ax = b, x > 0 }, F D◦ = {(y, s) : ATy + s = c, s > 0 }, F◦=F P◦ × F D◦. For μ > 0 we consider the following perturbations of (P) and (D):

(P)μ min x∈Rn cTx + μf (x)

s.t. Ax = b x > 0.

(D)μ max y∈Rm bTy − μf (s)

s.t. ATy + s = c, s > 0.

In both problems

f : Rn ++ → R

x 7 → −

∑^ n

j=

log(xj )

is the logarithmic barrier function. In Problem Set 6 we studied the duality/optimality theory of problems (P)μ and (D)μ and found the following result:

Theorem 2.2. Let (P),(D) satisfy the standard LP regularity assumption. Then x(μ) ∈ Rn^ and

y(μ), s(μ)

∈ Rm^ × Rn^ are optimal for (P)μ and (D)μ respectively if and only if the following system holds true:

ATy + s = c Ax = b XSe = μe x, s > 0 ,

Now note that since ˆx, ˆs > 0, there exist vectors l, u > 0 such that

ˆsTx + μf (x) ≤ sˆT^ xˆ + μf (ˆx) ⇒ l ≤ x ≤ u.

Therefore, (P)μ is equivalent to

(P’)μ min ˆsTx + μf (x) s.t. Ax = b, l ≤ x ≤ u.

Since {x : Ax = b, l ≤ x ≤ u} is a compact subset of Rn ++ and ˆsTx + μf (x) is contin- uous, the minimum of (P’)μ is attained.

Definition 3.2. For μ > 0 let us write

x(μ), y(μ), s(μ)

for the unique solution of the central path equations (2.1). Then the set {x(μ) : μ > 0 } is called the primal central path, {(y(μ), s(μ) : μ > 0 } is the dual central path, and {(x(μ), y(μ), s(μ)) : μ > 0 } is the primal-dual central path.

The central paths are smooth curves leading to a primal-dual optimal pair of solutions for the original problems (P),(D):

Theorem 3.3. The map

μ 7 →

x(μ), y(μ), s(μ)

is continuously differentiable. Furthermore, there exist x∗^ and (y∗, s∗) which are op- timal solutions to (P) and (D) respectively such that

lim μ↓ 0

x(μ), y(μ), s(μ)

= (x∗, y∗, s∗).

A primal-dual interior-point algorithm. The critical step of our primal- dual interior-point algorithm will be the following: given an approximate solution (x, y, s) to the central path equations (2.1), find a better approximation. An obvious approach is to apply Newton’s method to find a zero of the map  

x y s

ATy + s − c Ax − b XSe − μe

that is, we solve the system  

0 AT^ I

A 0 0

S 0 X

∆x ∆y ∆s

ATy + s − c Ax − b XSe − μe

for (∆x, ∆y, ∆s) and set  

x+ y+ s+

x y s

∆x ∆y ∆s

Note that we have neglected the positivity constraints x, s > 0 of the central path equations. We could enforce these by taking a damped Newton step α(∆x, ∆y, ∆s)T rather than the full step (∆x, ∆y, ∆s)T^ when this takes (¯x+, y¯+, ¯s+) outside of the domain x, s > 0, just as we did in our barrier methods for nonlinear programming. However, a nice feature of our algorithm will be that this issue is dealt with automat- ically through the notion of centrality developed below. We will now describe and analyse an algorithm that iterates over points that satisfy the constraints

ATy + s = c Ax = b x, s > 0

but not necessarily the equation XSe = μe. This requires a starting point (x, y, s) that satisfies (4.2). This issue can be dealt with via a phase I type auxiliary problem. Thus, we may simply assume that such a point is available.

4.1. Centrality and the Duality Gap. In order to be able to assure that the iterates of our algorithm stay well inside the domain x, s > 0, we need a measure of centrality, or of “nearness” to the central path.

Definition 4.1. For all ω = (x, y, s) ∈ F◦^ =

(x, y, s) : ATy + s = c, Ax = b, x, s > 0

we define

μ(ω) :=

∑n j=1 xj^ sj n

Recall that LP duality showed that any feasible solution of (P) yields an upper bound on the optimal solution of (D), and any feasible solution of (D) yields a lower bound on the optimal solution of (P).

Definition 4.2. Let x and (y, s) be primal and dual feasible points. The duality gap associated with these solutions is defined as cTx − bTy.

Strong LP duality shows that the duality gap becomes zero at a primal-dual optimal point ω∗^ = (x∗, y∗, s∗). The number μ(ω) is useful in monitoring the progress of an algorithm because it is proportional to the duality gap: if ω = (x, y, s) is primal- dual feasible, then

cTx − bTy = xT(c − ATy) = xTs = nμ(ω).

It is thus reasonable to fix a number σ ∈ (0, 1) and to set μ = σμ(ω) in the system (4.1). That is to say, we are aiming to reduce the duality gap by a constant factor in each iteration. Another interesting observation is that ω = (x, y, s) ∈ F◦^ lies on the primal-dual central path if and only if XSe = μ(ω)e. This can be used to define a neighbourhood of the central path:

Definition 4.3. For θ ∈ (0, 1), let

N 2 (θ) :=

ω = (x, y, s) ∈ F◦^ : ‖XSe − μ(ω)e‖ 2 ≤ θμ(ω)

ω 0

ω 1

ω∗

ωμ

F◦

N 2 (θ)

Fig. 4.2. Choosing σ too small and aiming at a point ωμ too far away makes the update leave the narrow neighbourhood N 2 (θ).

Algorithm 4.4 (SPF). S0 Choose θ, δ ∈ (0, 1) be such that

θ^2 + δ^2 23 /^2 (1 − θ)

δ √ n

θ.

Set σ := 1 − √δn and choose ω 0 = (x 0 , y 0 , s 0 ) ∈ N 2 (θ). S1 For k = 0, 1 ,... repeat solve (4.3) with ω = ωk for ∆ω := (∆x, ∆y, ∆s) compute ωk+1 = ωk + ∆ω end

Convergence Analysis of Algorithm SPF. Let us now analyse the con- vergence of the primal-dual short-step path-following method SPF. Our main theorem shows that θ and σ were chosen so that the iterates never leave the narrow neighbour- hood N 2 (θ) of the central path, and that the duality gap shrinks at a geometric rate:

Theorem 5.1. The sequence (ωk)N generated by Algorithm SPF satisfies ωk ∈ N 2 (θ) for all k ∈ N, and

μ(ωk) =

δ √ n

)k μ(ω 0 ).

An immediate consequence of Theorem 5.1 is that it takes only logarithmically many iterations to reduce the duality gap below a desired threshold ǫ > 0:

Corollary 5.2. After at most k = O

n log n×μ ǫ(ω^0 )

iterations Algorithm SPF produces a point ωk = (xk, yk, sk) ∈ F◦^ such that

cTxk − bTyk ≤ ǫ.

Theorem 5.1 readily follows from the following result:

Lemma 5.3. Let ω = (x, y, s) ∈ N 2 (θ) and let ω+ = (x+, y+, s+) = ω + ∆ω, where ∆ω = (∆x, ∆y, ∆s) solves the system  

0 AT^ I

A 0 0

S 0 X

∆x ∆y ∆s

XSe − σμ(ω)e

with σ = 1 − √δn and θ, δ chosen as in the initialisation step of Algorithm SPF. Then

ω+ ∈ N 2 (θ) and μ(ω+) = σμ(ω).

Proof. Let μ+ =

1 − √δn

μ(ω). We claim that the following three relations hold true:

μ+ =

eTX+S+e n

‖X+S+e − μ+e‖ ≤ θμ+, (5.3) x+, s+ > 0. (5.4)

Clearly, these relations imply that the lemma holds true.

We will establish the validity of Claims (5.2), (5.3), and (5.4) separately after proving the following two technical lemmas:

Lemma 5.4. ∆xT∆s = 0.

Proof. This follows readily from the first two blocks of equations in (5.1).

Lemma 5.5. Let u, v ∈ Rn^ be such that uTv ≥ 0 and let U = Diag(u), V = Diag(v). Then

‖U V e‖ ≤

‖u + v‖^2 23 /^2

Proof. First note that for all α, β ∈ R such that αβ ≥ 0 it is the case that

|αβ| ≤ (α + β)^2 4

Consider the index sets I := {j : uj vj ≥ 0 } and J := {j : uj vj < 0 }. Using this notation we can write uTv =

j∈I |uj^ vj^ | −^

j∈J |uj^ vj^ | ≥^ 0. Therefore,

‖U V e‖^2 =

j∈I

(uj vj )^2 +

j∈J

(uj vj )^2 = ‖uvI ‖^22 + ‖uvJ ‖^22

≤ ‖uvI ‖^21 + ‖uvJ ‖^21 ≤ 2 ‖uvI ‖^21 ,

where uvI is a vector of length |I| with components {uj vj : j ∈ I} and similarly for uvJ , and where we used the well-known fact that ‖ · ‖ 2 ≤ ‖ · ‖ 1. Finally, this proves that

‖U V e‖ ≤

2 ‖uvI ‖ 1

(5.5) ≤

j∈I

(uj + vj )^2 ≤

‖u + v‖^2 23 /^2

shows that XSe − μe ⊥ e.

Lemma 5.8. Claim (5.4) is true.

Proof. We begin with the case θ ≤ 1 /2 which is easier to understand. Relation (5.3) shows that (x+)j (s+)j ≥ (1 − θ)μ+ > 0 for all j. So, if (x+)j < 0 for some j then (s+)j is negative too, and then

∆xj ∆sj ≥ (x+)j (s+)j > (1 − θ)μ+. (5.12)

On the other hand,

∆xj ∆sj ≤ ‖∆X∆Se‖

(5.3),(5.7) ≤ θμ+. (5.13)

The combination of (5.12) and (5.13) yields

(1 − θ)μ+ ≤ ∆xj ∆sj < θμ+

which implies the contradiction 2θ > 1 and proves the claim. If we do not assume θ ≤ 1 /2, we can proceed as follows: let

ω(α) :=

x(α), y(α), s(α)

:= (x, y, s) + α(∆x, ∆y, ∆s).

Proceeding as in the proof of (5.2) and (5.3), one can establish that for all α ∈ [0, 1],

μ

ω(α)

eTX(α)S(α)e n

αδ √ n

μ(ω),

and ‖X(α)S(α)e − μ(ω(α))e‖ ≤ θμ(ω(α)). Thus, xj (α)sj (α) ≥ (1 − θ)μ(ω(α)) for all α ∈ [0, 1]. This implies xj (α), sj (α) > 0 for all α ∈ [0, 1] by continuity of x(α), s(α) and by the fact that (1 − θ)μ(ω(α)) > 0. In particular, x+ = x(1) > 0 and s+ = s(1) > 0, which proves the claim.

Primal Dual, Lecture Notes - Mathematics -, Study notes of Mathematical Methods

Related documents

Partial preview of the text

Download Primal Dual, Lecture Notes - Mathematics - and more Study notes Mathematical Methods in PDF only on Docsity!

C12.1B: CONTINUOUS OPTIMISATION

LECTURE 16: PRIMAL-DUAL PATH-FOLLOWING FOR LINEAR

PROGRAMMING

0 AT^ I

A 0 0

S 0 X

0 AT^ I

A 0 0

S 0 X