Benders' Decomposition for Linear Programming with Block-Ladder Structure, Study notes of Linux skills

Benders' decomposition for solving linear programming problems with block-ladder structure. The method involves decomposing the problem into smaller subproblems and iteratively solving them to update the master problem. The document also covers delayed constraint generation and its implementation for problems with block-ladder structure.

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-h3g
koofers-user-h3g 🇺🇸

5

(1)

10 documents

1 / 23

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Benders’ Decomposition Methods for Structured
Optimization, including Stochastic Optimization
Robert M. Freund
April 29, 2004
c
2004 Massachusetts Institute of Technology.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17

Partial preview of the text

Download Benders' Decomposition for Linear Programming with Block-Ladder Structure and more Study notes Linux skills in PDF only on Docsity!

Benders’ Decomposition Methods for Structured

Optimization, including Stochastic Optimization

Robert M. Freund

April 29, 2004

© c 2004 Massachusetts Institute of Technology.

1 Block Ladder Structure

We consider the solution of a linear optimization model of the basic format:

T minimize x,y c x + f

T y

s.t. Ax = b

Bx + Dy = d

x ≥ 0 y ≥ 0.

Here the variables x can be thought of as stage-1 decisions, governed by

constraints Ax = b, x ≥ 0 and with cost function c

T x. Once the x variables

have been chosen, the y variables are chosen next, subject to the constraints

Dy = d − Bx, y ≥ 0 and with cost function f

T y.

We also consider the more complex format associated with two-stage

optimization under uncertainty:

T minimize c x + α 1 f 1

T y 1 + α 2 f 2

T y 2 + · · · + αK f

T

K

yK

x, y 1 ,... , y K

s.t. Ax

B

1 x + D 1 y 1

B

2 x

= b

= d 1

+ D

2 y 2 = d 2

..^.

B

K x + D K y K = d K

x, y 1 , y 2 ,... , y K

Actually, each of these formats can be thought of as a special case of the

other format. The more complex format above is known as “block-ladder”,

and is illustrated in Figure 1.

whose optimal objective value is denoted by VAL. We re-write this as:

VAL = minimumx c

T x + z(x)

s.t. Ax = b

x ≥ 0 ,

where:

P2 : z(x) = minimum y f

T y

s.t. Dy = d − Bx

y ≥ 0.

We call this problem “P2” because it represents the stage-2 decision problem,

once the stage-1 variables x have been chosen. Applying duality, we can also

construct z(x) through the dual of P2, which we denote by D2:

D2 : z(x) = maximum p p

T (d − Bx)

s.t. D

T p ≤ f.

The feasible region of D2 is the set:

D

2 := p | D

T p ≤ f

whose extreme points and extreme rays can be enumerated:

1 I p ,... , p

are the extreme points of P 2 , and

1 J r ,... , r

are the extreme rays of P 2

If we use an algorithm to solve D2, exactly one of two cases will arise:

D2 is unbounded from above, or D2 has an optimal solution. In the first

case, the algorithm will find that D2 is unbounded from above, and it will

return one of the extreme rays ¯ = r

j r for some j, with the property that

(r

j )

T (d − Bx) > 0

in which case

z(x) = +∞.

In the second case, the algorithm will solve D2, and it will return one of the

extreme points p¯ = p

i for some i as well as the optimal objective function

value z(x) which will satisfy:

z(x) = (p

i )

T (d − Bx) = max (p

k )

T (d − Bx).

k=1,...,I

Therefore we can re-write D2 again as the following problem:

D2 : z(x) = minimum z z

s.t. (p

i )

T (d − Bx) ≤ z i = 1,... , I

(r

j )

T (d − Bx) ≤ 0 j = 1,... , J.

This problem simply states that the value of D2 is just the dual objective

function evaluated at the best extreme point of the dual feasible region, pro-

vided that the objective function makes an obtuse angle with every extreme

ray of the dual feasible region.

We solve this problem, obtaining the optimal objective value VAL

k and

the solution ¯x, z¯ that is optimal for this problem. We first observe that

VAL

k is a lower bound on VAL, i.e.:

VAL

k ≤ VAL.

In order to check whether the solution x, z¯ ¯ is optimal for the full mas-

ter problem, we must check whether x, z¯ ¯ violates any of the non-included

constraints. We check for this by solving the linear optimization problem:

x) : = maximum p p

T Q(¯ (d − B ¯x) = minimum y f

T y

s.t. D

T p ≤ f s.t. Dy = d − B ¯

y ≥ 0.

If Q(¯ x) is unbounded, the algorithm will return an extreme ray ¯r = r

j

for some j, where x¯ will satisfy:

(r

j )

T (d − B x¯) > 0.

We therefore have found that ¯x has violated the constraint:

(r

j )

T (d − Bx) ≤ 0 ,

and so we add this constraint to RMP and re-solve RMP.

If Q(¯x) has an optimal solution, the algorithm will return an optimal

i extreme point p¯ = p for some i, as well as the optimal solution y¯ to the

minimization problem of Q(¯x).

x

Now notice first from Q(¯x ) that (¯x, y¯) is feasible for the original problem.

Therefore, if UB is a previously computed upper bound on VAL, we can

update the upper bound and our “best” feasible solution as follows:

T If c ¯x + f x, ¯

T y¯ ≤ UB, then BESTSOL ← (¯ y).

T UB ← min{UB, c x¯ + f

T y¯}.

Also, notice that if:

(p

i )

T (d − B ¯x ) > z ,¯

then we have found that x, z¯ ¯ has violated the constraint:

(p

i )

T (d − Bx) ≤ z ,

and so we add this constraint to RMP and re-solve RMP.

If (p

i )

T (d − B ¯x) ≤ z¯, then we have:

max (p

k )

T (d − B ¯x ) = (p x) ≤ ¯

i )

T (d − B ¯ z ,

k=1,...,I

and so x, z¯ ¯ is feasible and hence optimal for FMP, and so (¯x, y¯) is optimal

for the original problem, and so we terminate the algorithm.

By our construction of upper and lower bounds on VAL, we can also

terminate the algorithm whenever:

UB − VAL

k ≤ ε

for a pre-specified tolerance ε.

T If c ¯x + f x, y¯)

T y¯ ≤ UB, then BESTSOL ← (¯

T UB ← min{UB, c x¯ + f

T y¯}

If UB − LB ≤ ε, then terminate. Otherwise, return to step 1.

  1. If Q(¯x ) is unbounded from above, let r¯ be the extreme ray generated

by the algorithm that solves Q(¯x). Add the constraint:

T (d − Bx) ≤ 0

to the restricted master problem RMP, and return to step 1.

This algorithm is known formally as Benders’ decomposition. The Ben-

ders’ decomposition method was developed in 1962 [2], and is described in

many sources on large-scale optimization and stochastic programming. A

general treatment of this method can be found in [3, 4].

The algorithm can be initialized by first computing any x¯ that is feasible

for the first stage, that is, Ax¯ = b, x¯ ≥ 0, and by choosing the value of z¯

to be −∞ (or any suitably small number). This will force (¯x, z¯) to violate

some constraints of FMP.

Notice that the linear optimization problems solved in Benders’ decom-

position are “small” relative to the size of the original problem. We need

to solve RMP

k once every outer iteration and we need to solve Q(¯x) once

every outer iteration. Previous optimal solutions of RMP

k and Q(¯x) can

serve as a starting basis for the current versions of RMP

k and Q(¯x) if we

use the simplex method. Because interior-point methods do not have the

capability of starting from a previous optimal solution, we should think of

Benders’ decomposition as a method to use in conjunction with the simplex

method primarily.

Incidentally, one can show that Benders’ decomposition method is the

same as Dantzig-Wolfe decomposition applied to the dual problem.

4 Benders’ Decomposition for Problems with Block-

Ladder Structure

Suppose that our problem has the complex block-ladder structure of the

following optimization model that we see for two-stage optimization under

uncertainty:

T VAL = minimize c x + α 1 f 1

T y 1

  • α 2 f 2

T y 2

  • · · · + α K f

T

K

y K

x, y 1 ,... , yK

s.t. Ax

B

1 x + D 1 y 1

B

2 x

= b

= d 1

+ D

2 y 2 = d 2

..^.

B

K x + D K y K = d K

x, y 1 , y 2 ,... , y K

whose optimal objective value is denoted by VAL. We re-write this as:

K T VAL = minimum x c x + α ω z ω (x)

ω=

s.t. Ax = b

x ≥ 0 ,

where:

(r

j

ω

T (d ω

− B

ω x) > 0

in which case

z ω (x) = +∞.

In the second case, the algorithm will solve D ω , and it will return one of the

extreme points p¯ω = p

i for some i as well as the optimal objective function ω

value z ω (x) which will satisfy:

i k zω (x) = (p ω

T (dω − Bω x) = max (p ω

T (dω − Bω x).

k=1,...,Iω

Therefore we can re-write D ω again as the following problem:

D

ω : z ω (x) = minimum zω z ω

i s.t. (p ω

T (d ω

− B

ω x) ≤ z ω i = 1,... , I ω

j (r ω

T (d ω

− B

ω x) ≤ 0 j = 1,... , J ω

This problem simply states that the value of D ω is just the dual objective

function evaluated at the best extreme point of the dual feasible region, pro-

vided that the objective function makes an obtuse angle with every extreme

ray of the dual feasible region.

We now take this reformulation of z ω (x) and place it in the orginal

problem:

K T FMP : VAL = minimum c x + α ω z ω

ω=

x, z 1 ,... , zK

s.t. Ax = b

x ≥ 0

i (p ω

T (d ω

− B

ω x) ≤ z ω i = 1,... , I ω , ω = 1,... , K

j (r ω

T (d ω

− B

ω x) ≤ 0 j = 1,... , J ω , ω = 1,... , K.

We call this problem FMP for the full master problem. Comparing

FMP to the original version of the problem, we see that we have eliminated

the vector variables y 1 ,... , yK from the problem, we have added K scalar

variables z ω for ω = 1,... , K, and we have also added a generically extremely

large number of constraints.

As in the basic model, we consider the restricted master problem RMP

composed of only k of the extreme point / extreme ray constraints from

FMP:

K T RMP

k : VAL

k = minimum c x + α ω z ω

ω=

x, z 1 ,... , z K

s.t. Ax = b

x ≥ 0

i (p ω

T (d ω

− B

ω x) ≤ z ω for some i and ω

j (r ω

T (d ω

− B

ω x) ≤ 0 for some i and ω ,

where there are a total of k of the inequality constraints. We solve this prob-

lem, obtaining the optimal objective value VAL

k and the solution ¯ z 1 ,... , z¯ K x, ¯

then we have found that x,¯ z¯ω has violated the constraint:

i (p ω

T (d ω

− B

ω x) ≤ z ω

and so we add this constraint to RMP.

If Qω (¯x) has finite optimal objective function values for all ω = 1,... , K,

then x,¯ y¯ 1 ,... , y¯ K satisfies all of the constraints of the original problem.

Therefore, if UB is a previously computed upper bound on VAL, we can

update the upper bound and our “best” feasible solution as follows:

K

T If c x¯ + α x, ¯ ω f ω

T y¯ ω ≤ UB, then BESTSOL ← (¯ y 1 ,... , y¯ K

ω=

K

T UB ← min{UB, c x¯ + αω fω

T y¯ω }.

ω=

i If (p ω

T (d ω

− B

ω x) ≤ z¯ ω ¯ for all ω = 1,... , K, then we have:

l i max (p ¯ ¯ ω

T (d ω

− B

ω x) = (p ω

T (d ω

− B

ω x) ≤ z¯ ω for all ω = 1,... , K ,

l=1,...,Iω

and so x,¯ z¯ x, ¯ 1 ,... , z¯ K satisfies all of the constraints in FMP. Therefore, ¯ z 1 ,... , z¯ K

is feasible and hence optimal for FMP, and so (¯ y 1 ,... , y¯ K x, ¯ ) is optimal for

the original problem, and we can terminate the algorithm.

Also, by our construction of upper and lower bounds on VAL, we can

also terminate the algorithm whenever:

UB − VAL

k ≤ ε

for a pre-specified tolerance ε.

4.1 The Delayed Constraint Generation Algorithm for Prob-

lems with Block-Ladder Structure

Here is a formal description of the algorithm just described for problems

with block-ladder structure.

  1. Set LB= −∞ and UB= +∞.
  2. A typical iteration starts with the relaxed master problem RMP

k , in

which only k constraints of the full master problem FMP are included.

An optimal solution x,¯ z¯ 1 ,... , z¯ K to the relaxed master problem is

computed. We update the lower bound on VAL:

LB ← VAL

k .

  1. For ω = 1,... , K, solve the subproblem Qω (¯x):

T x) : = maximum pω p ω

(d ω

− B

ω x) = minimum yω f

T Q ω

ω

y ω

s.t. D

T ¯ ω

p ω ≤ f ω s.t. D ω y ω = d ω

− B

ω x

y ω

  • If Qω (¯x) is unbounded from above, let r¯ω be the extreme ray

generated by the algorithm that solves Q(¯x). Add the constraint:

T r¯ ω

(d ω

− B

ω x) ≤ 0

to the restricted master problem RMP.

  • If Q ω (¯ p ω x) has an optimal solution, let ¯ and y¯ ω be the primal and

i x). If (p ω

T (d ω

− B

ω x) > z¯ ω dual solutions of Q , then we add ω

the constraint:

4 125 5 5 15

min c i x i

  • α ω
  1. 01 f i (ω)h j y ijkω x,y

i=1 ω=1 i=1 j=1 k=

4

s.t. c i x i ≤ 10 , 000 (Budget constraint)

i=

x 4 ≤ 5. 0 (Hydroelectric constraint)

yijkω ≤ xi for i = 1,... , 4 , all j, k, ω (Capacity constraints)

5

yijkω ≥ Djkω for all j, k, ω (Demand constraints)

i=

x ≥ 0 , y ≥ 0

where αω is the probability of scenario ω, and Djkω is the power demand in

block j and year k under scenario ω.

This model has 4 stage-1 variables, and 46,875 stage-2 variables. Further-

more, other than nonnegativity constraints, there are 2 stage-1 constraints,

and 375 constraints for each of the 125 scenarios in stage-2. This yields

a total of 46 , 877 constraints. It is typical for problems of this type to be

solved by Benders’ decomposition, which exploits the problem structure by

decomposing it into smaller problems.

Consider a fixed x which is feasible (i.e., x ≥ 0 and satisfies the budget

constraint and hydroelectric constraint). Then the optimal second-stage

variables y ijkω can be determined by solving for each ω the problem:

5 5 15

z ω (x) � min y

i=1 j=1 k=

  1. 01 f i (ω)h j y ijkω

subject to y ijkω ≤ x i for i = 1,... , 4 , all j, k, (Capacity constraints)

5 �

y ijkω

≥ D

jkω for all j, k (Demand constraints)

i=

y ≥ 0.

Although this means solving 125 problems (one for each scenario), each

subproblem has only 375 variables and 375 constraints. (Furthermore, the

structure of this problem lends itself to a simple greedy optimal solution.)

The dual of the subproblem (2) is:

4 5 15 5 15 � � � � �

z ω (x) � max

p,q i=1 j=1 (^) k=

x i p ijkω

j=1 (^) k=

D

jkω q jkω

subject to p ijkω

  • q jkω ≤ 0. 01 f i (ω)h j i = 1,... , 4 , for all j, k

p ijkω

qjkω ≥ 0.

6 Computational Results

All models were solved in OPLStudio on a Sony Viao Laptop with a Pentium

III 750 MHz Processor Running Windows 2000. Table 1 shows the number

of iterations taken by the simplex method to solve the original model.

Original Model CPT Time

Iterations (minutes:seconds)

Simplex 41,334 5:

Table 1: Iterations to solve the original model.

We next solved the problem using Benders’ decomposition, using a du-

ality gap tolerance of ε = 10

− 2

. We first solved the model treating all

stage-2 variables and constraints as one large block, using the basic Ben-

ders’ decomposition algorithm. We next solved the model treating each of

the 125 stage-2 scenarios as a separate block, using the more advanced ver-

sion of Benders’ decomposition that takes full advantage of the block-ladder

structure. The number of master iterations and the number of constraints