






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
These notes review the basics of linear programming and the simplex method, focusing on underdetermined systems of linear equations, linear programs, basic solutions, and the simplex method. The notes assume a strong background in linear algebra and exclude numerical implementation, stability, and geometric considerations.
Typology: Study notes
1 / 12
This page cannot be seen from the preview
Don't miss anything!







The purpose of these notes is to review the basics of linear programming and the simplex method in a clear, concise, and comprehensive way. The book contains all of this material, but it is unfortunately spread across several chapters and, in my opinion, confusing in part.
These notes are a bit more demanding then the book; if you can read and thouroughly understand them then you are doing very well in the course. The difficulty that most students will encounter in these notes is that they assume a thorough knowledge of linear algebra. I have said several times in class and will repeat now: there is, in my experience, almost no subject with a better reward/effort ratio than linear algebra (statistics being perhaps more rewarding still per unit of effort). A thorough, complete knowledge of elementary linear algebra will serve anyone in a technical field well indeed.
Note that this discussion is not comprehensive; in particular, I have omitted any discussion of numerical implementation and stability, as well as any discussion of the geometry underlying linear programs.
We begin by reviewing underdetermined systems of linear equations. Let A be an m × n matrix with n > m. Then the system of equations
(1) Ax = b
is underdetermined — it has more variables than equations. Suppose we were to row reduce the augmented matrix
A | b
in order to solve the system (1). Because the matrix A has more columns than rows we will necessarily have free variables (if this isn’t clear to you, work out an example now). So we cannot expect a unique solution for this system.
Of course, it is also possible that (1) has no solutions at all (if this isn’t clear, stop and find an example). By making an additional assumption on the matrix A, we can ensure that the system (1) has a solution for every b ∈ Rm. In particular, we will generally assume that the matrix A has rank m.
Recall that the rank of a matrix is the dimension of its column and row spaces (these two dimensions are equal). So the assumption that A has rank m means that there are no redundant equations in the system (1), or, equivalently, that the column space of A must be all of Rm^ — in other words, for every b there is some solution of the system (1). It is also equivalent to saying that some set of m columns of the matrix A forms a basis for Rm.
We will be particularly interested in a distinguished class of solutions of (1). We say that x ∈ Rn^ is a basic solution for the system (1) if there is a set of indices {i 1 ,... , im} ⊂ { 1 , 2 ,... , n} such that:
Note that the columns of the matrix corresponding to a basic solution form a submatrix B of A which is invertible; indeed, a set of m vectors v 1 ,... , vm in Rm^ form a basis if and only if the matrix whose columns are v 1 ,... , vm is invertible (check this!).
Basic solutions for (1) are (relatively) easy to find: simply find a submatrix B consisting of the columns Ai 1 ,... , Aim of A which form a basis (perhaps via Gram-Schmidt Orthogonalization), solve the m × m system Bz = b for z, and form the basic solution x with entries
xj =
zij if j ∈ {i 1 ,... , im} 0 otherwise.
Since the matrix B is invertible, there is one and only one basic solution x associated with the columns i 1 ,... , im. We will call that solution the basic solution associated with the columns i 1 ,... , im. Alternately, we will say that x is the basic solution corresponding to the invertible submatrix B. 1
These facts are sufficiently important that we will repeat them in Lemma form; we have proved the following:
LEMMA 1.1. Suppose that A is an m × n matrix, n ≥ m, of rank m and further suppose that the columns i 1 ,... , im of A form a basis for Rm. Then there is a unique vector x, which we will call the basic vector for the columns i 1 ,... , im, such that
A linear program is any optimization problem of the form
max: ctx
(2) subject to: Ax = b
x ≥ 0 ,
where A is an m × n matrix, n > m, of rank m, c is a given row vector of length n, b is a given row vector of length m, and x is a row vector of unknowns of length n.
Remark 2.1. Note that the assumption that A has rank m ensures that there are no redundant constraints in (2). It differs from the usual definition of linear program only in that it excludes certain infeasible problems from consideration; e.g., the problem
max: x 1 + x 2 subject to: x 1 − x 2 = − 1 − x 1 + x 2 = − 1 x 1 , x 2 ≥ 0
is excluded.
We will call a vector x which satisfies the constraints Ax = b and x ≥ 0 a feasible vector for the program (2). Moreover, a feasible vector x for which cx obtains a maximum among the set of all feasible vectors is called an optimal feasible vector for the program (2), or, more simply, a solution to (2). The linear function cx is called the objective function for the program (2).
The first thing we should note about the problem (2) is that the constraint equations Ax = b are under- determined. Because of our assumption about the rank of A, we expect there to be an infinite number of solutions for this system of equations. That only stands to reason, since the problem is asking us to find from among the infinite number of x′s satisfying the constraints, a single x such that cx is maximized.
Our second immediate observation is that, despite the fact that the system Ax = b has an infinite number of solutions, it is not necessarily true that there exists even a single x satisfying all of the constraints. For example, this is the case for linear program
max: x 1 + x 2 + x 3
(3) subject to: x 1 + x 3 = − 1
x 2 − x 3 = − 1 x 1 , x 2 , x 3 ≥ 0.
We call a linear program for which there are no vectors x satisfying both Ax = b and x ≥ 0 infeasible.
Clearly, infeasible linear programs do not admit a solution. There is another way in which a linear program can fail to have a solution. There might not be a maximum value of the objective function cx. In such cases, we say that the program (2) is unbounded.
We close this section with a final observation about (2): if the set of x which satisfy the constraints is bounded (as a set in Rn), then the problem (2) cannot be unbounded. As with all sufficiently elementary facts, there are many different incantations we can invoke to see that this is so (e.g., a continuous function on a compact set in Rm^ obtains its maximum). This implies, for example, that a linear program of the
ways of choosing m indices from a set of n possible indices, and upon termination η will be either be −∞, in which case the problem is infeasible, or η will be the maximum value of the objective function and x 0 will be a basic solution.
Consider the linear program
max: ctx
(5) subject to: Ax = b
x ≥ 0 ,
where A is an m × n matrix, n > m, of rank m, and suppose that ˜x is a basic solution to the constraint equation Ax = b. Note: we are not assuming that it is a solution to the entire linear program, just the constraint equation; indeed, we are not even assuming that it is feasible.
In the future, we will call such vectors basic vectors for the linear program (5) to avoid any confusion over the word “solution.” A basic vector which is feasible will be called, of course, a basic feasible vector and a basic vector which is feasible and optimal will be called a basic solution for the LP.
We can, by rearranging the columns of A and the rows of x, ensure that the basis associated with ˜x consists of the first m columns of A. We can then write the LP (5) as
max: ct 1 xB + ct 2 xN
subject to:
xB xN
(6) = b
xB , xN ≥ 0 ,
where we have partitioned the variables x 1 ,... , xn into two sets: the basic variables xB corresponding to the first m columns of A and the nonbasic variables xN. This leads to a corresponding partitioning of the constraint matrix A into the m × m invertible matrix B and the m × (n − m) matrix N.
Remember that we started with some basic vector ˜x for the constraint equations Ax = b. We can also partition its entries. We will let x˜B denote the values of the basic entries of ˜x and x˜N denote the nonbasic entries. Because ˜x is a basic solution, its other entries are zero; i.e., x˜N = 0.
The constraint equation in (6) is of the form
BxB + N xN = b.
Since B is invertible, we can multiply both sides of this equation by B−^1 , which yields
(7) xB + B−^1 N xN = B−^1 b.
We will make two observations about the equation (7). First, plugging the vector ˜x into (7), we get:
x˜B + B−^1 N x˜N = B−^1 b,
or (since x˜N = 0), B−^1 b = x˜B.
Moreover, (7) allows us to rewrite the objective function in the form
cx = ct 1 xB + ct 2 xN = ct 1 ( ˜xB − B−^1 N )xN + ct 2 xN = ξ˜ + ˜ctxN
where ˜c is a column vector of length n − m.
Thus we can rewrite the LP (5) as
max: ξ˜ + ˜ctxN
subject to:
) (^ xB xN
(8) = x˜B
xB , xN ≥ 0 ,
We will call the form of the linear program (8) the tableau associated with the basic vector ˜x. We say that (8) is a feasible tableau if x˜B ≥ 0 and it is an optimal tableau if ˜x is an optimal feasible vector.
This form has several useful properties:
Remark 4.1. We are used to writing down the tableau form of a linear program in a table in the following manner:
nonbasic basic B−^1 N Im x˜B −˜c 0 ξ˜
In this section we introduce the simplex method, an improved algorithm for solving linear programs
max: ctx
(9) subject to: Ax = b
x ≥ 0 ,
where A is an m × n matrix, n > m, of rank m.
Suppose that ˜x and ˜y are basic vectors for the LP (9), and further suppose that X˜ is associated with the columns {i 1 ,... , im} of A and ˜y with the columns {j 1 ,... , jm}. Then we say that ˜x and ˜y are adjacent if the sets {i 1 ,... , im} and {j 1 ,... , jm} have m − 1 elements in common. In other words, two basic vectors ˜x and ˜y are adjacent if their associated bases differ by one element.
The idea behind the simplex method is quite simple: it is an iterative method, which starting with an initial basic feasible vector, moves from one basic feasible vector to another adjacent one in an effort to increase the value of the objective function. For each iteration j, we form the tableau
max: ξ˜j + ˜cj txN
subject to:
xB xN
(10) = ˜xj
xB , xN ≥ 0 ,
for the associated basic feasible vector x˜j (there is a slight abuse of notation here: we are identifying the basic part of ˜xj with ˜xj ). If ˜xj is a solution for the linear program, it is evident from the coefficients of ˜cj (see the observations about the tableaus above). Otherwise, we can choose a nonbasic variable in xN whose coefficient is positive, called the entering variable, which we will move into the basis. Of course, we must swap this variable with a properly chosen basic variable from xB , called the leaving variable, in order to maintain a basis. In most cases, this results in an increase in the objective function.
Remark 5.1. It is very important to understand what having an initial basic feasible vector entails. It means not only do we have a vector x˜ with no more than m nonzero entries which is feasible (x˜ ≥ 0 and Ax˜ = b), but also that the associated columns of the constraint matrix A form a basis. Not any old basis of columns of A will do — we need a basis consisting of columns of A such that the associated basic solution is nonnegative!
That is, we increase t until one or more of the variables in xˆB becomes zero. We now choose from among the set of xˆB a leaving variable. That is, a variable to move out of the basic set xB and into the nonbasic set xN at the next iteration.
We have now computed the value of the basic feasible vector at the next iteration: xˆB and xˆN. It only remains to do bookkeeping: to update the list of variables that are basic and nonbasic for the next iteration.
Remark 6.1. Note that it is never necessary while performing the simplex method algorithm to actually compute the form of the constraint matrix which appears in (12). Indeed, it is more convenient to leave the constraint matrix in its original form (11), and compute the direction ∆x by solving a system of equations.
Remark 6.2. Assuming that we do not run into unboundedness, the new set of basic variables do indeed correspond to a set columns of A which forms a basis. To see this, note that the vector ∆x records the coefficients of the entering column with respect to the current basis. Because the entry of ∆x cooresponding to the leaving variable is nonzero (by definition), it means that the entering column is not in the span of the basis columns excluding the leaving column. This implies the the resulting set of columns is a basis.
There is an obvious unresolved difficulty with the simplex method: we need to find a basic feasible vector in order to start the simplex method (if we start with an infeasible vector, we might never find a solution to the LP). As was remarked upon above, this is a nontrivial task which entails more than finding an invertible submatrix of the constraint matrix A.
We have been spoiled by the textbook’s practice of writing linear programs in the form:
max: ctx
(14) subject to: Ax ≤ b
x ≥ 0.
This form is particularly nice for initialization, because once m slack variables have been introduced (14) takes on the form
max: ctx
subject to:
x w
= b
x, w ≥ 0.
This makes it trivial to find an initial basis for the constraint matrix (we simply pick the identity submatrix I). Of course, the resulting basic vector is feasible only if b ≥ 0, so there is still work to be done even in this case, but nonetheless it is easier than the initialization of a general LP.
Instead of considering this simple case, we will once again the consider the general LP
max: ctx
(15) subject to: Ax = b
x ≥ 0 ,
where A is an m × n matrix, n > m, of rank m. Without loss of generality, we may assume that b ≥ 0 (if bi < 0 for some i then we can multiply the corresponding constraint by −1). In order to initialize (15), we introduce the auxiliary linear program
min: y 1 + y 2 +... + ym
(16) subject to: Ax + y = b
x, y ≥ 0.
It should be clear that the original problem (15) is feasible if and only if (16) has a solution such that
y 1 = y 2 =... = ym = 0.
Moreover, it is easy to find an initial feasible vector for (16): we simply let y = b and x = 0. Then the corresponding submatrix of the constraint matrix is the identity (and so clearly invertible) and because b can be assumed to satisfy b ≥ 0, this vector is feasible.
Thus we initialize the simplex method by attempting to solve the auxiliary LP (16). If, for the resulting solution (x, y), we have y 1 = y 2 =... = yn = 0, then we have found a basic feasible vector for the original LP.
There is one complication, however. Finding a basic solution of (16) such that (8) holds means that we have found a basic feasible vector for (15), but it does not mean that we can necessarily identify the basis! In particular, if all of the basic variables for the solution of (16) are x′s then it is obvious what columns of A to choose: if the basic variables for the auxiliary solution are
xi 1 , xi 2 ,... , xim
then the columns Ai 1 ,... , Aim are a basis for Rm^ and the associated solution of Ax = b is feasible. However, if one or more of the yj are included in the basic variables for the solution of (16), then more work must be done to find a basis of columns of A.
We could finish the initialization process by completing our basis with a set of additional columns from A, but in most cases, it is easier and more elegant to simply use a foolproof initialization scheme that we will discuss below.
Remark 8.1. Note that we can only have a basic solution for (16) with one or more yj as basic variables if the solution has fewer than m nonzero entries.
The dual of the linear program
max: ctx
(17) subject to: Ax ≤ b
x ≥ 0 ,
is the linear program
min: bty
(18) subject to: Aty ≥ c
y ≥ 0.
It is easy to see that the dual of the dual (18) program is again the original problem (17), which we refer to as the primal problem. This definition is motivated by the search for an upper bound for the objective function of (17) using the constraint equations. Your book does a good job of explaining this in the beginning of Chapter 5. Note that there is one constraint in the dual problem for every variable in the primal problem and one variable in the dual for each constraint in the primal.
Since (17) is not the usual form for linear programs, our first task is to find the dual for a program in the usual form:
max: ctx
(19) subject to: Ax = b
x ≥ 0.
We proceed by rewriting the constraints in (19) as inequalities:
max: ctx
(20) subject to: Ax ≤ b
− Ax ≤ −b x ≥ 0.
It is now clear that the dual of (19) is the linear program
min: bty 1 − bty 2
subject to:
At^ −At^
) (^ y 1 y 2
(21) ≥ b
y 1 , y 2 ≥ 0.
The primary observation of this section — and the most important single fact in the theory of duality — is that the dual of a tableau for the primal problem is a itself a tableau for the dual problem. This fact and the form of the two corresponding tableaus imply all sorts of nice results (e.g., strong duality).
To see that this is so, suppose that
max: ξ˜ + ˜ctxN
subject to:
xB xN
(27) = ˜x
xB , xN ≥ 0 ,
is a tableau associated with a basic feasible vector ˜x for a primal problem. Using what we learned in the last section (check this!), we find that the dual of (27) is
min: ξ˜ + ˜cty
subject to:
(B−^1 N )t
y ≥
˜c
y free,
Of course, we can rewrite this as
min: ξ˜ + ˜xty
(29) subject to: (B−^1 N )ty ≥ ˜c
y ≥ 0.
If we introduce slack variables yB into (29), and rename the variables already present yN , then we arrive at the linear program
min: ξ˜ + ˜xtyB
subject to:
I −(B−^1 N )t^
yN yB
(30) = −˜c
yB , yN ≥ 0.
We call this linear program the dual tableau corresponding to the primal tableau (27). We can immediately make the following observations:
We will call the basic vector ˜y associated with the dual tableau (30) the dual basic vector of ˜x. This is a very important idea: we associate with each basic vector of the primal a basic vector of the dual.
We close this section by proving the Strong Duality Theorem. Thanks to our discussion of tableaus and dual tableaus, the proof is trivial.
THEOREM 10.1. (Strong Duality) Suppose that ˜x is a basic solution for the primal problem (e.g., a basic optimal feasible vector)
max: ctx
(31) subject to: Ax ≤ b
x ≥ 0.
Then the corresponding basic vector y˜ for the dual
min: bty
(32) subject to: Ax ≥ c
y ≥ 0.
of (31) is a basic solution (e.g., a basic optimal feasible vector for the dual). Moreover,
(33) ct^ ˜x = bt^ y.˜
Proof: Write the tableau
max: ξ˜ + ˜ctxN
subject to:
) (^ xB xN
(34) = x˜B
xB , xN ≥ 0 ,
for the basic vector ˜x. That ˜x is feasible means that xB ≥ 0 and that it is optimal means that ˜c ≤ 0. Now the tableau
min: ξ˜ + x˜B tyB
subject to:
I −(B−^1 N )t^
yN yB
(35) = −˜c
yB , yN ≥ 0.
is the corresponding dual tableau — the one associated with the basic vector ˜y. That ˜c ≤ 0, means that ˜y is feasible and x˜B ≥ 0 implies that it is optimal (since the dual is a minization problem). So ˜y is a basic optimal vector for the dual. Equation (33) now follows because we have yB = 0 and xN = 0 for the pair of vectors ˜x and ˜y. QED.
We now discuss the use of the dual problem to effect initialization.
max: ctx subject to: Ax = b l ≤ x ≤ u?
Is it possible for the dual of this LP to be infeasible? What does that say about the primal, assuming it is feasible?
max: ctx subject to: Ax = b x free?