










Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The properties of vectors and matrices, focusing on their relevance to linear programming. Topics include vector equality, scalar multiplication, vector addition, matrix multiplication, and the relationship between vectors and matrices. The document also covers the concept of a basis and its inverse.
Typology: Study Guides, Projects, Research
1 / 18
This page cannot be seen from the preview
Don't miss anything!











Vectors and matrices are notational conveniences for dealing with systems of linear equations and inequalities. In particular, they are useful for compactly representing and discussing the linear programming problem:
Maximize
∑^ n
j = 1
c (^) j x (^) j ,
subject to: (^) n ∑
j = 1
ai j x (^) j = bi ( i = 1 , 2 ,... , m ),
x (^) j ≥ 0 ( j = 1 , 2 ,... , n ).
This appendix reviews several properties of vectors and matrices that are especially relevant to this problem. We should note, however, that the material contained here is more technical than is required for understanding the rest of this book. It is included for completeness rather than for background.
A.1 VECTORS
We begin by defining vectors, relations among vectors, and elementary vector operations.
Definition. A k-dimensional vector y is an ordered collection of k real numbers y 1 , y 2 ,... , yk , and is written as y = ( y 1 , y 2 ,... , yk ). The numbers y (^) j ( j = 1 , 2 ,... , k ) are called the components of the vector y.
Each of the following are examples of vectors:
i) (1, −3, 0, 5) is a four-dimensional vector. Its first component is 1, its second component is −3, and its third and fourth components are 0 and 5, respectively. ii) The coefficients c 1 , c 2 ,... , cn of the linear-programming objective function determine the n -dimensional vector c = ( c 1 , c 2 ,... , cn ). iii) The activity levels x 1 , x 2 ,... , xn of a linear program define the n -dimensional vector x = ( x 1 , x 2 ,... , xn ). iv) The coefficients ai 1 , ai 2 ,... , ain of the decision variables in the i th equation of a linear program deter- mine an n -dimensional vector Ai^ = ( ai 1 , ai 2 ,... , ain ). v) The coefficients a 1 j , a 2 j ,... , an j of the decision variable x (^) j in constraints 1 through m of a linear program define an m -dimensional vector which we denote as A (^) j = ( a 1 j , a 2 j ,... , am j ).
487
488 Vectors and Matrices A.
Equality and ordering of vectors are defined by comparing the vectors’ individual components. Formally, let y = ( y 1 , y 2 ,... , yk ) and z = ( z 1 , z 2 ,... , zk ) be two k -dimensional vectors. We write:
y = z when y (^) j = z (^) j ( j = 1 , 2 ,... , k ), y ≥ z or z ≤ y when y (^) j ≥ z (^) j ( j = 1 , 2 ,... , k ), y > z or z < y when y (^) j > z (^) j ( j = 1 , 2 ,... , k ),
and say, respectively, that y equals z, y is greater than or equal to z and that y is greater than z. In the last two cases, we also say that z is less than or equal to y and less than y. It should be emphasized that not all vectors are ordered. For example, if y = ( 3 , 1 , − 2 ) and x = ( 1 , 1 , 1 ), then the first two components of y are greater than or equal to the first two components of x but the third component of y is less than the corresponding component of x. A final note: 0 is used to denote the null vector (0, 0, …, 0), where the dimension of the vector is understood from context. Thus, if x is a k -dimensional vector, x ≥ 0 means that each component x (^) j of the vector x is nonnegative. We also define scalar multiplication and addition in terms of the components of the vectors.
Definition. Scalar multiplication of a vector y = ( y 1 , y 2 ,... , yk ) and a scalar α is defined to be a new vector z = ( z 1 , z 2 ,... , zk ), written z = α y or z = y α, whose components are given by z (^) j = α y (^) j.
Definition. Vector addition of two k -dimensional vectors x = ( x 1 , x 2 ,... , xk ) and y = ( y 1 , y 2 ,... , yk ) is defined as a new vector z = ( z 1 , z 2 ,... , zk ), denoted z = x + y , with components given by z (^) j = x (^) j + y (^) j.
As an example of scalar multiplication, consider
4 ( 3 , 0 , − 1 , 8 ) = ( 12 , 0 , − 4 , 32 ),
and for vector addition, ( 3 , 4 , 1 , − 3 ) + ( 1 , 3 , − 2 , 5 ) = ( 4 , 7 , − 1 , 2 ).
Using both operations, we can make the following type of calculation:
( 1 , 0 ) x 1 + ( 0 , 1 ) x 2 + (− 3 , − 8 ) x 3 = ( x 1 , 0 ) + ( 0 , x 2 ) + (− 3 x 3 , − 8 x 3 ) = ( x 1 − 3 x 3 , x 2 − 8 x 3 ).
It is important to note that y and z must have the same dimensions for vector addition and vector comparisons. Thus ( 6 , 2 , − 1 ) + ( 4 , 0 ) is not defined, and ( 4 , 0 , − 1 ) = ( 4 , 0 ) makes no sense at all.
A.2 MATRICES
We can now extend these ideas to any rectangular array of numbers, which we call a matrix.
Definition. A matrix is defined to be a rectangular array of numbers
a 11 a 12 · · · a 1 n a 21 a 22 · · · a 2 n .. .
am 1 am 2 · · · amn
whose dimension is m by n. A is called square if m = n. The numbers ai j are referred to as the elements of A.
The tableau of a linear programming problem is an example of a matrix. We define equality of two matrices in terms of their elements just as in the case of vectors.
490 Vectors and Matrices A.
of matrix multiplication is sometimes referred to as an inner product. It can be visualized by placing the elements of π next to those of q and adding , as follows:
π 1 × q 1 = π 1 q 1 , π 2 × q 2 = π 2 q 2 , .. .
π m × qm = π m qm.
π q =
∑^ m
i = 1
π i qi.
In these terms, the elements ci j of matrix C = AB are found by taking the inner product of Ai^ (the i th row of A ) with B (^) j (the j th column of B ); that is, ci j = Ai^ B (^) j. The following properties of matrices can be seen easily by writing out the appropriate expressions in each instance and rearranging the terms:
A + B = B + A (Commutative law) A + ( B + C ) = ( A + B ) + C (Associative law) A ( BC ) = ( AB ) C (Associative law) A ( B + C ) = AB + AC (Distributive law)
As a result, A + B + C or ABC is well defined, since the evaluations can be performed in any order. There are a few special matrices that will be useful in our discussion, so we define them here.
Definition. The identity matrix of order m , written Im (or simply I , when no confusion arises) is a square m -by- m matrix with ones along the diagonal and zeros elsewhere.
For example,
I 3 =
It is important to note that for any m -by- m matrix B , B Im = Im B = B. In particular, Im Im = Im or I I = I.
Definition. The transpose of a matrix A , denoted At^ , is formed by interchanging the rows and columns of A ; that is, ati j = a (^) ji.
If A =
then the transpose of A is given by:
At^ =
We can show that ( AB ) t^ = Bt^ At^ since the i j th element of both sides of the equality is
k a^ jk^ bki^.
Definition. An elementary matrix is a square matrix with one arbitrary column, but otherwise ones along the diagonal and zeros elsewhere (i.e., an identify matrix with the exception of one column).
A.3 Linear Programming in Matrix Form 491
For example,
is an elementary matrix.
A.3 LINEAR PROGRAMMING IN MATRIX FORM
The linear-programming problem
Maximize c 1 x 1 + c 2 x 2 + · · · + cn xn ,
subject to: (^) a 11 x 1 +^ a 12 x 2 + · · · +^ a 1 n xn ≤^ b 1 , a 12 x 1 + a 22 x 2 + · · · + a 2 n xn ≤ b 2 , .. .
a 1 m x 1 + a 2 m x 2 + · · · + amn xn ≤ bm , x 1 ≥ 0 , x 2 ≥ 0 ,... , xn ≥ 0 ,
can now be written in matrix form in a straightforward manner. If we let:
x =
x 1 x 2 .. . xn
and b =
b 1 b 2 .. . bm
be column vectors, the linear system of inequalities is written in matrix form as Ax ≤ b. Letting c = ( c 1 , c 2 ,... , cn ) be a row vector, the objective function is written as cx. Hence,the linear program assumes the following compact form:
Maximize cx ,
subject to: Ax ≤ b , x ≥ 0. The same problem can also be written in terms of the column vectors A (^) j of the matrix A as:
Maximize c 1 x 1 + c 2 x 2 + · · · + cn xn , subject to:
A 1 x 1 + A 2 x 2 + · · · + An xn ≤ b , x (^) j ≥ 0 ( j = 1 , 2 ,... , n ).
At various times it is convenient to use either of these forms. The appropriate dual linear program is given by:
Minimize b 1 y 1 + b 2 y 2 + · · · + bm ym ,
subject to:
a 11 y 1 + a 21 y 2 + · · · + am 1 ym ≥ c 1 , a 12 y 1 + a 22 y 2 + · · · + am 2 ym ≥ c 2 , .. .
a 1 n y 1 + a 2 n y 2 + · · · + amn ym ≥ cn , y 1 ≥ 0 , y 2 ≥ 0 ,... , ym ≥ 0.
A.4 The Inverse of a Matrix 493
i) The inverse of a matrix B is unique if it exists.
Proof. Suppose that B −^1 and A are both inverses of B. Then
B −^1 = I B −^1 = ( AB ) B −^1 = A ( B B −^1 ) = A.
ii) I −^1 = I since I I = I. iii) If the inverse of A and B exist, then the inverse of AB exists and is given by ( AB )−^1 = B −^1 A −^1.
Proof. ( AB )( B −^1 A −^1 ) = A ( B B −^1 ) A −^1 = AI A −^1 = A A −^1 = I.
iv) If the inverse of B exists, then the inverse of B −^1 exists and is given by ( B −^1 )−^1 = B.
Proof. I = I −^1 = ( B −^1 B )−^1 = B −^1 ( B −^1 )−^1.
v) If the inverse of B exists, then the inverse of Bt^ exists and is given by ( Bt^ )−^1 = ( B −^1 ) t^.
Proof. I = I t^ = ( B −^1 B ) t^ = Bt^ ( B −^1 ) t^.
The natural question that arises is: Under what circumstances does the inverse of a matrix exist? Consider the square system of equations given by: Bx = I y = y.
If B has an inverse, then multiplying on the left by B −^1 yields
I x = B −^1 y ,
which ‘‘solves’’ the original square system of equations for any choice of y. The second system of equations has a unique solution in terms of x for any choice of y , since one variable x (^) j is isolated in each equation. The first system of equations can be derived from the second by multiplying on the left by B ; hence, the two systems are identical in the sense that any x ¯, y ¯ that satisfies one system will also satisfy the other. We can now show that a square matrix B has an inverse if the square system of equations Bx = y has a unique solution x for an arbitrary choice of y. The solution to this system of equations can be obtained by successively isolating one variable in each equation by a procedure known as Gauss–Jordan elimination , which is just the method for solving square systems of equations learned in high-school algebra. Assuming b 11 6 = 0, we can use the first equation to eliminate x 1 from the other equations, giving:
x 1 + b b^1211 x 2 + · · · + b b^111 m xm = (^) b^111 y 1 ,
( b 22 − b 21 b b^1211
x 2 + · · · +
b 2 m − b 21 b b^111 m
xm = − b b^2111 y 1 + y 2 , .. .
bm 2 − bm 1 b b^1211
x 2 + · · · +
bmm − bm 1 b b^111 m
xm = − b bm 111 y 1 + ym.
494 Vectors and Matrices A.
If b 11 = 0, we merely choose some other variable to isolate in the first equation. In matrix form, the new matrices of the x and y coefficients are given respectively by E 1 B and E 1 I , where E 1 is an elementary matrix of the form:
k 1 0 0 · · · 0 k 2 1 0 · · · 0 k 3 0 1 · · · 0 .. .
km 0 0 · · · 1
k 1 = (^) b^111 , .. . ki = − b b 11 i^1 ( i = 2 , 3 ,... , m ).
Further, since b 11 is chosen to be nonzero, E 1 has an inverse given by:
1 / k 1 0 0 · · · 0 − k 2 1 0 · · · 0 − k 3 0 1 · · · 0 .. .
− km 0 0 · · · 1
Thus by property (iii) above, if B has an inverse, then E 1 B has an inverse and the procedure may be repeated. Some x (^) j coefficient in the second row of the updated system must be nonzero , or no variable can be isolated in the second row, implying that the inverse does not exist. The procedure may be repeated by eliminating this x (^) j from the other equations. Thus, a new elementary matrix E 2 is defined, and the new system
( E 2 E 1 B ) x = ( E 2 E 1 ) y
has x 1 isolated in equation 1 and x 2 in equation 2. Repeating the procedure finally gives:
( Em Em − 1 · · · E 2 E 1 B ) x = ( Em Em − 1 · · · E 2 E 1 ) y
with one variable isolated in each equation. If variable x (^) j is isolated in equation j , the final system reads:
x 1 = β 11 y 1 +β 12 y 2 + · · · + β 1 m ym , x 2 = β 21 y 1 +β 22 y 2 + · · · + β 2 m ym ,
... .. . xm = β m 1 y 1 +β m 2 y 2 + · · · + β mm ym ,
and
β 11 β 12 · · · β 1 m β 21 β 22 · · · β 2 m .. .
β m 1 β m 2 · · · β mm
Equivalently, B −^1 = Em Em − 1 · · · E 2 E 1 is expressed in product form as the matrix product of elementary matrices. If, at any stage in the procedure, it is not possible to isolate a variable in the row under consideration, then the inverse of the original matrix does not exist. If x (^) j has not been isolated in the j th equation, the equations may have to be permuted to determine B −^1. This point is illustrated by the following example:
496 Vectors and Matrices A.
multiplication of two partitioned matrices
(^) , and B =
results in
AB =
assuming the indicated products are defined; i.e., the matrices Ai j and B (^) jk have the appropriate dimensions. To illustrate that partitioned matrices may be helpful in computing inverses, consider the following example. Let
M =
where 0 denotes a matrix with all zero entries. Then
M −^1 =
satisfies
M M −^1 = I or
which implies the following matrix equations:
A + QC = I , B + Q D = 0 , RC = 0 , R D = I.
Solving these simultaneous equations gives
C = 0 , A = I , D = R −^1 , and B = − Q R −^1 ;
or, equivalently,
M −^1 =
Note that we need only compute R −^1 in order to determine M −^1 easily. This type of use of partitioned matrices is the essence of many schemes for handling large-scale linear programs with special structures.
A.5 BASES AND REPRESENTATIONS
In Chapters 2, 3, and 4, the concept of a basis plays an important role in developing the computational procedures and fundamental properties of linear programming. In this section, we present the algebraic foundations of this concept.
Definition. m-dimensional real space Rm^ is defined as the collection of all m -dimensional vectors y = ( y 1 , y 2 ,... , ym ).
Definition. A set of m -dimensional vectors A 1 , A 2 ,... , Ak is linearly dependent if there exist real numbers α 1 , α 2 ,... , α k , not all zero , such that
α 1 A 1 + α 2 A 2 + · · · + α k Ak = 0. (1)
If the only set of α (^) j ’s for which (1) holds is α 1 = α 2 = · · · = α k = 0, then the m -vectors A 1 , A 2 ,... , Ak are said to be linearly independent.
A.5 Bases and Representations 497
For example, the vectors ( 4 , 1 , 0 , − 1 ), ( 3 , 1 , 1 , − 2 ), and ( 1 , 1 , 3 , − 4 ) are linearly dependent, since
2 ( 4 , 1 , 0 , − 1 ) − 3 ( 3 , 1 , 1 , − 2 ) + 1 ( 1 , 1 , 3 , − 4 ) = 0.
Further, the unit m -dimensional vectors u (^) j = ( 0 ,... , 0 , 1 , 0 ,... , 0 ) for j = 1 , 2 ,... , m , with a plus one in the j th component and zeros elsewhere, are linearly independent, since
∑^ m
j = 1
α (^) j u (^) j = 0
implies that
α 1 = α 2 = · · · = α m = 0.
If any of the vectors A 1 , A 2 ,... , Ak , say Ar , is the 0 vector (i.e., has all zero components), then, taking α r = 1 and all other α (^) j = 0 shows that the vectors are linearly dependent. Hence, the null vector is linearly dependent on any set of vectors.
Definition. An m -dimensional vector Q is said to be dependent on the set of m -dimensional vectors A 1 , A 2 ,... , Ak if Q can be written as a linear combination of these vectors; that is, Q = λ 1 A 1 + λ 2 A 2 + · · · + λ k Ak
for some real numbers λ 1 , λ 2 ,... , λ k. The k -dimensional vector (λ 1 , λ 2 ,... , λ k ) is said to be the representation of Q in terms of A 1 , A 2 ,... , Ak.
Note that ( 1 , 1 , 0 ) is not dependent upon ( 0 , 4 , 2 ) and ( 0 , − 1 , 3 ), since λ 1 ( 0 , 4 , 2 ) + λ 2 ( 0 , − 1 , 3 ) = ( 0 , 4 λ 1 − λ 2 , 2 λ 1 + 3 λ 2 ) and can never have 1 as its first component. The m -dimensional vector (λ 1 , λ 2 ,... , λ m ) is dependent upon the m -dimensional unit vectors u 1 , u 2 ,... , um , since
(λ 1 , λ 2 ,... , λ m ) =
∑^ m
j = 1
λ (^) j u (^) j.
Thus, any m -dimensional vector is dependent on the m -dimensional unit vectors. This suggests the following important definition.
Definition. A basis of Rm^ is a set of linearly independent m -dimensional vectors with the property that every vector of Rm^ is dependent upon these vectors.
Note that the m -dimensional unit vectors u 1 , u 2 ,... , um are a basis for Rm^ , since they are linearly independent and any m -dimensional vector is dependent on them. We now sketch the proofs of a number of important properties relating bases of real spaces, representations of vectors in terms of bases, changes of bases, and inverses of basis matrices.
Property 1. A set of m -dimensional vectors A 1 , A 2 ,... , Ar is linearly dependent if and only if one of these vectors is dependent upon the others.
Proof. First, suppose that
Ar =
r ∑− 1
j = 1
λ (^) j A (^) j ,
A.5 Bases and Representations 499
Proof. Suppose that λ m 6 = 0. First, we show that the vectors A 1 , A 2 ,... , Am − 1 , Q are linearly indepen- dent. Let α (^) j for j = 1 , 2 ,... , m and α Q be any real numbers satisfying:
m ∑− 1
j = 1
α (^) j A (^) j + α Q Q = 0. (3)
If α Q 6 = 0, then
Q =
m ∑− 1
j = 1
α (^) j α Q
A (^) j ,
which with (2) gives two representations of Q in terms of the basis A 1 , A 2 ,... , Am. By Property 2, this is impossible, so α Q = 0. But then, α 1 = α 2 = · · · = α m − 1 = 0, since A 1 , A 2 ,... , Am − 1 are linearly independent. Thus, as required, α 1 = α 2 = · · · = α m − 1 = α Q = 0 is the only solution to (3). Second, we show that any m -dimensional vector P can be represented in terms of the vectors A 1 , A 2 ,... , Am − 1 , Q. Since A 1 , A 2 ,... , Am is a basis, there are constants α 1 , α 2 ,... , α m such that
∑^ m
j = 1
α (^) j A (^) j.
Using expression (2) to eliminate Am , we find that
m ∑− 1
j = 1
α (^) j − α m
λ (^) j λ m
A (^) j
α m λ m
which by definition shows that A 1 , A 2 ,... , Am − 1 , Q is a basis.
Property 4. Let Q 1 , Q 2 ,... , Qk be a collection of linearly independent m -dimensional vectors, and let A 1 , A 2 ,... , Ar be a basis for Rm^. Then Q 1 , Q 2 ,... , Qk can replace k vectors from A 1 , A 2 ,... , Ar to form a new basis.
Proof. First recall that the 0 vector is not one of the vectors Q (^) j , since 0 vector is dependent on any set of vectors. For k = 1, the result is a consequence of Property 3. The proof is by induction. Suppose, by reindexing if necessary, that Q 1 , Q 2 ,... , Q (^) j , A (^) j + 1 , A (^) j + 2 ,... , Ar is a basis. By definition of basis, there are real numbers λ 1 , λ 2 ,... , λ r such that
Q (^) j + 1 = λ 1 Q 1 + λ 2 Q 2 + · · · + λ (^) j Q (^) j + λ (^) j + 1 A (^) j + 1 + λ (^) j + 2 A (^) j + 2 + · · · + λ r Ar.
If λ i = 0 for i = j + 1 , j + 2 ,... , r , then Q is represented in terms of Q 1 , Q 2 ,... , Q (^) j , which, by Property 1, contradicts the linear independence of Q 1 , Q 2 ,... , Qk. Thus some, λ i 6 = 0 for i = j + 1 , j + 2 ,... , r , say, λ (^) j + 1 6 = 0. By Property 3, then, Q 1 , Q 2 ,... , Q (^) j + 1 , A (^) j + 1 , A (^) j + 2 ,... , Ar is also a basis. Consequently, whenever j < k of the vectors Qi can replace j vectors from A 1 , A 2 ,... , Ar to form a basis, ( j + 1 ) of them can be used as well, and eventually Q 1 , Q 2 ,... , Qk can replace k vectors from A 1 , A 2 ,... , Ar to form a basis.
Property 5. Every basis for Rm^ contains m vectors.
500 Vectors and Matrices A.
Proof. If Q 1 , Q 2 ,... , Qk and A 1 , A 2 ,... , Ar are two bases, then Property 4 implies that k ≤ r. By reversing the roles of the Q (^) j and Ai , we also have r ≤ k and thus k = r , and every two bases contain the same number of vectors. But the unit m -dimensional vectors u 1 , u 2 ,... , um constitute a basis with m -dimensional vectors, and consequently, every basis of Rm^ must contain m vectors.
Property 6. Every collection Q 1 , Q 2 ,... , Qk of linearly independent m -dimensional vectors is con- tained in a basis.
Proof. Apply Property 4 with A 1 , A 2 ,... , Am the unit m -dimensional vectors.
Property 7. Every m linearly-independent vectors of Rm^ form a basis. Every collection of ( m + 1 ) or more vectors in Rm^ are linearly dependent.
Proof. Immediate, from Properties 5 and 6.
If a matrix B is constructed with m linearly-independent column vectors B 1 , B 2 ,... , Bm , the properties just developed for vectors are directly related to the concept of a basis inverse introduced previously. We will show the relationships by defining the concept of a nonsingular matrix in terms of the independence of its vectors. The usual definition of a nonsingular matrix is that the determinant of the matrix is nonzero. However, this definition stems historically from calculating inverses by the method of cofactors, which is of little computational interest for our purposes and will not be pursued.
Definition. An m -by- m matrix B is said to be nonsingular if both its column vectors B 1 , B 2 ,... , Bm and rows vectors B^1 , B^2 ,... , Bm^ are linearly independent.
Although we will not establish the property here, defining nonsingularity of B merely in terms of linear independence of either its column vectors or row vectors is equivalent to this definition. That is, linear independence of either its column or row vectors automatically implies linear independence of the other vectors.
Property 8. An m -by- m matrix B has an inverse if and only if it is nonsingular.
Proof. First, suppose that B has an inverse and that
B 1 α 1 + B 2 α 2 + · · · + Bm α m = 0.
Letting α = 〈α 1 , α 2 ,... , α m 〉, in matrix form, this expression says that
B α = 0.
Thus ( B −^1 )( B α) = B −^1 ( 0 ) = 0 or ( B −^1 B )α = I α = α = 0. That is, α 1 = α 2 = · · · = α m = 0, so that vectors B 1 , B 2 ,... , Bm are linearly independent. Similarly, α B = 0 implies that
α = α( B B −^1 ) = (α B ) B −^1 = 0 B −^1 = 0 ,
so that the rows B^1 , B^2 ,... , Bm^ are linearly independent.
502 Vectors and Matrices A.
Figure A.
Proof. By reindexing if necessary, we may assume that only the first r components of y are positive; that is, y 1 > 0 , y 2 > 0 ,... , yr > 0 , yr + 1 = yr + 2 = · · · = yn = 0. We must show that any vector y solving Ay = b , y ≥ 0 , is an extreme point if and only if the first r column A 1 , A 2 ,... , Ar of A are linearly independent. First, suppose that these columns are not linearly independent, so that A 1 α 1 + A 2 α 2 + · · · + Ar α r = 0 (5) for some real numbers α 1 , α 2 ,... , α r not all zero. If we let x denote the vector x = (α 1 , α 2 ,... , α r , 0 ,... , 0 ), then expression (5) can be written as Ax = 0. Now let w = y + λ x and w¯ = y − λ x. Then, as long as λ is chosen small enough to satisfy λ|α (^) j | ≤ y (^) j for each component j = 1 , 2 ,... , r , both w ≥ 0 and w ¯ ≥ 0. But then, both w and w¯ are contained in S , since
A ( y + λ x ) = Ay + λ Ax = Ay + λ( 0 ) = b , and, similarly, A ( y − λ x ) = b. However, since y = 12 (w + ¯w), we see that y is not an extreme point of S in this case. Consequently, every extreme point of S satisfies the linear independence requirement. Conversely, suppose that A 1 , A 2 ,... Ar are linearly independent. If y = λw + ( 1 − λ) x for some points w and x of S and some 0 < λ < 1 , then y (^) j = λw (^) j + ( 1 − λ) x (^) j. Since y (^) j = 0 for j ≥ r + 1 and w (^) j ≥ 0 , x (^) j ≥ 0 , then necessarily w (^) j = x (^) j = 0 for j ≥ r + 1. Therefore,
A 1 y 1 + A 2 y 2 + · · · + Ar yr = A 1 w 1 + A 2 w 2 + · · · + Ar w r = A 1 x 1 + A 2 x 2 + · · · + Ar xr = b.
Since, by Property 2 in Section A.5, the representation of the vector b in terms of the linearly independent vectors A 1 , A 2 ,... , Ar is unique, then y (^) j = z (^) j = x (^) j. Thus the two points w and x cannot be distinct and therefore y is an extreme point of S.
If A contains a basis (i.e., the tows of A are linearly independent), then, by Property 6, any collection A 1 , A 2 ,... , Ar of linearly independent vectors can be extended to a basis A 1 , A 2 ,... , Am. The extreme- point theorem shows, in this case, that every extreme point y can be associated with a basic feasible solution, i.e., with a solution satisfying y (^) j = 0 for nonbasic variables y (^) j , for j = m + 1 , m + 2 ,... , n. Chapter 2 shows that optimal solutions to linear programs can be found at basic feasible solutions or equivalently, now, at extreme points of the feasible region. At this point, let us use the linear-algebra tools
A.6 Extreme Points of Linear Programs 503
of this appendix to drive this result independently. This will motivate the simplex method for solving linear programs algebraically. Suppose that y is a feasible solution to the linear program
Maximize cx ,
subject to: Ax = b , x ≥ 0 , (6)
and, by reindexing variables if necessary, that y 1 > 0 , y 2 > 0 ,... , yr + 1 > 0 and yr + 2 = yr + 3 = · · · = yn = 0. If the column Ar + 1 is linearly dependent upon columns A 1 , A 2 ,... , Ar , then
Ar + 1 = A 1 α 1 + A 2 α 2 + · · · + Ar α r , (7)
with at least one of the constants α (^) j nonzero for j = 1 , 2 ,... , r. Multiplying both sides of this expression by θ gives Ar + 1 θ = A 1 (α 1 θ) + A 2 (α 2 θ) + · · · + Ar (α r θ), (8)
which states that we may simulate the effect of setting xr + 1 = θ in (6) by setting x 1 , x 2 ,... , xr , respectively, to (α 1 θ), (α 2 θ),... , (α r θ). Taking θ = 1 gives:
c ˜ r + 1 = α 1 c 1 + α 2 c 2 + · · · + α r cr
as the per-unit profit from the simulated activity of using α 1 units of x 1 , α 2 units of x 2 , through α r units of xr , in place of 1 unit of xr + 1. Letting x ¯ = (−α 1 , −α 2 ,... , −α r , + 1 , 0 ,... , 0 ), Eq. (8) is rewritten as A (θ x ) = θ A x ¯ = 0. Here x ¯ is interpreted as setting xr + 1 to 1 and decreasing the simulated activity to compensate. Thus,
A ( y + θ x ¯) = Ay + θ A x ¯ = Ay + 0 = b ,
so that y + θ x ¯ is feasible as long as y + θ x ¯ ≥ 0 (this condition is satisfied if θ is chosen so that |θα (^) j | ≤ y (^) j for every component j = 1 , 2 ,... , r ). The return from y + θ x ¯ is given by:
c ( y + θ x ¯) = cy + θ c x ¯ = cy + θ( cr + 1 − ˜ cr + 1 ).
Consequently, if c ˜ r + 1 < cr + 1 , the simulated activity is less profitable than the ( r + 1 )st activity itself, and return improves by increasing θ. If c ˜ r + 1 > cr + 1 , return increases by decreasing θ (i.e., decreasing yr + 1 and increasing the simulated activity). If c ˜ r + 1 = cr + 1 , return is unaffected by θ. These observation imply that, if the objective function is bounded from above over the feasible region, then by increasing the simulated activity and decreasing activity yr + 1 , or vice versa, we can find a new feasible solution whose objective value is at least as large as cy but which contains at least one more zero component than y. For, suppose that c ˜ r + 1 ≥ cr + 1. Then by decreasing θ from θ = 0 , c ( y + θ x ¯) ≥ cy ; eventually y (^) j + θ x ¯ (^) j = 0 for some component j = 1 , 2 ,... , r + 1 (possibly yr + 1 + θ x ¯ r + 1 = yr + 1 + θ = 0 ). On the other hand, if c ˜ r + 1 < cr + 1 , then c ( y + θ x ¯) > cy as θ increases from θ = 0 ; if some component of α (^) j from (7) is positive, then eventually y (^) j + θ x ¯ (^) j = y (^) j − θα (^) j reaches 0 as θ increases. (If every α (^) j ≤ 0, then we may increase θ indefinitely, c ( y + θ x ¯) → +∞, and the objective value is unbounded over the constraints, contrary to our assumption.) Therefore, if
either c ˜ r + 1 ≥ cr + 1 or c ˜ r + 1 < cr + 1 ,
we can find a value for θ such that at least one component of y (^) j + θ x ¯ (^) j becomes zero for j = 1 , 2 ,... , r + 1. Since y (^) j = 0 and x ¯ (^) j = 0 for j > r + 1 , y (^) j + θ x ¯ (^) j remains at 0 for j > r + 1. Thus, the entire vector y + θ x ¯ contains at least one more positive component than y and c ( y + θ x ¯) ≥ cy. With a little more argument, we can use this result to show that there must be an optimal extreme-point solution to a linear program.