






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Algorithms; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Unknown 1989;
Typology: Study notes
1 / 10
This page cannot be seen from the preview
Don't miss anything!







The greatest flood has the soonest ebb; the sorest tempest the most sudden calm; the hottest love the coldest end; and from the deepest desire oftentimes ensues the deadliest hate. — Socrates Th’ extremes of glory and of shame, Like east and west, become the same. — Samuel Butler, Hudibras Part II, Canto I (c. 1670) Extremes meet, and there is no better example than the haughtiness of humility. — Ralph Waldo Emerson, “Greatness”, in Letters and Social Aims (1876)
The maximum flow/minimum cut problem is a special case of a very general class of problems called linear programming. Many other optimization problems fall into this class, including minimum spanning trees and shortest paths, as well as several common problems in scheduling, logistics, and economics. Linear programming was used implicitly by Fourier in the early 1800s, but it was first formalized and applied to problems in economics in the 1930s by Leonid Kantorovich. Kantorivich’s work was hidden behind the Iron Curtain (where it was largely ignored) and therefore unknown in the West. Linear programming was rediscovered and applied to shipping problems in the early 1940s by Tjalling Koopmans. The first complete algorithm to solve linear programming problems, called the simplex method, was published by George Dantzig in 1947. Koopmans first proposed the name “linear programming" in a discussion with Dantzig in 1948. Kantorovich and Koopmans shared the 1975 Nobel Prize in Economics “for their contributions to the theory of optimum allocation of resources”. Dantzig did not; his work was apparently too pure. Koopmans wrote to Kantorovich suggesting that they refuse the prize in protest of Dantzig’s exclusion, but Kantorovich saw the prize as a vindication of his use of mathematics in economics, which had been written off as “a means for apologists of capitalism”. A linear programming problem asks for a vector x ∈ IRd^ that maximizes (or equivalently, minimizes) a given linear function, among all vectors x that satisfy a given set of linear inequalities. The general form of a linear programming problem is the following:
maximize
∑^ d
j= 1
cj x (^) j
subject to
∑^ d
j= 1
ai j x (^) j ≤ bi for each i = 1 .. p
∑^ d
j= 1
ai j x (^) j = bi for each i = p + 1 .. p + q
∑^ d
j= 1
ai j x (^) j ≥ bi for each i = p + q + 1 .. n
Here, the input consists of a matrix A = (ai j ) ∈ IRn×d^ , a column vector b ∈ IRn, and a row vector c ∈ IRd^. Each coordinate of the vector x is called a variable. Each of the linear inequalities is called a constraint. The function x 7 → c · x is called the objective function. I will always use d to denote the number of variables, also known as the dimension of the problem. The number of constraints is usually denoted n.
A linear programming problem is said to be in canonical form^1 if it has the following structure:
maximize
∑^ d
j= 1
cj x (^) j
subject to
∑^ d
j= 1
ai j x (^) j ≤ bi for each i = 1 .. n
x (^) j ≥ 0 for each j = 1 .. d
We can express this canonical form more compactly as follows. For two vectors x = (x 1 , x 2 ,... , xd ) and y = ( y 1 , y 2 ,... , yd ), the expression x ≥ y means that xi ≥ yi for every index i.
max c · x s.t. Ax ≤ b x ≥ 0
Any linear programming problem can be converted into canonical form as follows:
j ai j^ x^ j^ =^ bi^ with two inequality constraints^
∑^ j^ ai j^ x^ j^ ≥^ bi^ and j ai j^ x^ j^ ≤^ bi^.
j ai j^ x^ j^ ≥^ bi^ with the equivalent lower bound^
j −ai j^ x^ j^ ≤ −bi^.
This conversion potentially double the number of variables and the number of constraints; fortunately, it is rarely necessary in practice. Another useful format for linear programming problems is slack form^2 , in which every inequality is of the form x (^) j ≥ 0:
max c · x s.t. Ax = b x ≥ 0
It’s fairly easy to convert any linear programming problem into slack form. Slack form is especially useful in executing the simplex algorithm (which we’ll see in the next lecture).
A point x ∈ IRd^ is feasible with respect to some linear programming problem if it satisfies all the linear constraints. The set of all feasible points is called the feasible region for that linear program. The feasible region has a particularly nice geometric structure that lends some useful intuition to the linear programming algorithms we’ll see later. Any linear equation in d variables defines a hyperplane in IRd^ ; think of a line when d = 2, or a plane when d = 3. This hyperplane divides IRd^ into two halfspaces; each halfspace is the set of points that satisfy some linear inequality. Thus, the set of feasible points is the intersection of several hyperplanes
(^1) Confusingly, some authors call this standard form. (^2) Confusingly, some authors call this standard form.
We can compute the length of the shortest path from s to t in a weighted directed graph by solving the following very simple linear programming problem.
maximize dt subject to ds = 0
from s to v. The constraints mirror the requirement that every edge in the graph must be relaxed. These relaxation constraints imply that in any feasible solution, dv is at most the shortest path distance from s to v. Thus, somewhat counterintuitively, we are correctly maximizing the objective function to compute the shortest path! In the optimal solution, the objective function dt is the actual shortest-path distance from s to t, but for any vertex v that is not on the shortest path from s to t, dv may be an underestimate of the true distance from s to v. However, we can obtain the true distances from s to every other vertex by modifying the objective function:
maximize
v
dv
subject to ds = 0
There is another formulation of shortest paths as an LP minimization problem using an indicator
minimize
uv
` uv · xuv
subject to
u
xus −
w
xsw = 1 ∑
u
xut −
w
xtw = − 1 ∑
u
xuv −
w
xvw = 0 for every vertex v 6 = s, t
not lie on this shortest path. The constraints merely state that the path should start at s, end at t, and either pass through or avoid every other vertex v. Any path from s to t—in particular, the shortest path—clearly implies a feasible point for this linear program. However, there are other feasible solutions, possibly even optimal solutions, with non-integral values that do not represent paths. Nevertheless, there is always an optimal solution in which every xe is either 0 or 1 and the edges e with xe = 1 comprise the shortest path. (This fact is by no means obvious, but a proof is beyond the scope of these notes.) Moreover, in any optimal solution, even if not every xe is an integer, the objective function gives the shortest path distance!
Recall that the input to the maximum (s, t)-flow problem consists of a weighted directed graph G = (V, E), two special vertices s and t, and a function assigning a non-negative capacity ce to each edge e. Our task
is to choose the flow fe across each edge e, as follows:
maximize
w
fsw −
u
fus
subject to
w
fvw −
u
fuv = 0 for every vertex v 6 = s, t
Similarly, the minimum cut problem can be formulated using ‘indicator’ variables similarly to the shortest path problem. We have a variable Sv for each vertex v, indicating whether v ∈ S or v ∈ T , and a
minimize
uv
cuv · Xuv
Ss = 1 St = 0
Like the minimization LP for shortest paths, there can be optimal solutions that assign fractional values to the variables. Nevertheless, the minimum value for the objective function is the cost of the minimum cut, and there is an optimal solution for which every variable is either 0 or 1, representing an actual minimum cut. No, this is not obvious; in particular, my claim is not a proof!
Each of these pairs of linear programming problems is related by a transformation called duality. For any linear programming problem, there is a corresponding dual linear program that can be obtained by a mechanical translation, essentially by swapping the constraints and the variables. The translation is simplest when the LP is in canonical form:
Primal (Π) max c · x s.t. Ax ≤ b x ≥ 0
Dual (q) min y · b s.t. yA≥ c y ≥ 0
We can also write the dual linear program in exactly the same canonical form as the primal, by swapping the coefficient vector c and the objective vector b, negating both vectors, and replacing the constraint matrix A with its negative transpose.^4
Primal (Π) max c · x s.t. Ax≤ b x≥ 0
Dual (q) max −b>^ · y> s.t. −A>^ y>≤ −c y>≥ 0
(^3) These two linear programs are not quite syntactic duals; I’ve added two redundant variables Ss and St to the min-cut program to increase readability. (^4) For the notational purists: In these formulations, x and b are column vectors, and y and c are row vectors. This is a somewhat nonstandard choice. Yes, that means the dot in c · x is redundant. Sue me.
Now suppose that each yi is larger than the ith coefficient of the objective function:
y 1 + 3 y 2 ≥ 4, 4 y 1 − y 2 ≥ 1, y 2 ≥ 3.
This assumption lets us derive an upper bound on the objective value of any feasible solution:
4 x 1 + x 2 + 3 x 3 ≤ ( y 1 + 3 y 2 )x 1 + ( 4 y 1 − y 2 )x 2 + y 2 x 3 ≤ 2 y 1 + 4 y 2. (∗)
In particular, by plugging in the optimal solution (x∗ 1 , x 2 ∗, x∗ 3 ) for the original LP, we obtain the following upper bound on σ ∗: σ ∗^ = 4 x 1 ∗ + x∗ 2 + 3 x∗ 3 ≤ 2 y 1 + 4 y 2. Now it’s natural to ask how tight we can make this upper bound. How small can we make the expression 2 y 1 + 4 y 2 without violating any of the inequalities we used to prove the upper bound? This is just another linear programming problem.
minimize 2 y 1 + 4 y 2 subject to y 1 + 3 y 2 ≥ 4 4 y 1 − y 2 ≥ 1 y 2 ≥ 3 y 1 , y 2 ≥ 0
In fact, this is precisely the dual of our original linear program! Moreover, inequality (∗) is just an instantiation of the Weak Duality Theorem.
The Fundamental Theorem can be rephrased in the following form:
Strong Duality Theorem. If x∗^ is an optimal solution for a canonical linear program Π, then there is an optimal solution y∗^ for its dual q, such that c · x∗^ = y∗Ax∗^ = y∗^ · b.
Proof (Sketch): I’ll prove the theorem only for non-degenerate linear programs, in which (a) the optimal solution (if one exists) is a unique vertex of the feasible region, and (b) at most d constraint planes pass through any point. These non-degeneracy assumptions are relatively easy to enforce in practice and can be removed from the proof at the expense of some technical detail. I will also prove the theorem only for the case n ≥ d; the argument for under-constrained LPs is similar (if not simpler). Let x∗^ be the optimal solution for the linear program Π; non-degeneracy implies that this solution is unique, and that exactly d of the n linear constraints are satisfied with equality. Without loss of generality (by permuting the rows of A), we can assume that these are the first d constraints. So let A• be the d × d matrix containing the first d rows of A, and let A◦ denote the other n − d rows. Similarly, partition b into its first d coordinates b• and everything else b◦. Thus, we have partitioned the inequality Ax∗^ ≤ b into a system of equations A• x∗^ = b• and a system of strict inequalities A◦ x∗^ < b◦. Now let y∗^ = ( y•∗ , y∗◦ ) where y•∗ = cA− • 1 and y∗◦ = 0. We easily verify that y∗^ · b = c · x∗:
y∗^ · b = y•∗ · b• + y◦∗ · b◦ = y•∗ · b• = cA− • 1 b• = c · x∗.
(The existence of the inverse matrix A− • 1 follows from our non-degeneracy assumption.) Similarly, it’s easy to verify that y∗A ≥ c: y∗A = y•∗ A∗• + y∗◦ A∗◦ = y•∗ A∗• = c. Once we prove that y∗^ is non-negative, and therefore feasible, the Weak Duality Theorem implies the result. We chose y◦∗ = 0. As we will see below, the optimality of x∗^ implies the strict inequality y•∗ > 0—we had to use optimality somewhere! This is the hardest part of the proof. The key insight is to give a geometric interpretation to the vector y•∗ = cA− • 1. Each row of the linear system A• x∗^ = b• describes a hyperplane ai · x∗^ = bi in IRd^. The vector ai is normal to this hyperplane and points out of the feasible region. The vectors a 1 ,... , ad are linearly independent (by non-degeneracy) and thus describe a coordinate frame for the vector space IRd^. The definition of y•∗ can be rewritten as follows:
c = y•∗ A• =
∑^ d
i= 1
y∗ i ai.
In other words, y•∗ lists the coefficients of the objective vector c in the coordinate frame a 1 ,... , ad. The point x∗^ lies on exactly d constraint hyperplanes; any d − 1 of these hyperplanes determine a line through x∗. For each 1 ≤ i ≤ d, let _ i denote the line that lies on all but the ith constraint plane, and let vi denote a vector based at x∗^ that points into the halfspace ai x ≤ bi along the line _ i. This vector lies along an edge of the feasible polytope. For all j 6 = i, we have aj · vi = 0. Thus, we can write
A• vi = (0,... , 0, ai · vi , 0,... , 0)>
where the scalar ai · vi appears in the ith coordinate. It follows that
c · vi = y∗• A• vi = y i∗ (ai · vi ).
The optimality of x∗^ implies that c · vi < 0, and because vi points into the feasible region while ai points out, we have ai · vi < 0. We conclude that y i∗ > 0. We’re done! É
(b) Prove that finding the optimal feasible solution to an integer program is NP-hard.
[Hint: Almost any NP-hard decision problem can be formulated as an integer program. Pick your favorite.]
? 5. Helly’s theorem states that for any collection of convex bodies in IRd (^) , if every d + 1 of them
intersect, then there is a point lying in the intersection of all of them. Prove Helly’s theorem for the special case where the convex bodies are halfspaces. Equivalently, show that if a system of linear inequalities Ax ≤ b does not have a solution, then we can select d + 1 of the inequalities such that the resulting subsystem also does not have a solution. [Hint: Construct a dual LP from the system by choosing a 0 cost vector.]
© c Copyright 2008 Jeff Erickson. Released under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License (http://creativecommons.org/licenses/by-nc-sa/3.0/). Free distribution is strongly encouraged; commercial distribution is expressly forbidden. See http://www.cs.uiuc.edu/~jeffe/teaching/algorithms for the most recent revision.