









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Fundamental Algorithms; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Spring 2005;
Typology: Study notes
1 / 15
This page cannot be seen from the preview
Don't miss anything!










Given a directed graph – network, with two special vertices – the source and the sink, and with a capacity associated with each edge, the question is to compute a maximum flow from s to t. This is the simplest of a class of graph problems which has many applications and has received a great deal of attention.
Formally, the input consists of a directed graph G = (V, E), with the special vertices s (source) and t (sink), and with a capacity function on all pairs of vertices c :
2
E is precisely the set of vertex pairs for which c > 0 (but it is convenient to think that it is defined on the set of all pairs of vertices,
2
). A flow in G is another function f :
2
such that
(i) f (u, v) = −f (v, u), (skew-symmetry)
(ii) f (u, v) ≤ c(u, v) for all (u, v) (or f ≤ c, in a more compact form),
(iii) for each v 6 = s, t,
u∈V f^ (u, v) = 0 (flow conservation: the total flow entering^ v^ is zero).
8
s (^9 7) t
16 10 4
14
4
20
13
12
s (^7) t
12 15
4 11
1 4
11
Figure VII.1: Network with capacities (left) and a flow (right).
Because of (i), (iii) is equivalent to
u∈V f^ (v, u) = 0 (the total flow leaving^ v^ is zero). The value of the flow f , denoted |f | is
v f^ (s, v), that is, the flow out of the source^ s. Cuts will play an important role: A cut of G is a pair (S, T ) with S, T ⊆ V , S ∩ T = ∅, s ∈ S, t ∈ T. The capacity of and the flow through the cut (S, T ) are defined by
c(S, T ) =
u∈S,v∈T
c(u, v), f (S, T ) =
u∈S,v∈T
f (u, v).
Clearly, f (S, T ) ≤ c(S, T ) for any cut (S, T ).
Observation 1. For any flow f in G and any cut (S, T ), |f | = f (S, T ). Hence, for any cut (S, T ), |f | ≤ c(S, T ).
Proof. By induction on the size of S. If |S| = 1 then S = {s} and it follows by definition. For the inductive step, move a vertex u from T to S, then the change is
f (S ∪ {u}, T − {u}) − f (S, T ) =
w∈T −{u}
f (u, w) −
v∈S
f (v, u) =
v
f (u, w) = 0,
by flow conservation.
Most algorithms for this problem work in an incremental way: flow is added through some subgraph of G, usually a path from s to t. Note that this must be done correctly though as seen in the example in the figure: If a flow of 10 units is sent as shown from s to t, then s and t become disconnected and no more flow can be sent, though the maximum flow is actually 20. So, what we need is the ability to “correct” the mistake of having sent 10
10
s t
10 10 10
10
10
s t
10 10 10
10
10
s t
10
10
s t
10 10
units of flow through the middle edge, by sending up to 10 units of flow in the opposite direction as shown in the far right. That flow cancels the flow already sent in the forward direction and effectively makes the flow zero on that edge. This motivates the definition of the residual graph Gf of G for flow f : Its vertex set is also V and its capacity function is cf = c − f. This determines the set of edges Ef (those for which c > 0). Note then that |Ef | ≤ 2 |E|.
The idea is then to look for a flow in Gf and use it to augment the current flow f in G. The following lemma formalizes this. (Notation: f + g indicates the edge function (f + g)(u, v) = f (u, v) + g(u, v).)
Lemma 2. Let f be a flow in G and let Gf be its residual graph. Then
(a) f ′^ is a flow in Gf iff f + f ′^ is a flow in G.
(b) |f + f ′| = |f | + |f ′|.
(c) f ′^ is a max flow in Gf iff f + f ′^ is a max flow in G.
(d) If f is any flow in G and f ∗^ is a max flow in G, then the max flow value in Gf is |f ∗| − |f |.
Proof. (a) (⇒) If f ′^ is a flow, then it satisfies (i-iii), then f + f ′^ also satisfy (i-iii): (i) (f + f ′)(u, v) = f (u, v) + f ′(u, v) = −f (v, u) − f (v, u) = −(f + f ′)(u, v), (ii) f ′^ ≤ cf = c − f implies f + f ′^ ≤ c, (iii)
v(f^ +^ f^
′)(s, v) = ∑ v f^ (s, v) +^
v f^
′(s, v) = 0. (⇐) Essentially
the same argument shows this direction.
(b) |f + f ′| =
v(f^ +^ f^
′)(s, v) = ∑ v f^ (s, v) +^
v f^
′(s, v) = |f | + |f ′|.
(c) (⇒) Suppose f ′^ is a max flow in Gf but f + f ′^ is not a max flow in G. Let f ∗^ be a max flow in G, then by (a) since f ∗^ = f + (f ∗^ − f ) is a flow in G then f ∗^ − f is a flow in Gf , and by (b) its value is |f ∗^ − f | = |f ∗| − |f | which is larger than (|f | + |f ′|) − |f | = |f ′|, but this is a contradiction since f ′^ is a max flow in Gf. (⇐) Suppose f + f ′^ is a max flow in
Proof. (1 ⇒ 2) Suppose Gf contains an augmenting path p, and let fp be the corresponding augmenting path flow. Then f + fp has larger value than f and hence f is not a max flow.
(2 ⇒ 3) Let S = {v : s
Gf ; v}. Then, by hypothesis, t 6 ∈ S and so (S, T ), where T = v − S, is a cut. Furthermore for any (u, v) with u ∈ S and v ∈ T , (u, v) is not in Gf and so f (u, v) = c(u, v). Thus, |f | = c(S, T ). Since |f | ≤ c(S′, T ′) for any cut (S′, T ′), then (S, T ) is a minimum capacity cut. (3 ⇒ 1) Again, because |f | = c(S, T ) then f must be a maximum flow, otherwsie the max flow would have larger value than c(S, T ).
The max flow theorem is the basis for a generic algorithm that finds a maximum flow: iterate finding an augmenting path in the residual graph. If this procedure halts, then the theorem guarantees that it has computed a maximum flow.
GenericMaxFlow (G, s, t, c) f ← 0 Gf ←G while there is an augmenting path p in Gf let fp be the bottleneck flow through p f ←f + fp update Gf
Unfortunately, the algorithm does not halt unless we make special assumptions on how the augmenting path is selected, or on the values of the capacities. The follow construction shows that for arbitrary (irrational) capacities, the algorithm may not halt for some aug- menting path choices. In that example, the max flow is 1 + r + r^2 , but this is reached only in the limit as n goes to ∞. The network could be easily modified so that the convergence is not even to the max flow.
1 n+
r
r 2
s^ t
r
s t
0
r n
n+
s^ t
r
0
r n+
Figure VII.3: The original network is on the left. All capacities are +∞ (including three backward edges in the middle, not shown for simplicity) except the three capacities shown: 1 , r, r^2 , where r = (−1 +
5)/2 = 0. 618... satisfies the equation r^2 = 1 − r, and so also rn+2^ = rn^ − rn+1. The flow is augmented using the path shown on the left. This brings the situation to the generic one shown in the middle for n = 1: the middle edges have capacities 0, rn, rn+1. Using then the augmenting path shown, we obtain a residual graph as shown on the right (using rn+2^ = rn^ − rn+1), the generic situation for the next integer (the edges are permuted though with respect to the middle picture).
In the particular case of integer capacities, then the algorithm is guaranteed to halt: in each step, the flow value is incremented by at least 1, so it must reach the maximum in at
most |f ∗| steps, where f ∗^ is a maximum flow. Furthermore, it follows from the algorithm, that there is a maximum flow that is integer valued (all edge flows are integer). This is summarized in the following:
Theorem 4. If the capacity function c is integer valued, then there is a maximum flow that is also integer valued (and its value is also integer).
Since, it takes time O(m) to find an augmenting path in Gf , where m is the number of edges, then the running time of the generic algorithm, for integer capacity, is O(m|f ∗|). We are interested in improving this, specially making the running time independent from |f ∗|. Before we continue with algorithms, we present two applications of the previous theorems to graph theory.
VII.1 Application of Flow Theorem to Graph Theory
Let G = (V, E) be a bipartite graph with V = A ∪ B with A, B disjoint and E ⊆ A × B. A complete matching for A is a subset M ⊆ E of the edges each incident to exactly one vertex in A and one vertex in B. Flow arguments can be used to prove the following well-known condition (Halls’ marriage theorem) for the existence of a complete matching. For W ⊆ A, N(W ) is the set of neighbors in B: N(W ) = {v ∈ B : u ∈ W, (u, v) ∈ E}.
Theorem 5. There is a complete matching for A in G iff for all W ⊆ A, |N(W )| ≥ |W |.
Proof. One direction is trivial. For the other, we form a flow network by adding vertices s and t, and edges from s to A and from B to t (all edges directed) as shown in the figure. All edges have capacity 1. Let n = |A|. We want to show that the capacity of a minimum cut is n. Consider any cut (S, T ) (so s ∈ S, t ∈ T ) (some cases are illustrated in the figure). Let S = {s} ∪ W ∪ X where W ⊆ A and X ⊆ B. Let w = |W | and x = |X|. Then the cut (S, T ) includes n − w edges between s and A, at least |N(W ) − X| ≥ w − x edges between A and B, and x edges between B and t. Thus, the capacity of the cut (S, T ) is at least (n − w) + (w − x) + x = n. Now, the max-flow min-cut theorem and the integrality of flow theorem show that there must be a complete matching.
B
t
A
s
B
t
A
s
B
t
A
s
Theorem 6. Let G = (V, E) be a directed graph with s, t ∈ V and k ∈ N. Then there are k disjoint-edge s − t paths iff after deleting any k − 1 edges from G, t is still reachable from s.
For the total running time, we need to multiply this by the time needed to compute a maximum bottleneck capacity flow. This is an improvement over the generic algorithm without particular choice of augmenting path for some values of |f ∗|. Still, we would like to obtain a running time independent of |f ∗|.
− Augment using a path of minimum length (number of edges)
To implement and analize this heuristic, it is convenient to consider the level graph L(Gf ) of the residual graph. It is determined by a breadth first search (BFS) of Gf : vertices in the k-th level are those that are at distance k from s; edges in the graph go from vertices in a level to the next level. Note we don’t keep levels beyond the one containing t (and keep only t in that level). Edges of Gf between vertices in the same level (inner), or edges between a level and a previous one (backward) are disregarded; note that there are no edges in Gf (hence neither in L(Gf )) that jump forward more than one level. Once the level graph is constructed, a path of minimum length from s to t can be found by restricting the search to level graph edges, and is then used as augmenting path. Then, the residual graph is updated, and the level graph is updated by removing the edges that are saturated. As long as t is reachable from s in this level graph, we do not need to recompute it. The real level graph for the current residual graph would look different than the “degrading” one we are using, but would still include the same minimum length paths that the degrading level graph has and no others nor shorter ones (using old or new backward edges in the degrading level graph can only make an s − t path longer). After a number of iterations,
t
0 1 2 3 4 5
s (^) t
0 1 2 3 4 5
s
Figure VII.4: On the left is a level graph (backward and inner edges are dashed green); an augmenting path is shown (dashed red). On the right is the updated level graph: a saturated edge is removed (dashed blue) as it is not relevant for a minimum length path in the current phase.
t will become unreachable from s in the level graph, then we compute a fresh level graph for the current residual graph and start again another phase.
We summarize a phase in the following pseudocode. The first phase starts with f = 0 and Gf = G.
Min-Length-Heuristic Phase L←L(Gf ) while there is a path p from s to t in L f ←f + fp update L: remove saturated edges update Gf if t is still reachable from s in Gf go to next phase else halt
Running Time: Let n = |V | and m = |E|. As observed, the distance from s to t increases from one iteration to the next by at least 1, so there are at most n phases (the distance from s to t is at most n − 1 in any residual graph of G). Within each phase, there are at most m augmentations because each saturates at least one edge. So, the total number of augmentations is at most mn. Finally, the time to find an augmenting path is O(m), so the total running time is O(nm^2 ).
In the previous analysis, we have assumed that every augmentation starts from scratch.nd takes time O(m). A better running time is obtained if we describe and analize a careful implementation.
Min-Length-Heuristic Phase (revised implementation) Initialization: L←L(Gf ) p←s /* current path / u←s / last vertex in current path / Advance Advance: if there is an edge (u, v) in L then p←p · v u←v if u 6 = t then Advance else Augment else Retreat Retreat: if u 6 = s then remove u and incident edges from L drop u from p u←last vertex of p else End Augment: f ←f + fp / fp is bottleneck flow through p */ remove saturated edges from L u←last vertex in p still reachable p←subpath of p from s to u Advance End: update Gf if t is still reachable from s in Gf go to next phase else halt
Push-Pull Phase
Push to sink: The outgoing edges of v are saturated in order, leaving at most one partially saturated edge. All edges that become satu- rated during the process are deleted. This process is then repeated on each vertex that received flow during the saturation of the edges out of v, and so on all the way to t. Pull from source: The incoming edges of v are saturated in order, leaving at most one partially saturated edge. All edges that become saturated by this process are deleted. This process is then repeated on each vertex from which flow was taken during the saturation of the edges into v, and so on all the way back to s.
Delete v and all its remaining incident edges from L, and update the capacities of the neighboring vertices. Go to step 1.
Further details of the implementation, and its analysis are left as an exercise.
Still a better running time, namely O(mn log n) can be achieved with an approach that is called Push-Relabel for the basic operations it uses. The approach is described in [CLRS] but only with an implementation that results in a running time O(n^3 ) (so no better than the push-pull algorithm). The improvement to O(mn log n) needs a complicated data structure for maintaining dynamic trees.
VII.3 Minimum Cut Problem
Given an (undirected) graph G = (V, E), a cut is a pair (S, T ), where S, T ⊆ V form a partition of V (S and T are disjoint and V = S ∪ T ). The size c(S, T ) of the cut (S, T ) is the number of edges between the vertex sets S and T. More precisely:
c(S, T ) = |{{u, v} ∈ E : u ∈ S, v ∈ T }|
The problem is to find a cut (S, T ).^1 The min cut problem is solved “automatically” by network flow algorithms. More precisely, n − 1 s − t max flows suffice; even with the faster algorithm above, the running time woould be O(n^4 ) for dense graphs. The purpose here though is to present a simpler and somewhat faster randomized algorithm.
(^1) A more interesting problem is to find a minimum balanced cut, that is, in which |S|, |T | ≤ α|V | for
some positive fraction α. Such balanced cuts would be of interest as part of a divide-and-conquer algorithm for some other problem. The min balanced cut problem is NP-hard.
The simple randomized algorithm is based on edge contractions. In this context, it is better to consider multigraphs: there can be multiple edges between the same pair of vertices. Given a multigraph G, and an edge e = {u, v} in E, the graph G/e resulting from the contraction of e is the multigraph obtained by identifying the vertices u and v of e into a new vertex uv (while discarding the selfloops). We don’t write formally what the new edge set, rather illustrate it with the example in the figure where a sequence of three edge conttractions is performed. The following observation is the basis for the algorithm.
bcd
a
b c
d f e
b c
d f
ae
b
f
ae
cd
f
ae
Figure VII.5: Sequence of edge contractions: {a, e}, {c, d}, {b, cd}. The last one is an edge in the minimum cut, and the resulting graph does not have a cut of size two anymore.
Observation 8. (i) For any cut in G/e, there is a corresponding cut in G with equal size.
(ii) If edge {u, v} does not belong to a minimum cut of G, then a minimum cut of G/e corresponds to a minimum cut of G.
The contraction can be implemented in time O(n) so that edges in the original graph corresponding to edges in the contracted graph can be easily identified (so that at the end, the algorithm can report the cut in the original graph). This is left as an exercise. The idea is to perform edge contractions on the graph until it is reduced to 2 vertices, at which point the minimum cut is simply the set of edges between those 2 vertices. If we manage never to contract edges in a minimum cut, then a minimum cut of the original graph is obtained. The edge to be contracted is chosen at random among all the edges every time (this can also be implemented in time O(n)). At the center of the algorithms is then the following:
Contract (G, ) while G has more than vertices pick a random edge e in G G←G/e
which contracts edges until a specified number of vertices remain.
What is the probability that an edge is chosen which does not belong to a minimum fixed cut of the original graph? Let us fix a minimum cut in the original graph, let C be the set of corresponding edges, and let k be its size. Let ni be the number of vertices after i contractions (so ni = n − i), and let mi be the number of edges after i contractions. Given that no edge in C has been contracted in the previous i contractions, the probability that an edge in C is contracted is exactly pi = k/mi. Since the minimum cut has size k then each vertex has degree at least k (otherwise there would be a smaller cut), therefore
Let P (n) denote the probability of success for the algorithm running on a graph with n vertices. From the previous observation, we have
1 − P (n) ≤
P (n/
Then, we get the recurrence, with P (2) = 1:
P (n) ≥ P (n/
P 2 (n/
How to solve such recurrence? We will be content with listening to the little birdie and suggest to verify by induction that P (n) ≥ 1 / log n (left as an exercise). The running time satisfies the recurrence: T (n) = O(n^2 ) + 2T (n/
which has solution T (n) = O(n^2 log n). To reduce the probability of error to polynomially small, we repeat this algorithm Θ(log^2 n) times and return the smallest cut. The total running time is O(n^2 log^3 n).
VII.4 Some Examples of Applications
A town has n residents, n′^ clubs, and n′′^ political parties, with each resident belonging to at least one club and to exactly one political party. The question is whether there is a way to choose a representative to the town council from each club so that the i-th party has at most ci representatives chosen. This problem can be formulated as a max-flow problem. This is illustrated in the figure. There are three groups R, C, P of vertices corresponding to residents, clubs and parties, plus a source and sink s, t. Edges with capacity 1 connect s to each of the clubs, each club to each of its members, and each resident to its party, and an edge with capacity ci from the vertex for the i-th party to t. The original question has a solution iff the max-flow in this network has value n′^ (the nuumber of clubs). Furthermore, an integer valued max-flow directly gives a solution to the problem.
residents
s (^) t
clubs parties
ci (^1 )
1
Given a matrix with fractional entries, it is desired to round up or down each entry so that this is consistent with rounding up or down each of the row and column sums. An example instance is shown below in the table on the left: a 3 × 3 matrix together with the row and column sums; the table on the right shows the possible results of rounding up or down for the entries and the row and column sums. This problem can be posed as a feasibility question for a network with upper and lower bounds. The network consists of s, t and two groups of vertices C = {c 1 ,... , cn} and R = {r 1 ,... , rm} corresponding to the columns and rows. Each edge has an upper and lower bound corresponding to the two possibilities for rounding an entry or a column/row sum. An edge (s, ci) corresponds to the i-th column sum, an edge (rj , t) corresponds to the j-th row sum, an edge (ci, rj ) corresponds to the i, j entry of the matrix. This network with upper and lower bounds on capacities has a feasible flow iff the original matrix has a consistent rounding.
3.1 6.8 7.3 17. 9.6 2.4 0.7 12. 3.6 1.2 6.5 11. 16.3 10.4 14.
As usual, let G = (V, E) be a network, and s, t ∈ V. But now, we have two edge capacity functions cl and cu, with cl ≤ cu, which for each edge set a lower and an upper bound on the flow through that edge. More precisely, we are interested in a flow function f that satisfies skew symmetry and conservation at vertices other than s, t, and such that
cl(u, v) ≤ f (u, v) ≤ cu(u, v).
Unlike for the standard network flow problem, where f = 0 satisfies the problem constraints, it is not obviuos whether or not there is a flow that satisifes these conditions. Any such flow is called a feasible flow and the problem of whether it exists is a feasibility problem.
This feasibility problem can be reduced to a standard max-flow problem. The constraints on the flow can be rewritten as
0 ≤ f (u, v) − cl(u, v) ≤ cu(u, v) − cl(u, v).
This looks like the standard form. We would like to define a capacity function c = cu − cl, obtain a flow f ′^ for this standard problem, and extract from it a flow f for the network with upper and lower bounds: f = f ′^ + cl.
However, this would in general violate conservation at vertices v by the amount
exc(v) =
(u,v)∈E
cl(u, v) −
(v,w)∈E
cl(v, w).
So, we need this much flow to leave vertex v through an extra edge. This discussion leads to a right construction: the network is G′^ = (V ′, E′) where V ′^ = V ∪ {s′, t′} where s′^ and t′^ are new vertices and
E′^ = E ∪ {(s′, v) : v ∈ V, exc(v) < 0 } ∪ {(v, t′) : v ∈ V, exc(v) > 0 } ∪ {(t, s)}