Depth-First Search and Strong Components: Finding Strongly Connected Components in Graphs, Exams of Engineering

Depth-first search (dfs) algorithm and its application in finding strongly connected components in directed graphs. The concept of tree edges, forward edges, cross edges, and back edges, and how they are classified during dfs. It also introduces the concept of strong components and how they can be identified using an augmented dfs algorithm. From carnegie mellon university, course 15-451, spring 2003.

Typology: Exams

Pre 2010

Uploaded on 08/09/2009

koofers-user-gu8
koofers-user-gu8 🇺🇸

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Depth First Search and Strong Components
CMU
15-451
Spring
2003
D.
Sleator
-
1.
Introduction
Depth first search is
a
very useful technique for analyzing graphs. For example, it can be used to:
0
Determine the connected components of
a
graph.
0
Find cycles in
a
directed
or
undirected graph.
0
Find the biconnected components of an undirected graph.
0
Topologically sort
a
directed graph.
0
Determine if
a
graph is planar, and finding an embedding of it if it is.
0
Find the strong components of
a
directed graph.
If
the graph has
n
vertices and
m
edges then depth first search can be used to solve all of these
problems in time
O(n
+
m),
that is, linear in the size of the graph.
2.
Depth First Search in Directed Graphs
We assume that the graph is represented
as
an adjacency structure, that is, for every vertex
v
there
is
a
set
ad!(v)
which is the set of vertices reachable by following one edge out of
v.
Let
V
be the
set of vertices in the graph, and let
E
be the set of edges.
To
do
a
depth first search we keep two
pieces of information associated with each vertex
v.
One is
a
the depth first search numbering,
num(v),
and the other is
mark(v),
which indicates that
v
is currently on the recursion stack.
Here is the depth first search procedure:
it0
for all x
E
V
do
num(v)
t
0
for all x
E
V
do
mark(v)
t
0
for all x
E
V
do
if
num(z)
=
0
then DFS(x)
DFS(w)
iti+l
num(v)
t
2
mark(v)
t
1
for all
w
E
adj(v)
do
if
num(w)
=
0
then DFS(w)
else if
num(w)
>
num(v)
then
else'if
mark(w)
=
0
then
,else
mark(v)
t
0
end DFS
[(v,
w)
is
a
tree
edge
J
[(v,
w)
is
a
forward
edge
]
[(v,w)
is
a
cross
edge]
[(v,w)
is
a
back
edge]
1
pf3
pf4
pf5

Partial preview of the text

Download Depth-First Search and Strong Components: Finding Strongly Connected Components in Graphs and more Exams Engineering in PDF only on Docsity!

Depth First Search and Strong Components

CMU 15-451 Spring 2003 D. Sleator

1. Introduction

Depth first search is a very useful technique for analyzing graphs. For example, it can be used to:

0 Determine the connected components of a graph.

0 Find cycles in a directed or undirected graph.

0 Find the biconnected components of an undirected graph.

0 Topologically sort a directed graph.

0 Determine if a graph is planar, and finding an embedding of it if it is.

0 Find the strong components of a directed graph.

If the graph has n vertices and m edges then depth first search can be used to solve all of these

problems in time O ( n + m), that is, linear in the size of the graph.

2. Depth First Search in Directed Graphs

We assume that the graph is represented as an adjacency structure, that is, for every vertex v there

is a set ad!(v) which is the set of vertices reachable by following one edge out of v. Let V be the

set of vertices in the graph, and let E be the set of edges. To do a depth first search we keep two

pieces of information associated with each vertex v. One is a the depth first search numbering,

num(v), and the other is mark(v), which indicates that v is currently on the recursion stack.

Here is the depth first search procedure:

i t 0

for all x E V do num(v) t 0

for all x E V do mark(v) t 0

for all x E V do

if num(z) =^0 then DFS(x)

DFS(w)

i t i + l

num(v) t 2

mark(v) t 1

for all w E adj(v) do

if num(w) = 0 then DFS(w)

else if num(w) > num(v) then

else'if m a r k ( w ) = 0 then

,else

mark(v) t 0

end DFS

[(v, w) is a tree edge J

[(v, w) is a forward edge ]

[(v,w) is a cross edge]

[(v,w) is a back edge]

This process examines all edges and vertices. The call DFS(v) is made exactly once for each vertex of the graph. Each edge is placed into exactly one of four classes by the algorithm: tree edges, forward edges, cross edges, and back edges.

This classification of the edges is not a property of the graph alone. It also depends on the

ordering of the vertices in adj(v) and on the ordering of the vertices in the loop that calls the DFS

procedure. The num and mark fields are not actually necessary to accomplish a complete search

of the graph. All that is needed to do that is a single bit for each vertex that indicates whether or

not that vertex has already been searched. (This bit is zero for vertex v if and only if num(v) = 0).

The depth first labeling (the num field) has some very useful properties that we shall make use of.

The tree edges have the property that either zero or one of them points to a given vertex.

Therefore, they define a collection of trees, called the depth-first spanning forest of the graph. The

root of each tree is the lowest numbered vertex in it (the one that was searched first). These rooted

trees allow us to define the ancestor and descendant relations among vertices. The four types of

edges are related to the spanning forest as follows:

The forward edges are edges.from a vertex to a descendant of it that are not tree edges. This

is because the test n u m ( w ) > num(v) indicates that w was explored after the call to DFS(v). Since the call to DFS(v) is not yet complete 20 must be a descendant of v.

The cross edges are edges from a vertex v to a vertex 20 such that the subtrees rooted at v

and w are disjoint. This follows because marlc(w) = 0 so the exploration of w is complete,

and was complete before the call to DFS(v). Therefore is not in a subtree rooted at w.

Vertex w is not in a subtree rooted at v because n u m ( w ) < nurn(v).

The back edges are edges from a vertex to an ancestor of it. The fact that marlc(w) = 1

indicates that the w is on the recursion stack and is thus an ancestor of v.

Below is an example of a graph and a corresponding depth first spanning forest.

We shall repeatedly make use of one very general property of the depth-first numbering of the graph. This is embodied in the following lemma.

Lemma 1 Let T, be the subtree of the spanning forest rooted at v, and define T, similarly. Suppose that T, and T, are disjoint. Then all the depth-first numbers in T, are greater than all of those in T, or all the depth-first numbers in T, are less than all of those in Tw. Furthermore, i f there is an edge from a vertex in T,' to a vertex in Tw then all of the depth first search numbers in T, are less

than all of them in T,.

DefinitionA vertex is called a base of a strong component i f it has the lowest depth-first search

number (num field) of any vertex in the strong component.

Lemma 2 Let b be the base of a strong component X. Then for all v E X , v is a descendant of b,

and all vertices on the path from b to v are in X.

Proof. First we prove the first part. Let v be any vertex besides b in X. We know that either (1) v

descends from b, or (2) b descends from v, or (3) neither of the above. (2) is impossible, because if

b descended from v, then its depth-first number would be greater than v's, which contradicts the

assumption that b is a base.

Suppose (3). There must be a path from b to v, because they are in the same strong component.

Consider one such path, p , and let r be the least common ancestor of all the vertices on the path.

(In other words, r^ is the vertex such that all the vertices on the path descend from it, but this

property does not hold for any of its descendants. Since there is only one tree in the spanning forest there must be such a vertex.)

We claim that r must be on the path p. To prove this seems to require two cases.

Case 1: b and v descend from different children of r.

Let Tb be the subtree containing b and let Tv be the subtree containing 21. Since num(b) < num(v) and Tb and Tv are disjoint, there cannot be any edge from a vertex in Tb to one in Tv. This follows from Lemma 1. Therefore the only way path p can get from b to v is by going through r.

Case 2: b and v descend from the same child of r.

Suppose that the path p does not contain r. Then the path must touch the subtrees of at least two distinct children of r. (If not, then a child of r would be a lower common ancestor than r of the path p.) Call the subtrees rooted at the children of r 2 ' 1 , T2,... ,Tk. Assume without loss of generality that TI contains b and v. Path p starts in TI goes through a sequence of subtrees of r , then returns to TI. Each time the path changes from one subtree to another, all of the numbers

in the new subtree must be less than all those of the old subtree, by Lemma 1. So it is impossible

for such a path to return to 2'1. The conclusion is that the path must go through r.

Ok, so the path goes through r , so what? Well, since r is an ancestor of b, its depth-first search

number is less than that of b. It is also in the same strong component as b because there is a

path from b to r and a path (along tree edges) from r to b. But b was supposed to be the lowest

numbered vertex in the strong component. This shows that (3)^ is impossible.

Of the three original alternatives, only (1) is left. Therefore v is a descendant of b.

The second part of the lemma is now trivial. Let x be a vertex on the path from v to b. There

is a path from b to x (via tree edges). There is also a path from x to b by first going from x to v

then, from v to b. Therefore x is in the same strong component as b. 0

Lemma 3 Let b be a base vertex. Let bl, ba,... , bk be all of the base vertices that descend from b.

Then b’s strong component is the set of vertices descending from b but not descending from any of

b l , b2, * * , b k.

Proof. Assume the contrary, i.e. that there is a vertex, v , in the same strong component as b and

which descends from b and bi. There must be a path from v to b. There also a path from b to bi to

v (following tree edges). Therefore b and bi are in the same strong component, which contradicts

the assumption that b and bi are bases of distinct strong components. 0

Definition: Let lowlink(v) be the minimum numbered vertex in the same strong component as

v that can be reached from v b y following in zero more tree edges followed b y at most one back or

cross edge.

Lemma 4 A vertex v is a base if and only if num(v) = ZowZink(v)

Proof. We first assume that num(v) > ZowZink(v) and prove that v is not a base. Be definition of

ZowZink, there is a vertex w in the same strong component of v such that ZowZink(v) = num(w).

Therefore num(v) > num(w), and v cannot be a base.

To prove the converse, assume that v is not a base. Let b be the base of the strong component

containing w. Then there must be a path p from v to b. By Lemma 2 b must be an ancestor of w.

Let y be the first vertex along the path p that is not in the subtree rooted at v , and let x be the

vertex before y on the path. Because the subtree rooted at y is disjoint from that rooted at w , and

there is an edge from a descendant of v (namely x ) to y , we can apply Lemma 1 and conclude that

num(y) < num(v). This shows that lowZink(v) 5 num(y), since y is in the same strong component

as v and is reachable by following tree edges, then one back or cross edge. Combining this with

the fact that num(y) < num(v) gives Zowlink(v) < num(v). 0

4. The Strong Components Algorithm

We can now present the algorithm for computing strong components. A stack S containing a

particular set of vertices is maintained by the algorithm.

Clearly ZowZink(v) is computed correctly assuming one thing: the test w E S is satisfied exactly

when w is in the same strong component as w. If w E S then by Lemma 5 there exists a path from

w to w. This, combined with edge ( v , w ) ensures that w and w are in the same strong component. If w tif S then since num(w) < num(v) the recursive call to w must have been completed and the strong component containing w has been output. This strong component has been correctly

computed by induction hypothesis (2).

We now show (2). By the proof just given, when the output phase is reached, ZowZink(v) has

been correctly computed. By Lemma 4 w is a base vertex.

The ensuing loop pops everything from the end of the stack (including w ) into a strong com-

ponent. These vertices are exactly those that are descendants of w but not in another strong

component (here we’re using induction hypothesis (2)). By Lemma 3, this is the strong component

with v as a base. 0

We would like to withdraw our assumptions that the graph has a vertex R connected to all

other vertices, and R is searched first by the algorithm. Compare the following two alternatives:

(1) Run the above algorithm on a graph G, or (2) add a new vertex R to G with an edge to all other

vertices, and run the search above by calling STRONG(&). By syntatic analysis of the program above, it is easy to see that the these two processes will differ only in that (2) will output one extra

strong component at the end (the one consisting of R). This is because while running (2), inside

the call to STRONG(R), the loop through the vertices adjacant vertices of R will behave exactly

like the loop on the outside running in (1).

It is tempting to consider what happens to the algorithm if the line:

“lowlink (w ) t min(Zowlink (w ) , num (w) )”

Is replaced by:

“ZowZink(v) t min(ZowZink(v), ZowZink(w))”

This new,ZowZink function is not the same as the old one, yet the algorithm will still work.

Lemma 6 The modified strong components algorithm works.

Proof. Let ZZ(w) be the new ZowZink function. (That is, change the line indicated above, then

replace ZowZink by ZZ everywhere in the program.) Since ZowZink(z) 5 num(z) this change can only decrease the function, that is ZZ(z) 5 ZowZink(z). Our goal is to show that the test ZowZink(v) =

num(w) is true in the running of the original algorithm exactly when U(v) = num(v) in the running

of the new algorithm.

One way is easy. If the test ZZ(v) = num(v) is satisfied, then it must be the case that in the

running of the original algorithm the test ZowZink(v) = num(v) is also satisfied, because lowZink(w)

is trapped between lZ(v) and num(v).

Conversely suppose that ZowZink(v) = num(w), that is, that w is a base vertex. Our goal

is to show that lZ(w) = jowZink(v). The reason is that even though U(z) can be smaller than

lowZink(z), its value,is still the depth first number (num field) of of another vertex in the same

strong component as z. If w is a base vertex there is no other vertex in the same strong component

as 21 with a smaller num field. Therefore in this case Zl(v) = ZowZink(v), which completes the proof.

0