Data Structures and Algorithms: Topological Sort and Strongly Connected Components, Lecture notes of Algorithms and Programming

The concepts of topological sort and strongly connected components in directed acyclic graphs (DAGs). It provides an overview of two algorithms for topological sorting, Kahn's Algorithm and Tarjan's Algorithm, and explains how Kosaraju's Algorithm can be used to compute the SCCs of a graph. lecture notes and readings for CIS 121, a course on Data Structures and Algorithms, and is relevant for students studying computer science and related fields.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

salim
salim 🇺🇸

4.4

(24)

242 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CIS 121—Data Structures and Algorithms—Fall 2022
Topological Sort / Strongly Connected Components—Monday, October 24 / Tuesday, October 25
Readings
Lecture Notes Chapter 18: DAGs and Topological Sort
Lecture Notes Chapter 19: Strongly Connected Components
Review: Topological Sort
A topological sort of a directed acyclic graph (DAG) G= (V, E) is an ordering of the vertices such that for
each directed edge (u, v)E,uappears before vin the ordering. As described in the below algorithms,
topologically sorting a DAG only takes O(m+n) time, so given a DAG, it is helpful to topologically
sort it, since most graph algorithms take Ω(m+n) time anyway. In other words, topologically sorting a
DAG is usually a free step, and if not necessary for your algorithm, it can make reasoning/thinking about
the problem easier since it gives you a visual. Below are two algorithms to find a topological sort:
Kahn’s Algorithm
Every DAG has a source node, or a node with no incoming edges. Kahn’s algorithm relies on this intuition
at a high-level, the algorithm operates by repeatedly finding a source node, putting it next in the topological
sort, removing the node and all of the edges incident on it from the graph, and repeating this process.
Kahn’s algorithm runs in O(m+n) time. As seen in the pseudocode, the first step to compute the in-degree
of each node takes O(m+n) time since for each node, we scan through its neighbors; the second step to
populate the queue takes O(n) time since we iterate through all of the vertices. In our while loop, note that
we enqueue each node exactly once and scan through each of its neighbors, performing constant work for
each, which takes O(m+n) time.
Tarjan’s Algorithm
Tarjan’s algorithm leverages the finishing times of DFS as shown in the pseudocode by just running DFS
and then returning the nodes in decreasing order of finishing times. Thus, it also runs in O(m+n) time.
This algorithm provides key insight into the following algorithm to find the strongly connected components
of a directed graph.
Review: Kosaraju’s Algorithm
Given a directed graph G= (V, E), a strongly connected comp onent (SCC) is a maximal set SV
such that for all u, v S, there exists a path uvand a path vu. Thus, we can decompose a directed
graph Ginto its SCCs, yielding GSCC or our kernel graph. Formally, GSCC = (VS CC , E SCC ). Each vertex
viin GSCC represents a single SCC Ciin G, and an edge (vi, vj) exists in GSC C if Gcontains the directed
edge (x, y) where xis in SCC Ciand yis in SCC Cj. Observe that GSC C is a DAG, meaning that we can
topologically sort it to make the problem easier to think about.
Kosaraju’s algorithm is an algorithm that we can use to compute the SCCs of a graph, and by extension, to
obtain GSCC . It operates by running two DFS traversals, one on Gand another on GT, the transposed graph
obtained by reversing the direction of edges in G; in the latter, we consider vertices in order of decreasing
finishing times.
1
pf2

Partial preview of the text

Download Data Structures and Algorithms: Topological Sort and Strongly Connected Components and more Lecture notes Algorithms and Programming in PDF only on Docsity!

CIS 121—Data Structures and Algorithms—Fall 2022

Topological Sort / Strongly Connected Components—Monday, October 24 / Tuesday, October 25

Readings

  • Lecture Notes Chapter 18: DAGs and Topological Sort
  • Lecture Notes Chapter 19: Strongly Connected Components

Review: Topological Sort

A topological sort of a directed acyclic graph (DAG) G = (V, E) is an ordering of the vertices such that for each directed edge (u, v) ∈ E, u appears before v in the ordering. As described in the below algorithms, topologically sorting a DAG only takes O(m + n) time, so given a DAG, it is helpful to topologically sort it, since most graph algorithms take Ω(m + n) time anyway. In other words, topologically sorting a DAG is usually a free step, and if not necessary for your algorithm, it can make reasoning/thinking about the problem easier since it gives you a visual. Below are two algorithms to find a topological sort:

Kahn’s Algorithm

Every DAG has a source node, or a node with no incoming edges. Kahn’s algorithm relies on this intuition — at a high-level, the algorithm operates by repeatedly finding a source node, putting it next in the topological sort, removing the node and all of the edges incident on it from the graph, and repeating this process.

Kahn’s algorithm runs in O(m + n) time. As seen in the pseudocode, the first step to compute the in-degree of each node takes O(m + n) time since for each node, we scan through its neighbors; the second step to populate the queue takes O(n) time since we iterate through all of the vertices. In our while loop, note that we enqueue each node exactly once and scan through each of its neighbors, performing constant work for each, which takes O(m + n) time.

Tarjan’s Algorithm

Tarjan’s algorithm leverages the finishing times of DFS as shown in the pseudocode by just running DFS and then returning the nodes in decreasing order of finishing times. Thus, it also runs in O(m + n) time. This algorithm provides key insight into the following algorithm to find the strongly connected components of a directed graph.

Review: Kosaraju’s Algorithm

Given a directed graph G = (V, E), a strongly connected component (SCC) is a maximal set S ⊆ V such that for all u, v ∈ S, there exists a path u ⇝ v and a path v ⇝ u. Thus, we can decompose a directed graph G into its SCCs, yielding GSCC^ or our kernel graph. Formally, GSCC^ = (V SCC^ , ESCC^ ). Each vertex vi in GSCC^ represents a single SCC Ci in G, and an edge (vi, vj ) exists in GSCC^ if G contains the directed edge (x, y) where x is in SCC Ci and y is in SCC Cj. Observe that GSCC^ is a DAG, meaning that we can topologically sort it to make the problem easier to think about.

Kosaraju’s algorithm is an algorithm that we can use to compute the SCCs of a graph, and by extension, to obtain GSCC^. It operates by running two DFS traversals, one on G and another on GT^ , the transposed graph obtained by reversing the direction of edges in G; in the latter, we consider vertices in order of decreasing finishing times.

Thus, Kosaraju’s runs in O(m + n) time. As seen in the pseudocode, the first step is just DFS, which takes O(m + n) time; the second step is computing GT^ , but this can be done in O(m + n) time since we just reverse the direction of edges; and the third step is also just DFS, which takes O(m + n) time.

Problems

Problem 1: True or False

  1. Every DAG has exactly one topological sort.
  2. If a graph has a topological sort, then a DFS traversal of the graph will not find any back edges.
  3. The finishing times of all vertices in a SCC s must be greater than the finishing times of other SCCs reachable from s during the first DFS.

Problem 2

  1. How does the number of SCC’s of a graph change if a new edge is added?
  2. (Adapted from CLRS 22.5) Consider a “simpler” version of Kosaraju’s algorithm, where we use the original (instead of the transposed) graph in the second DFS traversal but process vertices in order of increasing finishing times. Is this algorithm always correct?

Problem 3

A graph G = (V, E) is “almost strongly connected” if adding a single edge makes the graph strongly connected. Design an O(|V | + |E|) algorithm to determine whether a graph is almost strongly connected.