


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Introduction and Greedy Algorithms, Approximation Algorithms, NP-hard Problems, Greedy Algorithms, Interval Scheduling, Minimum Spanning Tree, Kruskal’s Algorithm, Prim’s Algorithm, Reverse-delete, Set Cover, Advanced Algorithms, Shuchi Chawla, Lecture Notes, University of Wisconsin, United States of America
Typology: Study notes
1 / 4
This page cannot be seen from the preview
Don't miss anything!



CS787: Advanced Algorithms Scribe: Shuchi Chawla Lecturer: Shuchi Chawla Topic: Introduction and Greedy Algorithms Date: Sept 5, 2007
In this course we will study techniques for designing and analyzing algorithms. Undergraduate algorithms courses typically cover techniques for designing exact, efficient (polynomial time) al- gorithms. The focus of this course is different. We will consider problems for which polynomial time exact algorithms are not known, problems under stringent resource constraints, as well as problems for which the notion of optimality is not well defined. In each case, our emphasis will be on designing efficient algorithms with provable guarantees on their performance. Some topics that we will cover are as follows:
We begin with a quick revision of basic algorithmic techniques including greedy algorithms, divide & conquer, dynamic programming, network flow and basic randomized algorithms. Students are expected to have seen this material before in a basic algorithms course.
Note that some times we will not explicitly analyze the running times of the algorithms we discuss. However, this is an important part of algorithm analysis, and readers are highly encouraged to work out the asymptotic running times themselves.
As the name suggests, greedy algorithms solve problems by making a series of myopic decisions, each of which by itself solves some subproblem optimally, but that altogether may or may not be
optimal for the problem as a whole. As a result these algorithms are usually very easy to design but may be tricky to analyze, and don’t always lead to the optimal solution. Nevertheless there are a few broad arguments that can be utilized to argue their correctness. We will demonstrate two such techniques through a few examples.
Given: n jobs, each with a start and finish time (si, fi).
Goal: Schedule the maximum number of (non-overlapping) jobs on a single machine.
To apply the greedy approach to this problem, we will schedule jobs successively, while ensuring that no picked job overlaps with those previously scheduled. The key design element is to decide the order in which we consider jobs. There are several ways of doing so. Suppose for example, that we pick jobs in increasing order of size. It is easy to see that this does not necessarily lead to the optimal solution (see the figure below for a counter-example). Likewise, scheduling jobs in order of their arrivals (start times), or in increasing order of the number of conflicts that they have, also does not work.
(a) Bad example for the shortest job first algorithm (b) Bad example for the earliest start first algorithm
(c) Bad example for the fewest conflicts first algorithm
We will now show that picking jobs in increasing order of finish times gives the optimal solution. At a high level, our proof will employ induction to show that at any point of time the greedy solution is no worse than any partial optimal solution up to that point of time. In short, we will show that greedy always stays ahead.
Theorem 1.2.1 The “earliest finish time first” algorithm described above generates an optimal schedule for the interval scheduling problem.
Proof: Consider any solution S with at least k jobs. We claim by induction on k that the greedy algorithm schedules at least k jobs and that the first k jobs in the greedy schedule finish no later
Proof: Consider any optimal solution, T ∗, to the problem. As described above, we will transform this solution into the greedy solution T produced by Kruskal’s algorithm, without increasing its length. Consider the first edge in increasing order of length, say e, that is in one of the trees T and T ∗^ but not in the other. Then e ∈ T \ T ∗^ (convince yourself that the other case, e ∈ T ∗^ \ T , is not possible). Now consider adding e to the tree T ∗, forming a unique cycle C. Naturally T does not contain C, so consider the most expensive edge e′^ ∈ C that is not in T. It is immediate that e′ ≤e, by our choice of e, and because e′^ belongs to one of the trees and not the other. Let T 1 ∗ be the tree T ∗^ minus the edge e′^ plus the edge e. Then T 1 ∗ has total length no more than T ∗, and is closer (in hamming distance^1 ) to T than T ∗^ is. Continuing in this manner, we can obtain a sequence of trees that are increasingly closer to T in hamming distance, and no worse than T ∗^ in terms of length; the last tree on this sequence is T itself.
As we mentioned earlier, greedy algorithms don’t always lead to globally optimal solutions. In the following lecture, we will discuss one such example, namely the set cover problem. Following the techniques introduced above we will show that it nevertheless produces a near-optimal solution. The set cover problem is defined as follows:
Given: A universe U of n elements. A collection of subsets S 1 , · · · , Sk of U.
Goal: Find the smallest collection C of subsets that covers U , that is, ∪S∈CS = U.
(^1) We define the hamming distance between two trees to be the number of edges that are contained in one of the trees and not the other.