Greedy Approximations-Approximations Algorithms-Lecture 03 Notes-Computer Science, Study notes of Approximation Algorithms

Greedy Approximations: Set Cover and Min Makespan, Set Cover problem, Min Makespan Scheduling, Min Makespan Problem, Graham’s List Scheduling, Approximations Algorithms, Shuchi Chawla, Lecture Notes, University of Wisconsin, United States of America

Typology: Study notes

2011/2012

Uploaded on 02/14/2012

alexey
alexey 🇺🇸

4.7

(20)

325 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS880: Approximations Algorithms
Scribe: Matt Elder Lecturer: Shuchi Chawla
Topic: Greedy Approximations: Set Cover and Min Makespan Date: 1/30/06
3.1 Set Cover
The Set Cover problem is: Given a set of elements E={e1, e2,...,en}and a set of msubsets
of E, S={S1, S2,...,Sn}, find a “least cost” collection Cof sets from Ssuch that Ccovers all
elements in E. That is, SiCSi=E.
Set Cover comes in two flavors, unweighted and weighted. In unweighted Set Cover, the cost of a
collection Cis number of sets contained in it. In weighted Set Cover, there is a nonnegative weight
function w:S R, and the cost of Cis defined to be its total weight, i.e., PSiCw(Si).
First, we will deal with the unweighted Set Cover problem. The following algorithm is an extension
of the greedy vertex cover algorithm that we discussed in Lecture 1.
Algorithm 3.1.1 Set Cover(E,S):
1. C .
2. While Econtains elements not covered by C:
(a) Pick an element eEnot covered by C.
(b) Add all sets Sicontaining eto C.
To analyze Algorithm 3.1.1, we will need the following definition:
Definition 3.1.2 A set E0of elements in Eis independent if, for all e1, e2E0, there is no
SiCsuch that e1, e2Si.
Now, we shall determine how strong an approximation Algorithm 3.1.1 is. Say that the frequency of
an element is the number of sets that contain that element. Let Fdenote the maximum frequency
across all elements. Thus, Fis the largest number of sets from Sthat we might add to our cover
Cat any step in the algorithm. It is clear that the elements selected by the algorithm form an
independent set, so the algorithm selects no more than F|E0|elements, where E0is the set of
elements picked in Step 2a. That is, ALG F|E0|. Because every element is covered by some
subset in an optimal set cover, we know that |E0| OPT for any independent set E0. Thus,
ALG FOPT, and Algorithm 3.1.1 is therefore an F–approximation.
Theorem 3.1.3 Algorithm 3.1.1 is an F–approximation to Set Cover.
Algorithm 3.1.1 is a good approximation if Fis guaranteed to be small. In general, however, there
could be some element contained in every set of S, and Algorithm 3.1.1 would be a very poor
approximation. So, we consider a different unweighted Set Cover approximation algorithm which
uses the greedy strategy to yield a ln n–approximation.
1
pf3
pf4
pf5

Partial preview of the text

Download Greedy Approximations-Approximations Algorithms-Lecture 03 Notes-Computer Science and more Study notes Approximation Algorithms in PDF only on Docsity!

CS880: Approximations Algorithms

Scribe: Matt Elder Lecturer: Shuchi Chawla Topic: Greedy Approximations: Set Cover and Min Makespan Date: 1/30/

3.1 Set Cover

The Set Cover problem is: Given a set of elements E = {e 1 , e 2 ,... , en} and a set of m subsets of E, S = {S 1 , S 2 ,... , Sn}, find a “least cost” collection C of sets from S such that C covers all elements in E. That is, ∪Si∈C Si = E.

Set Cover comes in two flavors, unweighted and weighted. In unweighted Set Cover, the cost of a collection C is number of sets contained in it. In weighted Set Cover, there is a nonnegative weight function w : S → R, and the cost of C is defined to be its total weight, i.e.,

Si∈C w^ (Si).

First, we will deal with the unweighted Set Cover problem. The following algorithm is an extension of the greedy vertex cover algorithm that we discussed in Lecture 1.

Algorithm 3.1.1 Set Cover(E, S):

1. C ← ∅.

  1. While E contains elements not covered by C:

(a) Pick an element e ∈ E not covered by C. (b) Add all sets Si containing e to C.

To analyze Algorithm 3.1.1, we will need the following definition:

Definition 3.1.2 A set E′^ of elements in E is independent if, for all e 1 , e 2 ∈ E′, there is no Si ∈ C such that e 1 , e 2 ∈ Si.

Now, we shall determine how strong an approximation Algorithm 3.1.1 is. Say that the frequency of an element is the number of sets that contain that element. Let F denote the maximum frequency across all elements. Thus, F is the largest number of sets from S that we might add to our cover C at any step in the algorithm. It is clear that the elements selected by the algorithm form an independent set, so the algorithm selects no more than F |E′| elements, where E′^ is the set of elements picked in Step 2a. That is, ALG ≤ F |E′|. Because every element is covered by some subset in an optimal set cover, we know that |E′| ≤ OPT for any independent set E′. Thus, ALG ≤ F OPT, and Algorithm 3.1.1 is therefore an F –approximation.

Theorem 3.1.3 Algorithm 3.1.1 is an F –approximation to Set Cover.

Algorithm 3.1.1 is a good approximation if F is guaranteed to be small. In general, however, there could be some element contained in every set of S, and Algorithm 3.1.1 would be a very poor approximation. So, we consider a different unweighted Set Cover approximation algorithm which uses the greedy strategy to yield a ln n–approximation.

Algorithm 3.1.4 Set Cover(E, S):

  1. C ← ∅.
  2. While E contains elements not covered by C:

(a) Find the set Si containing the greatest number of uncovered elements. (b) Add Si to C.

Theorem 3.1.5 Algorithm 3.1.4 is a ln (^) OPTn –approximation.

Proof: Let k = OPT, and let Et be the set of elements not yet covered after step i, with E 0 = E. OPT covers every Et with no more than k sets. ALG always picks the largest set over Et in step t + 1. The size of this largest set must cover at least |Et|/k in Et; if it covered fewer elements, no way of picking sets would be able to cover Et in k sets, which contradicts the existence of OPT. So, |Et+1| ≤ |Et| − |Et|/k, and, inductively, |Et| ≤ n (1 − 1 /k)t.

When |Et| < 1, we know we are done, so we solve for this t:

k

)t <

n

⇒ n <

k k − 1

)t

⇒ ln n ≤ t ln

k − 1

t k ⇒ t ≤ k ln n = OPT ln n.

Algorithm 3.1.4 finishes within OPT ln n steps, so it uses no more than that many sets. We can get a better analysis for this approximation by considering when |Et| < k, as follows:

n

k

)t = k

⇒ n

et/k^

= k (because (1 − x)^1 /x^ ≤

e

for all x).

⇒ et/k^ =

n k ⇒ t = k ln

n k

Thus, after k ln n k steps there remain only k elements. Each subsequent step removes at least one element, so ALG ≤ OPT

ln (^) OP Tn + 1

Theorem 3.1.6 If all sets are of size ≤ B, then there exists a (ln B + 1)–approximation to un- weighted Set Cover.

Proof: If all sets have size no greater than B, then k ≥ n/B. So, B ≥ n/k, and Algorithm 3.1. gives a (ln B + 1)–approximation.

The dots are elements, and the loops represent the sets of S. Each set has weight 1. The optimal solution is to take the two long sets, with a total cost of 2. If Algorithm 3.1.7 instead selects the leftmost thick set at first, then it will take at least 5 sets. This example generalizes to a family of examples each with 2k^ elements, and shows that no analysis of Algorithm 3.1.7 will make it better than a O(ln n)–approximation.

A ln n–approximation to Set Cover can also be obtained by other techniques, including LP-rounding. However, Feige showed that no improvement, even by a constant factor, is likely:

Theorem 3.1.9 There is no (1 − ) ln n–approximation to Weighted Set Cover unless NP ⊆ DTIME(nlog log^ n). [1]

3.2 Min Makespan Scheduling

The Min Makespan Problem is: given n jobs to schedule on m machines, where job i has size si, schedule the jobs to minimize their makespan.

Definition 3.2.1 The makespan of a schedule is the earliest time when all machines have stopped doing work.

This problem is NP-hard, as can be seen by a reduction from Partition. The following algorithm due to Ron Graham yields a 2–approximation.

Algorithm 3.2.2 (Graham’s List Scheduling) [2] Given a set of n jobs and a set of m empty machine queues,

  1. Order the jobs arbitrarily.
  2. Until the job list is empty, move the next job in the list to the end of the shortest machine queue.

Theorem 3.2.3 Graham’s List Scheduling is a 2–approximation.

Proof: Let Sj denote the size of job j. Suppose job i is the last job to finish in a Graham’s List schedule, and let ti be the time it starts. When job i was placed, its queue was no longer

than any other queue, so every queue is full until ti. Thus, ALG = Si + ti ≤ Si +

(Pnj=1 Sj )−Si 1 m^ = m

∑n j=1 Sj^ + (1^ −^1 /m)Si. It’s easy to see that^ Si^ ≤^ OPT and that^

1 m

∑n j=1 Sj^ ≤^ OP T^. So, we conclude that ALG ≤ (2 − 1 /m)OPT, which yields a 2–approximation.

This analysis is tight. Suppose that after the jobs are arbitrarily ordered, the job list contains m(m−1) unit-length jobs, followed by one m-length job. The algorithm yields a schedule completing in 2m − 1 units while the optimal schedule has length m.

This algorithm can be improved. For example, by ordering the job list by increasing duration instead of arbitrarily, we get a (4/3)–approximation, a result proved in [3]. Also, this problem has a poly- time approximation scheme (PTAS), given in [4]. However, a notable property of Algorithm 3.2. is that it is an online algorithm, i.e., even if the jobs arrive one after another, and we have no information about what jobs may arrive in the furture, we can still use this algorithm to obtain a 2–approximation.

References

[1] Uriel Feige. A Threshold of ln n for Approximating Set Cover. In J. ACM 45(4), pp 634-652. (1998)

[2] Graham, R. Bounds for Certain Multiprocessing Anomalies. In Bell System Tech. J., 45, pp 1563-1581. (1966)

[3] Ronald L. Graham. Bounds on Multiprocessing Timing Anomalies. In SIAM Journal of Applied Mathematics, 17(2), pp 416-429. (1969)

[4] Dorit S. Hochbaum, David B. Shmoys. A Polynomial Approximation Scheme for Scheduling on Uniform Processors: Using the Dual Approximation Approach. In SIAM J. Comput. 17(3), pp 539-551. (1988)

[5] D. S. Johnson. Approximation Algorithms for Combinatorial Problems. In Journal of Computer and System Sciences, 9, pp 256-278. (1974) Preliminary version in Proc. of the 5th Ann. ACM Symp. on Theory of Computing, pp 36-49. (1973)