









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This lecture was delivered by Dr. Ameet Shashank at B R Ambedkar National Institute of Technology. Its relate to Data Representation and Algorithm Design course. Its main points are: Approximation, Algorithms, NP-hard, Load, Balancing, Makespan, List, Scheduling, Analysis
Typology: Slides
1 / 15
This page cannot be seen from the preview
Don't miss anything!










1
2
Q. Suppose I need to solve an NP-hard problem. What should I do?
A. Theory says you're unlikely to find a poly-time algorithm.
Must sacrifice one of three desired features.
Solve problem to optimality.
Solve problem in poly-time.
Solve arbitrary instances of the problem.
ρ-approximation algorithm.
Guaranteed to run in poly-time.
Guaranteed to solve arbitrary instance of the problem
Guaranteed to find solution within ratio ρ of true optimum.
Challenge. Need to prove a solution's value is close to optimum, without
even knowing what optimum value is!
4
Input. m identical machines; n jobs, job j has processing time t
j
Job j must run contiguously on one machine.
A machine can process at most one job at a time.
Def. Let J(i) be the subset of jobs assigned to machine i. The
load of machine i is L
i
j ∈ J(i)
t
j
Def. The makespan is the maximum load on any machine L = max
i
i
Load balancing. Assign each job to a machine to minimize makespan.
5
List-scheduling algorithm.
Consider n jobs in some fixed order.
Assign job j to machine whose load is smallest so far.
Implementation. O(n log m) using a priority queue.
List-Scheduling(m, n, t
1
,t
2
,…,t
n
) {
for i = 1 to m {
L
i
← 0
J(i) ← φ
}
for j = 1 to n {
i = argmin
k
L
k
J(i) ← J(i) ∪ {j}
L
i
← L
i
+ t
j
}
return J(1), …, J(m)
}
jobs assigned to machine i
load on machine i
machine i has smallest load
assign job j to machine i
update load of machine i
6
Theorem. [Graham, 1966] Greedy algorithm is a 2-approximation.
First worst-case analysis of an approximation algorithm.
Need to compare resulting solution with optimal makespan L*.
Lemma 1. The optimal makespan L* ≥ max
j
t
j
Pf. Some machine must process the most time-consuming job. ▪
Lemma 2. The optimal makespan
Pf.
The total processing time is Σ
j
t
j
One of m machines must do at least a 1/m fraction of total work. ▪
!
1
m
t
j j
7
Theorem. Greedy algorithm is a 2-approximation.
Pf. Consider load L
i
of bottleneck machine i.
Let j be last job scheduled on machine i.
When job j assigned to machine i, i had smallest load. Its load
before assignment is L
i
j
i
j
k
for all 1 ≤ k ≤ m.
j
0
L = L
i
L
i
j
machine i
blue jobs scheduled before j
8
Theorem. Greedy algorithm is a 2-approximation.
Pf. Consider load L
i
of bottleneck machine i.
Let j be last job scheduled on machine i.
When job j assigned to machine i, i had smallest load. Its load
before assignment is L
i
j
i
j
k
for all 1 ≤ k ≤ m.
Sum inequalities over all k and divide by m:
Now ▪
i
" t
j
1
m
k k
1
m
t
k k
i
i
" t
j
j
Lemma 2
Lemma 1
13
Q. Is our 3/2 analysis tight?
A. No.
Theorem. [Graham, 1969] LPT rule is a 4/3-approximation.
Pf. More sophisticated analysis of same algorithm.
Q. Is Graham's 4/3 analysis tight?
A. Essentially yes.
Ex: m machines, n = 2m+1 jobs, 2 jobs of length m+1, m+2, …, 2m-1 and
one job of length m.
15
center
r(C)
Input. Set of n sites s
1
, …, s
n
and integer k > 0.
Center selection problem. Select k centers C so that maximum
distance from a site to nearest center is minimized.
site
k = 4
16
Input. Set of n sites s
1
, …, s
n
and integer k > 0.
Center selection problem. Select k centers C so that maximum
distance from a site to nearest center is minimized.
Notation.
dist(x, y) = distance between x and y.
dist(s
i
, C) = min
c ∈ C
dist(s
i
, c) = distance from s
i
to closest center.
r(C) = max
i
dist(s
i
, C) = smallest covering radius.
Goal. Find set of centers C that minimizes r(C), subject to |C| = k.
Distance function properties.
dist(x, x) = 0 (identity)
dist(x, y) = dist(y, x) (symmetry)
dist(x, y) ≤ dist(x, z) + dist(z, y) (triangle inequality)
17
center
site
Ex: each site is a point in the plane, a center can be any point in the
plane, dist(x, y) = Euclidean distance.
Remark: search can be infinite!
r(C)
18
Greedy algorithm. Put the first center at the best possible location
for a single center, and then keep adding centers so as to reduce the
covering radius each time by as much as possible.
Remark: arbitrarily bad!
greedy center 1
k = 2 centers site
center
19
Greedy algorithm. Repeatedly choose the next center to be the site
farthest from any existing center.
Observation. Upon termination all centers in C are pairwise at least r(C)
apart.
Pf. By construction of algorithm.
Greedy-Center-Selection(k, n, s
1
,s
2
,…,s
n
) {
C = φ
repeat k times {
Select a site s
i
with maximum dist(s
i
, C)
Add s
i
to C
}
return C
}
site farthest from any center
20
Theorem. Let C* be an optimal set of centers. Then r(C) ≤ 2r(C*).
Pf. (by contradiction) Assume r(C*) < ½ r(C).
For each site c
i
in C, consider ball of radius ½ r(C) around it.
Exactly one c
i
i
be the site paired with c
i
Consider any site s and its closest center c
i
dist(s, C) ≤ dist(s, c
i
) ≤ dist(s, c
i
*) + dist(c
i
*, c
i
) ≤ 2r(C*).
Thus r(C) ≤ 2r(C*). ▪
C*
sites
½ r(C)
c
i
c
i
s
≤ r(C*) since c i
½ r(C)
½ r(C)
Δ-inequality
25
Pricing method. Set prices and find vertex cover simultaneously.
Weighted-Vertex-Cover-Approx(G, w) {
foreach e in E
p
e
= 0
while ( ∃ edge i-j such that neither i nor j are tight)
select such an edge e
increase p
e
as much as possible until i or j tight
}
S ← set of all tight nodes
return S
}
i
e i j
e
p = w
= (,)
26
vertex weight
Figure 11.
price of edge a-b
27
Theorem. Pricing method is a 2-approximation.
Pf.
Algorithm terminates since at least one new node becomes tight
after each iteration of while loop.
Let S = set of all tight nodes upon termination of algorithm. S is a
vertex cover: if some edge i-j is uncovered, then neither i nor j is
tight. But then while loop would not terminate.
Let S* be optimal vertex cover. We show w(S) ≤ 2 w(S*).
i
i " S
i " S
e
e =( i , j )
i " V
e
e =( i , j )
e
e " E
all nodes in S are tight S ⊆ V,
prices ≥ 0
fairness lemma each edge counted twice
29
Weighted vertex cover. Given an undirected graph G = (V, E) with
vertex weights w
i
≥ 0 , find a minimum weight subset of nodes S such
that every edge is incident to at least one vertex in S.
3
6
10
7
A
E
H
B
D I
C
F
J
G
6
16
10
7
23
9
10
9
33
total weight = 55
32
30
Weighted vertex cover. Given an undirected graph G = (V, E) with
vertex weights w
i
≥ 0 , find a minimum weight subset of nodes S such
that every edge is incident to at least one vertex in S.
Integer programming formulation.
Model inclusion of each vertex i using a 0/1 variable x
i
Vertex covers in 1-1 correspondence with 0/1 assignments:
S = {i ∈ V : x
i
Objective function: maximize Σ
i
w
i
x
i
Must take either i or j: x
i
j
x
i
0 if vertex i is not in vertex cover
1 if vertex i is in vertex cover
31
Weighted vertex cover. Integer programming formulation.
Observation. If x* is optimal solution to (ILP), then S = {i ∈ V : x*
i
is a min weight vertex cover.
( ILP ) min w
i
x
i
i " V
s. t. x
i
j
$ 1 ( i , j ) " E
x
i
" {0,1} i " V
32
INTEGER-PROGRAMMING. Given integers a
ij
and b
i
, find integers x
j
that
satisfy:
Observation. Vertex cover formulation proves that integer
programming is NP-hard search problem.
a
ij
x
j
j = 1
n
" # b
i
1 $ i $ m
x
j
x
j
integral 1 $ j $ n
even if all coefficients are 0/1 and
at most two variables per inequality
!
max c
t
x
s. t. Ax " b
x integral
37
Theorem. 2-approximation algorithm for weighted vertex cover.
Theorem. [Dinur-Safra 2001] If P ≠ NP, then no ρ-approximation
for ρ < 1.3607, even with unit weights.
Open research problem. Close the gap.
10 √ 5 - 21
39
Input. Set of m machines M; set of n jobs J.
Job j must run contiguously on an authorized machine in M
j
Job j has processing time t
j
Each machine can process at most one job at a time.
Def. Let J(i) be the subset of jobs assigned to machine i. The
load of machine i is L
i
j ∈ J(i)
t
j
Def. The makespan is the maximum load on any machine = max
i
i
Generalized load balancing. Assign each job to an authorized machine
to minimize makespan.
40
ILP formulation. x
ij
= time machine i spends processing job j.
LP relaxation.
( IP ) min L
s. t. x
i j
i
" = t
j
for all j # J
x
i j
j
" $ L for all i # M
x
i j
j
} for all j # J and i # M
j
x
i j
= 0 for all j # J and i % M
j
( LP ) min L
s. t. x
i j
i
" = t
j
for all j # J
x
i j
j
" $ L for all i # M
x
i j
% 0 for all j # J and i # M
j
x
i j
= 0 for all j # J and i & M
j
41
Lemma 1. Let L be the optimal value to the LP. Then, the optimal
makespan L* ≥ L.
Pf. LP has fewer constraints than IP formulation.
Lemma 2. The optimal makespan L* ≥ max
j
t
j
Pf. Some machine must process the most time-consuming job. ▪
42
Lemma 3. Let x be solution to LP. Let G(x) be the graph with an edge
from machine i to job j if x
ij
> 0. Then G(x) is acyclic.
Pf. (deferred)
G(x) acyclic
job
machine
can transform x into another LP solution where
G(x) is acyclic if LP solver doesn't return such an x
G(x) cyclic
x
ij
> 0
43
Rounded solution. Find LP solution x where G(x) is a forest. Root
forest G(x) at some arbitrary machine node r.
If job j is a leaf node, assign j to its parent machine i.
If job j is not a leaf node, assign j to one of its children.
Lemma 4. Rounded solution only assigns jobs to authorized machines.
Pf. If job j is assigned to machine i, then x
ij
> 0. LP solution can only
assign positive value to authorized machines. ▪
job
machine
44
Lemma 5. If job j is a leaf node and machine i = parent(j), then x
ij
= t
j
Pf. Since i is a leaf, x
ij
= 0 for all j ≠ parent(i). LP constraint
guarantees Σ
i
x
ij
= t
j
Lemma 6. At most one non-leaf job is assigned to a machine.
Pf. The only possible non-leaf job assigned to machine i is parent(i). ▪
job
machine
50
PTAS. (1 + ε)-approximation algorithm for any constant ε > 0.
Load balancing. [Hochbaum-Shmoys 1987]
Euclidean TSP. [Arora 1996]
Consequence. PTAS produces arbitrarily high quality solution, but trades
off accuracy for time.
This section. PTAS for knapsack problem via rounding and scaling.
51
Knapsack problem.
Given n objects and a "knapsack."
Item i has value v
i
> 0 and weighs w
i
Knapsack can carry weight up to W.
Goal: fill knapsack so as to maximize total value.
Ex: { 3, 4 } has value 40.
1
Value
18
22
28
1
Weight
5
6
6 2
7
Item
1
3
4
5
2
W = 11
we'll assume w
i
≤ W
52
KNAPSACK: Given a finite set X, nonnegative weights w
i
, nonnegative
values v
i
, a weight limit W, and a target value V, is there a subset S ⊆ X
such that:
SUBSET-SUM: Given a finite set X, nonnegative values u
i
, and an integer
U, is there a subset S ⊆ X whose elements sum to exactly U?
Claim. SUBSET-SUM ≤
P
Pf. Given instance (u
1
, …, u
n
, U) of SUBSET-SUM, create KNAPSACK
instance:
w
i
i " S
v
i
i " S
v
i
= w
i
= u
i
u
i
i " S
V = W = U u
i
i " S
53
Def. OPT(i, w) = max value subset of items 1,..., i with weight limit w.
Case 1: OPT does not select item i.
Case 2: OPT selects item i.
i
i
Running time. O(n W).
W = weight limit.
Not polynomial in input size!
OPT ( i , w ) =
0 if i = 0
OPT ( i " 1 , w ) if w
i
> w
max OPT ( i " 1 , w ), v
i
i
otherwise
54
Def. OPT(i, v) = min weight subset of items 1, …, i that yields value
exactly v.
Case 1: OPT does not select item i.
Case 2: OPT selects item i.
i
, new value needed = v – v
i
Running time. O(n V*) = O(n
2
v
max
V* = optimal value = maximum v such that OPT(n, v) ≤ W.
Not polynomial in input size!
OPT ( i , v ) =
0 if v = 0
" if i = 0, v > 0
OPT ( i # 1 , v ) if v
i
> v
min OPT ( i # 1 , v ), w
i
i
otherwise
V* ≤ n v max
55
Intuition for approximation algorithm.
Round all values up to lie in smaller range.
Run dynamic programming algorithm on rounded instance.
Return optimal items in rounded instance.
W = 11
original instance rounded instance
W = 11
1
Value
18
22
28
1
Weight
5
6
6 2
7
Item
1
3
4
5
2
934,
Value
17,810,
21,217,
27,343,
1
Weight
5
6
5,956, 2
7
Item
1
3
4
5
2
56
Knapsack FPTAS. Round up all values:
max
= largest value in original instance
max
/ n
Observation. Optimal solution to problems with or are equivalent.
Intuition. close to v so optimal solution using is nearly optimal;
small and integral so dynamic programming algorithm is fast.
Running time. O(n
3
/ ε).
Dynamic program II running time is , where
v
i
v
i
", v ˆ
i
v
i
!
v ˆ
!
v
!
v
!
v
!
v ˆ
O ( n
2
v ˆ
max
v
max
v
max
n