Load Balancing-Constructing Algorithms and Representing Data-Lecture Slides, Slides of Data Representation and Algorithm Design

This lecture was delivered by Dr. Ameet Shashank at B R Ambedkar National Institute of Technology. Its relate to Data Representation and Algorithm Design course. Its main points are: Approximation, Algorithms, NP-hard, Load, Balancing, Makespan, List, Scheduling, Analysis

Typology: Slides

2011/2012

Uploaded on 07/15/2012

saandeep
saandeep 🇮🇳

4.5

(6)

99 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Chapter 11
Approximation
Algorithms
2
Approximation Algorithms
Q. Suppose I need to solve an NP-hard problem. What should I do?
A. Theory says you're unlikely to find a poly-time algorithm.
Must sacrifice one of three desired features.
Solve problem to optimality.
Solve problem in poly-time.
Solve arbitrary instances of the problem.
ρ-approximation algorithm.
Guaranteed to run in poly-time.
Guaranteed to solve arbitrary instance of the problem
Guaranteed to find solution within ratio ρ of true optimum.
Challenge. Need to prove a solution's value is close to optimum, without
even knowing what optimum value is!
11.1 Load Balancing
4
Load Balancing
Input. m identical machines; n jobs, job j has processing time tj.
Job j must run contiguously on one machine.
A machine can process at most one job at a time.
Def. Let J(i) be the subset of jobs assigned to machine i. The
load of machine i is Li = Σj J(i) tj.
Def. The makespan is the maximum load on any machine L = maxi Li.
Load balancing. Assign each job to a machine to minimize makespan.
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Load Balancing-Constructing Algorithms and Representing Data-Lecture Slides and more Slides Data Representation and Algorithm Design in PDF only on Docsity!

1

Chapter 11

Approximation

Algorithms

2

Approximation Algorithms

Q. Suppose I need to solve an NP-hard problem. What should I do?

A. Theory says you're unlikely to find a poly-time algorithm.

Must sacrifice one of three desired features.

 Solve problem to optimality.

 Solve problem in poly-time.

 Solve arbitrary instances of the problem.

ρ-approximation algorithm.

 Guaranteed to run in poly-time.

 Guaranteed to solve arbitrary instance of the problem

 Guaranteed to find solution within ratio ρ of true optimum.

Challenge. Need to prove a solution's value is close to optimum, without

even knowing what optimum value is!

11.1 Load Balancing

4

Load Balancing

Input. m identical machines; n jobs, job j has processing time t

j

 Job j must run contiguously on one machine.

 A machine can process at most one job at a time.

Def. Let J(i) be the subset of jobs assigned to machine i. The

load of machine i is L

i

j ∈ J(i)

t

j

Def. The makespan is the maximum load on any machine L = max

i

L

i

Load balancing. Assign each job to a machine to minimize makespan.

5

List-scheduling algorithm.

 Consider n jobs in some fixed order.

 Assign job j to machine whose load is smallest so far.

Implementation. O(n log m) using a priority queue.

Load Balancing: List Scheduling

List-Scheduling(m, n, t

1

,t

2

,…,t

n

) {

for i = 1 to m {

L

i

0

J(i) ← φ

}

for j = 1 to n {

i = argmin

k

L

k

J(i)J(i){j}

L

i

L

i

+ t

j

}

return J(1), …, J(m)

}

jobs assigned to machine i

load on machine i

machine i has smallest load

assign job j to machine i

update load of machine i

6

Load Balancing: List Scheduling Analysis

Theorem. [Graham, 1966] Greedy algorithm is a 2-approximation.

 First worst-case analysis of an approximation algorithm.

 Need to compare resulting solution with optimal makespan L*.

Lemma 1. The optimal makespan L* ≥ max

j

t

j

Pf. Some machine must process the most time-consuming job. ▪

Lemma 2. The optimal makespan

Pf.

 The total processing time is Σ

j

t

j

 One of m machines must do at least a 1/m fraction of total work. ▪

!

L * "

1

m

t

j j

7

Load Balancing: List Scheduling Analysis

Theorem. Greedy algorithm is a 2-approximation.

Pf. Consider load L

i

of bottleneck machine i.

 Let j be last job scheduled on machine i.

 When job j assigned to machine i, i had smallest load. Its load

before assignment is L

i

  • t

j

⇒ L

i

  • t

j

≤ L

k

for all 1 ≤ k ≤ m.

j

0

L = L

i

L

i

  • t

j

machine i

blue jobs scheduled before j

8

Load Balancing: List Scheduling Analysis

Theorem. Greedy algorithm is a 2-approximation.

Pf. Consider load L

i

of bottleneck machine i.

 Let j be last job scheduled on machine i.

 When job j assigned to machine i, i had smallest load. Its load

before assignment is L

i

  • t

j

⇒ L

i

  • t

j

≤ L

k

for all 1 ≤ k ≤ m.

 Sum inequalities over all k and divide by m:

 Now ▪

L

i

" t

j

1

m

L

k k

1

m

t

k k

# L *

L

i

= ( L

i

" t

j

L *

  • t

j

L *

# 2 L *.

Lemma 2

Lemma 1

13

Load Balancing: LPT Rule

Q. Is our 3/2 analysis tight?

A. No.

Theorem. [Graham, 1969] LPT rule is a 4/3-approximation.

Pf. More sophisticated analysis of same algorithm.

Q. Is Graham's 4/3 analysis tight?

A. Essentially yes.

Ex: m machines, n = 2m+1 jobs, 2 jobs of length m+1, m+2, …, 2m-1 and

one job of length m.

11.2 Center Selection

15

center

r(C)

Center Selection Problem

Input. Set of n sites s

1

, …, s

n

and integer k > 0.

Center selection problem. Select k centers C so that maximum

distance from a site to nearest center is minimized.

site

k = 4

16

Center Selection Problem

Input. Set of n sites s

1

, …, s

n

and integer k > 0.

Center selection problem. Select k centers C so that maximum

distance from a site to nearest center is minimized.

Notation.

 dist(x, y) = distance between x and y.

 dist(s

i

, C) = min

c ∈ C

dist(s

i

, c) = distance from s

i

to closest center.

 r(C) = max

i

dist(s

i

, C) = smallest covering radius.

Goal. Find set of centers C that minimizes r(C), subject to |C| = k.

Distance function properties.

 dist(x, x) = 0 (identity)

 dist(x, y) = dist(y, x) (symmetry)

 dist(x, y) ≤ dist(x, z) + dist(z, y) (triangle inequality)

17

center

site

Center Selection Example

Ex: each site is a point in the plane, a center can be any point in the

plane, dist(x, y) = Euclidean distance.

Remark: search can be infinite!

r(C)

18

Greedy Algorithm: A False Start

Greedy algorithm. Put the first center at the best possible location

for a single center, and then keep adding centers so as to reduce the

covering radius each time by as much as possible.

Remark: arbitrarily bad!

greedy center 1

k = 2 centers site

center

19

Center Selection: Greedy Algorithm

Greedy algorithm. Repeatedly choose the next center to be the site

farthest from any existing center.

Observation. Upon termination all centers in C are pairwise at least r(C)

apart.

Pf. By construction of algorithm.

Greedy-Center-Selection(k, n, s

1

,s

2

,…,s

n

) {

C = φ

repeat k times {

Select a site s

i

with maximum dist(s

i

, C)

Add s

i

to C

}

return C

}

site farthest from any center

20

Center Selection: Analysis of Greedy Algorithm

Theorem. Let C* be an optimal set of centers. Then r(C) ≤ 2r(C*).

Pf. (by contradiction) Assume r(C*) < ½ r(C).

 For each site c

i

in C, consider ball of radius ½ r(C) around it.

 Exactly one c

i

  • in each ball; let c

i

be the site paired with c

i

 Consider any site s and its closest center c

i

  • in C*.

 dist(s, C) ≤ dist(s, c

i

) ≤ dist(s, c

i

*) + dist(c

i

*, c

i

) ≤ 2r(C*).

 Thus r(C) ≤ 2r(C*). ▪

C*

sites

½ r(C)

c

i

c

i

s

≤ r(C*) since c i

  • is closest center

½ r(C)

½ r(C)

Δ-inequality

25

Pricing Method

Pricing method. Set prices and find vertex cover simultaneously.

Weighted-Vertex-Cover-Approx(G, w) {

foreach e in E

p

e

= 0

while (edge i-j such that neither i nor j are tight)

select such an edge e

increase p

e

as much as possible until i or j tight

}

Sset of all tight nodes

return S

}

i

e i j

e

p = w

= (,)

26

Pricing Method

vertex weight

Figure 11.

price of edge a-b

27

Pricing Method: Analysis

Theorem. Pricing method is a 2-approximation.

Pf.

 Algorithm terminates since at least one new node becomes tight

after each iteration of while loop.

 Let S = set of all tight nodes upon termination of algorithm. S is a

vertex cover: if some edge i-j is uncovered, then neither i nor j is

tight. But then while loop would not terminate.

 Let S* be optimal vertex cover. We show w(S) ≤ 2 w(S*).

w ( S ) = w

i

i " S

i " S

# p

e

e =( i , j )

i " V

# p

e

e =( i , j )

# = 2 p

e

e " E

# $ 2 w ( S *).

all nodes in S are tight S ⊆ V,

prices ≥ 0

fairness lemma each edge counted twice

11.6 LP Rounding: Vertex Cover

29

Weighted Vertex Cover

Weighted vertex cover. Given an undirected graph G = (V, E) with

vertex weights w

i

≥ 0 , find a minimum weight subset of nodes S such

that every edge is incident to at least one vertex in S.

3

6

10

7

A

E

H

B

D I

C

F

J

G

6

16

10

7

23

9

10

9

33

total weight = 55

32

30

Weighted Vertex Cover: IP Formulation

Weighted vertex cover. Given an undirected graph G = (V, E) with

vertex weights w

i

≥ 0 , find a minimum weight subset of nodes S such

that every edge is incident to at least one vertex in S.

Integer programming formulation.

 Model inclusion of each vertex i using a 0/1 variable x

i

Vertex covers in 1-1 correspondence with 0/1 assignments:

S = {i ∈ V : x

i

 Objective function: maximize Σ

i

w

i

x

i

 Must take either i or j: x

i

  • x

j

x

i

0 if vertex i is not in vertex cover

1 if vertex i is in vertex cover

31

Weighted Vertex Cover: IP Formulation

Weighted vertex cover. Integer programming formulation.

Observation. If x* is optimal solution to (ILP), then S = {i ∈ V : x*

i

is a min weight vertex cover.

( ILP ) min w

i

x

i

i " V

s. t. x

i

  • x

j

$ 1 ( i , j ) " E

x

i

" {0,1} i " V

32

Integer Programming

INTEGER-PROGRAMMING. Given integers a

ij

and b

i

, find integers x

j

that

satisfy:

Observation. Vertex cover formulation proves that integer

programming is NP-hard search problem.

a

ij

x

j

j = 1

n

" # b

i

1 $ i $ m

x

j

0 1 $ j $ n

x

j

integral 1 $ j $ n

even if all coefficients are 0/1 and

at most two variables per inequality

!

max c

t

x

s. t. Ax " b

x integral

37

Weighted Vertex Cover

Theorem. 2-approximation algorithm for weighted vertex cover.

Theorem. [Dinur-Safra 2001] If P ≠ NP, then no ρ-approximation

for ρ < 1.3607, even with unit weights.

Open research problem. Close the gap.

10 √ 5 - 21

* 11.7 Load Balancing Reloaded

39

Generalized Load Balancing

Input. Set of m machines M; set of n jobs J.

 Job j must run contiguously on an authorized machine in M

j

⊆ M.

 Job j has processing time t

j

 Each machine can process at most one job at a time.

Def. Let J(i) be the subset of jobs assigned to machine i. The

load of machine i is L

i

j ∈ J(i)

t

j

Def. The makespan is the maximum load on any machine = max

i

L

i

Generalized load balancing. Assign each job to an authorized machine

to minimize makespan.

40

Generalized Load Balancing: Integer Linear Program and Relaxation

ILP formulation. x

ij

= time machine i spends processing job j.

LP relaxation.

( IP ) min L

s. t. x

i j

i

" = t

j

for all j # J

x

i j

j

" $ L for all i # M

x

i j

{ 0 , t

j

} for all j # J and i # M

j

x

i j

= 0 for all j # J and i % M

j

( LP ) min L

s. t. x

i j

i

" = t

j

for all j # J

x

i j

j

" $ L for all i # M

x

i j

% 0 for all j # J and i # M

j

x

i j

= 0 for all j # J and i & M

j

41

Generalized Load Balancing: Lower Bounds

Lemma 1. Let L be the optimal value to the LP. Then, the optimal

makespan L* ≥ L.

Pf. LP has fewer constraints than IP formulation.

Lemma 2. The optimal makespan L* ≥ max

j

t

j

Pf. Some machine must process the most time-consuming job. ▪

42

Generalized Load Balancing: Structure of LP Solution

Lemma 3. Let x be solution to LP. Let G(x) be the graph with an edge

from machine i to job j if x

ij

> 0. Then G(x) is acyclic.

Pf. (deferred)

G(x) acyclic

job

machine

can transform x into another LP solution where

G(x) is acyclic if LP solver doesn't return such an x

G(x) cyclic

x

ij

> 0

43

Generalized Load Balancing: Rounding

Rounded solution. Find LP solution x where G(x) is a forest. Root

forest G(x) at some arbitrary machine node r.

 If job j is a leaf node, assign j to its parent machine i.

 If job j is not a leaf node, assign j to one of its children.

Lemma 4. Rounded solution only assigns jobs to authorized machines.

Pf. If job j is assigned to machine i, then x

ij

> 0. LP solution can only

assign positive value to authorized machines. ▪

job

machine

44

Generalized Load Balancing: Analysis

Lemma 5. If job j is a leaf node and machine i = parent(j), then x

ij

= t

j

Pf. Since i is a leaf, x

ij

= 0 for all j ≠ parent(i). LP constraint

guarantees Σ

i

x

ij

= t

j

Lemma 6. At most one non-leaf job is assigned to a machine.

Pf. The only possible non-leaf job assigned to machine i is parent(i). ▪

job

machine

11.8 Knapsack Problem

50

Polynomial Time Approximation Scheme

PTAS. (1 + ε)-approximation algorithm for any constant ε > 0.

 Load balancing. [Hochbaum-Shmoys 1987]

 Euclidean TSP. [Arora 1996]

Consequence. PTAS produces arbitrarily high quality solution, but trades

off accuracy for time.

This section. PTAS for knapsack problem via rounding and scaling.

51

Knapsack Problem

Knapsack problem.

 Given n objects and a "knapsack."

 Item i has value v

i

> 0 and weighs w

i

 Knapsack can carry weight up to W.

 Goal: fill knapsack so as to maximize total value.

Ex: { 3, 4 } has value 40.

1

Value

18

22

28

1

Weight

5

6

6 2

7

Item

1

3

4

5

2

W = 11

we'll assume w

i

≤ W

52

Knapsack is NP-Complete

KNAPSACK: Given a finite set X, nonnegative weights w

i

, nonnegative

values v

i

, a weight limit W, and a target value V, is there a subset S ⊆ X

such that:

SUBSET-SUM: Given a finite set X, nonnegative values u

i

, and an integer

U, is there a subset S ⊆ X whose elements sum to exactly U?

Claim. SUBSET-SUM ≤

P

KNAPSACK.

Pf. Given instance (u

1

, …, u

n

, U) of SUBSET-SUM, create KNAPSACK

instance:

w

i

i " S

$ W

v

i

i " S

% V

v

i

= w

i

= u

i

u

i

i " S

# $ U

V = W = U u

i

i " S

# % U

docsity.com

53

Knapsack Problem: Dynamic Programming 1

Def. OPT(i, w) = max value subset of items 1,..., i with weight limit w.

 Case 1: OPT does not select item i.

  • OPT selects best of 1, …, i– 1 using up to weight limit w

 Case 2: OPT selects item i.

  • new weight limit = w – w

i

  • OPT selects best of 1, …, i– 1 using up to weight limit w – w

i

Running time. O(n W).

 W = weight limit.

 Not polynomial in input size!

OPT ( i , w ) =

0 if i = 0

OPT ( i " 1 , w ) if w

i

> w

max OPT ( i " 1 , w ), v

i

  • OPT ( i " 1 , w " w

i

otherwise

54

Knapsack Problem: Dynamic Programming II

Def. OPT(i, v) = min weight subset of items 1, …, i that yields value

exactly v.

 Case 1: OPT does not select item i.

  • OPT selects best of 1, …, i-1 that achieves exactly value v

 Case 2: OPT selects item i.

  • consumes weight w

i

, new value needed = v – v

i

  • OPT selects best of 1, …, i-1 that achieves exactly value v

Running time. O(n V*) = O(n

2

v

max

 V* = optimal value = maximum v such that OPT(n, v) ≤ W.

 Not polynomial in input size!

OPT ( i , v ) =

0 if v = 0

" if i = 0, v > 0

OPT ( i # 1 , v ) if v

i

> v

min OPT ( i # 1 , v ), w

i

  • OPT ( i # 1 , v # v

i

otherwise

V* ≤ n v max

55

Knapsack: FPTAS

Intuition for approximation algorithm.

 Round all values up to lie in smaller range.

 Run dynamic programming algorithm on rounded instance.

 Return optimal items in rounded instance.

W = 11

original instance rounded instance

W = 11

1

Value

18

22

28

1

Weight

5

6

6 2

7

Item

1

3

4

5

2

934,

Value

17,810,

21,217,

27,343,

1

Weight

5

6

5,956, 2

7

Item

1

3

4

5

2

56

Knapsack: FPTAS

Knapsack FPTAS. Round up all values:

  • v

max

= largest value in original instance

  • ε = precision parameter
  • θ = scaling factor = ε v

max

/ n

Observation. Optimal solution to problems with or are equivalent.

Intuition. close to v so optimal solution using is nearly optimal;

small and integral so dynamic programming algorithm is fast.

Running time. O(n

3

/ ε).

 Dynamic program II running time is , where

v

i

v

i

", v ˆ

i

v

i

!

v ˆ

!

v

!

v

!

v

!

v ˆ

O ( n

2

v ˆ

max

v

max

v

max

n