Greedy Algorithms - Data Structures - Lecture Slides, Slides of Data Structures and Algorithms

Some concept of Data Structures are Abstract, Balance Factor, Complete Binary Tree, Dynamically, Storage, Implementation, Sequential Search, Advanced Data Structures, Graph Coloring Two, Insertion Sort. Main points of this lecture are: Greedy Algorithms, Real-World Problems, Optimal Solution, Optimization, Candidate Solutions, Change-Making Problem, Fewest Numbers, Purchase, Possible, Remainder

Typology: Slides

2012/2013

Uploaded on 04/30/2013

dinpal
dinpal 🇮🇳

3.6

(12)

73 documents

1 / 30

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Greedy Algorithms
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e

Partial preview of the text

Download Greedy Algorithms - Data Structures - Lecture Slides and more Slides Data Structures and Algorithms in PDF only on Docsity!

Greedy Algorithms

Greedy Algorithms:

Many real-world problems are optimization problems in that they attempt to find an optimal solution among many possible candidate solutions. A familiar scenario is the change-making problem that we often encounter at a cash register: receiving the fewest numbers of coins to make change after paying the bill for a purchase. For example, the purchase is worth $5.27, how many coins and what coins does a cash register return after paying a $6 bill? The Make-Change algorithm:

For a given amount (e.g. $0.73), use as many quarters ($0.25) as possible without exceeding the amount. Use as many dimes ($.10) for the remainder, then use as many nickels ($.05) as possible. Finally, use the pennies ($.01) for the rest.

A Generic Greedy Algorithm:

(1) Initialize C to be the set of candidate solutions (2) Initialize a set S = the empty set ∅ (the set is to be the optimal solution we are constructing). (3) While C ≠ ∅ and S is (still) not a solution do (3.1) select x from set C using a greedy strategy (3.2) delete x from C (3.3) if { x } ∪ S is a feasible solution, then S = S ∪ { x } (i.e., add x to set S ) (4) if S is a solution then return S (5) else return failure

In general, a greedy algorithm is efficient because it makes a sequence of (local) decisions and never backtracks. The solution is not always optimal, however.

The Knapsack Problem:

Given n objects each have a weight wi and a value v (^) i , and given a knapsack of total capacity W. The problem is to pack the knapsack with these objects in order to maximize the total value of those objects packed without exceeding the knapsack’s capacity. More formally, let x (^) i denote the fraction of the object i to be included in the knapsack, 0 ≤ x (^) i ≤ 1, for 1 ≤ in. The problem is to find values for the x (^) i such that

Note that we may assume because otherwise, we

would choose x (^) i = 1 for each i which would be an obvious

optimal solution.

∑ ≤^ ∑ = =

n i i i

n i i^ i

x w W xv 1 1

and is maximized.

∑ >

n i i^

w W 1

The Optimal Knapsack Algorithm:

Input: an integer n , positive values wi and v (^) i , for 1 ≤ in , and another positive value W. Output: n values x (^) i such that 0 ≤ x (^) i ≤ 1 and

Algorithm (of time complexity O( n lg n )) (1) Sort the n objects from large to small based on the ratios v (^) i / wi. We assume the arrays w [1.. n ] and v [1.. n ] store the respective weights and values after sorting. (2) initialize array x [1.. n ] to zeros. (3) weight = 0; i = 1 (4) while ( in and weight < W ) do (4.1) if weight + w [ i ] ≤ W then x [ i ] = 1 (4.2) else x [ i ] = ( W – weight) / w[ i ] (4.3) weight = weight + x[ i ] * w[ i ] (4.4) i ++

∑ ≤^ ∑ = =

n i i i

n i i^ i

x w W xv 1 1

and is maximized.

Optimal 2-way Merge patterns and Huffman Codes:

Example. Suppose there are 3 sorted lists L 1 , L 2 , and L 3 , of sizes 30, 20, and 10, respectively, which need to be merged into a combined sorted list, but we can merge only two at a time. We intend to find an optimal merge pattern which minimizes the total number of comparisons. For example, we can merge L 1 and L 2 , which uses 30 + 20 = 50 comparisons resulting in a list of size 50. We can then merge this list with list L 3 , using another 50 + 10 = 60 comparisons, so the total number of comparisons is 50 + 60 = 110. Alternatively, we can merge lists L 2 and L 3 , using 20 + 10 = 30 comparisons, the resulting list (size 30) can then be merged with list L 1 , for another 30 + 30 = 60 comparisons. So the total number of comparisons is 30 + 60 = 90. It doesn’t take long to see that this latter merge pattern is the optimal one.

Optimal Binary Merge Tree Algorithm:

Input: n leaf nodes each have an integer size, n ≥ 2. Output: a binary tree with the given leaf nodes which has a minimum total weighted external path lengths Algorithm: (1) create a min-heap T [1.. n ] based on the n initial sizes. (2) while (the heap size ≥ 2) do (2.1) delete from the heap two smallest values, call them a and b , create a parent node of size a + b for the nodes corresponding to these two values (2.2) insert the value ( a + b ) into the heap which corresponds to the node created in Step (2.1)

When the algorithm terminates, there is a single value left in the heap whose corresponding node is the root of the optimal binary merge tree. The algorithm’s time complexity is O( n lg n ) because Step (1) takes O( n ) time; Step (2) runs O( n ) iterations, in which each iteration takes O(lg n ) time.

Example of the optimal merge tree algorithm:

2 3 5 7 9

2 3

5 5 7 9

2 3

5 5

10

7 9

Initially, 5 leaf nodes with sizes

Iteration 1: merge 2 and 3 into 5

Iteration 2: merge 5 and 5 into 10

16 Iteration 3: merge 7 and 9 (chosen among 7, 9, and 10) into 16

2 3

5

10 5 7 9

16

26 Iteration 4: merge 10 and 16 into 26

Cost = 23 + 33 + 52 + 7

  • 9*2 = 57.

Huffman Codes:

Suppose we wish to save a text (ASCII) file on the disk or to transmit it though a network using an encoding scheme that minimizes the number of bits required. Without compression , characters are typically encoded by their ASCII codes with 8 bits per character. We can do better if we have the freedom to design our own encoding.

Example. Given a text file that uses only 5 different letters (a, e, i, s, t), the space character, and the newline character. Since there are 7 different characters, we could use 3 bits per character because that allows 8 bit patterns ranging from 000 through 111 (so we still one pattern to spare). The following table shows the encoding of characters, their frequencies, and the size of encoded (compressed) file.

Character Frequency Code Total bits a 10 000 30 e 15 001 45 i 12 010 36 s 3 011 9 t 4 100 12 space 13 101 39 newline 1 110 3 Total 58 174

Code Total bits 001 30 01 30 10 24 00000 15 0001 16 11 26 00001 5 146

Fixed-length encoding Variable-length encoding

If we can use variable lengths for the codes, we can actually compress more as shown in the above. However, the codes must satisfy the property that no code is the prefix of another code; such code is called a prefix code.

We note that the encoded file size is equal to the total weighted external path lengths if we assign the frequency to each leaf node. For example,

3 ‘ s’

1 ‘ n’

4 ‘t’

10 ‘ a’

15

‘ e’ 12 ‘ i’

13 ‘ ’ Total file size = 35 + 15 + 44 + 103 + 152 + 122 + 13*2 = 146, which is exactly the total weighted external path lengths.

We also note that in an optimal prefix code, each node in the tree has either no children or has two. Thus, the optimal binary merge tree algorithm finds the optimal code (Huffman code). Nodeone child^ x^ has only y

x y

x

Merge x and y , reducing total size

Greedy Strategies Applied to Graph problems:

We first review some notations and terms about graphs. A graph consists of vertices (nodes) and edges (arcs, links), in which each edge “connects” two vertices (not necessarily distinct). More formally, a graph G = ( V , E ), where V and E denote the sets of vertices and edges, respectively.

1

2 3

4

a b

c d e

In this example, V = {1, 2, 3, 4}, E = { a , b , c , d , e }. Edges c and d are parallel edges; edge e is a self-loop. A path is a sequence of “adjacent” edges, e.g., path abeb , path acdab.

Both directed and undirected graphs appear often and naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2- dimensional arrays. Suppose G = ( V , E ), and | V | = n. Declare an array T [1.. n ][1.. n ] so that T [ i ][ j ] = 1 if there is an edge ( i , j ) ∈ E ; 0 otherwise. (Note that in an undirected graph, edges ( i , j ) and ( j , i ) refer to the same edge.)

1

4

(^2 ) 

0 1 0 0

1 0 1 0

0 0 0 1

0 1 0 0

1 2 3 4 1 2 3 4

A 2-dimensional array for the digraph, called the adjacency matrix.

i

j

Sometimes, edges of a graph or digraph are given a positive weight or cost value. In that case, the adjacency matrix can easily modified so that T [ i ][ j ] = the weight of edge ( i , j ); 0 if there is no edge ( i , j ). Since the adjacency matrix may contain many zeros (when the graph has few edges, known as sparse ), a space-efficient representation uses linked lists representing the edges, known as the adjacency list representation.

1

4

(^2 )

1 2 3 4

2 4 3 1 2 The adjacency lists for the digraph, which can store edge weights by adding another field in the list nodes.