




















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Detailed Lecture Notes with Description
Typology: Cheat Sheet
1 / 28
This page cannot be seen from the preview
Don't miss anything!





















Introduction - Greedy: Huffman Coding - Knapsack Problem - Minimum Spanning Tree (Kruskals Algorithm). Dynamic Programming: 0/1 Knapsack Problem - Travelling Salesman Problem - Multistage Graph- Forward path and backward path.
Greedy Method A greedy algorithm is an algorithmic paradigm that follows the problem solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum. Greedy is a strategy that works well on optimization problems with the following characteristics:
Huffman codes are very effective and widely used technique for compressing data. Huffman encoding problem is of finding the minimum length bit string which can be used to encode a string of symbols. It uses a table of frequencies of occurrence of each character to represent each character as a binary string, optimally. It uses a simple heap based priority queue. Each leaf is labeled with a character and its frequency of occurrence. Each internal node is labeled with the sum of the weights of the leaves in its subtree. Huffman coding is a lossless data compression algorithm. The idea is to assign variable-legth codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent character gets the largest code. The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code assigned to one character is not prefix of code assigned to any other character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bit stream. Huffman's greedy algorithm looks at the occurrence of each character and it as a binary string in an optimal way. There are mainly two major parts in Huffman Coding
Fixed Length Code : In fixed length code, needs 3 bits to represent six(6) characters. A B C D E F Frequency 45,000 13,000 12,000 16,000 9,000 5, Fixed Length code 000 001 010 011 100 101
Practice Problems
There are n items in a store. For i =1,2,... , n, item i has weight wi > 0 and worth vi > 0. Thief can carry a maximum weight of W pounds in a knapsack. In this version of a problem the items can be broken into smaller piece, so the thief may decide to carry only a fraction xi of object i , where 0 ≤ xi ≤ 1. Item i contributes xiwi to the total weight in the knapsack, and xivi to the value of the load. Applications The problem often arises in resource allocation where there are financial constraints and is studied in fields such as combinatory, computer science, complexity theory, cryptography and applied mathematics. Algorithm Greedy-fractional-knapsack ( w, v, W ) FOR i =1 to n do x [ i ] = weight = 0 while weight < W do i = best remaining item IF weight + w [ i ] ≤ W then x [ i ] = 1 weight = weight + w [ i ] else x [ i ] = ( w - weight) / w [ i ] weight = W return x
Kruskal's algorithm to find minimum cost spanning tree uses greedy approach. This algorithm treats the graph as a forest and every node it as an individual tree. A tree connects to another only and only if it has least cost among all available options and does not violate MST properties. Algorithm
Step 2 - Arrange all edges in their increasing order of weight: Next step is to create a set of edges & weight and arrange them in ascending order of weightage (cost). Step 3 - Add the edge which has least weightage Now we start adding edges to graph beginning from the one which has least weight. At all time, we shall keep checking that the spanning properties are remain intact. In case, by adding one edge, the spanning tree property does not hold then we shall consider not to include the edge in graph. The least cost is 2 and edges involved are B,D and D,T so we add them. Adding them does not violate spanning tree properties so we continue to our next edge selection. Next cost is 3, and associated edges are A,C and C,D. So we add them − Next cost in the table is 4, and we observe that adding it will create a circuit in the graph
It is used when the solution can be recursively described in terms of solutions to subproblems (optimal substructure). Algorithm finds solutions to subproblems and stores them in memory for later use. More efficient than “brute-force methods”, which solve the same subproblems over and over again. Optimal substructure: Optimal solution to problem consists of optimal solutions to subproblems Overlapping subproblems: Few subproblems in total, many recurring instances of each Bottom up approach: Solve bottom-up, building a table of solved subproblems that are used to solve larger ones
make an optimal choice (without knowing solutions to subproblems) and then solve remaining subproblems solve subproblems first, then use those solutions to make an optimal choice solutions are top down solutions are bottom up Best choice does not depend on solutions to subproblems. Choice at each step depends on solutions to subproblems Make best choice at current time, then work on subproblems. Best choice does depend on choices so far Many subproblems are repeated in solving larger problems. This repetition results in great savings when the computation is bottom up Optimal Substructure : solution to problem contains within it optimal solutions to subproblems Optimal Substructure : solution to problem contains within it optimal solutions to subproblems Fractional knapsack: at each step, choose item with highest ratio 0 - 1 Knapsack: to determine whether to include item i for a given size, must consider best solution, at that size, with and without item i
The most common problem being solved is the 0 - 1 knapsack problem , which restricts the number xi of copies of each kind of item to zero or one. Given a set of n items numbered from 1 up to n , each with a weight wi and a value vi , along with a maximum weight capacity W , maximize subject to and. Here xi represents the number of instances of items i to include in the knapsack. Informally, the problem is to maximize the sum of the values of the items in the knapsack so that the sum of the weights is less than or equal to the knapsack's capacity. Optimal substructure: To consider all subsets of items, there can be two cases for every item: (1) the item is included in the optimal subset, (2) not included in the optimal set. Therefore, the maximum value that can be obtained from n items is max of following two values. (1) Maximum value obtained by n-1 items and W weight (excluding nth item). (2) Value of nth item plus maximum value obtained by n-1 items and W minus weight of the nth item (including nth item). If weight of nth item is greater than W, then the nth item cannot be included and case 1 is the only possibility. Pseudo Code // Input: // Values (stored in array v) // Weights (stored in array w) // Number of distinct items (n) // Knapsack capacity (W) for j from 0 to W do: m[0, j] := 0 for i from 1 to n do: for j from 0 to W do: if w[i] <= j then: m[i, j] := max(m[i-1, j], m[i-1, j-w[i]] + v[i]) else: m[i, j] := m[i-1, j]
ii) Second item is selected Object 3 1 2 5 4 W 30 10 20 50 40 Ratio = !" #" 2.2^2 1.5^ 1.2^1 Selected item 1 1 Sack Weight = 30 +10 = 40 < Sack value = 66 + (210) = iii) Third item is selected Object 3 1 2 5 4 W 30 10 20 50 40 Ratio = !" #" 2.2^2 1.5^ 1.2^1 Selected item 1 1 1 Sack Weight = 40 +20 = 60 < Sack value = 86 + (1.520) = 116 iv) Fourth item is selected Object 3 1 2 5 4 W 30 10 20 50 40 Ratio = !" #" 2.2^2 1.5^ 1.2^1 Selected item 1 1 1 1 Sack Weight = 60 +50 = 110 >100 Hence item 4 is selected partially. Sack Weight = 60+(100-60) = 100 <= Sack value = 116 + (1.2*40) = 116+48= 164 Now the sack is FULL. Hence we stop Total selected weight 100 and total value = 2. 2 ∗ 30 + 2 ∗ 10 + 1. 5 ∗ 20 + 1. 2 ∗ 40 = 164. Total value = 164
O-1 Knapsack Input: 5 objects, C = 100 W 10 20 30 40 50 V 20 30 66 40 60 Solution: Given Total no of items = 5, sack capacity = 100 ,
Travelling Salesman Problem (TSP): Given a set of cities and distance between every pair of cities, the problem is to find the shortest possible route that visits every city exactly once and returns to the starting point. Hamiltonian Path in an undirected graph is a path that visits each vertex exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a Hamiltonian Path such that there is an edge (in graph) from the last vertex to the first vertex of the Hamiltonian Path. Note the difference between Hamiltonian Cycle and TSP. The Hamiltoninan cycle problem is to find if there exist a tour that visits every city exactly once. Here we know that Hamiltonian Tour exists (because the graph is complete) and in fact many such tours exist, the problem is to find a minimum weight Hamiltonian Cycle. For example, consider the graph shown in figure on right side. A TSP tour in the graph is 1 - 2 - 4 - 3 -
Dynamic Programming: Let the given set of vertices be {1, 2, 3, 4,….n}. Let us consider 1 as starting and ending point of output. For every other vertex i (other than 1), we find the minimum cost path with 1 as the starting point, i as the ending point and all vertices appearing exactly once. Let the cost of this path be cost(i), the cost of corresponding Cycle would be cost(i) + dist(i, 1) where dist(i, 1) is the distance from i to 1. Finally, we return the minimum of all [cost(i) + dist(i, 1)] values. This looks simple so far. Now the question is how to get cost(i)? To calculate cost(i) using Dynamic Programming, we need to have some recursive relation in terms of sub-problems. Let us define a term C(S, i) be the cost of the minimum cost path visiting each vertex in set S exactly once, starting at 1 and ending at i. We start with all subsets of size 2 and calculate C(S, i) for all subsets where S is the subset, then we calculate C(S, i) for all subsets S of size 3 and so on. Note that 1 must be present in every subset. If size of S is 2, then S must be {1, i}, C(S, i) = dist(1, i) Else if size of S is greater than 2. C(S, i) = min { C(S-{i}, j) + dis(j, i)} where j belongs to S, j != i and j != 1. For a set of size n, we consider n-2 subsets each of size n-1 such that all subsets don’t have nth in them. Using the above recurrence relation, we can write dynamic programming based solution. There are at most O(n*2n) subproblems, and each one takes linear time to solve. The total running time is therefore O(n^2 *2n). The time complexity is much less than O(n!), but still exponential. Space required is also exponential. So this approach is also infeasible even for slightly higher number of vertices. Example Distance matrix: g(2,Ø) = c21 = 1 g(3,ø) = c31 = 15 g(4,ø) = c41 = 6
“ Dynamic programming is an algorithm design method that can be used when the solution to a problem may be viewed as the result of a sequence of decisions” Minimum spanning of multistage graph using dynamic programming a. Forward Approach b. Backward Approach a. Forward Approach: Spanning a multiple stage graph using following considerations · Identify source and destination nodes. · Find all possible paths to reach destination from source and sum of weights of adjacent nodes. · The path giving the least weight will be the minimum spanning path. Consider a multistage graph given below Identifying source and destination nodes. Source node - > S Destination node - > D The possible ways to connect S & D d(S,D) = min { 1 + d(A,D) ; 2 + d(F,D) ; 5 + d(C,D) } (1) d(A,D) = min{ 4 + d(B,D) ; 9 + d(G,D) } = min{ 4 + 18 ; 9 + 13 } ‘Substation weights = min{ 22 ; 22 } d(A,D) = 22 (2)
d(F,D) = min{ 9 + d(B,D) ; 5 + d(G,D) ; 16 + d(E,D) } = min{ 9 + d(B,D) ; 5 + d(G,D) ; 16 + d(E,D) } = min{ 9 + 18 ; 5 + 13 ; 16 + 2} ‘Substation weights = min{ 27 ; 18 ; 18} d(F,D) = 18 (3) d(C,D) = min{ 2 + d(E,D) } = min{ 2 + 2} ‘Substation weights = min{ 4} d(C,D) = 4 (4) substitution of 2,3,4 in 1 gives d(S,D) = min { 1 + d(A,D) ; 2 + d(F,D) ; 5 + d(C,D) } d(S,D) = min { 1 + 22 ; 2 + 18 ; 5 + 4 } d(S,D) = min { 23 ; 20 ; 9 } d(S,D) = 9 Hence according to Forward Approach minimum spanning path from S to D is S - > C - > E - > D b. Backward Approach: Backward Approach is just the reverse of forward approach, here Source node and the next node is considered at every stage. Considering same Multi staged Graph, 1 - > 2 Source node S to next nodes A, F and C d(S,A) = 1