Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Huffman Encoding, Correctness - Design and Analysis - Study Notes, Study notes of Digital Systems Design

Jaypee University of Engineering & Technology Digital Systems Design

Huffman Encoding Correctness, Optimal prefix code tree T, Swap x and b in tree prefix tree T, Activity Selection, Maximum depth in the tree, Claim and Proof are the key points in this study notes file.

Typology: Study notes

2011/2012

Uploaded on 11/03/2012

ankitay 🇮🇳

4.4

(50)

106 documents

1 / 5

This page cannot be seen from the preview

Don't miss anything!

Lecture No. 25

7.2.2 Huffman Encoding: Correctness

Huffman algorithm uses a greedy approach to generate a prefix code T that minimizes the

expected length B(T) of the encoded string. In other words, Huffman algorithm generates

an optimum prefix code.

The question that remains is that why is the algorithm correct?

Recall that the cost of any encoding tree T is

Our approach to prove the correctness of Huffman Encoding will be to show that any tree

that differs from the one constructed by Huffman algorithm can be converted into one

that is equal to Huffman’s tree without increasing its costs. Note that the binary tree

constructed by Huffman algorithm is a full binary tree.

Claim:

Consider two characters x and y with the smallest probabilities. Then there is optimal

code tree in which

these two characters are siblings at the maximum depth in the tree.

Proof:

Let T be any optimal prefix code tree with two siblings b and c at the maximum depth of

the tree. Such a

tree is shown in Figure 7.2Assume without loss of generality that

p(b) ≤ p(c) and p(x) ≤ p(y)

Figure 7.2: Optimal prefix code tree T

Docsity.com

Discover Study notes of Digital Systems Design Jaypee University of Engineering & Technology

Partial preview of the text

Download Huffman Encoding, Correctness - Design and Analysis - Study Notes and more Study notes Digital Systems Design in PDF only on Docsity!

Lecture No. 25

7.2.2 Huffman Encoding: Correctness

Huffman algorithm uses a greedy approach to generate a prefix code T that minimizes the expected length B(T) of the encoded string. In other words, Huffman algorithm generates an optimum prefix code. The question that remains is that why is the algorithm correct? Recall that the cost of any encoding tree T is

Our approach to prove the correctness of Huffman Encoding will be to show that any tree that differs from the one constructed by Huffman algorithm can be converted into one that is equal to Huffman’s tree without increasing its costs. Note that the binary tree constructed by Huffman algorithm is a full binary tree.

Claim: Consider two characters x and y with the smallest probabilities. Then there is optimal code tree in which these two characters are siblings at the maximum depth in the tree. Proof: Let T be any optimal prefix code tree with two siblings b and c at the maximum depth of the tree. Such a tree is shown in Figure 7.2Assume without loss of generality that

p(b) ≤ p(c) and p(x) ≤ p(y)

Figure 7.2: Optimal prefix code tree T

Since x and y have the two smallest probabilities (we claimed this), it follows that

p(x) ≤ p(b) and p(y) ≤ p(c)

Since b and c are at the deepest level of the tree, we know that

d(b) ≥ d(x) and d(c) ≥ d(y) (d is the depth)

Thus we have

p(b) - p(x) ≥ 0 and d(b) - d(x) ≥ 0 Hence their product is non-negative. That is,

(p(b) - p(x)) · (d(b) - d(x)) ≥ 0

Now swap the positions of x and b in the tree

Figure 7.3: Swap x and b in tree prefix tree T

The final tree T′′ satisfies the claim we made earlier, i.e., consider two characters x and y

with the smallest probabilities. Then there is optimal code tree in which these two characters are siblings at the maximum depth in the tree.

The claim we just proved asserts that the first step of Huffman algorithm is the proper one to perform (the greedy step). The complete proof of correctness for Huffman algorithm follows by induction on n.

Claim: Huffman algorithm produces the optimal prefix code tree.

Proof: The proof is by induction on n, the number of characters. For the basis case, n = 1, the tree consists of a single leaf node, which is obviously optimal. We want to show it is true with exactly n characters.

Suppose we have exactly n characters. The previous claim states that two characters x and y with the lowest probability will be siblings at the lowest level of the tree. Remove x and y and replace them with a new character z whose probability is p(z) = p(x) + p(y). Thus n - 1 character remain.

Consider any prefix code tree T made with this new set of n - 1 characters. We can convert T into prefix code tree T 0 for the original set of n characters by replacing z with nodes x and y. This is essentially undoing the operation where x and y were removed an

replaced by z. The cost of the new tree T′ is

B(T′) = B(T) - p(z)d(z) + p(x)[d(z) + 1] + p(y)[d(z) + 1]

= B(T) - (p(x) + p(y))d(z) + (p(x) + p(y))[d(z) + 1] = B(T) + (p(x) + p(y))[d(z) + 1 - d(z)] = B(T) + p(x) + p(y)

The cost changes but the change depends in no way on the structure of the tree T (T is for

n – 1 characters). Therefore, to minimize the cost of the final tree T′, we need to build

the tree T on n – 1 character optimally. By induction, this is exactly what Huffman algorithm does. Thus the final tree is optimal.

7.3 Activity Selection The activity scheduling is a simple scheduling problem for which the greedy algorithm approach provides an optimal solution. We are given a set S = {a 1 , a 2 ,... , an } of n activities that are to be scheduled to use some resource. Each activity ai must be started at a given start time si and ends at a given finish time f i.

An example is that a number of lectures are to be given in a single lecture hall. The start and end times have be set up in advance. The lectures are to be scheduled. There is only one resource (e.g., lecture hall). Some start and finish times may overlap. Therefore, not all requests can be honored. We say that two activities ai and aj are non-interfering if their start-finish intervals do not overlap. I.e, (si, f i) \ (sj , f j ) = ?. The activity selection problem is to select a maximum-size set of mutually non-interfering activities for use of the resource. So how do we schedule the largest number of activities on the resource? Intuitively, we do not like long activities Because they occupy the resource and keep us from honoring other requests. This suggests the greedy strategy: Repeatedly select the activity with the smallest duration (f i - si ) and schedule it, provided that it does not interfere with any previously scheduled activities. Unfortunately, this turns out to be non-optimal.

Huffman Encoding, Correctness - Design and Analysis - Study Notes, Study notes of Digital Systems Design

Related documents

Partial preview of the text

Download Huffman Encoding, Correctness - Design and Analysis - Study Notes and more Study notes Digital Systems Design in PDF only on Docsity!

Lecture No. 25

7.2.2 Huffman Encoding: Correctness

The final tree T′′ satisfies the claim we made earlier, i.e., consider two characters x and y

replaced by z. The cost of the new tree T′ is

B(T′) = B(T) - p(z)d(z) + p(x)[d(z) + 1] + p(y)[d(z) + 1]

n – 1 characters). Therefore, to minimize the cost of the final tree T′, we need to build