Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Scheduling and Huffman Encoding Techniques - Prof. J. W. Demmel, Exams of Computer Science

University of California - Berkeley Computer Science

Prof. J. W. Demmel

Information on two important techniques used in computer science: scheduling and huffman encoding. The scheduling section explains how to find the critical path in a task graph and how to determine the maximum number of tasks that can be executed in parallel. The huffman encoding section discusses how to compress files using this method and how to determine the optimal encoding for each symbol. Both sections include examples and algorithms.

Typology: Exams

2010/2011

Uploaded on 06/19/2011

koofers-user-v3o 🇺🇸

8 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

CS 170 Second Midterm 5 April 2007

NAME (1 pt):

TA (1 pt):

Name of Neighbor to your left (1 pt):

Name of Neighbor to your right (1 pt):

Instructions: This is a closed book, closed calculator, closed computer, closed network, open

brain exam, but you are permited a 1 page, double-sided set of notes, large enough to read

without a magnifying glass.

You get one point each for ﬁlling in the 4 lines at the top of this page. Each other question

is worth the amount shown.

Write all your answers on this exam. If you need scratch paper, ask for it, write your name

on each sheet, and attach it when you turn it in (we have a stapler).

Tot al

Discover Exams of Computer Science University of California - Berkeley

Partial preview of the text

Download Scheduling and Huffman Encoding Techniques - Prof. J. W. Demmel and more Exams Computer Science in PDF only on Docsity!

CS 170 Second Midterm 5 April 2007

NAME (1 pt):

TA (1 pt):

Name of Neighbor to your left (1 pt):

Name of Neighbor to your right (1 pt):

Instructions: This is a closed book, closed calculator, closed computer, closed network, open brain exam, but you are permited a 1 page, double-sided set of notes, large enough to read without a magnifying glass. You get one point each for filling in the 4 lines at the top of this page. Each other question is worth the amount shown. Write all your answers on this exam. If you need scratch paper, ask for it, write your name on each sheet, and attach it when you turn it in (we have a stapler).

Total

(1) Scheduling (30 points).

You’re an engineer planning the construction of a large bridge. There are constraints between the tasks involved. For instance, a portion of road cannot be attached to the suspensions until these are in place, or until the road itself is built. Let each task be represented as a node in a graph, where a directed edge joins A to B if task A must be completed before B begins. Give a condition on this graph for you to be able to order the tasks so as to satisfy all the constraints. Answer. The graph must be a DAG. Give a method to check this condition. Answer. Run Depth First Search. The graph is a DAG if and only if no back-edge is found. Give a method to find an ordering of the tasks that satisfies the constraints if one exists. Answer. Run DFS and order the nodes by decreasing post number.
In fact in a real schedule, tasks can happen simultaneously, unless a constraint forces us to finish one before beginning the other. To represent task duration, let the length of edge A→B be the duration of A. (In other words, every outgoing edge from A must have the same length.) Let S (“Start”) and F (“Finish”) be special tasks of length 0, that must happen respectively before and after all other tasks. For instance they can represent the contract signature and the inauguration. An important concept in scheduling is the critical path, that is a sequence of tasks S, A 1 ,... Ak, F such that Ai → Ai+1 and the length of the path is equal to that of the shortest possible schedule. We call this length the construction time.

A

D

S C F

B

E

Figure 1: A list of tasks (nodes), constraints (edges) and task durations (edge lengths).

In Fig. 1, find the critical path and the construction time. Answer. The critical path is the longest path from S to F. Here it is S → A → C → E → F , of length 9. Give a method to find the critical path automatically on a given graph (hint: this method uses the property of this graph that you found in the first part of this problem).

(1) Scheduling (30 points).

Planning the construction of a large building is a challenging engineering task, including dealing with constraints among the tasks involved. For instance, a portion of roof cannot be attached to the building until supports are in place, or until the roof itself is available. Let each task be represented as a node in a graph, where a directed edge joins A to B if task A must be completed before B begins. Give a condition on this graph for you to be able to order the tasks so as to satisfy all the constraints. Answer. The graph must be a DAG. Give a method to check this condition. Answer. Run Depth First Search. The graph is a DAG if and only if no back-edge is found. Give a method to find an ordering of the tasks that satisfies the constraints if one exists. Answer. Run DFS and order the nodes by decreasing post number.
In fact in a real schedule, tasks can happen simultaneously, unless a constraint forces us to finish one before beginning the other. To represent task duration, let the length of edge A→B be the duration of A. (In other words, every outgoing edge from A must have the same length.) Let S (“Start”) and F (“Finish”) be special tasks of length 0, that must happen respectively before and after all other tasks. For instance they can represent the contract signature and the inauguration. An important concept in scheduling is the critical path, that is a sequence of tasks S, A 1 ,... Ak, F such that Ai → Ai+1 and the length of the path is equal to that of the shortest possible schedule. We call this length the construction time.

A

D

S C F

B

E

Figure 2: A list of tasks (nodes), constraints (edges) and task durations (edge lengths).

In Fig. 2, find the critical path and the construction time. Answer. The critical path is the longest path from S to F. Here it is S → A → C → E → F , of length 9. Give a method to find the critical path automatically on a given graph (hint: this method uses the property of this graph that you found in the first part of this problem).

Answer. The best method is to adapt the “dags-shortest-path” method to find the longest path instead. Find a linearization order as in the third question. Iterate through each node u in linearized order, calling update() on each edge from u, except update is now: update(u,v): dist(v) = max(dist(v),dist(u)+l(u,v)) and the array dist() is initialized to −∞, or to 0 since all durations are positive. We can call this method “dags-longest-path”.

What we really want is not just the construction time, but an entire schedule, specifying for each task when to start it. How can you use the intermediate results of your algorithm to output a start time for each task (each node)? Show that this schedule is indeed valid, i.e. it does not violate any constraint. Answer. The array element dist(u) is the length of the longest path from S to u. We can use it as start time. To prove that this gives us a valid schedule, look at a constraint u → v. The start time of u is dist(u), and the update equation gives us dist(v) ≥dist(u)+l(u, v) so the start time of v is after the end time of u.
If each task requires a team of workers, show how to compute the number of teams we need to hire, i.e. the maximum number of tasks that will be executed in parallel. For example, if all tasks take unit time and A → B, A → C, B → D, C → D, then answer is 2 teams, because B and C can be done in parallel. Answer. Now that we have a start time and end time for each task where start time is the longest path to the vextex, and end time = start time + task duration (lengths of outgoing edges), we can form records of the form (start time,+1), (endtime, −1) and sort all the records by their first entry yielding the list (t 1 , s 1 ), (t 2 , s 2 ), ...(tn, sn) where t 1 ≤ t 2 ≤ ... ≤ tn and each si is +1 or -1. If there are ties (multiple equal ti) then put all the records with si = − 1 before those with si = +1. Now do

parallel tasks = 0; max parallel tasks = 0; for i = 1 to n parallel tasks = parallel tasks + s i // increases by 1 when a new task starts // decreases by 1 when one ends max parallel tasks = max(max parallel tasks, parallel tasks) end

(2) (20 points) Let G be a file containing symbols b 1 ,...,bm, where bi appears ci times. Suppose c 1 = 1, c 2 = 2 and ci = ci− 1 + ci− 2 for i > 2 (a Fibonacci sequence). We want to use Huffman encoding to compress G. Determine an optimal encoding of each symbol bi. Answer. The algorithm for Huffman encoding creates a priority queue of nodes (representing sets of symbols) ordered by increasing frequency of appearance (the sum of all the frequencies of symbols in the set). Initially the priority queue contains (b 1 , c 1 ), ... , (bm, cm). We claim that at step i of the algorithm the two lowest frequency sets removed from the priority queue are (bi+1, ci+1) and ({b 1 , ..., bi}, c 1 + · · · + ci), where c 1 + · · · + ci = ci+2 − 2 ≥ ci+1. We prove this by induction. The base case is i = 1, and this is clearly true. Now suppose it is true for i, we must show it true for i + 1. Then at step i, we remove these two sets, merge them to form {b 1 , ..., bi+1} with frequency ci+1 + ci+2 − 2 = ci+3 − 2 , where we have used the definition of the Fibonacci sequence. The two other lowest frequency entries in the queue are (bi+2, ci+2) and (bi+3, ci+3). Since ci+2 < ci+3 and ci+3 − 2 < ci+3, the two lowest frequency items on the queue at step i + 1 are (bi+2, ci+2) and ({b 1 , ..., bi+1}, ci+3 − 1), as claimed. The resulting tree created by the Huffman encoding algorithm is therefore a chain, with b 1 and b 2 being leaves at the the bottom (level m), and bk being the only leaf at level m − k + 2, for k > 2 (the root being at level 1). So one possible optimal encoding is for b 1 to get symbol 0 · · · 0 (m − 1 zeros), and bk to get symbol 0 · · · 01 (m − k zeros) for 1 < k ≤ m.

(3) (20 points) Let p(n) be the number of ways you can write the positive integer n as a sum of positive integers. For example, 3 can be written as 3, 2 + 1 and 1 + 1 + 1, so p(3) = 3. (Note that 2 + 1 = 1 + 2 is only counted once, i.e. the order of summands doesn’t matter.) Give a dynamic programming algorithm for computing p(n). Hint: Start with a dynamic programming algorithm for the slightly different function p(n, k) = the number of ways you can write n as a sum of positive integers less than or equal to k. You should include an update formula for p(n, k) (with justification), a program for filling in the values of p(n, k), a bound on the running time (using O()), and how to compute p(n) from the function p(n, k). Answer. The base case for p(n, k) is p(n, 1) = 1 (since n = 1 + 1 + · · · + 1 can only be written one way). We will also need p(0, 0) = 1 for convenience. The update formula is p(n, k) = p(n, k − 1) + p(n − k, min(k, n − k)), since n can either be written not using k (p(n, k−1) ways) or using k (p(n−k, min(k, n−k)) ways). The reason for having min(k, n−k) instead of just k is that n − k cannot be written using numbers any larger than n − k. We can now compute p(n, k) using the following program (where N is the largest value of n that we are interested in).

p(0, 0) = 1 for n = 1 to N , p(n, 1) = 1, end for for n = 2 to N for k = 2 to n p(n, k) = p(n, k − 1) + p(n − k, min(k, n − k))

The cost of this algorithm is O(N 2 ). Finally, p(n) = p(n, n).

Scheduling and Huffman Encoding Techniques - Prof. J. W. Demmel, Exams of Computer Science

Related documents

Partial preview of the text

Download Scheduling and Huffman Encoding Techniques - Prof. J. W. Demmel and more Exams Computer Science in PDF only on Docsity!

Total

A

D

S C F

B

E

A

D

S C F

B

E