

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An analysis of the merge sort algorithm, focusing on the running time of the merge procedure and the use of a recurrence relation to describe the worst-case running time of mergesort. It also discusses the importance of divide-and-conquer design technique and mathematical techniques for solving recurrences.
Typology: Study notes
1 / 2
This page cannot be seen from the preview
Don't miss anything!


First let us consider the running time of the procedure Merge(A, p, q, r). Let n = r - p + 1 denote the total length of both the left and right sub-arrays. What is the running time of Merge as a function of n? The algorithm contains four loops (none nested in the other). It is easy to see that each loop can be executed at most n times. (If you are a bit more careful you can actually see that all the while-loops together can only be executed n times in total, because each execution copies one new element to the array B, and B only has space for n elements.) Thus the running time to Merge n items is Θ(n). Let us write this without the asymptotic notation, simply as n. (We’ll see later why we do this.)
Now, how do we describe the running time of the entire MergeSort algorithm? We will do this through the use of a recurrence, that is, a function that is defined recursively in terms of itself. To avoid circularity, the recurrence for a given value of n is defined in terms of values that are strictly smaller than n. Finally, a recurrence has some basis values (e.g. for n = 1), which are defined explicitly.
Let T(n) denote the worst case running time of MergeSort on an array of length n. If we call MergeSort with an array containing a single item (n = 1) then the running time is constant. We can just write T(n) = 1, ignoring all constants. For n > 1, MergeSort splits into two halves, sorts the two and then merges them together. The left half is of size dn/2e and the right half is n/2. How long does it take to sort elements in sub array of size n/2? We do not know this but because n/2 < n for n > 1, we can express this as T(n/2). Similarly the time taken to sort right sub array is expressed as T(n/2). In conclusion we have
This is called recurrence relation , i.e., a recursively defined function. Divide-and- conquer is an important design technique, and it naturally gives rise to recursive algorithms. It is thus important to develop mathematical techniques for solving recurrences, either exactly or asymptotically. Let’s expand the terms.
T(1) = 1 T(2) = T(1) + T(1) + 2 = 1 + 1 + 2 = 4 T(3) = T(2) + T(1) + 3 = 4 + 1 + 3 = 8 T(4) = T(2) + T(2) + 4 = 8 + 8 + 4 = 12 T(5) = T(3) + T(2) + 5 = 8 + 4 + 5 = 17
... T(8) = T(4) + T(4) + 8 = 12 + 12 + 8 = 32 ... T(16) = T(8) + T(8) + 16 = 32 + 32 + 16 = 80
What is the pattern here? Let’s consider the ratios T(n)/n for powers of 2 :
T(1)/1 = 1 T(8)/8 = 4 T(2)/2 = 2 T(16)/16 = 5 T(4)/4 = 3 T(32)/32 = 6
This suggests T(n)/n = log n + 1 Or, T(n) = n log n + n which is Θ(n log n) (using the limit rule).