Linear Time Selection/Median | Exams Computer Programming

601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz

Topic: Linear time selection/median Date: 9/9/21

4.1 Introduction and Problem Definition

We saw last lecture a way to sort in time O(nlog n): Randomized Quicksort. There are also other

sorting algorithms with similar time bounds, most notably Mergesort and Heapsort (you should all

know both of these already). In this lecture we will discuss a related problem with some surprisingly

efficient algorithms: median-finding, or more generally, selection.

The median problem is the following: given an unsorted array, find and return the median element.

In other words, given an array of length n, find and return the (n/2)nd smallest element. The

selection problem is only slightly more general: given an array of length nand a value k≤n, find

and return the kth smallest element. From now on we’ll mostly talk about selection.

It is obvious that selection can be done in time O(nlog n): we can sort the array (using, e.g.,

mergesort), and then return the kth smallest element. Can we do any better?

It turns out that the answer is yes! We can do selection in O(n) time, both randomized (worst-case

expected time) and deterministic.

There are a few easy cases, which we can do to warm up. For example, suppose k= 1. Then

we are trying to find the smallest element, which we can do by simply scanning the array in O(n)

time and keeping track of the smallest. Similarly, if k=na simple scan also suffices. In general,

this strategy works whenever k=O(1) or k=n−O(1), since we can just keep track of the k

smallest/largest elements we see while we do a scan.

This doesn’t work for k=n/2, though. If we kept track of the ksmallest elements, then when

considering a new element in the scan we would have to figure out its place in the smallest k, which

takes time Θ(log k) = Θ(log n) (upper bound via binary search, lower bound something we’ll see

next week). So the total time would be Θ(nlog k) = Θ(nlog n).

4.2 Randomized Quickselect

The idea here is to use randomized quicksort, but instead of recursing on both sides we only recurse

on the side which has the desired element. Slightly more formally, suppose we are given an array

Aof length nand an integer k≤n. Then Randomized Quickselect does the following:

1. If n= 1, return the element.

2. Pick a pivot element puniformly at random from A.

3. Compare each element of Ato p, creating subarrays Lof elements less than pand Gof

elements greater than p.

4. (a) If |L|=k−1 then return p.

Linear Time Selection/Median, Exams of Computer Programming