Linear Time Sorting - Introduction to Algorithms - Lecture Slides, Slides of Computer Science

These are the Lecture Slides of Introduction to Algorithms which includes Expensive Operations, Sort Edges, Running Time, Upshot, Union, Makeset, Disjoint Set, Disjoint Set Union, Naïve Implementation etc. Key important points are: v

Typology: Slides

2012/2013

Uploaded on 03/23/2013

dhruv
dhruv 🇮🇳

4.3

(12)

194 documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Algorithms
Linear-Time Sorting Continued
Medians and Order Statistics
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Linear Time Sorting - Introduction to Algorithms - Lecture Slides and more Slides Computer Science in PDF only on Docsity!

Algorithms

Linear-Time Sorting Continued

Medians and Order Statistics

Review: Comparison Sorts

● Comparison sorts: O(n lg n) at best

■ Model sort with decision tree

■ Path down tree = execution trace of algorithm

■ Leaves of tree = possible permutations of input

■ Tree must have n! leaves, so O(n lg n) height

Review: Counting Sort

1 CountingSort(A, B, k)

2 for i=1 to k

3 C[i]= 0;

4 for j=1 to n

5 C[A[j]] += 1;

6 for i=2 to k

7 C[i] = C[i] + C[i-1];

8 for j=n downto 1

9 B[C[A[j]]] = A[j];

10 C[A[j]] -= 1;

Review: Radix Sort

● How did IBM get rich originally?

● Answer: punched card readers for census

tabulation in early 1900’s.

■ In particular, a card sorter that could sort cards

into different bins

○ Each column can be punched in 12 places

○ Decimal digits use 10 places

■ Problem: only one column can be sorted on at a

time

Radix Sort

● Can we prove it will work?

● Sketch of an inductive argument (induction on

the number of passes):

■ Assume lower-order digits {j: j<i}are sorted

■ Show that sorting next digit i leaves array correctly

sorted

○ If two digits at position i are different, ordering numbers by that digit is correct (lower-order digits irrelevant)

○ If they are the same, numbers are already sorted on the

lower-order digits. Since we use a stable sort, the

numbers stay in the right order

Radix Sort

● What sort will we use to sort on digits?

● Counting sort is obvious choice:

■ Sort n numbers on digits that range from 1.. k

■ Time: O( n + k )

● Each pass over n numbers with d digits takes

time O( n+k ), so total time O( dn+dk )

■ When d is constant and k= O( n ), takes O( n ) time

● How many bits in a computer word?

Radix Sort

● In general, radix sort based on counting sort is

■ Fast

■ Asymptotically fast (i.e., O( n ))

■ Simple to code

■ A good choice

● To think about: Can radix sort be used on

floating-point numbers?

Summary: Radix Sort

● Radix sort:

■ Assumption: input has d digits ranging from 0 to k

■ Basic idea:

○ Sort elements by digit starting with least significant

○ Use a stable sort (like counting sort) for each stage

■ Each pass over n numbers with d digits takes time

O( n+k ), so total time O( dn+dk )

○ When d is constant and k= O( n ), takes O( n ) time

■ Fast! Stable! Simple!

■ Doesn’t sort in place

Order Statistics

● The i th order statistic in a set of n elements is

the i th smallest element

● The minimum is thus the 1st order statistic

● The maximum is (duh) the n th order statistic

● The median is the n /2 order statistic

■ If n is even, there are 2 medians

● How can we calculate order statistics?

● What is the running time?

Order Statistics

● How many comparisons are needed to find the

minimum element in a set? The maximum?

● Can we find the minimum and maximum with

less than twice the cost?

● Yes:

■ Walk through elements by pairs

○ Compare each element in pair to the other

○ Compare the largest to maximum, smallest to minimum

■ Total cost: 3 comparisons per 2 elements =

O(3n/2)

Randomized Selection

● Key idea: use partition() from quicksort

■ But, only need to examine one subarray

■ This savings shows up in running time: O(n)

● We will again use a slightly different partition

than the book:

q = RandomizedPartition(A, p, r)

≤ A[q] ≥ A[q]

p q r

Randomized Selection

RandomizedSelect(A, p, r, i)

if (p == r) then return A[p];

q = RandomizedPartition(A, p, r)

k = q - p + 1; if (i == k) then return A[q]; // not in book

if (i < k) then

return RandomizedSelect(A, p, q-1, i);

else

return RandomizedSelect(A, q+1, r, i-k);

≤ A[q] ≥ A[q]

k

p q r

Randomized Selection

● Average case

■ For upper bound, assume i th element always falls

in larger side of partition:

■ Let’s show that T( n ) = O( n ) by substitution

∑^ ( )^ ( )

=

=

1

/ 2

1

0

max , 1

n

k n

n

k

T k n n

T k n k n n

T n

What happened here?

What happened here?“Split” the recurrence

What happened here?

What happened here?

What happened here?

Randomized Selection

● Assume T( n ) ≤ cn for sufficiently large c :

( ) ( ) n

c n c n

n

n n n n n

c

k k n n

c

ck n n

T k n n

T n

n

k

n

k

n

k n

n

k n

+^ Θ

∑ ∑

=

=

=

=

2 1

1

1

1

1

/ 2

1

/ 2

The recurrence we started with

Substitute T(n)cn for T(k)

Expand arithmetic series

Multiply it out