Algorithm Review: Medians, Order Stats, Radix & Bucket Sort, Selection, Dynamic Sets, Lecture notes of Algorithms and Programming

A collection of notes from a computer science course (cs 332) on algorithms. The notes cover various topics including medians and order statistics, radix sort, bucket sort, selection algorithms, and dynamic sets. The document also includes review sections on radix sort and bucket sort, as well as an explanation of the selection problem and its solutions. The notes also mention the concept of worst-case linear-time selection and linear-time median selection.

Typology: Lecture notes

2015/2016

Uploaded on 10/21/2016

Mouad1231
Mouad1231 🇩🇿

4

(2)

12 documents

1 / 40

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
David Luebke 1
10/21/16
CS 332: Algorithms
Medians and Order Statistics
Structures for Dynamic Sets
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28

Partial preview of the text

Download Algorithm Review: Medians, Order Stats, Radix & Bucket Sort, Selection, Dynamic Sets and more Lecture notes Algorithms and Programming in PDF only on Docsity!

CS 332: Algorithms

Medians and Order Statistics

Structures for Dynamic Sets

Homework 3

● On the web shortly…

■ (^) Due Wednesday at the beginning of class (test)

Review: Bucket Sort ● (^) Bucket sort ■ (^) Assumption: input is n reals from [0, 1) ■ (^) Basic idea: ○ (^) Create n linked lists ( buckets ) to divide interval [0,1) into subintervals of size 1/ n ○ (^) Add each input element to appropriate bucket and sort buckets with insertion sort ■ (^) Uniform input distribution  O(1) bucket size ○ (^) Therefore the expected total time is O(n) ■ (^) These ideas will return when we study hash tables

Review: Order Statistics ● (^) The i th order statistic in a set of n elements is the i th smallest element ● (^) The minimum is thus the 1st order statistic ● (^) The maximum is (duh) the n th order statistic ● (^) The median is the n /2 order statistic ■ (^) If n is even, there are 2 medians ● (^) Could calculate order statistics by sorting ■ (^) Time: O(n lg n) w/ comparison sort ■ (^) We can do better

Review: Randomized Selection

● Key idea: use partition() from quicksort

■ (^) But, only need to examine one subarray ■ (^) This savings shows up in running time: O(n)  A[q]  A[q] p q r

Review: Randomized Selection RandomizedSelect(A, p, r, i) if (p == r) then return A[p]; q = RandomizedPartition(A, p, r) k = q - p + 1; if (i == k) then return A[q]; // not in book if (i < k) then return RandomizedSelect(A, p, q-1, i); else return RandomizedSelect(A, q+1, r, i-k);  A[q]  A[q] k p q r

Worst-Case Linear-Time Selection

● Randomized algorithm works well in practice

● What follows is a worst-case linear time

algorithm, really of theoretical interest only

● Basic idea:

■ (^) Generate a good partitioning element ■ (^) Call this element x

Worst-Case Linear-Time Selection ● (^) The algorithm in words:

  1. Divide n elements into groups of 5
  2. Find median of each group ( How? How long? )
  3. Use Select() recursively to find median x of the  n/5 medians
  4. Partition the n elements around x. Let k = rank( x )
  5. if (i == k) then return x if (i < k) then use Select() recursively to find i th smallest element in first partition else (i > k) use Select() recursively to find ( i-k )th smallest element in last partition

Worst-Case Linear-Time Selection

● Thus after partitioning around x , step 5 will

call Select() on at most 3 n /4 elements

● The recurrence is therefore:

 (^)                 if is big enough

cn c cn cn n cn n cn cn n T n T n n T n T n T n n

??? ??? ??? ??? ???n/5   n/ Substitute T(n) = cn Combine fractions Express in desired form What we set out to prove

Worst-Case Linear-Time Selection

● Intuitively:

■ (^) Work at each level is a constant fraction (19/20) smaller ○ (^) Geometric progression! ■ (^) Thus the O(n) work at the root dominates

Linear-Time Median Selection

● Worst-case O(n lg n) quicksort

■ (^) Find median x and partition around it ■ (^) Recursively quicksort two halves ■ (^) T(n) = 2T(n/2) + O(n) = O(n lg n)

Structures…

● Done with sorting and order statistics for now

● Ahead of schedule, so…

● Next part of class will focus on data structures

● We will get a couple in before the first exam

■ (^) Yes, these will be on this exam

Binary Search Trees

● Binary Search Trees (BSTs) are an important

data structure for dynamic sets

● In addition to satellite data, eleements have:

■ (^) key : an identifying field inducing a total ordering ■ (^) left : pointer to a left child (may be NULL) ■ (^) right : pointer to a right child (may be NULL) ■ (^) p : pointer to a parent node (NULL for root)

Binary Search Trees

● BST property:

key[left(x)]  key[x]  key[right(x)]

● Example:

F B H A D K