Medians and Order Statistics - Introduction to Algorithms - Lecture Slides, Slides of Computer Science

These are the Lecture Slides of Introduction to Algorithms which includes Expensive Operations, Sort Edges, Running Time, Upshot, Union, Makeset, Disjoint Set, Disjoint Set Union, Naïve Implementation etc. Key important points are: Medians and Order Statistics, Structures For Dynamic Sets, Radix Sort, Assumption, Input, Digits Ranging, Basic Idea, Digit Starting, Counting Sort, Bucket Sort

Typology: Slides

2012/2013

Uploaded on 03/23/2013

dhruv
dhruv 🇮🇳

4.3

(12)

194 documents

1 / 40

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Algorithms
Medians and Order Statistics
Structures for Dynamic Sets
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28

Partial preview of the text

Download Medians and Order Statistics - Introduction to Algorithms - Lecture Slides and more Slides Computer Science in PDF only on Docsity!

Algorithms

Medians and Order Statistics

Structures for Dynamic Sets

Homework 3

● On the web shortly…

■ Due Wednesday at the beginning of class (test)

Review: Bucket Sort

● Bucket sort

■ Assumption: input is n reals from [0, 1)

■ Basic idea:

○ Create n linked lists ( buckets ) to divide interval [0,1) into subintervals of size 1/ n ○ Add each input element to appropriate bucket and sort buckets with insertion sort

■ Uniform input distribution  O(1) bucket size

○ Therefore the expected total time is O(n)

■ These ideas will return when we study hash tables

Review: Order Statistics

● The i th order statistic in a set of n elements is

the i th smallest element

● The minimum is thus the 1st order statistic

● The maximum is (duh) the n th order statistic

● The median is the n /2 order statistic

■ If n is even, there are 2 medians

● Could calculate order statistics by sorting

■ Time: O(n lg n) w/ comparison sort

■ We can do better

Review: Randomized Selection

● Key idea: use partition() from quicksort

■ But, only need to examine one subarray

■ This savings shows up in running time: O(n)

≤ A[q] ≥ A[q] p q r

Review: Randomized Selection

RandomizedSelect(A, p, r, i) if (p == r) then return A[p]; q = RandomizedPartition(A, p, r) k = q - p + 1; if (i == k) then return A[q]; // not in book if (i < k) then return RandomizedSelect(A, p, q-1, i); else return RandomizedSelect(A, q+1, r, i-k);

≤ A[q] ≥ A[q]

k

p q r

Worst-Case Linear-Time Selection

● Randomized algorithm works well in practice

● What follows is a worst-case linear time

algorithm, really of theoretical interest only

● Basic idea:

■ Generate a good partitioning element

■ Call this element x

Worst-Case Linear-Time Selection

● The algorithm in words:

  1. Divide n elements into groups of 5
  2. Find median of each group ( How? How long? )
  3. Use Select() recursively to find median x of the n/5 medians
  4. Partition the n elements around x. Let k = rank( x )
  5. if (i == k) then return x if (i < k) then use Select() recursively to find i th smallest element in first partition else (i > k) use Select() recursively to find ( i-k )th smallest element in last partition

Worst-Case Linear-Time Selection

● Thus after partitioning around x , step 5 will

call Select() on at most 3 n /4 elements

● The recurrence is therefore:

if is big enough

20

19 20 ( )

5 3 4 ( )

5 3 4

( ) 5 3 4

cn c

cn cn n

cn n

cn cn n

T n T n n

T n T n T n n

= − − Θ

= + Θ

≤ + + Θ

≤ + + Θ

≤ + + Θ

???

???

??? ???

???

 n/5  ≤ n/

Substitute T(n) = cn

Combine fractions Express in desired form

What we set out to prove

Worst-Case Linear-Time Selection

● Intuitively:

■ Work at each level is a constant fraction (19/20)

smaller

○ Geometric progression!

■ Thus the O(n) work at the root dominates

Linear-Time Median Selection

● Worst-case O(n lg n) quicksort

■ Find median x and partition around it

■ Recursively quicksort two halves

■ T(n) = 2T(n/2) + O(n) = O(n lg n)

Structures…

● Done with sorting and order statistics for now

● Ahead of schedule, so…

● Next part of class will focus on data structures

● We will get a couple in before the first exam

■ Yes, these will be on this exam

Binary Search Trees

● Binary Search Trees (BSTs) are an important

data structure for dynamic sets

● In addition to satellite data, eleements have:

■ key : an identifying field inducing a total ordering

■ left : pointer to a left child (may be NULL)

■ right : pointer to a right child (may be NULL)

■ p : pointer to a parent node (NULL for root)

Binary Search Trees

● BST property:

key[left(x)] ≤ key[x] ≤ key[right(x)]

● Example:

F

B H

A D K