Linear Sorting Review and Direct Access Array Sort, Summaries of Introduction to Computers

The lower bound for comparison search and how it can be done faster using random access indexing. It also introduces the concept of direct access array and how it can be used for sorting. the direct access array sort algorithm and its time complexity. It also discusses the representation of keys in a larger range using tuples and how they can be sorted. useful for students studying algorithms and data structures.

Typology: Summaries

2021/2022

Available from 04/06/2023

praveen-kumar-336
praveen-kumar-336 🇮🇳

1 document

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture 5: Linear Sorting
Review
Comparison search lower bound: any decision tree with n nodes has height dlg(n+1)e1
Can do faster using random access indexing: an operation with linear branching factor!
Direct access array is fast, but may use a lot of space (Θ(u))
Solve space problem by mapping (hashing) key space u down to m = Θ(n)
Hash tables give expected O(1) time operations, amortized if dynamic
Expectation input-independent: choose hash function randomly from universal hash family
Data structure overview!
Last time we achieved faster find. Can we also achieve faster sort?
Data Structure
Operations O(·)
Container Static Dynamic Order
build(X) find(k) insert(x)
delete(k)
find min()
find max()
find prev(k)
find next(k)
Array n n n n n
Sorted Array n log n log n n 1 log n
Direct Access Array u 1 1 u u
Hash Table n(e) 1(e) 1(a)(e) n n
pf3
pf4
pf5

Partial preview of the text

Download Linear Sorting Review and Direct Access Array Sort and more Summaries Introduction to Computers in PDF only on Docsity!

Review

  • Comparison search lower bound: any decision tree with n nodes has height ≥ dlg(n+1)e− 1
  • Can do faster using random access indexing: an operation with linear branching factor!
  • Direct access array is fast, but may use a lot of space (Θ(u))
  • Solve space problem by mapping (hashing) key space u down to m = Θ(n)
  • Hash tables give expected O(1) time operations, amortized if dynamic
  • Expectation input-independent: choose hash function randomly from universal hash family
  • Data structure overview!
  • Last time we achieved faster find. Can we also achieve faster sort? Data Structure Operations O(·) Container Static Dynamic Order build(X) find(k) insert(x) delete(k) find min() find max() find prev(k) find next(k) Array n n n n n Sorted Array n log n log n n 1 log n Direct Access Array u 1 1 u u Hash Table n(e) 1 (e) 1 (a)(e) n n

Comparison Sort Lower Bound

  • Comparison model implies that algorithm decision tree is binary (constant branching factor)
  • Requires # leaves L ≥ # possible outputs
  • Tree height lower bounded by Ω(log L), so worst-case running time is Ω(log L)
  • To sort array of n elements, # outputs is n! permutations
  • Thus height lower bounded by log(n!) ≥ log((n/2)n/^2 ) = Ω(n log n)
  • So merge sort is optimal in comparison model
  • Can we exploit a direct access array to sort faster?

Direct Access Array Sort

  • Example: [5, 2, 7, 0, 4]
  • Suppose all keys are unique non-negative integers in range { 0 ,... , u − 1 }, so n ≤ u
  • Insert each item into a direct access array with size u in Θ(n)
  • Return items in order they appear in direct access array in Θ(u)
  • Running time is Θ(u), which is Θ(n) if u = Θ(n). Yay! 1 def direct_access_sort(A): 2 "Sort A assuming items have distinct non-negative keys" 3 u = 1 + max([x.key for x in A]) # O(n) find maximum key 4 D = [None] (^) * u # O(u) direct access array 5 for x in A: # O(n) insert items 6 D[x.key] = x 7 i = 0 8 for key in range(u): # O(u) read out items in order 9 if D[key] is not None: 10 A[i] = D[key] 11 i += 1
  • What if keys are in larger range, like u = Ω(n^2 ) < n^2?
  • Idea! Represent each key k by tuple (a, b) where k = an + b and 0 ≤ b < n
  • Specifically a = bk/nc < n and b = (k mod n) (just a 2 -digit base-n number!)
  • This is a built-in Python operation (a, b) = divmod(k, n)
  • Example: [17, 3, 24, 22, 12] ⇒ [(3,2), (0,3), (4,4), (4,2), (2,2)] ⇒ 32, 03, 44, 42, 22
  • How can we sort tuples?

Radix Sort

  • Idea! If u < n^2 , use tuple sort with auxiliary counting sort to sort tuples (a, b)
  • Sort least significant key b, then most significant key a
  • Stability ensures previous sorts stay sorted
  • Running time for this algorithm is O(2n) = O(n). Yay!
  • If every key < nc^ for some positive c = logn(u), every key has at most c digits base n
  • A c-digit number can be written as a c-element tuple in O(c) time
  • We sort each of the c base-n digits in O(n) time
  • So tuple sort with auxiliary counting sort runs in O(cn) time in total
  • If c is constant, so each key is ≤ nc, this sort is linear O(n)! 1 def radix_sort(A): 2 "Sort A assuming items have non-negative keys" 3 n = len(A) 4 u = 1 + max([x.key for x in A]) # O(n) find maximum key 5 c = 1 + (u.bit_length() // n.bit_length()) 6 class Obj: pass 7 D = [Obj() for a in A] 8 for i in range(n): # O(nc) make digit tuples 9 D[i].digits = [] 10 D[i].item = A[i] 11 high = A[i].key 12 for j in range(c): # O(c) make digit tuple 13 high, low = divmod(high, n) 14 D[i].digits.append(low) 15 for i in range(c): # O(nc) sort each digit 16 for j in range(n): # O(n) assign key i to tuples 17 D[j].key = D[j].digits[i] 18 counting_sort(D) # O(n) sort on digit i 19 for i in range(n): # O(n) output to A 20 A[i] = D[i].item Algorithm Time O(·) In-place? Stable? Comments Insertion Sort n^2 Y Y O(nk) for k-proximate Selection Sort n^2 Y N O(n) swaps Merge Sort n log n N Y stable, optimal comparison Counting Sort n + u N Y O(n) when u = O(n) Radix Sort n + n log (n u) N Y O(n) when u = O(nc)