Slides on Sorting - Object-Oriented Programming and Data Structures | CS 2110, Study notes of Computer Science

Material Type: Notes; Class: Object-Oriented Programming and Data Structures; Subject: Computer Science; University: Cornell University; Term: Summer 2008;

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-1wh-1
koofers-user-1wh-1 ๐Ÿ‡บ๐Ÿ‡ธ

5

(1)

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Sorting
Lecture 12
CS211 โ€“ Summer 2008
2
InsertionSort
yMany people sort cards this way
yInvariant: everything to left of iis
already sorted
yWorks especially well when input
is nearly sorted
yWorst-case is O(n2)
๎šƒConsider reverse-sorted input
yBest-case is O(n)
๎šƒConsider sorted input
yExpected case is O(n2)
๎šƒExpected number of inversions is
n(nโ€“1)/4
//sort a[], an array of int
for (int i = 1; i < a.length; i++) {
int temp = a[i];
int k;
for (k = i; 0 < k && temp < a[kโ€“1]; k โ€“โ€“)
a[k] = a[kโ€“1];
a[k] = temp;
}
3
SelectionSort
yTo sort an array of size n:
๎šƒExamine a[0] to a[nโ€“1]; find
the smallest one and swap it with
a[0]
๎šƒExamine a[1] to a[nโ€“1]; find
the smallest one and swap it with
a[1]
๎šƒIn general, in step i, examine
a[i] to a[nโ€“1]; find the
smallest one and swap it with
a[i]
yThis is the other common way for
people to sort cards
yRuntime
๎šƒWorst-case O(n2)
๎šƒBest-case O(n2)
๎šƒExpected-case O(n2)
4
Divide & Conquer?
yIt often pays to
๎šƒBreak the problem into smaller subproblems,
๎šƒSolve the subproblems separately, and then
๎šƒAssemble a final solution
yThis technique is called divide-and-conquer
๎šƒCaveat: It wonโ€™t help unless the partitioning and assembly
processes are inexpensive
yCan we apply this approach to sorting?
5
MergeSort
yQuintessential divide-and-conquer algorithm
yDivide array into equal parts, sort each part, then merge
yQuestions:
๎šƒQ1: How do we divide array into two equal parts?
๎šƒA1: Find middle index: a.length/2
๎šƒQ2: How do we sort the parts?
๎šƒA2: call MergeSort recursively!
๎šƒQ3: How do we merge the sorted subarrays?
๎šƒA3: We have to write some (easy) code
6
Merging Sorted Arrays Aand B
yCreate an array Cof size = size of A+ size of B
yKeep three indices:
๎šƒiinto A
๎šƒjinto B
๎šƒkinto C
yInitialize all three indices to 0(start of each array)
yCompare element A[i] with B[j], and move the smaller
element into C[k]
yIncrement ior j, whichever one we took, and k
yWhen either Aor Bbecomes empty, cop y remaining elements
from the other array (Bor A, respectively) into C
pf3
pf4

Partial preview of the text

Download Slides on Sorting - Object-Oriented Programming and Data Structures | CS 2110 and more Study notes Computer Science in PDF only on Docsity!

Sorting

Lecture 12 CS211 โ€“ Summer 2008

2

InsertionSort

y Many people sort cards this way y Invariant: everything to left of i is already sorted y Works especially well when input is nearly sorted

y Worst-case is O(n^2 ) ยƒ Consider reverse-sorted input y Best-case is O(n) ยƒ Consider sorted input y Expected case is O(n^2 ) ยƒ Expected number of inversions is n(nโ€“1)/

//sort a[], an array of int for (int i = 1; i < a.length; i++) { int temp = a[i]; int k; for (k = i; 0 < k && temp < a[kโ€“1]; kโ€“โ€“) a[k] = a[kโ€“1]; a[k] = temp; }

3

SelectionSort

y To sort an array of size n: ยƒ Examine a[0] to a[nโ€“1] ; find the smallest one and swap it with a[0] ยƒ Examine a[1] to a[nโ€“1] ; find the smallest one and swap it with a[1] ยƒ In general, in step i, examine a[i] to a[nโ€“1] ; find the smallest one and swap it with a[i]

y This is the other common way for people to sort cards

y Runtime ยƒ Worst-case O(n 2 ) ยƒ Best-case O(n 2 ) ยƒ Expected-case O(n 2 )

4

Divide & Conquer?

y It often pays to

ยƒ Break the problem into smaller subproblems, ยƒ Solve the subproblems separately, and then ยƒ Assemble a final solution

y This technique is called divide-and-conquer

ยƒ Caveat: It wonโ€™t help unless the partitioning and assembly processes are inexpensive

y Can we apply this approach to sorting?

MergeSort

y Quintessential divide-and-conquer algorithm

y Divide array into equal parts, sort each part, then merge

y Questions:

ยƒ Q1: How do we divide array into two equal parts? ยƒ A1: Find middle index: a.length/

ยƒ Q2: How do we sort the parts? ยƒ A2: call MergeSort recursively!

ยƒ Q3: How do we merge the sorted subarrays? ยƒ A3: We have to write some (easy) code

Merging Sorted Arrays A and B

y Create an array C of size = size of A + size of B y Keep three indices: ยƒ i into A ยƒ j into B ยƒ k into C y Initialize all three indices to 0 (start of each array) y Compare element A[i] with B[j] , and move the smaller element into C[k] y Increment i or j , whichever one we took, and k y When either A or B becomes empty, copy remaining elements from the other array ( B or A , respectively) into C

7

1 3 4 4 6 7

Merging Sorted Arrays

C = merged array

B

A

1 3 4 6 8

k 4 7 7 8 9

i

j

8

MergeSort Analysis

y Outline (detailed code on the website) ยƒ Split array into two halves ยƒ Recursively sort each half ยƒ Merge the two halves

y Merge = combine two sorted arrays to make a single sorted array ยƒ Rule: always choose the smallest item ยƒ Time: O(n) where n is the combined size of the two arrays

y Runtime recurrence ยƒ Let T(n) be the time to sort an array of size n T(n) = 2T(n/2) + O(n) T(1) = 1

y Can show by induction that T(n) is O(n log n)

y Alternately, can see that T(n) is O(n log n) by looking at tree of recursive calls

9

MergeSort Notes

y Asymptotic complexity: O(n log n)

ยƒ Much faster than O(n^2 )

y Disadvantage

ยƒ Need extra storage for temporary arrays ยƒ In practice, this can be a disadvantage, even though MergeSort is asymptotically optimal for sorting ยƒ Can do MergeSort in place, but this is very tricky (and it slows down the algorithm significantly)

y Are there good sorting algorithms that do not use so

much extra storage?

ยƒ Yes: QuickSort

10

QuickSort

y Intuitive idea

ยƒ Given an array A to sort, choose a pivot value p ยƒ Partition A into two subarrays, AX and AY ยŠ AX contains only elements โ‰ค p ยŠ AY contains only elements โ‰ฅ p ยƒ Sort subarrays AX and AY separately ยƒ Concatenate (not merge!) sorted AX and AY to get sorted A ยŠ Concatenation is easier than merging โ€“ O(1)

20 31 24 19 45 56 4 65 5 72 14 99

pivot partition

(^5 ) 14

4

31 72

56

(^65 )

24

99

4 5 14 19 20 24 31 45 56 65 72 99

QuickSort QuickSort

4 5 14 19 20 24 31 45 56 65 72 99

concatenate

QuickSort Questions

y Key problems ยƒ How should we choose a pivot? ยƒ How do we partition an array in place?

y Partitioning in place ยƒ Can be done in O(n) time (next slide)

y Choosing a pivot ยƒ Ideal pivot is the median, since this splits array in half ยƒ Computing the median of an unsorted array is O(n), but algorithm is quite complicated ยƒ Popular heuristics: ยŠ Use first value in array (usually not a good choice) ยŠ Use middle value in array ยŠ Use median of first, last, and middle values in array ยŠ Choose a random element

19

Comparison Trees

y Comparison-based algorithms make decisions based on comparison of data elements

y This gives a comparison tree y If the algorithm fails to terminate for some input, then the comparison tree is infinite

y The height of the comparison tree represents the worst-case number of comparisons for that algorithm

y Can show that any correct comparison-based algorithm must make at least n log n comparisons in the worst case

a[i] < a[j] no yes

20

Lower Bound for Comparison Sorting

y Say we have a correct comparison-based algorithm

y Suppose we want to sort the elements in an array B[]

y Assume the elements of B[] are distinct

y Any permutation of the elements is initially possible

y When done, B[] is sorted

y But the algorithm could not have taken the same path in the comparison tree on different input permutations

y How many input permutations are possible? n! ~ 2n log n

y For a comparison-based sorting algorithm to be correct, it must have at least that many leaves in its comparison tree

y to have at least n! ~ 2n log n^ leaves, it must have height at least n log n (since it is only binary branching, the number of nodes at most doubles at every depth)

y therefore its longest path must be of length at least n log n, and that it its worst-case running time

Lower Bound for Comparison Sorting