Parallel Processing Lecture 8: Sorting Algorithms - Prof. Anton W. Bohm, Study notes of Computer Science

A part of the cs575 parallel processing course lecture notes. It covers the topic of sorting algorithms, focusing on internal sorting methods such as merge sort and bitonic sort. The concepts of sorters, input and output storage, comparison and non-comparison sorting, and provides an assignment on parallel merge sort.

Typology: Study notes

Pre 2010

Uploaded on 11/08/2009

koofers-user-9qc
koofers-user-9qc 🇺🇸

10 documents

1 / 25

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS575 Parallel Processing
Lecture eight: Sorting
Wim Bohm, Colorado State University
Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19

Partial preview of the text

Download Parallel Processing Lecture 8: Sorting Algorithms - Prof. Anton W. Bohm and more Study notes Computer Science in PDF only on Docsity!

CS575 Parallel Processing

Lecture eight: SortingWim Bohm, Colorado State University

Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.

Sorting Problem^ ^ Sorting^ ^ Input:^ sequence S = (a

,a,…,a) 01 n- ^ Output: (b,b,…,b^01

) = permutation of S s.t. b n-

<= b i i+

^ Sorting Algorithm Categories^ ^ Internal sorting: S is small enough to fit in memory/network

CS575 lecture 8^ ^ Internal sorting: S is small enough to fit in memory/network^ ^ We concentrate on this ^ External sorting: S partly stored on external device (disk)^ ^ Job for MapReduce

☺ ^ Comparison sorting: Uses compares and exchanges^ ^ Ω(n log(n)) work ^ Non comparison sorting: Uses extra information about input data^ ^ Values lie in a small range (Radix Sort)^ ^ S is permutation of (1 .. N) (Pigeon hole sort)^ ^ Sometimes^ Ω

(n) work

Assignment: Parallel Merge Sort  A pipeline of sorters S

, S…. S 01 n

^ S:^0 ^ One inputs stream, two output streams^ ^ reads input stream and creates “sorted” subsequences of size 1

CS575 lecture 8^

^ reads input stream and creates “sorted” subsequences of size 1 ^ sends the subsequences to its outputs (alternating between the two)  S: (i = 1 .. n-1)i ^ Two input streams, two output streams ^ merges sorted subsequences from two input streams ^ sends double-sized, merged subsequences to its outputs (again alternating)  S:n ^ Two input streams, one output stream ^ merges sorted subsequences from two inputs into one result

Parallel Merge Sort (cont.)^ S^0

SS^1

S^3

7 2 3 1 5 6 4 8^

8 7 6 5 4 3 2 1 2 |^^1 |^^6 |^^8 3 1^ | 7 |^^3 |^^5 |^^4

8 4^ 8 6 5 4 7 2 |^ 6 5^ 7 3 2 1 CS575 lecture 8^

Questions:1.^ Given n = 2

m^ input numbers, how many sorters are needed?

2.^ If a sorter can read one number in one time step, write onenumber in one time step, and store and compare in zero timesteps, how many time steps does it take to sort n numbers?3.^ Is this algorithm cost optimal?

7 |^^3 |^^5 |^^4 7 2^

|^ 6 5^ 7 3 2 1

Sorting networks  n numbers, n lines, M stages  Each stage:^ ^ n/2 compare-exchanges^ ^ Each compare exchange computes in O(1) time

CS575 lecture 8

^ time complexity: M ^ Cost: M*n^

+^ + +^ + +^ + +^ +

Bitonic Sequence^ ^ A sequence A = a

, a, … , a 01 n-

is^ bitonic^ iff

1. There is an index i, 0 < i < n, s.t.^ a.. a^0 i^

is increasing^ CS575 lecture 8^

a.. ais increasing^0 i^ anda.. ais decreasing i^ n-1^ or 2. There is a cyclic shift of A for which 1 holds.

Bitonic Merge^ ^ Given: a Bitonic Sequence BS of size n = 2

m

^ Sort BS using m (parallel) Bitonic Splits stages^ +

+^ +

1

1 1

1 CS575 lecture 8^ +^ +^ +^ + +^ +^ +^ +

+^ +^ +^ + +^ +^ +^ + +^ +^ +^ + +^ +^ +^ +

1 3 5 7 8 6 4 2

11 32 44 23 85 66 58 77

1 2 3 4 5 6 7 8

Bitonic merge=log(n) bitonic splits stages^ ^ Can sort a bitonic sequence in log(n) steps^ ^ Increasing order: +BM(n)^ ^ use + compare exchangers^ ^ Decreasing order:

  • BM(n)^ CS575 lecture 8^ ^ Decreasing order:
  • BM(n)  use - compare exchangers
  • Bitonic sort = log(n) bitonic merge stages +BM
    • +BM4 CS575 lecture
      • -BM2 +BM2 -BM4-BM
        • +BM

Bitonic Sort network

CS575 lecture 8^

+^

Bitonic Sort on the hyper cube^ ^ One element per processor^ ^ Power of two distances in bitonic sort map perfectly on cube^ ^ Question: perform +BMx or –BMx?^ ^ stage i, step j: compare bit-j to bit-i+1^ equal:^

take minimum^ CS575 lecture 8 equal:^ take minimum unequal: take maximum for i = 0^ to d-1for j = i downto 0partner = flip bit (myLabel[j])exchange data with partnerif (myLabel[i+1] = = myLabel[j] )keep minelse keep max

Bitonic Sort on Mesh^ ^ No ideal mapping; best: nearest = most used^0000 --^

--^^0101

|^ |^

|^ | CS575 lecture 8^ |^ |^

|^ |

0010 --^^0011

0110 --^^0111

1000 --^^1001

1100 --^^1101

|^ |^

|^ |

1010 --^^1011

1110 --^^1111

Distance 1: used 7 times

,^ Distance 2: used 3 times

Bubble sort family^ ^ Odd-Even sort:^ ^ sorts n elements in n/2 phases^ ^ Each phase has two stages^ ^ first stage compares even element with next element

CS575 lecture 8^

^ first stage compares even element with next element ^ second stage compares odd element with next^2  O(n) time, O(n) work  Shell sort on d-dimensional hypercube  for i = 0..d-1 compare splits with neighbor in dimension i  followed by odd even sort until sorted

Quicksort^ ^ Sequential Algorithm^ ^ Select pivot element^ ^ Partition the array^ ^ left: elements <= pivot

CS575 lecture 8^ ^ left: elements <= pivot ^ right: elements > pivot  Quicksort(left)  Quicksort(right)  Sequential complexity?