Pseudocode - Parallel and Distributed Computing - Lecture Slides, Slides of Parallel Computing and Programming

During the course of work of the Parallel and Distributed Computing we learn the core of the programming. The main points disucss in these lecture slides are:Pseudocode, Unmarked Natural Numbers, Sequential Algorithm, Complexity, Sources of Parallelism, Domain Decomposition, Array Element, Agglomeration Goals, Data Decomposition Options, Block Decomposition

Typology: Slides

2012/2013

Uploaded on 04/24/2013

banamala
banamala 🇮🇳

4.4

(19)

114 documents

1 / 32

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Pseudocode
1. Create list of unmarked natural numbers 2, 3, …, n
2. k 2
3. Repeat:
(a) Mark all multiples of k between k2 and n
(b) k smallest unmarked number > k
until k2 > n
4. The unmarked numbers are primes
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20

Partial preview of the text

Download Pseudocode - Parallel and Distributed Computing - Lecture Slides and more Slides Parallel Computing and Programming in PDF only on Docsity!

Pseudocode

1. Create list of unmarked natural numbers 2, 3, …, n

2. k  2

3. Repeat:

(a) Mark all multiples of k between k

2

and n

(b) k  smallest unmarked number > k

until k

2

> n

4. The unmarked numbers are primes

Sequential Algorithm

Complexity: ( n ln ln n )

Making 3(a) Parallel

Mark all multiples of k between k

2

and n

for all j where k

2

 j  n do

if j mod k = 0 then

mark j (it is not a prime)

endif

endfor

Making 3(b) Parallel

Find smallest unmarked number > k

Min-reduction (to find smallest unmarked number > k )

Broadcast (to get result to all tasks)

Data Decomposition Options

• Interleaved (cyclic)

– Easy to determine “owner” of each index

– Leads to load imbalance for this problem

• Block

– Balances loads

– More complicated to determine owner if n not a

multiple of p

Block Decomposition Options

• Want to balance workload when n not a

multiple of p

• Each process gets either n/p or n/p

elements

• Seek simple expressions

– Find low, high indices given an owner

– Find owner given an index

Examples

17 elements divided among 7 processes 17 elements divided among 5 processes 17 elements divided among 3 processes

Method #1 Calculations

• First element controlled by process i

• Last element controlled by process i

• Process controlling element j

i  n / p  min( i , r )

( i  1 ) n / p min( i  1 , r ) 1

min( j /( n / p  1 ) , ( j  r )/ n / p )

Examples

17 elements divided among 7 processes 17 elements divided among 5 processes 17 elements divided among 3 processes

Comparing Methods

Operations Method 1 Method 2

Low index 4 2

High index 6 4

Owner 7 4

Assuming no operations for “floor” function Our choice

Pop Quiz

• Illustrate how block decomposition method

would divide 13 elements among 5 processes.

Block Decomposition Macros #define BLOCK_LOW(id,p,n) ((id)(n)/(p)) #define BLOCK_HIGH(id,p,n)
(BLOCK_LOW((id)+1,p,n)-1) #define BLOCK_SIZE(id,p,n)
(BLOCK_LOW((id)+1,p,n)-BLOCK_LOW(id,p,n)) #define BLOCK_OWNER(index,p,n)
(((p)
(index)+1)-1)/(n))**

Local vs. Global Indices

 - L - L - L - L - L 
  • G 0 1 G - G
    • G 7 8 9 G

Decomposition Affects Implementation

  • Largest prime used to sieve is n
  • First process has n/p elements
    • It has all sieving primes if p < n
  • First process always broadcasts next sieving prime
  • No reduction step needed