









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Professor: Bohm; Class: Parallel Processing; Subject: Computer Science; University: Colorado State University; Term: Spring 2010;
Typology: Study notes
1 / 16
This page cannot be seen from the preview
Don't miss anything!










Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license
^ Routing / Switching Techniques
^ both Store and Forward: t
+tsw .m.l^
O(m.l)
and Cut Through: t
+tsw .m + t
.l^ h
O(m+l)
ONON ^ Ring: l <= p/2 ^ Wraparound square 2 D mesh: l <= sqrt(p) ^ Hypercube: l <= log(p)
^ Two directly connected PE-s can send messages of size mto
each other simultaneously in t
m timew
^ WASTEFUL ^ 2*(1 + 2 + .. p/2) hops: O(p
2 ) hops
store
copy of message and
forward
it
^ Initiating node (say node 0)
sends two messages
^ one in clockwise direction to
node p/
^ one in counter clock direction to node p/2+1 ^ the messages travel overlapping in time ^ #steps <= p/2+1:
(t+ts
.m).(p/2+1)w
^ Ring broadcast in row of initiating node^ Ring broadcast in all columns from row of initiating node^ Ring broadcast in all columns from row of initiating node ^ Time steps:
(ts^
.m).sqrt(p)
^ Similar (n phase) procedure works on nD mesh
^ nn matrix, n1 vector, n*n mesh ^ DISCUSS
^ Need a re-labeling mechanism: My-ID
XOR
Sender
^ This maps all bit patterns onto all bit patterns
^ Why? Why is that relevant? ^ Nearest neighbors in original labeling are nearest neighbors in thenew labeling.
^ Why? Why is that relevant? ^ Phase i still sends along dimension related to bit i
^ Assumptions: Only one send per time step possible
Any topology allowed
^ Step one:node i
sends its data to node
(i+1)mod p
^ Step 2 .. (p
^ Step 2 .. (p
node i
sends the data it received in the previous step to node (i+1)mod p T ~ (t
.m)pw
^ Phase one: broadcast in row:
message size m
^ Phase two: broadcast in
col: message size m.sqrt(p)
^ T ~ t
.sqrt(p) + ts
.m.pw
^ Pre:
has Xi
i
Post:
has Reduce(op,X
^ Post:
PEi
has Reduce(op,X
)p-
^ Why associative?
i
^ Every PE has two variables ^ my-result
: add incoming value from PE
if^ i i < my-id
^ total
: add incoming value ^ exchange total
^ (SF, CT) * (ring, mesh, hyper cube)
Along different paths Along different paths
^ Simultaneous communication over all ports of a node
^ E.g. for reduction (switches combine messages)