










Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Prof. Bhairav Gupta delivered this lecture at Ankit Institute of Technology and Science for Parallel Processing course. It includes: SPMD, Program, Elements, SMP, System, Parallel, Machines, Communication, Costs, Programming, Model, Semantics
Typology: Slides
1 / 18
This page cannot be seen from the preview
Don't miss anything!











-^
-^
-^
-^
h^
Store-and-forward makes poor use of communicationresources.
-^
Packet routing breaks messages into packets andpipelines them through the network.
-^
Since packets may take different paths, each packetmust carry routing information, error checking,sequencing, and other related header information.
-^
The total communication time for packet routing isapproximated by:
-^
The factor
t
h^
accounts for overheads in packet headers.
(m
p
Takes the concept of packet routing to an extreme byfurther dividing messages into basic units called flits.
-^
Since flits are typically small, the header informationmust be minimized.
-^
This is done by forcing all flits to take the same path, insequence.
-^
A tracer message first programs all intermediate routers.All flits then take the same route.
-^
Error checks are performed on the entire message, asopposed to flits.
-^
No sequence numbers are needed.
Group communication operations are built usingpoint-to-point messaging primitives.
-^
Recall from our discussion of architectures thatcommunicating a message of size
m
over an
uncongested network takes time (
w
We use this as the basis for our analyses. Wherenecessary, we take congestion into account explicitlyby scaling the
tw
term.
We assume that the network is bidirectional and thatcommunication is single-ported.
Simplest way is to send
p-
messages from the source
to the other
p-
processors - this is not very efficient.
Use recursive doubling: source sends a message to aselected processor. We now have two independentproblems derived over halves of machines.
-^
Reduction can be performed in an identical fashion byinverting the process.
One-to-All Broadcast on a Ring: Recursive doubling
One-to-all broadcast on an eight-node ring. Node 0 is the source of the broadcast. Each
message transfer step is shown by a numbered, dotted arrow from the source of themessage to its destination. The number on an arrow indicates the time step during
which the message is transferred.
One-to-all broadcast on a 16-node mesh.
One-to-all broadcast on a three-dimensional hypercube. The binary
representations of node labels are shown in parentheses.