Download Message-Passing Programming: A Comprehensive Guide by Jingke Li - Prof. Jingke Li and more Exams Computer Science in PDF only on Docsity!
Message-Passing Programming
Jingke Li
Portland State University
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 1 / 26
Hardware Characteristics
• Nodes are independent computers with private memory
• Processors communicate via message passing through an
interconnection network
Basic Programming Issues
• Decomposition —
Partitioning data and computation, and distributing them to
processors.
• Which first, data decomposition or computation decomposition?
• How to select a decomposition strategy?
• Communication —
Passing messages between processors to facilitate data sharing and
computation synchronization.
• Figuring out senders and receivers for each communication
• Selecting proper communication routines
• Placing communication routines in program
• Deciding when to invoke communication routines
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 3 / 26
Data and Computation Decomposition
• Computation Decomposition First —
Decompose the computation workload into disjoint tasks, and map
them to the processors first. Partition and distribute data later.
Since these tasks are likely to access the same data set, regardless how
the data are distributed, a large amount of messages may have to be
generated. Not really suitable for large-scale message-passing systems.
• Data Decomposition First —
Decompose the data into small portions, and map them to the
processors. For each data portion, the associated computation is
carried out on the assigned processor.
Data Decomposition Strategies
• Block — A data domain is decomposed into blocks of contiguous
elements.
P 1 P 2 P 3 P 4
(a) (1, 8)-block (b) (2, 4)-block (c) (4, 2)-block (d) (8, 1)-block
• Cyclic — A data domain is decomposed into small blocks, then the
small blocks are assigned to processors in a round-robin fashion.
P 1 P 2 P 3 P 4 P 1 P 2 P 3 P 4
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 7 / 26
Fine-Grained Decomposition
Advantages:
• The program does not have to know the number of processors
available on the target machine, making it more flexible and portable.
• Can use the decomposition best for the application domain.
Disadvantages: Requires aggregation
Aggregation — The goal is to minimize communication overhead.
• Surface-to-Volume Effects — Volume corresponds to the amount of
data involved in local computation; surface often corresponds to the
amount of data to be communicated.
• Communication Patterns — Different aggregations result in different
communication patterns.
Surface-Volume Example
Volume = 4 × 4 = 16 Surface = 4 × 8 = 32
Volume = 8 × 2 = 16 Surface = 8 × 4 + 2 × 4 = 40
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 9 / 26
Data Alignment
Often times there are multiple data objects in a program. Data
decomposition needs to take the dependencies between the objects into
consideration, or a higher communication cost may result. For example:
forall i=1,n
forall j=1,n
b(j) = b(j) + a(i,j)
end forall
end forall
j
i Domain of a Domain of b
Two possible alignments for the two domains:
• Align array b with first row of array a.
• Align array b with first column of array a.
Send/Receive Primitives
Provide the basic message-passing service from one source node to one
destination node. The two nodes do not have to be connected by a
physical link, e.g. any node can send a message to any other node.
• Sender — The user program issues a
send routine call; the routine copies
data from the user space and sends it
to the destination (it may use an
intermediate buffer).
• Receiver — The user program issues
a receive routine call; the routine
goes to receive the message sent
from the sender (it may need to
wait), and place it to the user space.
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 13 / 26
Blocking vs. Non-Blocking
Depending on the timing of return, a send or receive routine can be either
blocking or non-blocking.
• Blocking means the send/receive routine will block until it is “safe”
to return — when a blocking routine returns, it is safe to issue other
sends/receives.
• For a blocking send routine, safe means that the message data can be
modified and the communication buffer can be reused.
• For a blocking receive routine, safe means the message has been
received and is available for use. If the message has not arrived at the
time the receive routine is issued, it will wait until the message arrives.
• If not careful, blocking sends and receives can lead to deadlock.
Blocking vs. Non-Blocking (cont.)
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 15 / 26
Blocking vs. Non-Blocking (cont.)
• A non-blocking routine always returns immediately. It does wait for
the send or receive action to finish. The benefit is that the user
program can quickly move on to do something useful, instead of just
waiting.
• When a non-blocking send routine returns, it is not safe to alter the
message data or to reuse the communication buffer.
One issue needs to be resolved — no matter how long one waits after
the routine returns, there is no guarantee that the send is finished.
Solution: Use a specific routine to test for the completion of the send.
• A non-blocking receive routine returns after checking local buffer,
regardless whether the expected message has arrived or not; if it’s the
latter case, then process will not get the message.
“One-Way” Communication
The data producers (senders) are passive, they only respond to requests
from the consumers (receivers).
“Remote reads” and “remote writes.”
C C C C
D D D D
1 read(1)^3 read(3) write(5)
The distributed data structure is encapsulated in a set of tasks responsible
only for responding to read and write requests.
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 19 / 26
Collective Communications
Synchronous concurrent messages implementing global communication
patterns.
• One-to-Many — Spread data from one node to many other nodes.
• Broadcast: send same data to every other node.
• Multicast: send same data to a set of nodes.
• Scatter: send different data to different nodes.
broadcast multicast
Collective Communications (cont.)
• Many-to-One — Combine messages from many nodes to one node.
• Reduce: combine multiple data by a reduction function to a single
data. (+, −, ×, /, min, max, etc.)
• Gather: collect multiple data to a single node.
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 21 / 26
Collective Communications (cont.)
• Many-to-Many — Concurrent, disjoint send/receive pairs.
Shift Transpose Rotate
Flip Skew
How to Reduce Communication Overhead?
• Increasing locality — localized communication (i.e. sender and
receiver are neighbors) has a lower chance to interfere with other
communications
• Vectorizing messages — grouping small messages together; sending
fewer, larger messages
• Utilizing collective communications — collective communications are
often implemented as library routines, which often are optimized on
the given architecture
• Overlapping communication with computation — hiding
communication latency; this can be very effective
Jingke Li (Portland State University) CS 415/515 Message-Passing Programming 25 / 26
Languages and Tools
• CSP (Communicating Sequential Processes):
• An explicit message-passing language, proposed by Hoare.
• It extends ordinary sequential programming with a minimal set of
constructs for expressing parallelism, communication, and
indeterminacy.
• PVM (Parallel Virtual Machine):
• Provides for a software environment for message passing between
homogeneous or heterogeneous computers and has a collection of
library routines that the user can employ with C or Fortran programs.
• Developed by Oak Ridge National Labs, available free of charge.
• First of such tools, widely used at universities and research labs.
• MPI (Message Passing Interface):
• A standard developed by group of academics and industrial partners.
• Defines a much richer set of message passing routines than PVM.
• Several free implementations exist, e.g. MPICH and LAM.