Distributed Parallel Programming

CMPSCI 377 Operating Systems

10.1 Distributed parallel programming

So far we have been focusing on how to use threads in order to exploit concurrency. The problem with using

only threads is that eventually we can run out of resources (cores, memory, etc). So, instead of programming

with threads, that is, of using shared memory on a single machine, we now focus on how to use message

passing in order to distribute the processing across several machines. This approach works well for highly

parallelizable problems, such as simulating the weather, nuclear blasts, solving bioinformatics problems, etc.

Notice that since this approach requires communication across the network, which is slow, we typically want

each machine to perform the maximum amount of computation it can on its own, and only then to sparsely

communicate the results with the other nodes in the network.

10.1.1 Message passing

Message passing is the mechanism that allows parallel computers to communicate with each other. The use

of message passing assumes that we have a good way of partitioning the problem into a bunch of machines. In

general, message passing is efficient since it makes data sharing explicit, and also because it can communicate

only what is strictly necessary for performing the computation1. However, because message passing requires

the manual partitioning of the problem, its use is not trivial.

Message passing can be used on a variety of computer system architectures, from large clusters of machines,

to NUMA supercomputers, to SMPs. Its advantage is that it performs well on all of these architectures.

Shared-memory parallelism (threads) can perform well on SMPs, but does not perform well on distributed

cluster systems.

The actual implementation of a message passing architecture usually makes use of a Message Passing Interface

(MPI). The MPI is a language-independent communications protocol used to program parallel computers.

MPI is implemented as a library, generally produced by machine vendors in a version optimized for their

systems. For more details, please check the slides and also the Wikipedia entry for MPI: http://en.

wikipedia.org/wiki/Message_Passing_Interface.

MPI’s execution model is what is called “SPMD,” standing for “Single Program Multiple Data.” Each

machine in the cluster runs the same program, with different data and different local memory.

Let us now see how we could use an MPI to implement a program that runs in parallel in several machines:

int main(argc, argv)

int rank, size; // size=number of machines that will run this process

// rank=which processor am I?

MPI_Init(&argc, &argv);

MPI_Comm_size(MPI_COMM_WORLD, &size);

MPI_Comm_rank(MPI_COMM_WORLD, &rank);

1Contrarily to threads, which implicitly share everything.

10-1

Partial preview of the text

Download Distributed Parallel Programming - Operating Systems | CMPSCI 377 and more Study notes Operating Systems in PDF only on Docsity!

CMPSCI 377 Operating Systems

10.1 Distributed parallel programming

So far we have been focusing on how to use threads in order to exploit concurrency. The problem with using only threads is that eventually we can run out of resources (cores, memory, etc). So, instead of programming with threads, that is, of using shared memory on a single machine, we now focus on how to use message passing in order to distribute the processing across several machines. This approach works well for highly parallelizable problems, such as simulating the weather, nuclear blasts, solving bioinformatics problems, etc. Notice that since this approach requires communication across the network, which is slow, we typically want each machine to perform the maximum amount of computation it can on its own, and only then to sparsely communicate the results with the other nodes in the network.

10.1.1 Message passing

Message passing is the mechanism that allows parallel computers to communicate with each other. The use of message passing assumes that we have a good way of partitioning the problem into a bunch of machines. In general, message passing is efficient since it makes data sharing explicit, and also because it can communicate only what is strictly necessary for performing the computation^1. However, because message passing requires the manual partitioning of the problem, its use is not trivial.

Message passing can be used on a variety of computer system architectures, from large clusters of machines, to NUMA supercomputers, to SMPs. Its advantage is that it performs well on all of these architectures. Shared-memory parallelism (threads) can perform well on SMPs, but does not perform well on distributed cluster systems.

The actual implementation of a message passing architecture usually makes use of a Message Passing Interface (MPI). The MPI is a language-independent communications protocol used to program parallel computers. MPI is implemented as a library, generally produced by machine vendors in a version optimized for their systems. For more details, please check the slides and also the Wikipedia entry for MPI: http://en. wikipedia.org/wiki/Message_Passing_Interface.

MPI’s execution model is what is called “SPMD,” standing for “Single Program Multiple Data.” Each machine in the cluster runs the same program, with different data and different local memory.

Let us now see how we could use an MPI to implement a program that runs in parallel in several machines:

int main(argc, argv) int rank, size; // size=number of machines that will run this process // rank=which processor am I? MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); (^1) Contrarily to threads, which implicitly share everything.

10-2 Chapter 10: Distributed Parallel Programming

printf("hello world from process %d of %d", rank, size); MPI_Finalize(); return 0;

We could start this program on, say, 10 machines, by running:

mpirun -np 10 exampleProgram

Notice that the printf “magically” passes its output back to the machine who spawned mpirun!

Distributed Parallel Programming - Operating Systems | CMPSCI 377, Study notes of Operating Systems

Related documents

Partial preview of the text

Download Distributed Parallel Programming - Operating Systems | CMPSCI 377 and more Study notes Operating Systems in PDF only on Docsity!