Distributed Computing: Making Multiple Computers Look Like a Single System - Prof. Peter M, Exams of Electrical and Electronics Engineering

Various aspects of distributed computing, including distributed shared memory, remote procedure calls, process migration, parallelizing compilers, distributed file systems, and the internet. It explains how to make multiple computers appear as a single system and provides examples of protocols such as nfs, http, e-mail, ssh, rpc, udp, tcp, ip, and ethernet.

Typology: Exams

Pre 2010

Uploaded on 09/02/2009

koofers-user-bj5-1
koofers-user-bj5-1 🇺🇸

9 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EECS 482 1 Peter M. Chen
Networks and distributed computing
Hardware reality
• lots of different manufacturers of NICs
• network card has a fixed MAC address, e.g.
00:01:03:1C:8A:2E
• send packet to MAC address (max size 1500 bytes)
packets may be reordered, corrupted, dropped, duplicated
• anyone can sniff the packets from the network
What abstractions does the OS provide for network communi-
cation?
Distributed computing (not covered much in EECS 482): mak-
ing multiple computers look more like a single computer
• distributed shared memory: make multiple memories
look like 1 memory
• remote procedure call, process migration, parallelizing
compilers: make multiple CPUs look like one CPU
• distributed file systems: make disks on multiple comput-
ers look like one file system
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Distributed Computing: Making Multiple Computers Look Like a Single System - Prof. Peter M and more Exams Electrical and Electronics Engineering in PDF only on Docsity!

EECS 482

1

Netw

orks and distrib

uted computing

Hardware reality

  • lots of different manufacturers of NICs• network card has a fixed MAC address, e.g.

00:01:03:1C:8A:2E

  • send packet to MAC address (max size 1500 bytes)• packets may be reordered, corrupted, dropped, duplicated• anyone can sniff the packets from the network What abstractions does the OS provide for network communi-

cation?

Distributed computing (not covered much in EECS 482): mak-

ing multiple computers look more like a single computer• distributed shared memory: make multiple memories

look like 1 memory

  • remote procedure call, process migration, parallelizing

compilers: make multiple CPUs look like one CPU

  • distributed file systems: make disks on multiple comput-

ers look like one file system

EECS 482

2

Abstractions and pr

otocol lay

ers

Why build up abstractions in layers?

Routing

Hardware interface: deliver to neighbor computer on LANApplication interface: deliver to final destination through sev-

eral hops Provided by the IP (Internet Protocol) layerMessages on LAN (e.g. Ethernet) are sent via the physical ID

of the network interface card (e.g. 0:a0:c9:95:f5:58)

NFS

HTTP

e-mail

ssh

RPC

UDP

TCP

IP

Ethernet

ATM

ppp

computer 1

computer 2

Ethernet switchcomputer 3

EECS 482

4

Translation from hostname to IP address is provided by DNS

(domain name system) Used to be done with one central server

  • central server has to learn about all changes• central server has to answer all lookups Split up the data into a hierarchical database (each DNS server

stores part of the database). Hierarchy allows local man-agement (so everybody doesn’t notify one central serverwhenever their hostname changes), and spreads the lookupwork across multiple servers Example: translating www.eecs.umich.edu

  • start with the (well-known) IP address of the root name

server (A.ROOT-SERVERS.NET, 198.41.0.4)

  • ask root name server for IP address of the edu name

server (also A.ROOT-SERVERS.NET, 198.41.0.4)

  • ask edu name server for the IP address of the umich.edu

name server (dns.itd.umich.edu, 141.211.144.15)

  • ask umich.edu name server for the IP address of

eecs.umich.edu name server (zip.eecs.umich.edu,141.213.4.4)

  • ask eecs.umich.edu name server for the IP address of

www.eecs.umich.edu: 141.213.4.

Message size

Hardware interface: physical network type limits size of a

message (e.g. Ethernet maximum packet size is 1500bytes) Application interface: can send larger message (e.g. IP maxi-

mum packet size is 64 KB) IP layer can fragment a packet when it’s larger than the next

hop’s MTU (maximum transmission unit), then re-assem-ble it at the destination

EECS 482

5

Sock

ets and ports

Hardware interface: machine-to-machine communication (one

network endpoint per machine) Application interface: process-to-process communication (one

or more network endpoints per process) A process can ask the OS to create a “socket”, which will be

one endpoint of a network connection• thread is like a virtual processor• address space is like a virtual memory• an endpoint (socket) is like a virtual network interface

card

Each socket on a computer has a unique “port” number

  • a process can associate a specific port number with a

socket using the

bind

call

  • when sending to a socket, the destination port number is

included in each message. This allows the destinationmachine to know which process (and which socket inthat process) should receive the message.

The OS to multiplex several network connections onto a single

physical card UDP (user datagram protocol) provides this process-to-process

abstraction on top of IP TCP (transmission control protocol) is also built on IP

  • provides additional abstractions beyond UDP: ordered,

reliable, byte streams

process A

process B socket 3

socket 2

socket 1

Operating System

Network InterfaceCard

EECS 482

7

Duplicate messages are easy to detect (look at the sequence #)

and fix (just drop the duplicate) To detect corrupted messages, add some redundant informa-

tion, e.g. checksum• if message is corrupted, simply drop it. This transforms

the problem of a corrupted message into the problem ofa dropped message, and we already know how to handlethat).

Byte str

eams

Hardware interface: send information over network in distinct

messages Application interface: send data in a continuous stream (simi-

lar to reading/writing a file) TCP provides byte streams instead of distinct messagesSender sends messages of arbitrary size that are combined into

a single stream TCP layer breaks up the stream into fragments, sends them as

distinct messages, then reassembles them at the destinationinto a byte stream for the receiver. In contrast, UDP pre-serves the message boundary between sender and receiver. E.g.

  • sender sends 100 bytes, then sends another 100 bytes• TCP receive may return 1-200 bytes If receiver wants to receive a certain number of bytes, it must

loop around the receive call How to know # of bytes to receive?

EECS 482

8

Wh

y b

uild distrib

uted applications?

Performance: aggregate performance of many machines can be

higher than the performance of a single fast machine Co-location: locate different computers near local resources

  • examples of local resources: people, sensors, actuators Reliability: can provide continuous service, even if one com-

puter is down

Building distrib

uted applications

Send/receive as communication primitive

  • how did we communicate between threads running on a

single computer?

  • this doesn’t work for threads running across different

computers (distributed applications)

  • to communicate, must send/receive messages Send/receive as synchronization primitives - what hardware primitives did we build on top of to syn-

chronize between threads on a single machine?

  • these don’t work for synchronizing between multiple

machines

  • we’ll use send/receive as the atomic primitives that allow

us to synchronize distributed applications

EECS 482

10

client_produce() {

send

produce request message to server wait

for

response

} client_consume() {

send

consume message to server wait

for

response

} server()

{

receive

request (from any producer or any client) if

(request is from a producer) {put

coke in machine

}^

else

{ take

coke out of machine

} send

response

} Problems with this code?How to fix the code?

server() {

receive request

(from

any

producer

or

any

client) if (request is

from

a

producer)

{

fork a thread

that

calls

server_produce()

} else {

fork a thread

that

calls

server_consume()

} } server_produce()

{

lockwhile (machine

is

full)

{

wait } put coke in machinesend response

to

producer

client

unlock } server_consume()

{

lockwhile (machine

is

empty)

{

wait } take coke out

of

machine

send response

to

consumer

client

unlock }

EECS 482

11

This creates a new thread for each request. How to lower the

overhead of creating threads? There are other ways of solving the problem (but threads are

the cleanest, because each thread just has to keep track ofone thing at a time (and it can be blocking, as long as itdoesn’t hold a lock)• polling (using select())• signals (using SIGIO)

RPC

We’ve been using send/receive. E.g. client sends request to

server, server receives request, then sends response mes-sage to client• this exposes the distributed nature of the system to the

programmer

  • we’d like to make building a distributed application as

similar as possible to building a centralized application

What else in programming is like making a request to a server

and getting a response?

EECS 482

13

Peter M. Chen

Client stub Server stub Note that client makes a normal function call, server function

is called like a normal function RPC is the mechanism behind CORBA and COM

Pr

oducer consumer using RPC

This uses datagrams (like UDP) and assumes messages are

reliable Client stub is named produce()

int produce(int

n){

int status;send(sock,

&n,

sizeof(n));

recv(sock,

&status,

sizeof(status));

return(status); } Server stub can be named anything

produce_stub()

{

int n;int status;recv(sock,

&n,

sizeof(n));

status =

produce(n);

/*

call

“produce”

function

on

server

*/

send(sock,

&status,

sizeof(status));

} Client and server stubs can be generated automatically. What

information do you need to generate the stub?

EECS 482

14

Pr

oblems with RPC

RPC tries to make request/response to remote server look like

a function call, but some differences remain Hard to pass pointers and global variables

  • what happens if you pass a pointer to the remote server

and the server de-references it?

  • one way to fix this is to also send the data being pointed

to, then change the pointers on the server to point to theremote copy of the data, then copy the data back to theclient when the server is finished

Data might have different representations on different

machines• solve by agreeing to some conventional format Different failure modes can occur in RPC than a normal func-

tion call• e.g. server fails but client stays up

Structuring a concurr

ent system

1 multi-threaded process on 1 computer 1 multi-threaded process on each of several computers

send receive