Systems Programming Lec5 - Introduction to DistributedSystems, Study notes of Computers and Information technologies

Description about Systems Programming, Introduction to Distributed Systems, Introduction to Distributed Systems – Basic Concepts, Tightly-Coupled systems,Motivation for Loosely-Coupled Distributed Systems, Distributed System Architecture .

Typology: Study notes

2010/2011

Uploaded on 09/10/2011

aristocrat
aristocrat 🇬🇧

5

(5)

240 documents

1 / 27

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
111
Systems Programming
Introduction to Distributed Systems
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b

Partial preview of the text

Download Systems Programming Lec5 - Introduction to DistributedSystems and more Study notes Computers and Information technologies in PDF only on Docsity!

111

Systems Programming

Introduction to Distributed Systems

2

Introduction to Distributed Systems – Basic Concepts

A ‘distributed’ computing system is one in which the computation is

distributed / spread across multiple processing entities.

In addition to processing, various aspects of the ‘system’ can be distributed:

Operating System

File System

Database / data

Authentication

Business logic

Workload (resource allocation / load sharing)

Challenges include:

Scalability

Avoiding single point of failure (centralisation)

Replication

Availability and performance

Resource naming, addressing and location of resources

Binding (mapping between parts of the system)

4

Introduction to Distributed Systems – Loosely-Coupled systems

The processor units are within separate computers.

The computers are connected by a network technology.

General purpose hardware:

Cheap

Abundant

Challenges include:

Each computer has its own clock:

  • absolute synchronisation NOT possible.
  • ‘loose’ synchronisation is necessary.

Computers have separate memory – not suitable for inter-processor

communication.

The individual computers are Autonomous – need some overall guidance.

The individual computers are Heterogeneous – different memory size, disk

size, processor speed, hardware platform, operating system etc.

5

Introduction to Distributed Systems

- Motivation for Loosely-Coupled Distributed Systems

The Interest in distributed systems has grown because of:

The need to share large amounts of data,

The availability of cheap workstations,

The need to share expensive peripherals,

The availability of cheap high speed networks,

The need for local control but overall access,

The need to communicate and interact,

The need for flexibility of growth,

The need to provide users with facilities with realistic response times.

7

Introduction to Distributed Systems –Distributed System Architecture 1

The Processor-Pool model

(‘Grid Computing’ is based on this model, but tends to be larger scale and

can be across multiple organisations)

8

Introduction to Distributed Systems – Transparency 1

Distributed systems present numerous challenges to the developer, such as:

Where is the process? Can it be moved?

Where is the data / resource? Can it (data) be moved?

Providing robustness, and dealing with failures

Ensuring consistency

Building scalable systems (communication efficiency, interaction model).

Transparency means hiding the details of distribution.

The goal is to reduce the burden on developers so that they can focus their

efforts on the ‘business logic’ of the application and not have to deal with all

the vast array of technical issues arising because of distribution.

10

Introduction to Distributed Systems – Transparency 3

Failure transparency

Faults are concealed such that applications can continue without knowledge

that a fault has occurred.

Migration transparency

(For data objects) - Objects can be moved without affecting the operation of

applications that use those objects.

(For processes) - Processes can be moved without affecting their operations

or results.

Performance transparency

The performance of systems should degrade gracefully as the load on the

system increases.

Scaling transparency

It should be possible to scale-up an application, service or system without

changing the system structure or algorithms.

11

Introduction to Distributed Systems – Application Issues

Parallel processing

Multiple processors used to reduce the execution time (goal is a ‘speedup’)

Need to split the processing work appropriately (difficult)

Special programming languages can be used (Occam, Parallel C etc.)

Interaction and dependencies between sub-processes is problematic and

can restrict the amount of ‘speedup’ achieved.

Can be performed on Tightly-coupled or Loosely-coupled systems

Key issues are:

Communication overheads

Data dependencies

Current trend for loosely-coupled systems is ‘Grid Computing’

13

Client

Process

A

Server

Process

S

Introduction to Distributed Systems – Client - Server model 1

Client

Process

B

Connection 1

Connection 2

Connections are private between one client and one server.

One server may allow many clients to be connected at one time.

Clients usually initiate communication (as and when service is needed).

All communication is via the server (clients do not communicate directly).

14

Client

Process

A

Database

Server

Introduction to Distributed Systems – Client - Server model 2

Client

Process

B

Client – Server is quite a flexible model, can operate at several levels.

Consider a system comprising:

Database server (holds the database itself and the access / update logic),

Database clients (user-local interfaces to access the database),

Authentication service (holds information to validate / authenticate users).

Authenti-

cation

Server

Database

Server

Database access

request

Database access

request

Authentication

request

Request

accepted

Result

Refused

Request

denied

16

Introduction to Distributed Systems – Peer – Peer model

Process

A

Process

B

Process

C

Communication

at Time T

Communication

at Time T

Connectivity is ad-hoc (i.e. it can be spontaneous, unplanned, unstructured)

Peers can interact with others in any order, at any time.

Well-suited to mobile applications on mobile devices (some games).

Some applications (including some ‘sensor network’ applications) rely on

‘promiscuous’ connectivity of peers to pass information across a system.

17

Introduction to Distributed Systems – Communication Issues

Sockets API

A library of primitives to support transport-layer communication with UDP and

TCP.

  • The most flexible of all techniques (because it is ‘low-level’).
  • Requires the application developer to deal with the communication aspects.

User Datagram Protocol (UDP)

Connectionless

Unreliable

Lightweight

Point-to-point (uni-cast)

Broadcast

Transmission Control Protocol (TCP)

Connection oriented

Reliable

Relatively high overheads (larger header, acks, more processing overhead)

Point-to-point

19

Introduction to Distributed Systems – Multicast communication

One message is sent, delivered to a subset of available ‘receivers’.

Sender may not need to know how many, or identities of, receivers (depends

on implementation).

Insecure (any process on the appropriate port can hear the message).

Usually limited to LAN scope (routers block some application multicasts, but

routers use multicast when sharing routing information amongst themselves).

Inefficient (interrupts at all receivers – even if not interested in the data).

Process

C

Process

A

Process

D

Process

F

Process

G

Process

E

Process

B

Process

H

Member

of

Multicast

group

Not members

of Multicast

group

20

Introduction to Distributed Systems – Communication Issues

Message passing

This is a simple unstructured form of communication in which messages are

passed from a sender to a receiver.

Conceptually this is very similar to datagram communication provided by

UDP.

Message Passing Interface (MPI)

A specific implementation of message passing.

Very popular in parallel processing applications.

Has a large number of enhancements (beyond for example UDP datagrams),

added specifically to support the synchronisation aspects of communication

in parallel applications.