Download Understanding Distributed Systems: Autonomy, Sharing, and Transparency - Prof. Kiros and more Lecture notes Distributed Programming and Computing in PDF only on Docsity!
CHAPTER ONE
Introduction to Distributed Systems
Main contents
Introduction
Definitions
Goals
Distributed System: Definition A distributed system is a piece of software that ensures: a collection of autonomous computing elements that appears to its users as a single coherent system Two aspects in distributed system : 1) Autonomous computing elements, also referred to as nodes, be they hardware devices or software processes and 2) Single coherent system: users or applications perceive a single system nodes need to collaborate. Each node is autonomous :
- (^) Its own notion of time there is no global clock
- (^) Leads to fundamental synchronization and coordination problems. Collection of nodes and group:
- (^) How to manage group membership?
- (^) How to know you are communicating with an authorized member?
Why Distributed?
Because we need to…
Resource and Data Sharing
printers, databases, multimedia servers, ...
Availability, Reliability
the loss of some instances can be hidden
Scalability, Extensibility
the system grows with demand (e.g., extra servers)
Performance
huge power (CPU, memory, ...) available
Inherent distribution, communication
organizational distribution, e-mail, video
Characteristics of Distributed Systems
Differences between the computers and the way they communicate are hidden from
users.
Users and applications can interact with a distributed system in a consistent and
uniform way regardless of location.
Distributed systems should be easy to expand and scale.
A distributed system is normally continuously available, even if there may be partial
failures.
Organization of distributed system
Overlay network
- (^) Each node in the collection communicates only with other nodes in the system, its neighbors.
- (^) The set of neighbors may be dynamic, or may even be known only implicitly (i.e., requires a lookup). Overlay types
- (^) Well-known example of overlay networks: peer- to-peer systems.
- (^) Structured: each node has a well-defined set of neighbors with whom it can communicate (tree, ring).
- (^) Unstructured: each node has references to randomly selected other nodes from the system.
Middleware: OS of Distributed Systems
What’s inside?
- (^) Commonly used components and functions that need not be implemented by
applications separately.
What do we want to achieve?
Supporting sharing of resources
Distribution transparency
Openness
Scalability
Distribution Transparency
Transparency Description
Access Hide differences in data representation and how an object is accessed Location Hide where an object is located Relocation Hide that an object may be moved to another location while in use Migration Hide that an object may move (itself) to another location Replication Hide^ that^ an object^ is^ replicated Concurrency Hide that an object may be shared by several independent users Failure Hides failure and recovery of objects Note: Distribution transparency is a nice goal, but aiming at full distribution transparency may be too much
Degree of Transparency
Observation: Aiming at full distribution transparency may be too much.
- (^) There are communication latencies that cannot be hidden
- (^) Completely hiding failures of networks and nodes is (theoretically and practically) impossible - (^) You cannot distinguish a slow computer from a failing one - (^) You can never be sure that a server actually performed an operation before a crash
- (^) Full transparency will cost performance, exposing distribution of the system
- (^) Keeping replicas exactly up-to-date with the master takes time
- (^) Immediately flushing write operations to disk for fault tolerance
Openness of Distributed Systems
Open distributed system
Be able to interact with services from other open systems, irrespective
of the underlying environment:
- (^) Systems should conform to well-defined interfaces
- (^) Systems should support portability of applications
- (^) Systems should easily interoperate
Achieving openness
At least make the distributed system independent from heterogeneity of the underlying environment:
- (^) Hardware
- (^) Platforms
- (^) Languages
Policy versus Mechanisms
Implementing openness: Support for different policies:
- (^) What level of consistency do we require for client-cached data?
- (^) Which operations do we allow downloaded code to perform?
- (^) Which QoS requirements do we adjust in the face of varying bandwidth?
- (^) What level of secrecy do we require for communication? Implementing openness: Ideally, a distributed system provides only mechanisms:
- (^) Allow (dynamic) setting of caching policies
- (^) Support different levels of trust for mobile code
- (^) Provide adjustable QoS parameters per data stream
- (^) Offer different encryption algorithms
Geographical scalability Cannot simply go from LAN to WAN:
- (^) Many distributed systems assume synchronous (the same) client-server interactions
- (^) Client sends request and waits for an answer.
- (^) Latency(delay) may easily prohibit this scheme. WAN links are often inherently unreliable
- (^) Simply moving streaming video from LAN to WAN will fail. Lack of multipoint communication
- (^) A simple search broadcast cannot be deployed.
- (^) Solution: Develop separate naming and directory services (having their own scalability problems).
Administrative scalability
Essence: Conflicting policies concerning usage, management, and security Exception: Several P2P networks
- (^) File-sharing systems (based, e.g., on BitTorrent)
- (^) Peer-to-peer telephony (Skype)
- (^) Peer-assisted audio streaming (Spotify) Note: End users collaborate and not administrative entities. Examples: Computational grids: share expensive resources between different domains. Shared equipment: how to control, manage, and use a shared radio telescope constructed as large-scale shared sensor network?