State Machine Replication: Ensuring Consistency and Fault-Tolerance - Prof. Shouhuai Xu, Study notes of Cryptography and System Security

The challenges and solutions for replicating state machines to ensure consistency and fault-tolerance in distributed systems. Topics include making servers deterministic, replica coordination, and reliable broadcast. The document also explores different failure models and their implications on consensus and termination.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-ja1
koofers-user-ja1 🇺🇸

10 documents

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
State-Machine
Replication
Solution: replicate server!
The Problem
Clients Server
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download State Machine Replication: Ensuring Consistency and Fault-Tolerance - Prof. Shouhuai Xu and more Study notes Cryptography and System Security in PDF only on Docsity!

State-Machine

Replication

Solution: replicate server!

The Problem

Clients Server

The Solution

1. Make server deterministic (state machine)

State machine

The Solution

1. Make server deterministic (state machine)

2. Replicate server

State machine

The Solution

1. Make server deterministic (state machine)

2. Replicate server

3. Ensure correct replicas step through the same

sequence of state transitions

4. Vote on replica outputs for fault-tolerance

Clients Voter State machine

A conundrum

... A: voter and client share fate!

Ahhh, Java… simpl e obje c t-o r ien t e d p o r at bl e dis tr ib u te d in te r p re te d h ig h-p e rf o r manc e m u tl i-thr ea de d se cu re Semantic Characterization of a State Machine Outputs of a state machine are completely determined by the sequence of requests it processes, independent of time and any other activity in a system

Broadcast

If a process sends a message , then every process eventually delivers m m

Broadcast

If a process sends a message , then every process eventually delivers p 0 p 1 p 2 p 3 m m

Broadcast

If a process sends a message , then every process eventually delivers How can we adapt the spec for an environment where processes can fail? And what does “fail” mean? p 0 p 1 p 2 p 3 m m

A hierarchy of

failure models

Crash

A hierarchy of

failure models

Crash Send Omission General Omission Receive Omission benign failures Fail-stop

A hierarchy of

failure models

Crash Arbitrary failures with message authentication Send Omission General Omission Receive Omission benign failures Fail-stop

A hierarchy of

failure models

Crash Arbitrary failures with message authentication Arbitrary (Byzantine) failures Send Omission General Omission Receive Omission benign failures Fail-stop

Reliable Broadcast

Validity!! If the sender is correct and broadcasts a !! message , then all correct processes !! eventually deliver Agreement!! If a correct process delivers a message , !! then all correct processes eventually !! deliver Integrity!! Every correct process delivers at most one !! message, and if it delivers , then some !! process must have broadcast m m m m m m

Properties of

send(m) and receive(m)

Benign failures: Validity If sends to , and , , and the link between them are correct, then eventually receives Uniform* Integrity For any message , receives at most once from , and only if sent to

  • A property is uniform if it applies to both correct and faulty processes m m m m m p p q q q q q p p

Properties of

send( ) and receive( )

Arbitrary failures: Integrity For any message , if and are correct then receives at most once from , and only if sent to m p q q q (^) m p p m

m m

Questions, Questions…

Are these problems solvable at all? Can they be solved independent of the failure model? Does solvability depend on the ratio between faulty and correct processes? Does solvability depend on assumptions about the reliability of the network? Are the problems solvable in both synchronous and asynchronous systems? If a solution exists, how expensive is it?