CS347 Lecture 1: The Two-Generals Problem and Eventual Commit Protocol, Slides of Distributed Database Management Systems

The two-generals problem in cs347, a distributed systems course. The problem deals with the challenge of synchronizing two armies to attack at the same time, and the lecture explores solutions such as the eventual commit protocol. The document also touches on the importance of distributed and parallel data processing in today's economy.

Typology: Slides

2011/2012

Uploaded on 07/16/2012

sambandam
sambandam 🇮🇳

4.3

(37)

154 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
2
CS 347 Lecture 1 7
Blue and red army must attack
at same time
Blue and red generals synchronize
through messengers
Messengers can be lost
Rules:
CS 347 Lecture 1 8
How Many Messages Do We Need?
BG RG
attack at 9am
assume blue starts...
Is this enough??
CS 347 Lecture 1 9
How Many Messages Do We Need?
BG RG
attack at 9am
assume blue starts...
Is this enough??
ack (red goes at 9am)
CS 347 Lecture 1 10
How Many Messages Do We Need?
BG RG
attack at 9am
assume blue starts...
Is this enough??
ack (red goes at 9am)
got ack
CS 347 Lecture 1 11
Stated problem is Impossible!
Theorem: There is no protocol that uses a
finite number of messages that solves the
two-generals problem (as stated here)
Alternatives??
CS 347 Lecture 1 12
Probabilistic Approach?
Send as many messages as possible, hope
one gets through...
BG RG
attack at 9am
assume blue starts...
attack at 9am
attack at 9am
attack at 9am
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download CS347 Lecture 1: The Two-Generals Problem and Eventual Commit Protocol and more Slides Distributed Database Management Systems in PDF only on Docsity!

CS 347 Lecture 1 7

  • Blue and red army must attack at same time
  • Blue and red generals synchronize through messengers
  • Messengers can be lost

Rules:

CS 347 Lecture 1 8

How Many Messages Do We Need?

BG RG attack at 9am

assume blue starts...

Is this enough??

CS 347 Lecture 1 9

How Many Messages Do We Need?

BG RG attack at 9am

assume blue starts...

Is this enough??

ack (red goes at 9am)

CS 347 Lecture 1 10

How Many Messages Do We Need?

BG RG attack at 9am

assume blue starts...

Is this enough??

ack (red goes at 9am) got ack

CS 347 Lecture 1 11

Stated problem is Impossible!

  • Theorem: There is no protocol that uses a finite number of messages that solves the two-generals problem (as stated here)

Alternatives??

CS 347 Lecture 1 12

Probabilistic Approach?

  • Send as many messages as possible, hope one gets through...

BG RG attack at 9am

assume blue starts...

attack at 9am attack at 9am attack at 9am

CS 347 Lecture 1 13

Eventual Commit

  • Eventually both sides attack...

BG RG attack ASAP

assume blue starts...

on my way!

retransmits retransmits

CS 347 Lecture 1 14

Eventual Commit

  • One message sent every time unit
  • Probability of success one message is p
  • What is probability that red commits by time t?

BG (^) attack ASAP RG

on my way!

retransmits retransmits

CS 347 Lecture 1 15

Eventual Commit

BG (^) attack ASAP RG

on my way!

retransmits retransmits

  • C(1) = p

CS 347 Lecture 1 16

Eventual Commit BG (^) attack ASAP RG

on my way!

retransmits retransmits

  • C(1) = p
  • C(2) = p + (1-p)p

CS 347 Lecture 1 17

Eventual Commit

BG (^) attack ASAP RG

on my way!

retransmits retransmits

  • C(1) = p
  • C(2) = p + (1-p)p
  • C(3) = p + (1-p)p + (1-p)^2 p
  • C(4) = p + (1-p)p + (1-p)^2 p + (1-p) 3 p

Eventual Commit

CS 347 Lecture 1 18

C(t)

t

p

CS 347 Lecture 1 25

  • Renewed Interest in Distributed/Parallel Data Processing! - Massive web data, manage with many computers - How to crawl and search the web? - Peer-to-peer systems manage huge amounts of data - Data from many sources (e.g., comparison shopping): how to integrate? - Sensor Networks: data generated an many sensors/devices, need to analyze - Multi-player games (e.g., Second Life): tons of distributed data CS 347 Lecture 1 26

It’s the Economy, Stupid!

  • Example: Multi-player games

Data

state

P P

P P

P

P

P

P

P

P

CS 347 Lecture 1 27

It’s the Economy, Stupid!

  • Example: Multi-player games

Data

state

P P

P P

P

P

P

P

P

P state

CS 347 Lecture 1 28

Logistics

  • LECTURES: Mondays and Wednesdays 12:50pm to 2:05pm, Gates B
  • INSTRUCTOR: Hector Garcia-Molina; Office: Gates Hall 434 Email: [email protected]; Office Hours: Mondays, Wednesdays 11am to 12noon.
  • TEACHING ASSISTANT: Kushal Tayal; Email: [email protected]; News Group: su.class.cs347; Office Hours: TBD
  • SECRETARY: Marianne Siroker; Office: Gates Hall 436; Email: [email protected]; Phone: (650) 723-

CS 347 Lecture 1 29

Logistics

  • TEXTBOOK: No required textbook. Some material for the lectures will be drawn from the following book: - M. Tamer Ozsu and Patrick Valduriez, "Principles of Distributed Database Systems," Second Edition, Prentice Hall 1999.
  • CLASS WEB PAGE: http://www.stanford.edu/class/cs Will contain homework assignments, course news, etc. Be sure to check it periodically.
  • ASSIGNMENTS: about 5 homeworks
  • GRADING: Homeworks: 20%, Midterm 30%, Final: 50%.

CS 347 Lecture 1 30

Tentative Syllabus 2012 (Part I)

DATE TOPIC

  • Monday April 2 Introduction [N01]
  • Wednesday April 4 Data Fragmentation [N02]
  • Monday April 9 Query processing [N03]
  • Wednesday April 11 Query processing & Optimization [N04]
  • Monday April 16 Concurrency Control, Failures [N05]
  • Wednesday April 18 Reliable Data Management [N06]
  • Monday April 23 Reliable Data Management [N06]
  • Wednesday April 25 Replicated Data Management [N07]
  • Monday April 30 Partitions, Entity Resolution [N11]
  • Wednesday May 2 Midterm

CS 347 Lecture 1 31

Tentative Syllabus 2012 (Part II)

DATE TOPIC

  • Monday May 7 Peer to Peer Systems [N08]
  • Wednesday May 9 Peer to Peer Systems [N08]
  • Monday May 14 Map-Reduce [N09]
  • Wednesday May 16 Map-Reduce [N09]
  • Monday May 21 Distributed IR [N10]
  • Wednesday May 23 Publish Subscribe Systems [N14]
  • Wednesday May 30 Time [N12]
  • Monday June 4 Heterogeneous Systems [N13]
  • Wednesday June 6 Extra Topic
  • Friday June 8 8:30 am!!! FINAL EXAM

Interesting New Systems

  • Storm (from Twitter)
  • S4 (from Yahoo)
  • Casandra (key-value store)
  • Hive (SQL over Hadoop)
  • Pregel (graph execution)
  • Kestrel (queues?)
  • ZooKeeprer (replicated data)
  • Sparkl or Spark (Berkeley?)
  • H-Base
  • HyRacks (UC Irvine)

CS 347 Lecture 1 32

  • MemCache-D
  • Pnuts
  • Dynamo (Amazon)
  • Mega-Store (Google)
  • Paxos
  • G-Store (UC Santa Barbara)
  • Elastras (UC Santa Barbara)
  • Tao (Facebook)

CS 347 Lecture 1 33

Concepts you should be familiar with:

  • CS245: query plan, cost estimation, join algorithms, recovery, logging,…
  • Interconnection networks (bus, mesh, hypercube,…)
  • Computer networks (LAN, WAN,…)

CS 347 Lecture 1 34

Introductory topics

  • Database architectures
  • Client-server systems
  • Distributed vs. parallel DB systems
  • Cloud Computing

CS 347 Lecture 1 35

DB architectures

(1) Shared memory

P P (^) ... P

M

CS 347 Lecture 1 36

DB architectures

(2) Shared disk

P

M

P P

M M

CS 347 Lecture 1 43

(5) Unusual — processor per track or processor per disk

M

P

P’

P’

P’

“small” processors

  • “tiny” memories

CS 347 Lecture 1 44

(6) Unusual — sensor networks

P’

M

M

B P

M

B P

M

B P

M

B P

M

B P

data collection node sensor

battery

CS 347 Lecture 1 45

Issues for selecting architecture

  • Reliability
  • Scalability
  • Geographic distribution of data
  • Data “clusters”
  • Performance
  • Cost

CS 347 Lecture 1 46

Client-Server Systems

(or how to partition software)

Application Front End Query Processor Transaction Processing File Access

client server

CS 347 Lecture 1 47

Client-Server Systems

(or how to partition software)

Application Front End Query Processor Transaction Processing File Access

client server

CS 347 Lecture 1 48

Client-Server Systems

(or how to partition software)

Application Front End Query Processor Transaction Processing File Access

client server

CS 347 Lecture 1 49

Transaction Servers

  • Clients ship transactions consisting of 1 or more SQL commands

E.g., Open DataBase Connectivity (ODBC) (standard API)

CS 347 Lecture 1 50

Data Servers

  • Client requests pages or records
  • Popular for OODB systems

CS 347 Lecture 1 51

Issues

  • Object granularity
  • Where is data cached?
  • Where is locking done?

CS 347 Lecture 1 52

Basic Tradeoff

  • Offloading work to clients
  • Data transmitted

C C

S S

Get pages

Reserve hotel room

CS 347 Lecture 1 53

Note: Similar issues arise when we partition

software/functionality within server

Reserve hotel room (^) P

M

P

M

P

M

•Where is data cached? •Where is locking done? CS 347 Lecture 1 54

Parallel or distributed DB system?

  • More similarities than differences!

CS 347 Lecture 1 61

Next

  • How to describe distributed data
  • Query processing in parallel DBs
  • Query processing in distributed DBs

CS 347 Lecture 1 62

Query processing in parallel DBs:

  • Typically: we can distribute/ partition/ sort…. data to make certain DB operations (e.g., Join) fast

CS 347 Lecture 1 63

Query processing in distributed DBs:

  • Typically: we are given data distribution; we need to find query processing strategy to minimize cost (e.g., communication cost)