Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

CS 347: Distributed Transactions - Failure, Node, Network, Scenarios, and Process Models, Slides of Distributed Database Management Systems

Dhirubhai Ambani Institute of Information and Communication Technology Distributed Database Management Systems

Notes on distributed transactions, covering failure models, node models (fail-stop and byzantine), network models (reliable and partitionable), scenarios, and process models (cohorts and transaction servers).

Typology: Slides

2011/2012

Uploaded on 07/16/2012

sambandam 🇮🇳

4.3

(37)

154 documents

1 / 13

This page cannot be seen from the preview

Don't miss anything!

2

CS 347 Notes06 7

Failure model:

Desired

Events Expected

Undesired Unexpected

CS 347 Notes06 8

Node models

(1) Fail-stop nodes time

perfect halted recovery perfect

Volatile

memory lost Stable

storage ok

CS 347 Notes06 9

(2) Byzantine nodes

APerfect

Perfect Arbitrary failure Recovery

B

C

At any given time, at most some fraction f of nodes

failed (typically f < 1/2 or f < 1/3)

Node models

CS 347 Notes06 10

Network models

(1) Reliable network

- in order messages

- no spontaneous messages

-timeout T

D

I.e., no lost messages, except for node failures

If no ack in TD sec. Destination down

(not paused)

CS 347 Notes06 11

Variation of reliable net:

• Persistent messages

– If destination down, net will eventually

deliver message

– Simplifies node recovery, but leads to

inefficiencies (hides too much)

– Not considered here

CS 347 Notes06 12

Network models

(2) Partitionable network

- In order messages

- No spontaneous messages

- no timeout; nodes can have different view of failures

docsity.com

Discover Slides of Distributed Database Management Systems Dhirubhai Ambani Institute of Information and Communication Technology

Partial preview of the text

Download CS 347: Distributed Transactions - Failure, Node, Network, Scenarios, and Process Models and more Slides Distributed Database Management Systems in PDF only on Docsity!

CS 347 Notes06 7

Failure model:

Desired Events Expected Undesired Unexpected

CS 347 Notes06 8

Node models

(1) Fail-stop nodes time perfect halted recovery perfect

Volatile memory lost Stable storage ok

CS 347 Notes06 9

(2) Byzantine nodes A Perfect Perfect Arbitrary failure Recovery B

C

At any given time, at most some fraction f of nodes failed (typically f < 1/2 or f < 1/3)

Node models

CS 347 Notes06 10

Network models

(1) Reliable network

in order messages
no spontaneous messages
timeout T D

I.e., no lost messages, except for node failures

If no ack in TD sec. Destination down(not paused)

CS 347 Notes06 11

Variation of reliable net:

Persistent messages
- If destination down, net will eventually deliver message
- Simplifies node recovery, but leads to inefficiencies (hides too much)
- Not considered here

CS 347 Notes06 12

Network models

(2) Partitionable network

In order messages
No spontaneous messages
no timeout; nodes can have different view of failures

CS 347 Notes06 13

Scenarios

Reliable network
- Fail-stop nodes
  - No data replication (1)
  - Data replication (2)
Partitionable network
- Fail-stop nodes (3)

CS 347 Notes06 14

No Data Replication

Reliable network, fail-stop nodes
Basic idea: node P controls X

net P^ Item X

CS 347 Notes06 15

No Data Replication

Reliable network, fail-stop nodes
Basic idea: node P controls X

net P^ Item X

Single control point simplifies concurrency control, recovery
Not an availability hit: if P down, X unavailable too! CS 347 Notes06 16

“P controls X” means

P does concurrency control for X
P does recovery for X

CS 347 Notes06 17

Say transaction T wants to access X:

PT is process that represents T at this node

PT

req

Local DMBS

Lock mgr LOG^ X

CS 347 Notes06 18

Process models

(A) Cohorts Spawn process Communication USER Data Access

T 1

Local DMBS

T 2

Local DMBS

T 3

Local DMBS

CS 347 Notes06 25

-> Example: after participant fails: Log: T 1 X undo/redo info

T 1

Y ... (^) info ...

T 1

“W” state

CS 347 Notes06 26

 At recovery:

T 1 is in “W” state
Obtain X,Y write locks (no read locks!)
Wait for message from coordinator (or ask coordinator for outcome)

CS 347 Notes06 27

 Other examples:

No “W” record on log  abort T 1
See “C” record on log  finish T 1

CS 347 Notes06 28

Add timeouts to cope with messages lost during crashes
Add finish (“F”) state for coordinator – all done, can forget outcome

Presumed abort protocol

“F” and “A” states combined in coordinator
Saves persistent space (forget quicker)
Presumed commit is analogous

CS 347 Notes06 35

Presumed abort-coordinator (participant unchanged)

I

W

C

A/F

go exec* ping

c-ok* -

ping abort

ping commit (^) t cping

nok, t abort*

ok* commit*

CS 347 Notes06 36

Remember:

all state transitions must be logged

Example: tracking who has sent “OK” msgs Log at coord:

After failure, we know still waiting for OK from node b
Alternative: do not log receipts of “OK”s abort T 1

T 1

start part={a,b}

T 1

OK froma RCV

CS 347 Notes06 43

Coordinator Participant I

W

P

A

go exec*

ack* commit*

nok abort* ok* pre *

3PC

C

I

W

P

A

exec ok

commit

exec nok

pre ack

C

abort

CS 347 Notes06 44

3PC recovery rules: termination protocol

Survivors try to complete transaction, based on their current states
Goal:
- If dead nodes committed or aborted, then survivors should not contradict!
- Else, survivors can do as they please...

survivors

CS 347 Notes06 45

Let {S 1 ,S 2 ,…Sn} be survivor sites
If one or more Si = COMMIT  COMMIT T
If one or more Si = ABORT  ABORT T
If one or more Si = PREPARE  T could not have aborted  COMMIT T
If no Si = PREPARE (or COMMIT)  T could not have committed  ABORT T

survivors CS 347 Notes06 46

Example:

P

W

CS 347 Notes06 47

Example:

I

W

CS 347 Notes06 48

Example:

P

C

CS 347 Notes06 49

Example:

P

W

A

CS 347 Notes06 50

 Once survivors make decision, they must

select new coordinator to continue 3PC

P P C C W P C C W P P C Decide to commit Time 1

Time 2

Time 3

Time 4

CS 347 Notes06 51

Note: when survivors continue 3PC, failed nodes do not count E.g., “OK*”  when OK’s received from W non-failed nodes

P

ok* pre*

CS 347 Notes06 52

Note: 3PC unsafe with partitions!

W W W

P

abort (^) commit

CS 347 Notes06 53

Node recovery:

After node N recovers from failure:
- do not participate in termination protocol (why?) W

W

P

 A

CS 347 Notes06 54

Node recovery:

After node N recovers from failure:
- do not participate in termination protocol (why?) W

W

P

 A

later on...

CS 347 Notes06 61

Example(1): Coord P 2  W P 1 P 3  W p 4  W

Nodes P2, P3, P 4 enter “W” state and fail
When they recover, coord. and P 1 are down
Each node has 1 vote, V=5, Maj=

CS 347 Notes06 62

Example(1): Coord P 2  W P 1 P 3  W p 4  W

Nodes P2, P3, P 4 enter “W” state and fail
When they recover, coord. and P 1 are down
Each node has 1 vote, V=5, Maj=

Since P2, P3, P 4 have majority, they know coord. could not have gone to “P” without at least one of their votes
Therefore, T can be aborted!

CS 347 Notes06 63

Example(2): Coord P 3  ”P” P 1 P 4  ”W” P 2

Each node has 1 vote; V=5, Maj=
Nodes fail after entering states shown; P 3 , P 4 recover

CS 347 Notes06 64

Example(2): Coord P 3  ”P” P 1 P 4  ”W” P 2

Each node has 1 vote; V=5, Maj=
Nodes fail after entering states shown; P 3 , P 4 recover

Termination rule says we can try to commit, but P 3 , P 4 do not have enough votes, so they do nothing!
P 3 , P 4 doing nothing is good because later on, coord. P1, P 2 could abort T

CS 347 Notes06 65

Summary: Majority rule ensures that any decision (e.g., Preparing, committing) will be known to any future group making a decision

decision # 2

decision

1

CS 347 Notes06 66

Important Detail for Majority 3PC

Example:

W

P

 A

CS 347 Notes06 67

Important Detail for Majority 3PC

Example:

W

P

 A

 P  C

CS 347 Notes06 68

Need “Prepare To Abort” State

I

W

PC PA

go exec*

ackC* commit*

nok ok* preA* preC *

C

I

W

PC

A

exec ok

commit

exec nok

preC ackC

C

preA ackA

coordinator participant

A

ackA* abort*

PA

abort

CS 347 Notes06 69

Example Revisited

W

PC

 PA

CS 347 Notes06 70

Example Revisited

W

PC

 PA

 PC  C

OK to commit since transaction could not have aborted

CS 347 Notes06 71

Example Revisited -II

W

PC

 PA

CS 347 Notes06 72

Example Revisited -II

W

PC

 PA

No decision: Transaction could have aborted or could have committed... Block!

 PA

CS 347 Notes06 79

Need to “combine” WFGs to discover global deadlock T T

T T

e.g., central detection node

CS 347 Notes06 80

Problem: False deadlocks

T T

Time 1

Time 2

Time 3

info sent

T T

at centralsite:

CS 347 Notes06 81

Many deadlock solutions
- Distributed vs. centralized
- Detection vs. prevention
  - timeouts
  - wait-die
  - wound-wait
Covered in CS

CS 347: Distributed Transactions - Failure, Node, Network, Scenarios, and Process Models, Slides of Distributed Database Management Systems

Related documents

Partial preview of the text

Download CS 347: Distributed Transactions - Failure, Node, Network, Scenarios, and Process Models and more Slides Distributed Database Management Systems in PDF only on Docsity!

Failure model:

Node models

Node models

Network models

Variation of reliable net:

Network models

Scenarios

No Data Replication

No Data Replication

Say transaction T wants to access X:

PT

Process models

T 1

T 2

T 3

T 1

T 1

 At recovery:

 Other examples:

Next

W

C

A

F

W

C

A

F

W

C

A

W

C

A

W

C

A

Presumed abort protocol

W

C

A/F

Remember:

all state transitions must be logged

T 1

T 1

W

P

A

3PC

C

I

W

P

A

C

3PC recovery rules: termination protocol

Example:

P

W

W

Example:

I

W

W

Example:

P

P

C

Example:

P

W

A

 Once survivors make decision, they must

P

Node recovery:

P