Understanding Database Transactions and Serializability, Study notes of Database Management Systems (DBMS)

The concept of database transactions, focusing on isolation and durability. It discusses the importance of serializability and how the scheduler ensures it by analyzing conflicting operations. The document also covers the precedence graph and the implications of conflicting locks.

Typology: Study notes

2011/2012

Uploaded on 02/03/2012

christin
christin 🇺🇸

4.6

(18)

263 documents

1 / 38

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
-1-
Lecture Notes for Database Systems
Patrick E. O'Neil
Chapter 10
Class 20.
We have encountered the idea of a transaction before in Embedded SQL.
Def. 10.1 Transaction. A transaction is a means to package together a
number of database operations performed by a process, so the database
system can provide several guarantees, called the ACID properties.
Think of writing: BEGIN TRANSACTION op1 op2 . . . opN END TRANSACTION
Then all ops within the transaction are packaged together.
There is no actual BEGIN TRANSACTION statement in SQL. A transaction is
begun by a system when there is none in progress and the application first
performs an operation that accesses data: Select, Insert, Update, etc.
The application logic can end a transaction successfully by executing:
exec sql commit work; /* called simply a Commit */
Then any updates performed by operations in the transaction are success-
fully completed and made permanent and all locks on data items are re-
leased. Alternatively:
exec sql rollback work; /* called an Abort */
means that the transaction was unsuccessful: all updates are reversed, and
locks on data items are released.
The ACID guarantees are extremely important -- This and SQL is what
differentiates a database from a file system.
Imagine that you are trying to do banking applications on the UNIX file system
(which has it's own buffers, but no transactions). There will be a number of
problems, the kind that faced database practitioners in the 50s.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26

Partial preview of the text

Download Understanding Database Transactions and Serializability and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!

Lecture Notes for Database Systems

Patrick E. O'Neil

Chapter 10

Class 20.

We have encountered the idea of a transaction before in Embedded SQL.

Def. 10.1 Transaction. A transaction is a means to package together a number of database operations performed by a process, so the database system can provide several guarantees, called the ACID properties.

Think of writing: BEGIN TRANSACTION op1 op2... opN END TRANSACTION

Then all ops within the transaction are packaged together.

There is no actual BEGIN TRANSACTION statement in SQL. A transaction is begun by a system when there is none in progress and the application first performs an operation that accesses data: Select, Insert, Update, etc.

The application logic can end a transaction successfully by executing:

exec sql commit work; /* called simply a Commit * /

Then any updates performed by operations in the transaction are success- fully completed and made permanent and all locks on data items are re- leased. Alternatively:

exec sql rollback work; /* called an Abort * /

means that the transaction was unsuccessful: all updates are reversed, and locks on data items are released.

The ACID guarantees are extremely important -- This and SQL is what differentiates a database from a file system.

Imagine that you are trying to do banking applications on the UNIX file system (which has it's own buffers, but no transactions). There will be a number of problems, the kind that faced database practitioners in the 50s.

  1. Inconsistent result. Our application is transferring money from one account to another (different pages). One account balance gets out to disk (run out of buffer space) and then the computer crashes.

When bring computer up again, have no idea what used to be in memory buffer, and on disk we have destroyed money.

  1. Errors of concurrent execution. (One kind: Inconsistent Analysis.) Teller 1 transfers money from Acct A to Acct B of the same customer, while Teller 2 is performing a credit check by adding balances of A and B. Teller 2 can see A after transfer subtracted, B before transfer added.
  2. Uncertainty as to when changes become permanent. At the very least, we want to know when it is safe to hand out money: don't want to forget we did it if system crashes, then only data on disk is safe.

Want this to happen at transaction commit. And don't want to have to write out all rows involved in transaction (teller cash balance -- very popular, we buffer it to save reads and want to save writes as well).

To solve these problems, systems analysts came up with idea of transaction (formalized in 1970s). Here are ACID guarantees:

Atomicity. The set of record updates that are part of a transaction are indivisible (either they all happen or none happen). This is true even in presence of a crash (see Durability, below).

Consistency. If all the individual processes follow certain rules (money is neither created nor destroyed) and use transactions right, then the rules won't be broken by any set of transactions acting together. Implied by Isolation, below.

Isolation. Means that operations of different transactions seem not to be interleaved in time -- as if ALL operations of one Tx before or after all operations of any other Tx.

Durability. When the system returns to the logic after a Commit Work statement, it guarantees that all Tx Updates are on disk. Now ATM machine can hand out money.

The system is kind of clever about Durability. It doesn't want to force all updated disk pages out of buffer onto disk with each Tx Commit.

The reason for this notation is that often have to consider complex inter- leaved histories of concurrent transactions; Example history:

(10.1.2)... R 2 (A) W 2 (A) R 1 (A) R 1 (B) R 2 (B) W 2 (B) C 1 C 2...

Note C (^) i means commit by Tx i. A sequence of operations like this is known as a History or sometimes a Schedule.

A history results from a series of operations submitted by users, translated into R & W operations at the level of the Scheduler. See Fig. 10.1.

It is the job of the scheduler to look at the history of operations as it comes in and provide the Isolation guarantee, by sometimes delaying some operations, and occasionally insisting that some transactions be aborted.

By this means it assures that the sequence of operations is equivalent in effect to some serial schedule (all ops of a Tx are performed in sequence with no interleaving with other transactions). See Figure 10.1, pg. 640.

In fact, (10.1.2) above is an ILLEGAL schedule. Because we can THINK of a situation where this sequence of operations gives an inconsistent result.

Example 10.1.1. Say that the two elements A and B in (10.1.2) are Acct records with each having balance 50 to begin with. Inconsistent Analysis.

T 1 is adding up balances of two accounts, T 2 is transferring 30 units from A to B.

... R2(A, 50) W2(A, 20) R1(A, 20) R1(B, 50) R2(B, 50) W2(B, 80) C1 C..

.

And T determines that the customer fails the credit check (because under balance total of 80, say).

But this could never have happened in a serial schedule, where all operation of T 2 occurred before or after all operations of T 2.

... R2(A, 50) W2(A, 20) R2(B, 50) W2(B, 80) C2 R1(A, 20) R1(B, 80) C...

o r

... R1(A, 50) R1(B, 50) C1 R2(A, 50) W2(A, 20) R2(B, 50) W2(B, 80) C...

And in both cases, T 1 sees total of 100, a Consistent View.

Notice we INTERPRETED the Reads and Writes of (10.1.2) to create a model of what was being read and written to show there was an inconsistency.

This would not be done by the Scheduler. It simply follows a number of rules we explain shortly. We maintain that a serial history is always consistent under any interpretation.

10.2. Interleaved Read/Write Operations Quick>> If a serial history is always consistent, why don't we just enforce serial histories.

The scheduler could take the first operation that it encounters of a given transaction (T 2 in the above example) and delay all ops of other Txs (the Scheduler is allowed to do this) until all operations of T 2 are completed and the transaction commits (C 2 ).

Reason we don't do this? Performance. It turns out that an average Tx has relatively small CPU bursts and then I/O during which CPU has nothing to do. See Fig 10.3, pg. 644. When I/O is complete, CPU can start up again.

What do we want to do? Let another Tx run (call this another thread) during slack CPU time. (Interleave). Doesn't help much if have only one disk (disk is bottleneck). See Fig 10.4, pg. 644.

But if we have two disks in use all the time we get about twice the throughput. Fig 10.5, pg. 645.

And if we have many disks in use, we can keep the CPU 100% occupied. Fig 10.6, pg 646.

In actuality, everything doesn't work out perfectly evenly as in Fig 10.6. Have multiple threads and multiple disks, and like throwing darts at slots.

Try to have enough threads running to keep lots of disk occupied so CPU is 90% occupied. When one thread does an I/O, want to find another thread with completed I/O ready to run again.

Leave this to you -- covered in Homework.

Now these transaction numbers are just arbitrarily assigned labels, so it is clear we could have written the above as follows:

If R 2 (A) << (^) H W 1 (A) then R 2 (A) << (^) H W 1 (A).

Here Tx 1 and Tx 2 have exchanged labels. This is another one of the four cases. Now what can we say about the following?

R 1 (A) << (^) H R 2 (A)

This can be commuted -- reads can come in any order since they don't affect each other. Note that if there is a third transaction, T 3 , where:

R 1 (A) << H W 3 (A) << H R 2 (A)

Then the reads cannot be commuted (because we cannot commute either one with W 3 (A)), but this is because of application of the earlier rules, not depending on the reads as they affect each other.

Finally, we consider:

W 1 (A) << (^) H W 2 (A)

And it should be clear that these two operations cannot commute. The ul- timate outcome of the value of A would change. That is:

If W 1 (A) << (^) H W 2 (A) then W 1 (A) << (^) S(H) W 2 (A)

To summarize our discussion, we have Definition 10.3.1, pg. 650.

Def. 10.3.1. Two operations X (^) i (A) and Y (^) j (B) in a history are said to conflict (i.e., the order matters) if and only if the following three conditions hold:

(1) A ≡ B. Operations on distinct data items never conflict.

(2) i ≠ j. Operations conflict only if they are performed by different Txs.

(3) One of the two operations X or Y is a write, W. (Other can be R or W.)

Note in connection with (2) that two operations of the SAME transaction also cannot be commuted in a history, but not because they conflict. If the scheduler delays the first, the second one will not be submitted.

Thus W 2 (A) << (^) S(H1) R 1 (A) means that T 2 << (^) S(H1) T 1 , i.e. T 2 occurs before T 1 in any serial history S(H1) (there might be more than one).

But now consider the fourth and sixth operations of H1. We have:

R 1 (B) << (^) H1 W 2 (B)

Since these operations conflict, we also have R 1 (B) << (^) S(H1) W 2 (B)

But this implies that T 1 comes before T 2 , T 1 << (^) S(H1) T 2 , in any equivalent serial history H1. And this is at odds with our previous conclusion.

In any serial history S(H1), either T 1 << (^) S(H1) T 2 or T 2 << (^) S(H1) T 1 , not both. Since we conclude from examining H1 that both occur, S(H1) must not really exist. Therefore, H1 is not SR.

We illustrate this technique a few more times, and then prove a general characterization of SR histories in terms of conflicting operations.

Ex. 10.3.2. Consider the history:

H2 = R 1 (A) R 2 (A) W 1 (A) W 2 (A) C 1 C (^2)

We give a interpretation of this history as a paradigm called a lost update.

Assume that A is a bank balance starting with the value 100 and T1 tries to add 40 to the balance at the same time that T2 tries to add 50.

H2' = R 1 (A, 100) R 2 (A, 100) W 1 (A, 140) W 2 (A, 150) C 1 C (^2)

Clearly the final result is 150, and we have lost the update where we added

  1. This couldn't happen in a serial schedule, so H1 is non-SR.

In terms of conflicting operations, note that operations 1 and 4 imply that T1 << (^) S(H2) T2. But operations 2 and 3 imply that T2 << (^) S(H2) T1. No SR schedule could have both these properties, therefore H2 is non-SR.

By the way, this example illustrates that a conflicting pair of the form R 1 (A)

... W 2 (A) does indeed impose an order on the transactions, T1 << T2, in any equivalent serial history.

H2 has no other types of pairs that COULD conflict and make H2 non-SR.

Ex 10.3.3. Consider the history:

H3 = W 1 (A) W 2 (A) W 2 (B) W 1 (B) C 1 C (^2)

This example will illustrate that a conflicting pair W 1 (A)... W 2 (A) can impose an order on the transactions T1 << T2 in any equivalent SR history.

Understand that these are what are known as "Blind Writes": there are no Reads at all involved in the transactions.

Assume the logic of the program is that T1 and T2 are both meant to "top up" the two accounts A and B, setting the sum of the balances to 100.

T1 does this by setting A and B both to 50, T2 does it by setting A to 80 and B to 20. Here is the result for the interleaved history H3.

H3' = W 1 (A, 50) W 2 (A, 80) W 2 (B, 80) W 1 (B, 50) C 1 C (^2)

Clearly in any serial execution, the result would have A + B = 100. But with H3' the end value for A is 80 and for B is 50.

To show that H3 is non-SR by using conflicting operations, note that op- erations 1 and 2 imply T1 << T2, and operations 3 and 4 that T2 << T1.

Seemingly, the argument that an interleaved history H is non-SR seems to reduce to looking at conflicting pairs of operations and keeping track of the order in which the transactions will occur in an equivalent S(H).

When there are two transactions, T1 and T2, we expect to find in a non-SR schedule that T1 << (^) S(H) T2 and T2 << (^) S(H) T1, an impossibility.

If we don't have such an impossibility arise from conflicting operations in a history H, does that mean that H is SR?

And what about histories with 3 or more transactions involved? WIll we ever see something impossible other than T1 << (^) S(H) T2 and T2 << (^) S(H) T1?

We start by defining a Precedence Graph. The idea here is that this allows us to track all conflicting pairs of operations in a history H.

Proof. Leave only if for exercises at end of chapter. I.e., will show there that a circuit in PG(H) implies there is no serial ordering of transactions.

Here we prove that if PG(H) contains no circuit, there is a serial ordering of the transactions so no edge of PG(H) ever points from a later to an earlier transaction.

Assume there are m transactions involved, and label them T1, T2,.. ., Tm.

We are trying to find a reordering of the integers 1 to m, i(1), i(2),.. ., i(m), so that Ti(1), Ti(2),.. ., Ti(m) is the desired serial schedule.

Assume a lemma to prove later: In any directed graph with no circuit there is always a vertex with no edge entering it.

OK, so we are assuming PG(H) has no circuit, and thus there is a vertex, or transaction, Tk, with no edge entering it. We choose Tk to be Ti(1).

Note that since Ti(1) has no edge entering it, there is no conflict in H that forces some other transaction to come earlier.

(This fits our intuition perfectly. All other transactions can be placed after it in time, and there won't be an edge going backward in time.)

Now remove this vertex, Ti(1), from PG(H) and all edges leaving it. Call the resulting graph PG 1 (H).

By the Lemma, there is now a vertex in PG 1 (H) with no edge entering it. Call that vertex Ti(2).

(Note that an edge from Ti(1) might enter Ti(2), but that edge doesn't count because it's been removed from PG 1 (H).)

Continue in this fashion, removing Ti(2) and all it's edges to form PG 2 (H), and so on, choosing Ti(3) from PG 2 (H),.. ., Ti(m) from PG m-1^ (H).

By construction, no edge of PG(H) will ever point backward in the sequence S(H), from Ti(m) to Ti(n), m > n.

The algorithm we have used to determine this sequence is known as a topological sort. This was a hiring question I saw asked at Microsoft.

The proof is complete, and we now know how to create an equivalen SR schedule from a history whose precedence graph has no circuit.

Lemma 10.3.5. In any finite directed acyclic graph G there is always a vertex with no edges entering it.

Proof. Choose any vertex v1 from G. Either this has the desired property, or there is an edge entering it from another vertex v2.

(There might be several edges entering v1, but choose one.)

Now v2 either has the desired property or there is an edge entering it from vertex v3. We continue in this way, and either the sequence stops at some vertex vm, or the sequence continues forever.

If the sequence stops at a vertex vm, that's because there is no edge en- tering vm, and we have found the desired vertex.

But if the sequence continues forever, since this is a finite graph, sooner or later in the sequence we will have to have a repeated vertex.

Say that when we add vertex vn, it is the same vertex as some previously mentioned vertex in the sequence, vi.

Then there is a path from vn -> v(n-1) ->... v(i+1) ->vi, where vi ≡ vn. But this is the definition of a circuit, which we said was impossible.

Therefore the sequence had to terminate with vm and that vertex was the one desired with no edges entering.

Ex. Here is history not SR (Error in text: this is a variant of Ex. 10.4.1).

H4 = R 1 (A) R 2 (A) W 2 (A) R 2 (B) W 2 (B) R 1 (B) C 1 C (^2)

Same idea as 10.3.1 why it is non-SR: T2 reads two balances that start out A=50 and B=50, T1 moves 30 from A to B. Non-SR history because T1 sees A=50 and B=80. Now try locking and releasing locks at commit.

RL 1 (A) R 1 (A) RL 2 (A) R 2 (A) WL 2 (A) (conflicting lock held by T1 so T2 must WAIT) RL 1 (B) R 1 (B) C 1 (now T2 can get WL2(A)) W 2 (A) RL 2 (B) R 2 (B) WL 2 (B)

W 2 (B) C (^2)

Works fine: T1 now sees A=50, B=50. Serial schedule, T1 then T2.

But what if allowed to Unlock and then acquire more locks later. Get non-SR schedule. Shows necessity of 2PL Rule (3).

RL 1 (A) R 1 (A) RU 1 (A) RL 2 (A) R 2 (A) WL 2 (A) W 2 (A) WU 2 (A) RL 2 (B) R 2 (B) WL 2 (B) W 2 (B) WU 2 (B) RL 1 (B) R 1 (B) C 1 C (^2)

So H4 above is possible. But only 2PL rule broken is that T1 and T2 unlock rows, then lock other rows later.

The Waits-For Graph. How scheduler checks if deadlock occurs. Vertices are currently active Txs, Directed edgs Ti -> Tj iff Ti is waiting for a lock held by Tj.

(Note, might be waiting for lock held by several other Txs. And possibly get in queue for W lock behind others who are also waiting. Draw picture.)

The scheduler performs lock operations and if Tx required to wait, draws new directed edges resulting, then checks for circuit.

Ex 10.4.2. Here is schedule like H4 above, where T2 reverses order it touches A and B (now touches B first), but same example shows non-SR.

H5 = R 1 (A) R 2 (B) W 2 (B) R 2 (A) W 2 (A) R 1 (B) C 1 C (^2)

Locking result:

RL 1 (A) R 1 (A) RL 2 (B) R 2 (B) WL 2 (B) W 2 (B) RL 2 (A) R 2 (A) WL 2 (A) (Fails: RL1(A) held, T2 must WAIT for T1 to complete and release locks) RL 1 (B) (Fails: WL2(B) held, T1 must wait for T2 to complete: But this is a deadlock! Choose T2 as victim (T1 chosen in text)) A2 (now RL 1 (B) will succeed) R1(B) C1 (start T2 over, retry, it gets Tx number 3) RL 3 (B) R 3 (B) WL 3 (B) W 3 (B) RL 3 (A) R 3 (A) WL 3 (A) W3(A) C3.

Locking serialized T1, then T2 (retried as T3).

Thm. 10.4.2. Locking Theorem. A history of transactional operations that follows the 2PL discipline is SR.

First, Lemma 10.4.3. If H is a Locking Extended History that is 2PL and the edge Ti -> Tj is in PG(H), then there must exist a data item D and two conflicting operations Xi(D) and Yj(D) such that XUi(D) <<H YLj(D).

Proof. Since Ti -> Tj in PG(H), there must be two conflicting ops Xi(D) and Yj(D) such that Xi(D) <<H Yj(D).

By the definition of 2PL, there must be locking and unlocking ops on either side of both ops, e.g.: XLi(D) <<H Xi(D) <<H XUi(D).

Now between the lock and unlock for Xi(D), the X lock is held by Ti and similarly for Yj(D) and Tj. Since X and Y conflict, the locks conflict and the intervals cannot overlap. Thus, since Xi(D) <<H Yj(D), we must have:

XLi(D) <<H Xi(D) <<H XUi(D) <<H YLj(D) <<H Yj(D) <<H YUj(D)

And in particular XUi(D) <<H YLj(D).

Proof of Thm. 10.4.2. We want to show that every 2PL history H is SR.

Assume in contradiction that there is a cycle T1 -> T2 ->... -> Tn -> T1 in PG(H). By the Lemma, for each pair Tk -> T(k+1), there is a data item Dk where XUk(Dk) <<H YL(k+1)(Dk). We write this out as follows:

  1. XU1(D1) <<H YL2(D1)
  2. XU2(D2) <<H YL3(D2)

... n-1. XU(n-1)(D(n-1)) <<H YLn(D(n-1)) n. XUn(Dn) <<H YL1(Dn) (Note, T1 is T(n+1) too.)

Class 23.

10.5 Levels of Isolation

The idea of Isolation Levels, defined in ANSI SQL-92, is that people might want to gain more concurrency, even at the expense of imperfect isolation.

A paper by Tay showed that when there is serious loss of throughput due to locking, it is generally not because of deadlock aborts (having to retry) but simply because of transactions being blocked and having to wait.

Recall that the reason for interleaving transaction operations, rather than just insisting on serial schedules, was so we could keep the CPU busy.

We want there to always be a new transaction to run when the running transaction did an I/O wait.

But if we assume that a lot of transactions are waiting for locks, we lose this. There might be only one transaction running even if we have 20 trying to run. All but one of the transactions are in a wait chain!

So the idea is to be less strict about locking and let more transactions run. The problem is that dropping proper 2PL might cause SERIOUS errors in applications. But people STILL do it.

The idea behind ANSI SQL-99 Isolation Levels is to weaken how locks are held. Locks aren't always taken, and even when they are, many locks are released before EOT.

And more locks are taken after some locks are released in these schemes. Not Two-Phase, so not perfect Isolation.

(Note in passing, that ANSI SQL-92 was originally intended to define isolation levels that did not require locking, but it has been shown that the definitions failed to do this. Thus the locking interpretation is right.)

Define short-term locks to mean a lock is taken prior to the operation (R or W) and released IMMEDIATELY AFTERWARD. This is the only alternative to long-term locks , which are held until EOT.

Then ANSI SQL-92 Isolation levels are defined as follows (Fig. 10.9 -- some difference from the text):

Write locks on rows of a table are long term

Read Locks on rows of a table are long term

Read locks on predicates are long term Read Uncommitted (Dirty Reads)

NA

(Read Only)

No Read Locks taken at all

No Read Locks taken at all Read Committed Yes No No Repeatable Read Yes Yes No Serializable Yes Yes Yes

Note that Write Predicate Locks are taken and held long-term in all isolation levels listed. What this means is explained later.

In Read Uncommitted (RU), no Read locks are taken, thus can read data on which Write lock exists (nothing to stop you if don't have to WAIT for RL).

Thus can read uncommitted data; it will be wrong if Tx that changed it later aborts. But RU is just to get a STATISTICAL idea of sales during the day (say). CEO wants to know ballpark figure -- OK if not exact.

In Read Committed (RC), we take Write locks and hold them to EOT, and Read Locks on rows read and predicates and release immediately. (Cover predicates below.)

Problem that can arise is serious one, Lost Update (Example 10.3.2):

... R 1 (A,100) R 2 (A,100) W 1 (A,140) W 2 (A,150) C 1 C 2 ...

Since R locks are released immediately, nothing stops the later Writes, and the increment of 40 is overwritten by an increment of 50, instead of the two increments adding to give 90.

Call this the Scholar's Lost Update Anomaly (since many people say Lost Update only happens at Read Uncommitted).

This is EXTREMELY serious, obviously, and an example of lost update in SQL is given in Figure 10.12 (pg. 666) for a slightly more restrictive level: Cursor Stability. Applications that use RC must avoid this kind of update.

In Figure 10.11, we see how to avoid this by doing the Update indivisibly in a single operation.