Download Advanced Database Systems: Transaction Processing Recovery and more Slides Database Management Systems (DBMS) in PDF only on Docsity!
Transaction Processing:
Recovery
CPS 216
Advanced Database Systems
2
Announcements (April 28)
Homework #4 due today
Sample solution will be emailed to you by tomorrow
morning
Project demo period: April 28 – May 1
Remember to email me to sign up for a 30-minute slot
Final exam on Monday, May 2, 2-5pm
3 hours—no time pressure!
Open book, open notes
Comprehensive, but with emphasis on the second half of
the course and materials exercised in homework
Solution to sample final available
3
Review
ACID
Atomicity
Consistency
Isolation Concurrency control
Durability Recovery
Execution model
Before it can be operated upon, disk-resident data must first
be brought into memory
input( X ): copy the disk block containing object X to memory v = read( X ): read the value of X into a local variable v
- Execute input( X ) first if necessary write( X , v ): write value v to X in memory
- Execute input( X ) first if necessary output( X ): write the memory block containing X to disk
CPU
Memory
Disk X Y…
X Y…
Issued by transactions
Issued by DBMS
5
Failures
System crashes in the middle of a transaction T ;
partial effects of T were written to disk
How do we undo T (atomicity)?
System crashes right after a transaction T commits;
not all effects of T were written to disk
How do we complete T (durability)?
Media fails; data on disk corrupted
How do we reconstruct the database (durability)?
6
Naïve approach
Force: When a transaction commits, all writes of this
transaction must be reflected on disk
Without force, if system crashes right after T commits, effects of T will be lost )Problem:
No steal: Writes of a transaction can only be flushed to disk
at commit time
With steal, if system crashes before T commits but after some writes of T have been flushed to disk, there is no way to undo these writes )Problem:
Checkpointing
Naïve approach:
Stop accepting new transactions (lame!) Finish all active transactions Take a database dump Now safe to truncate the log
Fuzzy checkpointing
Determine S , the set of currently active transactions, and log h begin-checkpoint S i Flush all modified memory blocks at your leisure Log h end-checkpoint begin-checkpoint_location i Between begin and end, continue processing old and new transactions
11
Recovery: analysis and redo phase
Need to determine U , the set of active transactions at time
of crash
Scan log backward to find the last end-checkpoint record
and follow the pointer to find the corresponding
h start-checkpoint S i
Initially, let U be S
Scan forward from that start-checkpoint to end of the log
For a log record h T , start i, add T to U For a log record h T , commit | abort i, remove T from U For a log record h T , X , old , new i, issue write( X , new ) )Basically repeats history!
12
Recovery: undo phase
Scan log backward
Undo the effects of transactions in U
That is, for each log record h T , X , old , new i where T is
in U , issue write( X , old ), and log this operation too (part
of the repeating-history paradigm)
Log h T , abort i when all effects of T have been undone
) An optimization
Each log record stores a pointer to the previous log
record for the same transaction; follow the pointer chain
during undo
Physical vs. logical logging
Physical logging (what we have assumed so far)
Log before and after images of data
Logical logging
Log operations (e.g., insert a row into a table) Smaller log records
- An insertion could cause rearrangement of things on disk
- Or trigger hundreds of other events Sometimes necessary
- Assume row-level rather than page(block)-level locking
- Data might have moved to another block at time of undo! Much harder to make redo/undo idempotent )See solution offered by ARIES
14
ARIES
“ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging,” by Mohan et al. TODS 1992
Same basic ideas: steal, no force, WAL
Three phases: analysis, redo, undo
Repeats history (redo even incomplete transactions)
Better than our simple algorithm
CLR (Compensation Log Record) for transaction aborts Redo/undo on an object is only performed when necessary → idempotency requirement lifted → logical logging supported
- Each disk block records the LSN (log sequence number) of the last change Can take advantage of a partial checkpoint
- Recovery can start from any start-checkpoint, not necessarily one that corresponds to an end-checkpoint
15
Summary
Concurrency control
Serial schedule: no interleaving
Conflict-serializable schedule: no cycles in the precedence
graph; equivalent to a serial schedule
2PL: guarantees a conflict-serializable schedule
Strict 2PL: also guarantees recoverability
Recovery: undo/redo logging with fuzzy
checkpointing
Normal operation: write-ahead logging, no force, steal
Recovery: first redo (forward), and then undo (backword)