File System Reliability: Ensuring Consistency and Data Integrity, Slides of Computer Numerical Control

The reliability of file systems, focusing on the challenges of writing data and maintaining consistency in the face of crashes and multiple updates. Various approaches, including write-through, write-back, and transaction concept, and provides insights into the unix file system and its handling of metadata and user data. It also introduces the concept of transactions and their implementation, ensuring atomicity, serializability, and durability.

Typology: Slides

2010/2011

Uploaded on 10/07/2011

christina
christina 🇺🇸

4.6

(23)

393 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
File System Reliability
Arvind Krishnamurthy
Spring 2004
File Systems
nFile systems have lots of data structures
nFile headers (or i-nodes)
nFile headers point to file data blocks
nFor large files, file headers point to Indirect blocks
nDirectories are special files: a table mapping names to i-node numbers
nBitmap for free blocks
nBitmap for i-nodes (to remember which i-nodes have been allocated)
nQuestion: what kind of inconsistencies could occur in file systems?
File
Header
File
Contents
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download File System Reliability: Ensuring Consistency and Data Integrity and more Slides Computer Numerical Control in PDF only on Docsity!

File System Reliability

Arvind Krishnamurthy

Spring 2004

File Systems

n File systems have lots of data structures

n File headers (or i-nodes) n File headers point to file data blocks

n For large files, file headers point to Indirect blocks n Directories are special files: a table mapping names to i-node numbers n Bitmap for free blocks n Bitmap for i-nodes (to remember which i-nodes have been allocated)

n Question: what kind of inconsistencies could occur in file systems?

File Header

File Contents

File System Reliability

n For performance, all must be cached!

This is OK for reads but what about writes?

n Options for writing data:

Write-through : write change immediately to disk

Problem: slow! Have to wait for write to complete

before you go on

Write-back : delay writing modified data back to disk (for

example, until replaced)

Problem: can lose data on a crash!

Multiple updates

n If multiple updates needed to perform some operations,

crash can occur between them!

n Moving a file between directories:

n Delete file from old directory

n Add file to new directory

n Create new file

n Allocate space on disk for header, data

n Write new header to disk

n Add the new file to directory

What if there is a crash in the middle? Even with write-through it can

still have problems

Ordering of operations (contd.)

n Let’s say you want to create an empty file in a directory:

n find a free i-node

n write i-node map

n write i-node

n write directory

What order should the above operations be performed?

User data consistency

n For user data, Unix uses “write back” --- forced to disk every 30

seconds (or user can call “sync” to force to disk immediately).

No guarantee blocks are written to disk in any order. Sometimes meta-data consistency is good enough

How should vi save changes to a file to disk? Wrong: delete old version, create new version

Correct: Write new version in temp file Move old version to another temp file Move new version into real file Unlink old version

If crash, look at temp area; if any files out there, send email to user that there might be a problem.

Transaction concept

n Transactions: group actions together so that they are

n Atomic : either happens or it does not (no partial operations) n Serializable : transactions appear to happen one after the other n Durable : once it happens, stays happened

Critical sections are atomic and serializable, but not durable

Need two more terms to describe transactions: Commit --- when transaction is done (durable) Rollback --- if failure during a transaction (means it didn’t happen at all)

n Do a set of operations tentatively. If you get to commit, ok. Otherwise,

roll back the operations as if the transaction never happened.

Transaction implementation

n Key idea: fix problem of how you make multiple updates to disk, by

turning multiple updates into a single disk write

n Example: money transfer from account x to account y:

Begin transaction x = x + 1 y = y – 1 Commit

n Keep log on disk of all changes in transaction.

n A log is like a journal, never erased, record of everything you’ve done n Once both changes are on log, transaction is committed. n Then can “write behind” changes to disk --- if crash after commit, replay log to make sure updates get to disk

Transaction implementation (cont’d)

n Can we write X back to disk before commit?

n Yes: Keep an “undo log”

n Save old value along with new value

n If transaction does not commit, “undo change”

Transactions under multiple threads

n What if two threads run same transaction at same time? Use locks.

Begin transaction Lock x, y x = x + 1 y = y – 1 Unlock x, y Commit

n Is the above approach correct?

Two-phase locking

n Don’t allow “unlock” before commit.

n First phase: only allowed to acquire locks (this avoids deadlock

concerns).

n Second phase: all unlocks happen at commit

n Thread B can’t see any of A’s changes, until A commits and releases

locks. This provides serializability.

Transactions in file systems

n Write-ahead logging

n Almost all file systems built after 1985 (NT, Solaris) uses “write

ahead logging”

n Write all changes in a transaction to log (update directory, allocate

block, etc.) before sending any changes to disk.

n Example transactions: “Create file”, “Delete file”, “Move file”

This eliminates any need for file system check (fsck) after crash

If crash, read log:

n If log is not complete, no change!

n If log is completely written, apply all changes to disk

n If log is zero, then all updates have gotten to disk.

n Pros: reliability Cons: all data written twice