Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Log Structured FS, Lecture Slide - Computer Science, Slides of Computer Numerical Control

Yale University Computer Numerical Control

Log structure, File systems, Motivation, LFS writes, Floating inodes, LFS data structure, Compaction, Threading, Cleaning process, Cost benefit analysis, Postscript, Array of disk

Typology: Slides

2010/2011

Uploaded on 10/07/2011

christina 🇺🇸

4.6

(23)

393 documents

1 / 17

This page cannot be seen from the preview

Don't miss anything!

Log Structured FS

Arvind Krishnamurthy

Spring 2004

Log Structured File Systems

nRadical, different approach to designing file systems

nTechnology motivations: some technologies are advancing

more faster than others

nCPU are getting faster every year (x2 every 1-2 years)

nEverything else except CPU will become a bottleneck (Amdahl’s

law)

nDisks are not getting much faster

nMemory is growing in size dramatically (x2 every 1.5 years)

nFile systems èFile caches are a good idea (cut down on disk

bandwidth)

Discover Slides of Computer Numerical Control Yale University

Partial preview of the text

Download Log Structured FS, Lecture Slide - Computer Science and more Slides Computer Numerical Control in PDF only on Docsity!

Log Structured FS

Arvind Krishnamurthy

Spring 2004

Log Structured File Systems

n Radical, different approach to designing file systems

n Technology motivations: some technologies are advancing

more faster than others

n CPU are getting faster every year (x2 every 1-2 years) n Everything else except CPU will become a bottleneck (Amdahl’s law) n Disks are not getting much faster n Memory is growing in size dramatically (x2 every 1.5 years) n File systems Ë File caches are a good idea (cut down on disk bandwidth)

Motivation (contd.)

n File System motivations:

n File caches help reads a lot n File caches do not help writes very much n Delayed writes help but cannot delay for ever n File caches make disk writes more frequent than disk reads n Files are mostly small -- too much synchronous I/O n Disk geometries not predictable n RAID: whole bunch of disks with data striped across them n Increases bandwidth, but does not change latency n Does not help small files (more on this later)

LFS Writes

n Treat disk as a tape!

n Buffer recent writes in memory n Log append only – no overwrite in place n Log is the only thing on disk! Main storage structure

n When you create a small file (less than a block): n Write data block to memory log n Write file inode to memory log n Write directory block to memory log n Write directory inode to memory log n When memory accumulates to say 1MB or say 30 seconds have elapsed, write log to disk as a single write

n No seeks for writes

n But inodes are now floating

LFS: floating inodes

When write: n Append data, inode, piece of inode-map to the log

n Record location of piece of inode map in map of inode map (in memory)

n Checkpoint map of inode map once in a while

LFS Data structures

When read: n From map map, to inode map, to inode to block

n Get some locality in inode map

n Cache a lot of hot pieces of inode map

n Number of I/Os per read: a little worse than FFS

LFS Data structures (contd.)

When recover: n Read checkpoint, get map of map

n Roll forward in log to update map of map

Wrap Around Problem

n Pretty soon you run out of space on the disk

n Log needs to wrap around

n Two approaches:

n Compaction n Threading

n Sprite (first implementation of LFS):

n Combination of the two; open up free segments & avoid copying

Combined Solution

n Want benefits of both: n Compaction: big free space n Threading: leave long living things in place so they aren’t copied again and again

n Solution: “segmented log” n Chop disk into a bunch of large “segments” n Compaction within segments n Threading among segments n Always write to the “current clean” segment before moving onto next one n Segment cleaner: pick some segments and collect their live data together

Recap

n In LFS, everything is stored in a single log

n Carry over the data-blocks and I-node data structures from Unix n Buffer writes and write them to disk as a sequential log n Use inode-map and inode-map-map to keep track of floating I- nodes n Cache (in memory) typically minimizes the cost of the extra levels of indirection n Inode-map-map and pieces of inode-map are cache in memory

Cleaning

n Eventually the log could fill the entire disk

n Reclaim the holes in the log. Two approaches: n Compaction of entire disk n Threading over live data n LFS uses a hybrid strategy. Divides disk into “segments” n Threads over non-empty segments n Segments guarantee that seek costs are amortized n Every once in a while, picks a few segments, compacts them to generate empty segments

Cleaning Process

n When to clean?

n When the number of free segments falls below a certain threshold

n Choosing a segment to clean:

n Will be based on amount of live data it contains n Segment usage table: tracks number of live bytes in each segment n When you rewrite I-nodes/data blocks, find the old segment in which they used to live, and decrement the usage count for the old segment

Cleaning Goals

n Want bimodal distribution:

n Small number of low-utilized segments n So that cleaner can always find easy segments to clean n Large number of highly-utilized segments n So that disk is well utilized

# segs

Greedy Cleaner

n Greedy cleaner: pick the lowest “u” to clean

n Workload #1: uniform (pick random files to overwrite)

n Workload #2: hot-cold workload (90% of the updates to 10% of the files)

Greedy Cleaner

n Greedy strategy is not creating a bimodal distribution

n Slow moving segments likely to make the cleaning threshold high

n Separation of data into hot & cold data also didn’t help

Better Approach

n Cold segment space more valuable: if you clean cold segments, takes them longer to come back

n Hot free space is less valuable: might as well wait a bit longer

When is LFS good?

n LFS does well on “common” cases

n LFS degrade for “corner” cases

Why this is good research?

n Driven by keen awareness of technology trend

n Willing to radically depart from conventional practice

n Yet keep sufficient compatibility to keep things simple and

limit grunge work

n Provide insight with simplified math

n Simulation to evaluate and validate ideas

n Solid real implementation and measurements

Announcements

n Design review meetings:

n Tomorrow from 2-4pm n Thursday from 2-4pm with Zheng Ma

n Suggested background readings:

n RAID paper n Unix Time Sharing System paper

RAIDs and availability

n Suppose you need to store more data than fits on a single disk (e.g., large database or file servers). How should arrange data across disks?

n Option 1: treat disks as huge pool of disk blocks n Disk1 has blocks 1, 2, …, N n Disk2 has blocks N+1, N+2, …, 2N n …………

n Option 2: Stripe data across disks, with k disks:

n Disk1 has blocks 1, k+1, 2k+1, … n Disk2 has blocks 2, k+2, 2k+2, … n …………

n What are the advantages/disadvantages of the two options?

Writes to RAID 4

n Large writes which accesses all disks (say, a stripe of

blocks)

n Compute the parity block and store it on the parity disk

n Small writes. Two options:

n Read current stripe of blocks, compute parity with the new block, write parity block n Better option: n Read current version of block being written n Read current version of parity block n Compute how parity would change: n If a bit on block changed, the corresponding parity bit needs to be flipped n Write new version of block n Write new version of parity block

n Disk containing parity block is updated on all writes

Distributed Parity

n Parity blocks are distributed across disks

n Spreads load evenly n Multiple writes could potentially be serviced at the same time n All disks can be used for servicing reads

Comparison

n RAID-5 vs. normal disks:

n RAID-5: better throughput, better reliability, good bandwidth for large reads, small waste of space n Normal disks: perform better for small writes

n RAID-1 vs. RAID-5: Which is better?

n RAID-1 wastes more space n For small writes: RAID-1 is better

n HP-AutoRAID system:

n Stores hot data in RAID- n Cold data in RAID- n Does automatic background propagation of data as working set changes

Log Structured FS, Lecture Slide - Computer Science, Slides of Computer Numerical Control

Related documents

Partial preview of the text

Download Log Structured FS, Lecture Slide - Computer Science and more Slides Computer Numerical Control in PDF only on Docsity!

Log Structured FS

Arvind Krishnamurthy

Spring 2004

Log Structured File Systems

n Radical, different approach to designing file systems

n Technology motivations: some technologies are advancing

more faster than others

Motivation (contd.)

n File System motivations:

LFS Writes

LFS: floating inodes

LFS Data structures

LFS Data structures (contd.)

Wrap Around Problem

n Pretty soon you run out of space on the disk

n Log needs to wrap around

n Two approaches:

n Sprite (first implementation of LFS):

Combined Solution

Recap

n In LFS, everything is stored in a single log

Cleaning

n Eventually the log could fill the entire disk

Cleaning Process

n When to clean?

n Choosing a segment to clean:

Cleaning Goals

n Want bimodal distribution:

Greedy Cleaner

Greedy Cleaner

Better Approach

When is LFS good?

Why this is good research?

n Driven by keen awareness of technology trend

n Willing to radically depart from conventional practice

n Yet keep sufficient compatibility to keep things simple and

limit grunge work

n Provide insight with simplified math

n Simulation to evaluate and validate ideas

n Solid real implementation and measurements

Announcements

n Design review meetings:

n Suggested background readings:

RAIDs and availability

Writes to RAID 4

n Large writes which accesses all disks (say, a stripe of

blocks)

n Small writes. Two options:

n Disk containing parity block is updated on all writes

Distributed Parity

n Parity blocks are distributed across disks

Comparison

n RAID-5 vs. normal disks:

n RAID-1 vs. RAID-5: Which is better?

n HP-AutoRAID system: