









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Log structure, File systems, Motivation, LFS writes, Floating inodes, LFS data structure, Compaction, Threading, Cleaning process, Cost benefit analysis, Postscript, Array of disk
Typology: Slides
1 / 17
This page cannot be seen from the preview
Don't miss anything!










n CPU are getting faster every year (x2 every 1-2 years) n Everything else except CPU will become a bottleneck (Amdahl’s law) n Disks are not getting much faster n Memory is growing in size dramatically (x2 every 1.5 years) n File systems Ë File caches are a good idea (cut down on disk bandwidth)
n File caches help reads a lot n File caches do not help writes very much n Delayed writes help but cannot delay for ever n File caches make disk writes more frequent than disk reads n Files are mostly small -- too much synchronous I/O n Disk geometries not predictable n RAID: whole bunch of disks with data striped across them n Increases bandwidth, but does not change latency n Does not help small files (more on this later)
n Treat disk as a tape!
n Buffer recent writes in memory n Log append only – no overwrite in place n Log is the only thing on disk! Main storage structure
n When you create a small file (less than a block): n Write data block to memory log n Write file inode to memory log n Write directory block to memory log n Write directory inode to memory log n When memory accumulates to say 1MB or say 30 seconds have elapsed, write log to disk as a single write
n No seeks for writes
n But inodes are now floating
When write: n Append data, inode, piece of inode-map to the log
n Record location of piece of inode map in map of inode map (in memory)
n Checkpoint map of inode map once in a while
When read: n From map map, to inode map, to inode to block
n Get some locality in inode map
n Cache a lot of hot pieces of inode map
n Number of I/Os per read: a little worse than FFS
When recover: n Read checkpoint, get map of map
n Roll forward in log to update map of map
n Compaction n Threading
n Combination of the two; open up free segments & avoid copying
n Want benefits of both: n Compaction: big free space n Threading: leave long living things in place so they aren’t copied again and again
n Solution: “segmented log” n Chop disk into a bunch of large “segments” n Compaction within segments n Threading among segments n Always write to the “current clean” segment before moving onto next one n Segment cleaner: pick some segments and collect their live data together
n Carry over the data-blocks and I-node data structures from Unix n Buffer writes and write them to disk as a sequential log n Use inode-map and inode-map-map to keep track of floating I- nodes n Cache (in memory) typically minimizes the cost of the extra levels of indirection n Inode-map-map and pieces of inode-map are cache in memory
n Reclaim the holes in the log. Two approaches: n Compaction of entire disk n Threading over live data n LFS uses a hybrid strategy. Divides disk into “segments” n Threads over non-empty segments n Segments guarantee that seek costs are amortized n Every once in a while, picks a few segments, compacts them to generate empty segments
n When the number of free segments falls below a certain threshold
n Will be based on amount of live data it contains n Segment usage table: tracks number of live bytes in each segment n When you rewrite I-nodes/data blocks, find the old segment in which they used to live, and decrement the usage count for the old segment
n Small number of low-utilized segments n So that cleaner can always find easy segments to clean n Large number of highly-utilized segments n So that disk is well utilized
# segs
u
n Greedy cleaner: pick the lowest “u” to clean
n Workload #1: uniform (pick random files to overwrite)
n Workload #2: hot-cold workload (90% of the updates to 10% of the files)
n Greedy strategy is not creating a bimodal distribution
n Slow moving segments likely to make the cleaning threshold high
n Separation of data into hot & cold data also didn’t help
n Cold segment space more valuable: if you clean cold segments, takes them longer to come back
n Hot free space is less valuable: might as well wait a bit longer
n LFS does well on “common” cases
n LFS degrade for “corner” cases
n Tomorrow from 2-4pm n Thursday from 2-4pm with Zheng Ma
n RAID paper n Unix Time Sharing System paper
n Suppose you need to store more data than fits on a single disk (e.g., large database or file servers). How should arrange data across disks?
n Option 1: treat disks as huge pool of disk blocks n Disk1 has blocks 1, 2, …, N n Disk2 has blocks N+1, N+2, …, 2N n …………
n Option 2: Stripe data across disks, with k disks:
n Disk1 has blocks 1, k+1, 2k+1, … n Disk2 has blocks 2, k+2, 2k+2, … n …………
n What are the advantages/disadvantages of the two options?
n Compute the parity block and store it on the parity disk
n Read current stripe of blocks, compute parity with the new block, write parity block n Better option: n Read current version of block being written n Read current version of parity block n Compute how parity would change: n If a bit on block changed, the corresponding parity bit needs to be flipped n Write new version of block n Write new version of parity block
n Spreads load evenly n Multiple writes could potentially be serviced at the same time n All disks can be used for servicing reads
n RAID-5: better throughput, better reliability, good bandwidth for large reads, small waste of space n Normal disks: perform better for small writes
n RAID-1 wastes more space n For small writes: RAID-1 is better
n Stores hot data in RAID- n Cold data in RAID- n Does automatic background propagation of data as working set changes