Download Storage Systems - Lecture Slides | CMSC 411 and more Study notes Computer Science in PDF only on Docsity!
Page 1
Storage Systems (2)
Outline
- Historical Context of Storage I/O
- Secondary and Tertiary Storage Devices
- I/O Buses
- Processor Interface Issues
- Storage I/O Performance Measures
- A Little Queuing Theory
- Redundant Arrarys of Inexpensive Disks (RAID)
I/O Performance Measures
- Diversity
- Which IO devices can connect?
- Capacity
- How many IO devices can connect?
- Response time (Latency)
- time a task takes from the moment it is placed in the queue until the service is completed by the server
- Throughput (IO Bandwidth)
- average number of tasks completed by the server over a time period
Disk Device Performance
Disk Latency = Queuing Time + Seek Time + Rotation Time + Xfer Time Order of magnitude times for 4K byte transfers: Seek: 12 ms or less Rotate: 4.2 ms @ 7200 rpm (8.3 ms @ 3600 rpm ) Xfer: 1 ms @ 7200 rpm (2 ms @ 3600 rpm)
Page 2
Disk I/O Performance
Response time = Queue + Device Service time
Proc
Queue IOC Device
Disk Latency = Queuing Time + Controller Overhead + Seek Time + Rotation Time + Transfer Time
Disk Time Example
- Disk Parameters:
- Transfer size is 8K bytes
- Advertised average seek is 12 ms
- Disk spins at 7200 RPM
- Transfer rate is 4 MB/sec
- Controller overhead is 2 ms
- Assume that disk is idle so no queuing delay
- What is Average Disk Access Time for a Sector?
- Ave seek + ave rot delay + transfer time + controller overhead
- 12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms
- 12 + 4.15 + 2 + 2 = 20 ms
- Advertised seek time assumes no locality: typically 1/ to 1/3 advertised seek time: 20 ms 12 ms
Response Time vs Throughput Improving Performance I
- Provide more resources
- Improve throughput and response time?
Page 4
A Little Queuing Theory: Notation
- Queuing models assume state of equilibrium: input rate = output rate
- Notation: r ( λλλλ ) average number of arriving customers/second Tser average time to service a customer (traditionally μ = 1/ Tser ) u ( ρρρρ ) server utilization (0..1): u = r x Tser (or u = r / μ ) Tq average time/customer in queue Tsys average time/customer in system: Tsys = Tq + Tser Lq average length of queue: Lq = r x Tq Lsys average length of system: Lsys = r x Tsys
- Little’s Law: Lengthsystem = rate x Timesystem (Mean number customers = arrival rate x mean service time)
Proc IOC Device
Queue server
System
A Little Queuing Theory
- Service time completions vs. waiting time for a busy server: randomly arriving event joins a queue of arbitrary length when server is busy, otherwise serviced immediately - Unlimited length queues key simplification
- A single server queue : combination of a servicing facility that accommodates 1 customer at a time ( server ) + waiting area ( queue ): together called a system
- Server spends a variable amount of time with customers; how do you characterize variability? - Distribution of a random variable: histogram? curve? - Mean and variance sufficient to characterize distribution
Proc IOC Device
Queue server
System
A Little Queuing Theory
- Server spends a variable amount of time with customers
- Weighted mean m1 = (f1 x T1 + f2 x T2 +...+ fn x Tn)/F (F=f1 + f2...)
- variance = (f1 x T1^2 + f2 x T2^2 +...+ fn x Tn^2 )/F – m1^2 » Must keep track of unit of measure (100 ms^2 vs. 0.1 s^2 )
- Squared coefficient of variance : C = variance/m1^2 » Unitless measure (100 ms^2 vs. 0.1 s^2 )
- Exponential distribution C = 1 : most short relative to average, few others long; 90% < 2.3 x average, 63% < average
- Hypoexponential distribution C < 1 : most close to average, C=0.5 => 90% < 2.0 x average, only 57% < average
- Hyperexponential distributionC=2.0 => 90% < 2.8 x average, 69% < average C > 1 : further from average
Proc IOC Device
Queue server
System
Avg.
A Little Queuing Theory: Variable Service Time
- Server spends a variable amount of time with customers
- Weighted mean m1 = (f1xT1 + f2xT2 +...+ fnxTn)/F (F=f1+f2+...)
- Squared coefficient of variance C
- Disk response times C = 1.5 (majority seeks < average)
- Yet usually pick C = 1.0 for simplicity
- Another useful value is average time must wait for server to complete task: m1(z) - Not just 1/2 x m1 because doesn’t capture variance - Can derive m1(z) = 1/2 x m1 x (1 + C) - No variance C= 0 m1(z) = 1/2 x m
Proc IOC Device
Queue server
System
Page 5
A Little Queuing Theory: Average Wait Time
- Calculating average wait time in queue Tq
- If something at server, it takes to complete on average m1(z)
- Chance server is busy = u; average delay is u x m1(z)
- All customers in line must complete; each avg Tser Tq = u x m1(z) + Lq x Ts er = 1/2 x u x Tser x (1 + C) + Lq x Ts er Tq = 1/2 x u x Ts er x (1 + C) + r x Tq x Ts er Tq = 1/2 x u x Ts er x (1 + C) + u x Tq Tq x (1 – u) = Ts er x u x (1 + C) / Tq = Ts er x u x (1 + C) / (2 x (1 – u))
- Notation: r average number of arriving customers/second Tser average time to service a customer u server utilization (0..1): u = r x Tser Tq average time/customer in queue Lq average length of queue: Lq= r x Tq
A Little Queuing Theory: M/G/1 and M/M/
- Assumptions so far:
- System in equilibrium
- Time between two successive arrivals in line are random
- Server can start on next customer immediately after prior finishes
- No limit to the queue: works First-In-First-Out
- Afterward, all customers in line must complete; each avg Tser
- Described “memoryless” or Markovian request arrival (M for C=1 exponentially random), General service distribution (no restrictions), 1 server: M/G/1 queue
- When Service times have C = 1, M/M/1 queue Tq = Tser x u x (1 + C) /(2 x (1 – u)) = Tser x u / (1 – u) Tser average time to service a customer u server utilization (0..1): u = r x Tser Tq average time/customer in queue
A Little Queuing Theory: An Example
- processor sends 10 x 8KB disk I/Os per second, requests & service exponentially distrib., avg. disk service = 20 ms
- On average, how utilized is the disk?
- What is the number of requests in the queue?
- What is the average time spent in the queue?
- What is the average response time for a disk request?
- Notation: r average number of arriving customers/second = 10 Tser average time to service a customer = 20 ms (0.02s) u server utilization (0..1): u = r x Tser = 10/s x .02s = 0. Tq average time/customer in queue = Tser x u / (1 – u) = 20 x 0.2/(1-0.2) = 20 x 0.25 = 5 ms (0 .005s) Tsys average time/customer in system: Tsys =Tq +Tser = 25 ms Lq average length of queue: Lq= r x Tq = 10/s x .005s = 0.05 requests in queue Lsys average # tasks in system: Lsys = r x Tsys = 10/s x .025s = 0.
A Little Queuing Theory: Another Example
- processor sends 20 x 8KB disk I/Os per sec, requests & service exponentially distrib., avg. disk service = 12 ms
- On average, how utilized is the disk?
- What is the number of requests in the queue?
- What is the average time a spent in the queue?
- What is the average response time for a disk request?
- Notation: r average number of arriving customers/second= 20 Tser average time to service a customer= 12 ms u server utilization (0..1): u = r x Tser = /s x. s = Tq average time/customer in queue = Ts er x u / (1 – u) = x /( ) = x = ms Tsys average time/customer in system: Tsys =Tq +Tser = 16 ms Lq average length of queue: Lq= r x Tq = /s x s = requests in queue Lsys average # tasks in system : Lsys = r x Tsys = /s x s =
Page 7
Manufacturing Advantages of Disk Arrays
3.5” 5.25” 10”^ 14”
3.5”
Disk Array: 1 disk design
Conventional: 4 disk designs
Low End High End
Disk Product Families
Replace Small # of Large Disks with Large # of Small Disks! (1988 Disks)
Data Capacity Volume Power Data Rate I/O Rate MTTF Cost
IBM 3390 (K) 20 GBytes 97 cu. ft. 3 KW 15 MB/s 600 I/Os/s 250 KHrs $250K
IBM 3.5" 0061 320 MBytes 0.1 cu. ft. 11 W 1.5 MB/s 55 I/Os/s 50 KHrs $2K
x 23 GBytes 11 cu. ft. 1 KW 120 MB/s 3900 IOs/s ??? Hrs $150K
Disk Arrays have potential for
large data and I/O rates high MB per cu. ft., high MB per KW reliability?
Array Reliability
- Reliability of N disks = Reliability of 1 Disk ÷ N
50,000 Hours ÷ 70 disks = 700 hours Disk system MTTF: Drops from 6 years to 1 month!
- Arrays (without redundancy) too unreliable to be useful!
Hot spares support reconstruction in parallel with access: very high media availability can be achieved
Redundant Arrays of Disks
**- Files are "striped" across multiple spindles
- Redundancy yields high data availability** Disks will fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store it Bandwidth penalty to update Mirroring/Shadowing (high capacity cost) Horizontal Hamming Codes (overkill) Parity & Reed-Solomon Codes Failure Prediction (no capacity overhead!) VaxSimPlus — Technique is controversial
Techniques:
Page 8
Redundant Arrays of Disks RAID 1: Disk Mirroring/Shadowing
- Each disk is fully duplicated onto its "shadow" **Very high availability can be achieved
- Bandwidth sacrifice on write:** **Logical write = two physical writes
- Reads may be optimized
- Most expensive solution: 100% capacity overhead** Targeted for high I/O rate , high availability environments
recoverygroup
Redundant Arrays of Disks RAID 3: Parity Disk
P
(^1001001111001101) 10010011
... logical record (^1) (^00) (^10) (^01) 1
(^11) (^00) (^11) (^01)
(^10) (^01) (^00) (^11)
(^00) (^11) (^00) (^00)
Striped physicalrecords
- Parity computed across recovery group to protect against hard disk failures 33% capacity cost for parity in this configuration wider arrays reduce capacity costs, decrease expected availability, **increase reconstruction time
- Arms logically synchronized, spindles rotationally synchronized** logically a single high capacity, high transfer rate disk Targeted for high bandwidth applications: Scientific, Image Processing
Redundant Arrays of Disks RAID 5+: High I/O Rate Parity
A logical writebecomes four physical I/Os Independent writes possible because of interleaved parity Reed-Solomon Codes ("Q") forprotection during reconstruction
D0 D1 D2 (^) D