Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Storage Systems - Lecture Slides | CMSC 411, Study notes of Computer Science

University of Maryland Computer Science

Prof. Sam Hyuk Noh

Material Type: Notes; Professor: Noh; Class: SYSTM ARCHITECTURE; Subject: Computer Science; University: University of Maryland; Term: Fall 2001;

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-d8f 🇺🇸

10 documents

1 / 11

This page cannot be seen from the preview

Don't miss anything!

1

Page1

StorageSystems(2)

Outline

• HistoricalContextofStorageI/O

• SecondaryandTertiaryStorageDevices

• I/OBuses

• ProcessorInterfaceIssues

•StorageI/OPerformanceMeasures

•ALittleQueuingTheory

•RedundantArrarys ofInexpensiveDisks

(RAID)

I/OPerformanceMeasures

•Diversity

–WhichIOdevicescanconnect?

•Capacity

–HowmanyIOdevicescanconnect?

•Responsetime(Latency)

–timeatasktakesfromthemomentitisplacedinthe

queueuntiltheserviceiscompletedbytheserver

•Throughput(IOBandwidth)

–averagenumberoftaskscompletedbytheserverovera

timeperiod

DiskDevicePerformance

DiskLatency=QueuingTime+SeekTime+RotationTime+Xfer Time

Orderofmagnitudetimesfor4Kbytetransfers:

Seek:12msorless

Rotate:4.2ms@7200rpm(8.3ms@3600rpm)

Xfer:1ms@7200rpm(2ms@3600rpm)

Discover Study notes of Computer Science University of Maryland

Partial preview of the text

Download Storage Systems - Lecture Slides | CMSC 411 and more Study notes Computer Science in PDF only on Docsity!

Page 1

Storage Systems (2)

Outline

Historical Context of Storage I/O
Secondary and Tertiary Storage Devices
I/O Buses
Processor Interface Issues
Storage I/O Performance Measures
A Little Queuing Theory
Redundant Arrarys of Inexpensive Disks (RAID)

I/O Performance Measures

Diversity
- Which IO devices can connect?
Capacity
- How many IO devices can connect?
Response time (Latency)
- time a task takes from the moment it is placed in the queue until the service is completed by the server
Throughput (IO Bandwidth)
- average number of tasks completed by the server over a time period

Disk Device Performance

Disk Latency = Queuing Time + Seek Time + Rotation Time + Xfer Time Order of magnitude times for 4K byte transfers: Seek: 12 ms or less Rotate: 4.2 ms @ 7200 rpm (8.3 ms @ 3600 rpm ) Xfer: 1 ms @ 7200 rpm (2 ms @ 3600 rpm)

Page 2

Disk I/O Performance

Response time = Queue + Device Service time

Proc

Queue IOC Device

Disk Latency = Queuing Time + Controller Overhead + Seek Time + Rotation Time + Transfer Time

Disk Time Example

Disk Parameters:
- Transfer size is 8K bytes
- Advertised average seek is 12 ms
- Disk spins at 7200 RPM
- Transfer rate is 4 MB/sec
Controller overhead is 2 ms
Assume that disk is idle so no queuing delay
What is Average Disk Access Time for a Sector?
- Ave seek + ave rot delay + transfer time + controller overhead
- 12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms
- 12 + 4.15 + 2 + 2 = 20 ms
Advertised seek time assumes no locality: typically 1/ to 1/3 advertised seek time: 20 ms 12 ms

Response Time vs Throughput Improving Performance I

Provide more resources
- Improve throughput and response time?

Page 4

A Little Queuing Theory: Notation

Queuing models assume state of equilibrium: input rate = output rate
Notation: r ( λλλλ ) average number of arriving customers/second Tser average time to service a customer (traditionally μ = 1/ Tser ) u ( ρρρρ ) server utilization (0..1): u = r x Tser (or u = r / μ ) Tq average time/customer in queue Tsys average time/customer in system: Tsys = Tq + Tser Lq average length of queue: Lq = r x Tq Lsys average length of system: Lsys = r x Tsys
Little’s Law: Lengthsystem = rate x Timesystem (Mean number customers = arrival rate x mean service time)

Proc IOC Device

Queue server

System

A Little Queuing Theory

Service time completions vs. waiting time for a busy server: randomly arriving event joins a queue of arbitrary length when server is busy, otherwise serviced immediately - Unlimited length queues key simplification
A single server queue : combination of a servicing facility that accommodates 1 customer at a time ( server ) + waiting area ( queue ): together called a system
Server spends a variable amount of time with customers; how do you characterize variability? - Distribution of a random variable: histogram? curve? - Mean and variance sufficient to characterize distribution

Proc IOC Device

Queue server

System

A Little Queuing Theory

Server spends a variable amount of time with customers
- Weighted mean m1 = (f1 x T1 + f2 x T2 +...+ fn x Tn)/F (F=f1 + f2...)
- variance = (f1 x T1^2 + f2 x T2^2 +...+ fn x Tn^2 )/F – m1^2 » Must keep track of unit of measure (100 ms^2 vs. 0.1 s^2 )
- Squared coefficient of variance : C = variance/m1^2 » Unitless measure (100 ms^2 vs. 0.1 s^2 )
Exponential distribution C = 1 : most short relative to average, few others long; 90% < 2.3 x average, 63% < average
Hypoexponential distribution C < 1 : most close to average, C=0.5 => 90% < 2.0 x average, only 57% < average
Hyperexponential distributionC=2.0 => 90% < 2.8 x average, 69% < average C > 1 : further from average

Proc IOC Device

Queue server

System

Avg.

A Little Queuing Theory: Variable Service Time

Server spends a variable amount of time with customers
- Weighted mean m1 = (f1xT1 + f2xT2 +...+ fnxTn)/F (F=f1+f2+...)
- Squared coefficient of variance C
Disk response times C = 1.5 (majority seeks < average)
Yet usually pick C = 1.0 for simplicity
Another useful value is average time must wait for server to complete task: m1(z) - Not just 1/2 x m1 because doesn’t capture variance - Can derive m1(z) = 1/2 x m1 x (1 + C) - No variance C= 0 m1(z) = 1/2 x m

Proc IOC Device

Queue server

System

Page 5

A Little Queuing Theory: Average Wait Time

Calculating average wait time in queue Tq
- If something at server, it takes to complete on average m1(z)
- Chance server is busy = u; average delay is u x m1(z)
- All customers in line must complete; each avg Tser Tq = u x m1(z) + Lq x Ts er = 1/2 x u x Tser x (1 + C) + Lq x Ts er Tq = 1/2 x u x Ts er x (1 + C) + r x Tq x Ts er Tq = 1/2 x u x Ts er x (1 + C) + u x Tq Tq x (1 – u) = Ts er x u x (1 + C) / Tq = Ts er x u x (1 + C) / (2 x (1 – u))
Notation: r average number of arriving customers/second Tser average time to service a customer u server utilization (0..1): u = r x Tser Tq average time/customer in queue Lq average length of queue: Lq= r x Tq

A Little Queuing Theory: M/G/1 and M/M/

Assumptions so far:
- System in equilibrium
- Time between two successive arrivals in line are random
- Server can start on next customer immediately after prior finishes
- No limit to the queue: works First-In-First-Out
- Afterward, all customers in line must complete; each avg Tser
Described “memoryless” or Markovian request arrival (M for C=1 exponentially random), General service distribution (no restrictions), 1 server: M/G/1 queue
When Service times have C = 1, M/M/1 queue Tq = Tser x u x (1 + C) /(2 x (1 – u)) = Tser x u / (1 – u) Tser average time to service a customer u server utilization (0..1): u = r x Tser Tq average time/customer in queue

A Little Queuing Theory: An Example

processor sends 10 x 8KB disk I/Os per second, requests & service exponentially distrib., avg. disk service = 20 ms
On average, how utilized is the disk?
- What is the number of requests in the queue?
- What is the average time spent in the queue?
- What is the average response time for a disk request?
Notation: r average number of arriving customers/second = 10 Tser average time to service a customer = 20 ms (0.02s) u server utilization (0..1): u = r x Tser = 10/s x .02s = 0. Tq average time/customer in queue = Tser x u / (1 – u) = 20 x 0.2/(1-0.2) = 20 x 0.25 = 5 ms (0 .005s) Tsys average time/customer in system: Tsys =Tq +Tser = 25 ms Lq average length of queue: Lq= r x Tq = 10/s x .005s = 0.05 requests in queue Lsys average # tasks in system: Lsys = r x Tsys = 10/s x .025s = 0.

A Little Queuing Theory: Another Example

processor sends 20 x 8KB disk I/Os per sec, requests & service exponentially distrib., avg. disk service = 12 ms
On average, how utilized is the disk?
- What is the number of requests in the queue?
- What is the average time a spent in the queue?
- What is the average response time for a disk request?
Notation: r average number of arriving customers/second= 20 Tser average time to service a customer= 12 ms u server utilization (0..1): u = r x Tser = /s x. s = Tq average time/customer in queue = Ts er x u / (1 – u) = x /( ) = x = ms Tsys average time/customer in system: Tsys =Tq +Tser = 16 ms Lq average length of queue: Lq= r x Tq = /s x s = requests in queue Lsys average # tasks in system : Lsys = r x Tsys = /s x s =

Page 7

Manufacturing Advantages of Disk Arrays

3.5” 5.25” 10”^ 14”

3.5”

Disk Array: 1 disk design

Conventional: 4 disk designs

Low End High End

Disk Product Families

Replace Small # of Large Disks with Large # of Small Disks! (1988 Disks)

Data Capacity Volume Power Data Rate I/O Rate MTTF Cost

IBM 3390 (K) 20 GBytes 97 cu. ft. 3 KW 15 MB/s 600 I/Os/s 250 KHrs $250K

IBM 3.5" 0061 320 MBytes 0.1 cu. ft. 11 W 1.5 MB/s 55 I/Os/s 50 KHrs $2K

x 23 GBytes 11 cu. ft. 1 KW 120 MB/s 3900 IOs/s ??? Hrs $150K

Disk Arrays have potential for

large data and I/O rates high MB per cu. ft., high MB per KW reliability?

Array Reliability

Reliability of N disks = Reliability of 1 Disk ÷ N

50,000 Hours ÷ 70 disks = 700 hours Disk system MTTF: Drops from 6 years to 1 month!

- Arrays (without redundancy) too unreliable to be useful!

Hot spares support reconstruction in parallel with access: very high media availability can be achieved

Redundant Arrays of Disks

**- Files are "striped" across multiple spindles

Redundancy yields high data availability** Disks will fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store it Bandwidth penalty to update Mirroring/Shadowing (high capacity cost) Horizontal Hamming Codes (overkill) Parity & Reed-Solomon Codes Failure Prediction (no capacity overhead!) VaxSimPlus — Technique is controversial

Techniques:

Page 8

Redundant Arrays of Disks RAID 1: Disk Mirroring/Shadowing

- Each disk is fully duplicated onto its "shadow" **Very high availability can be achieved

Bandwidth sacrifice on write:** **Logical write = two physical writes
Reads may be optimized
Most expensive solution: 100% capacity overhead** Targeted for high I/O rate , high availability environments

recoverygroup

Redundant Arrays of Disks RAID 3: Parity Disk

P

(^1001001111001101) 10010011

... logical record (^1) (^00) (^10) (^01) 1

(^11) (^00) (^11) (^01)

(^10) (^01) (^00) (^11)

(^00) (^11) (^00) (^00)

Striped physicalrecords

- Parity computed across recovery group to protect against hard disk failures 33% capacity cost for parity in this configuration wider arrays reduce capacity costs, decrease expected availability, **increase reconstruction time

Arms logically synchronized, spindles rotationally synchronized** logically a single high capacity, high transfer rate disk Targeted for high bandwidth applications: Scientific, Image Processing

Redundant Arrays of Disks RAID 5+: High I/O Rate Parity

A logical writebecomes four physical I/Os Independent writes possible because of interleaved parity Reed-Solomon Codes ("Q") forprotection during reconstruction

D0 D1 D2 (^) D

P

D4 D5 D6

(^) P D D8 D

P D10 D

D12

^ P^ D13^ D14^ D

P D16 D17 D18 (^) D

D20 D21 D22 D23

(^) P .. .

. .. .. .

. . .

.. Disk Columns.

IncreasingLogical Disk Addresses

Stripe Stripe Unit

Targeted for mixed applications

Problems of Disk Arrays: Small Writes

D0 D1 D2 D

D0' (^) P

+ +

D0' D1 D2 D

P'

newdata olddata old parity

XOR

(1. Read) (^) (2. Read)

(3. Write) (^) (4. Write)

RAID-5: Small Write Algorithm 1 Logical Write = 2 Physical Reads + 2 Physical Writes

Page 10

Interfacing to an Operating System

Stale data problem
- Inconsistent view of the data

DMA and Virtual Memory

Should DMA use virtual or physical address?

Chapter 6

Did not discuss “Examples of Benchmarks of Disk Performance” on page 516 - May be helpful in understanding context - Recommended reading, though will not be in exam
Did not discuss Section 6.
- May be useful when studying Operating Systems
Homework #5 due Dec. 10
- Six questions in Section 6.7; note that answers are provided in text, but you still have to do it and SUBMIT them
- 6.6, 6.10, 6.16, 6.
- Optional: 6.5, 6.7, 6.

Summary: A Little Queuing Theory

Queuing models assume state of equilibrium: input rate = output rate
Notation: r average number of arriving customers/second Tser average time to service a customer (tradtionally μ = 1/ Tser ) u server utilization (0..1): u = r x Tser Tq average time/customer in queue Tsys average time/customer in system: Tsys = Tq + Tser Lq average length of queue: Lq = r x Tq Lsys average length of system : Lsys = r x Tsys
Little’s Law: Lengthsystem = rate x Timesystem (Mean number customers = arrival rate x mean service time)

Proc IOC Device

Queue server

System

Page 11

Summary: Redundant Arrays of Disks (RAID) Techniques

- Disk Mirroring, Shadowing (RAID 1) Each disk is fully duplicated onto its "shadow" Logical write = two physical writes 100% capacity overhead - Parity Data Bandwidth Array (RAID 3) Parity computed horizontally Logically a single high data bw disk - High I/O Rate Parity Array (RAID 5) Interleaved parity blocks Independent reads and writes Logical write = 2 reads + 2 writes

Parity + Reed-Solomon codes

(^10) (^01) (^00) (^11)

(^11) (^00) (^11) (^01)

(^10) (^01) (^00) (^11)

(^00) (^11) (^00) (^10)

(^10) (^01) (^00) (^11)

Storage Systems - Lecture Slides | CMSC 411, Study notes of Computer Science

Related documents

Partial preview of the text

Download Storage Systems - Lecture Slides | CMSC 411 and more Study notes Computer Science in PDF only on Docsity!

P

D4 D5 D6

P D10 D

D12

^ P^ D13^ D14^ D

D20 D21 D22 D23 

 P' 

Parity + Reed-Solomon codes

D20 D21 D22 D23

P'