Prepara i tuoi esami
Ottieni punti
Guide e consigli
Vendi su Docsity
Docsity AI

Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity

Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium

Guide e consigli

Vendi su Docsity

Docsity AI

Accedi Registrati

Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity

Cerca documenti

Prepara i tuoi esami con i documenti condivisi da studenti come te su Docsity

Cerca la tua università

Trova i documenti specifici per gli esami della tua università

Video Corsi

Preparati con lezioni e prove svolte basate sui programmi universitari!

Quiz

Rispondi a reali domande d’esame e scopri la tua preparazione

Docsity AINEW

Riassumi i tuoi documenti, fagli domande, convertili in quiz e mappe concettuali

Maturità 2026

Studia con prove svolte, tesine e consigli utili

Esplora domande

Togliti ogni dubbio leggendo le risposte alle domande fatte da altri studenti come te

Argomenti di studio

Esplora i documenti più scaricati per gli argomenti di studio più popolari

Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium

Condividi documenti

20 Punti

Per ogni documento caricato

Rispondi alle domande

5 Punti

per ogni risposta data (max 1 al giorno)

Tutti i modi per ottenere punti gratis

Ottieni punti subito

Scegli un piano Premium con tutti i punti di cui hai bisogno

Opportunità di studio

Scegli il tuo prossimo programma di studio

Entra in contatto con le migliori università del mondo e scegli il tuo percorso di studi

Classifica delle migliori università

Scopri le migliori università italiane secondo gli studenti

Community

Chiedi alla community

Chiedi aiuto alla community e sciogli i tuoi dubbi legati allo studio

Guide Gratuite

I nostri eBook salva studente

Scarica gratuitamente le nostre guide sulle tecniche di studio, metodi per gestire l'ansia, dritte per la tesi realizzati da tutor Docsity

Appunti di Computing Infrastructures, Dispense di Sistemi Integrati

Politecnico di Milano (POLIMI)Sistemi Integrati

Prof. Manuel Roveri

Appunti di Computing Infrastructures in lingua inglese. Argomenti trattati: - Storage: HDD, SSD, RAID, NAS, SAN - Virtual environment: Virtual Machines, Emulation - Performance

Tipologia: Dispense

2019/2020

In vendita dal 26/03/2020

Dr.Stoned 🇮🇹

(1)

13 documenti

1 / 68

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

COMPUTING INFRASTUCTURES

Index

• Architectures

o Hard Disk (HDD), SSD (Solid State Disk), RAID

o NAS, SAN, DAS

o Dependability

o Data Centers

o Virtual Machines, Emulators

• Performance

o Forced Flow Law, Throughput, Transaction workloads, Balance flow

o Model with one job class

Files

Disks can be seen by an OS as a collection of data block that can be read or written independently.

To allow the ordering/management among them, each block is characterized by a unique numerical

address called LBA (Logical Block Address).

Typically, the OS groups blocks into clusters to simplify the access to the disk.

Clusters: minimal unit that an OS can read from or write to a disk.

Files: a better way to address data on the clusters than LBA numbers.

Clusters contains:

• File data: the actual content of the files

• Metadata: the information required to support the file system (additional data)

Metadata contains:

• File names

• Directory structures and symbolic links

• File size and file type

• Creation, modification, last access dates

• Security information

• Links to the LBA where the file content can be located on the disk

Scopri Dispense di Sistemi Integrati Politecnico di Milano (POLIMI)

Documenti correlati

Fondamenti di informatica grafica EDA

(1)

Computing Infrastructures (Dispensa del Corso)

Warehouse-Scale Computing: Architectures, Storage, and Virtualization - Prof. Roveri

Network Infrastructures Notes - Cuomo Sapienza

(1)

Appunti corso Computing Infrastructure - Manuel Roveri 2024/2025

Network Infrastructures Notes - Cuomo Sapienza Domande d'esame

(1)

Network Infrastructures Theory

FrontEnd - Interaction Design

appunti informatica grafica

Riassunto Teoria 'Fondamenti di Informatica' (POLIMI Roveri Manuel 2023)

(1)

Appunti Informatica Grafica

Appunti sul primo modulo di Network Infrastructures

Anteprima parziale del testo

Scarica Appunti di Computing Infrastructures e più Dispense in PDF di Sistemi Integrati solo su Docsity!

COMPUTING INFRASTUCTURES

Index

• Architectures

o Hard Disk (HDD), SSD (Solid State Disk), RAID

o NAS, SAN, DAS

o Dependability

o Data Centers

o Virtual Machines, Emulators

• Performance

o Forced Flow Law, Throughput, Transaction workloads, Balance flow

o Model with one job class

Files

Disks can be seen by an OS as a collection of data block that can be read or written independently.

To allow the ordering/management among them, each block is characterized by a unique numerical

address called LBA (Logical Block Address).

Typically, the OS groups blocks into clusters to simplify the access to the disk.

Clusters : minimal unit that an OS can read from or write to a disk.

Files: a better way to address data on the clusters than LBA numbers.

Clusters contains:

File data: the actual content of the files
Metadata: the information required to support the file system (additional data)

Metadata contains:

File names
Directory structures and symbolic links
File size and file type
Creation, modification, last access dates
Security information
Links to the LBA where the file content can be located on the disk

We can:

Accessing the meta-data to locate its blocks → Access the blocks to read its content
Accessing the meta-data to locate free space → Write the data in the assigned blocks

Since the file system can only access clusters, the real occupation of space on a disk for a file is always a

multiple of the cluster size.

Given:

s: file size
c: cluster size (cannot be 1 byte because the metadata)
a: actual size on disk

Then, we have:

Optimal: 𝑎 = 𝑠

Commonly: 𝑎 > 𝑠

And the quantity 𝑤 = 𝑎 − 𝑠 is wasted disk space due to the organization of the file into clusters

This waste of space is called internal fragmentation of files.

Ex.

Deleting a file requires:

Only to update the meta-data to say the blocks where the file was stored are no longer in use by

the O.S.

Disk service time and response time

Service time 𝑠 𝑑𝑖𝑠𝑘

𝑑𝑖𝑠𝑘

Seek time : head movement time, is a function of the number of cylinders traversed

[

]

Latency time : time to wait for the sector (

𝑟𝑜𝑢𝑛𝑑) [𝑚𝑠𝑒𝑐]

Transfer time : is a function of rotation speed, storing density, cylinder position [

𝑀𝐵

𝑠𝑒𝑐

]

Controller overhead : buffer management (data transfer) and interrupt sending time

Response time R: time spent in queue waiting for the resource (that is busy executing other requests) and

for the execution of the request.

Response time is function of the:

Number of requests in the waiting queue
Utilization level of the resource
Mean disk service time
Variance of disk service time (distribution)
Arrival rate and inter-arrival times distribution

Assuming the same mean values of the metrics, the response time increases as the variability increases.

Ex.2.

Read/write of a sector of 512 𝐵𝑦𝑡𝑒 = 0. 5 𝐾𝐵

The previous service times consider only the very pessimistic case where sectors are fragmented on the

disk in worst possible way.

This is true if the files are very small or the disk is very fragmented.

In many circumstances, this is not the case: files are larger than one block, and they are stored in a

contiguous way to decrease the needs of seek and rotational latency times.

Suppose we can measure the data locality of a disk as the percentage of blocks that do not need seek or

rotational latency to be found.

Ex.2.

Read/write of a sector of 512 𝐵𝑦𝑡𝑒 = 0. 5 𝐾𝐵

RAID disk

Redundant Arrays of Independent Disks

The data are striped across several disks accessed in parallel:

High data transfer rate : large data accesses (heavy I/O op.)
High I/O rate : small but frequent data accesses (light I/O op.)
Load balancing across the disks

Two orthogonal techniques:

Data striping : to improve performance
Redundancy : to improve reliability

I/O virtualization: data are distributed transparently over the disks

Data striping:

striping : data are written sequentially (a vector, a file, a table, ...) in units (stripe unit: bit, byte,

blocks) on multiple disks according to a cyclic algorithm

stripe unit : dimension of the unit of data that are written on a single
disk stripe width : number of disks considered by the striping algorithm (does not necessarily

coincide with the number of physical disks in the array – there can be “hot spares”)

Multiple independent I/O requests will be executed in parallel by several disks decreasing the

queue length (and time) of the disks.

Single multiple-block I/O requests will be executed by multiple disks in parallel increasing of the

transfer rate of a single request

RAID architectures

RAID 0: stripping only
RAID 1: mirroring only
RAID 2: bit interleaving
RAID 3: byte interleaving - redundancy
RAID 4: block interleaving – redundancy
RAID 5: block interleaving – redundancy
RAID 6: greater redundancy (2 failed disks are tolerated)
RAID 7: proprietary solutions

N.B. RAID levels can be combined

RAID level 0: striping, no redundancy

Data are written on a single logical disk and splitted in several blocks distributed across the disks according

to a striping algorithm.

Used where performance and capacity, rather than reliability, are the primary concerns, minimum two

drives required.

Pros: lowest cost because it does not employ redundancy and best write performance because it does not

need to update redundant data and it is parallelized

Cons: single disk failure will result in data loss

RAID level 1: mirroring

Whenever data is written to a disk it is also duplicated (mirrored) to a second disk, minimum 2 disk drives.

Pros: high reliability (when a disk fails the second copy is used), read of data (it can be retrieved from the

disk with the shorter queueing, seek and latency delays), fast writes (no error correcting code should be

computed, but still lower than RAID 0)

Cons: high costs (50% of the capacity is used)

In principle, a RAID 1 can mirror the content over more than one disk. This give resiliency to errors even if

more than one disk break. It allows also with a voting mechanism to identify errors not reported by the disk

controller.

NEVER USED IN PRACTICE , because the overhead and costs are too high.

Two organization RAID 0+1 or RAID 1+0 are possible.

RAID 0+

High performance and reliability
Minimum 4 drives
After the first failure the model becomes as a RAID 0
High overhead (disk duplication)

RAID 1+

High performance and higher reliability than R0+
Minimum 4 drives are required
Used in databases with very high workload (fast writes)

R01 VS R

The blocks are the same but are allocated in a different order. Some controllers 0+1 execute striping and

mirroring as a single operation.

o Two concurrent disk failures are not tolerated

o After a disk fails, any other disk failure concerning another stripe is a single point of failure

o All the disks of the array are needed to recover the lost information

o One disk failure for each RAID1 group is tolerated, but after this failure any other disk

becomes a single point of failure of the entire array

RAID level 2: bit interleaved parity (not used)

Assumption: the failed component is unknown.

Memory stile ECC, less cost than mirroring but still very high

Number of redundant disks: proportional to the log of the total number of disks.

Parity blocks are computed for several subset of overlapped data.

When a disk fails several of the parity blocks will have inconsistent values, the failed component is the one

held in common by each incorrect subset.

RAID level 3: bit or byte interleaved parity

RAID5 data loss → two disks failed concurrently

Ex.4 - RAID level 5 -

5 physical disks configured as a single logical disk RAID5.

Write op. that update only one block require 5 disk I/Os (read-modify write)

Example: write of a new data in Block

Read Block 2,3 and 4
Computation of new Parity Block 1-4 with new Block 1
Write Block 1 and Parity Block 1- 4

INEFFICIENT

5 disk I/O are required → BAD!

Disk 0 Disk 1 Disk 2 Disk 3 Disk 4

Stripe 1 Block 1 Block 2 Block 3 Block 4 Parity 1- 4

Stripe 2 Block 5 Block 6 Block 7 Parity 5- 8 Block 8

Stripe 3 Block 9 Block 10 Parity 9- 12 Block 11 Block 12

Write new data in Block 1

Read old Block 1, old Parity 1- 4
Compute new Parity 1-4 from old Block 1, new Block 1, old Parity 1- 4
Write new Block 1 and Parity Block 1- 4

Only 4 disk I/O are required.

Write : compute new Parity 1-4 = old Parity 1- 4 – Block 1 + new Block 1
Read : parity blocks are not accessed during a read op. However, they are read if a sector suffer

from a CRC (Cyclic Redundancy Check) error. In this case the sector with the error can be

reconstructed from the remaining blocks of the stripe and the parity block.

RAID level 3, 4 and 5 comparison

Raid level 3,4 and 5 have all the same characteristics in terms of overhead, failure protection and total

available capacity.

However, they require controllers gradually more sophisticated:

Raid 3 has a simple controller, but requires disks synchronization
Raid 4 has the bottleneck on the parity disk

Raid 5 needs to implement a more complex algorithm

Currently, with the reduction of the hardware costs, Raid 5 are exactly as expensive as Raid 3 and 4.

For this Raid 3 and 4 are no longer used.

Current controller has also reduced the impact of interleaving, making thus the choice between corse and

fine grained less influent.

RAID level 6

As the size of the disk arrays are increasing, multiple failures become likely: this is due to the longer

recovery times required to consider all the sectors of the disks.

The probability of encountering an URE (Unrecoverable – Uncorrectable Read Error) can be significant.

RAID6: more fault tolerance with respect RAID

Uses Solomon-Reeds codes with two redundancy schemes: (P+Q) distributed and independent 2

concurrent failures are tolerated high overhead for the computation of parities N + 2 disks required with 4

data disks, each write require 6 disk accesses due to the need to update both the P and Q parity blocks

(slow writes) adopted for very critical applications.

N.B. Not efficient when the number of disks is small

When the number of disks increases the loss of efficiency decreases but the probability of 2 concurrent

failures increases, so RAID 6 becomes mandatory.

Let as call 𝐷 𝑖

the i-th data block

, and P and Q the two parity values.

Parity P is computed exactly as in RAID 5:

𝑛− 1

𝑖

𝑛− 1

𝑖= 0

Parity Q is instead computed using a generator value g:

𝑛− 1

𝑖

𝑛− 1

𝑖= 0

Ex.

Disk 0 Disk 1 Disk 2 Disk 3 Disk 4

Stripe 1 Block 1 Block 2 Block 3 P 1- 3 Q 1- 3

Values 2 10 7 19 50

Let as call the 𝐷 𝑖

′ the old data in the i-th block

, and P’ and Q’ the two old paries.

The new parity P can be computed as:

𝑛𝑒𝑤

𝑜𝑙𝑑

𝑛𝑒𝑤

𝑖𝑖

𝑜𝑙𝑑

𝑗−𝑖

𝑗− 1

𝑖

𝑗− 1

The secondary disk, can then be recovered from the first by simply using parity P:

𝑗

∗

𝑖

Ex.5.

Disk 0 Disk 1 Disk 2 Disk 3 Disk 4

Stripe 1 Block 1 Block 2 Block 3 P 1- 3 Q 1- 3

Values 2 10 7 19 50

∗

2 − 1

− 1

𝑗− 1

DAS (Directed Attached Storage)

DAS is a storage system directly attached to a server or workstation.

The term is used to differentiate non-networked storage from SAN and NAS.

DAS has:

Limited scalability
Complex manageability
Limited performance
To read files in other machines, the “file sharing” protocol of the OS must be used

DAS does not necessary mean “internal drives”, all the external disks, connected with a point to point

protocol to a PC can be considered as DAS.

USB and eSATA drive are example of DAS.

The current state of the art connection system for DAS is called SAS – Serial Attached SCSI.

Both DAS and NAS can potentially increase availability of data by using RAID or clustering.

Comparing NAS with local DAS, the performance of NAS depends mainly on the speed of and congestion on

the network.

SAN (Storage Area Network)

SAN are remote storage units that are connected to PC using a specific networking technology.

NAS vs SAN

NAS provide both storage and a file system
This is often contrasted with SAN which provides only block-based storage and leaves file system

concerns on the “client” side

One way to loosely conceptualize the difference between a NAS and a SAN is that

o NAS appears to the client OS as file server

o A disk available through a SAN still appears to the client OS as a disk: it will be visible in the

disks and volumes management utilities, and available to be formatted with a file system

In conclusion:

NAS is used for low-volume access to a large amount of storage by many users
SAN is the solution for petabytes of storage and multiple , simultaneous access to files , such as

streaming audio/video

SANs have a special network devoted to the accesses to storage devices.

Two distinct networks (TCP/IP and dedicated network)

High scalability

Problems of TCP/IP based storage networks

TCP/IP stack over Ethernet adds several overheads:

Managing Ethernet frames
Size of Ethernet frames
Routing of IP packets

This overhead is in most cases not required to transfer disk commands.

Moreover the TCP/IP overhead is proportional to the quantity of data transferred.

A specific protocol can reduce the overhead and increase performance ( fibre channel , a high speed

network technology that is well established in the open system environment as the underlining

architecture of the SAN).

SAN + NAS head

SANs are used to actually store the data
SAN disks are accessed using standard Ethernet, using a special appliance called NAS head

Solid State Disk (SSD)

A solid state drive is a non-volatile storage device.

A controller is included in the device with one or more solid state memory components.

The device should use traditional hard disk (HDD) interfaces and form factors.

Data are stored in array of NAND cells:

SLC: cells store one bit
MLC: cells store more bit by multilevel voltage

o Twice the capacity of SLC using MLC

o Tolerance on adjacent level smaller than SLC → decreasing data reliability

NAND flash is organized into:

o Pages: smallest unit that can be read/written. In particular: a sub-unit of an erase block

consisting of number of bytes which can be read from and written to in single operations,

through the loading or unloading of a page buffer and the issuance of a program or read

command

o Blocks (or Erase Block): smallest unit that can be erased, typically consisting of multiple

pages

Ex.

Hypothetical SSD:

Page size: 4KB
Block size: 5 Pages
Drive size: 1 Block
Read Speed: 2 KB/s
Write Speed: 1 KB/s

Let’s write a 4KB text file to the brand new SSD

Now let’s write a 8KB pic file to the almost brand new SSD

Now let’s delete the txt file in the first page

Finally lets write a 12KB pic to the SSD.

How long should it take? 1KB/s write speed

Problems:

The OS is told there are 3 open pages on the SSD when there are only 2 available
Time for the SSD to do some fancy footwork to open up the space

So steps:

Read block into cache
Delete page from cache
Write new pic into cache
Delete the old block on SSD
Write cache to SSD

The OS only thought it was writing 12KB of data when in fact the SSD had to read 12KB and then write

20KB, the entire block.

Since the SSD is quite slow the operation should have taken 12sec but actually took 26sec, resulting in a

write speed of 0.46KB/s and not 1KB/s!

SSD drawbacks

They cost more than the conventional HDD
Flash memory can be written only a limited number of times (wear):

o Have a shorter lifetime

o Error correcting codes

o Over-Provisioning

Different read/write speed
Write performance degrades of one order of magnitude after the first writing
Often the controller become the real bottleneck to the transfer rate
SSD are not affected by data-locality and must not be defragmented

Definitions

Retention failure: a data error occurring when the SSD is read after an extended period of time

following the previous write

Endurance failure: a failure caused by endurance stressing
Endurance rating (TBW rating): the number of terabytes that may be written to the SSD while still

meeting the requirements

Write amplification factor (WAF): the data written to the NVM divided by data written by the host

to the SSD

Uncoverable Bit Error Ratio (UBER): a metric for the rate of occurrence of data errors, equal to the

number of data error per bits read

Trim command

As we have seen, deleting a file only updates the meta data:

The TRIM command allows an operating system to indicate blocks that are no longer used in a

solid-state drive, such as blocks released after deleting one or more files

Generally, in the delete operation performed by an OS the data blocks are marked as not in use
The TRIM allows the OS to pass this information to the SSD controller which otherwise would not

be able to know which blocks to delete