Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Memory Hierarchies and Caches - Computer Systems - Lecture Slides, Slides of Computer Science

Punjab Engineering College Computer Science

These are the Lecture Slides of Computer Systems which includes Writing to Cache, Memory Access, Simple Direct-Mapped Cache, Inconsistent Memory, Write-Through Caches, Write-Back Caches, Finishing Write Back, Write Misses etc.Key important points are: Memory Hierarchies and Caches, Memory Systems, Cache Introduction, Introducing Caches, Principle of Locality, Spatial Locality, Temporal Locality, Locality in Program, Kinds of Caches

Typology: Slides

2012/2013

Uploaded on 03/27/2013

agarkar 🇮🇳

4.3

(26)

372 documents

1 / 29

This page cannot be seen from the preview

Don't miss anything!

CSE 410

Computer Systems

MHihi&Ch

ure

–

emory

erarc

Docsity.com

Discover Slides of Computer Science Punjab Engineering College

Partial preview of the text

Download Memory Hierarchies and Caches - Computer Systems - Lecture Slides and more Slides Computer Science in PDF only on Docsity!

CSE 410Computer Systems

L

t^

M

Hi

hi

& C

h

Lecture 11 – Memory Hierarchies & Caches

Memory Systems and I/OMemory

Systems and I/O

•^

We’ve already seen how to make a fast processor. How can wesupply the CPU with enough data to keep it busy?supply the CPU with enough data to keep it busy?

Part of CS410 focuses on memory and input/output issues,which are frequently bottlenecks that limit the performance of asystem.

We’ll start off by looking at memory systems and turn to I/O.– How caches can dramatically improve the speed of memory

accesses.How virtual memory provides security and ease of

How virtual memory provides security and ease of

programming

How processors, memory and peripheral devices can be

connected

Memory

Processor

Input/Output

Large and fastLarge

and fast

Today’s computers depend upon large and fast storage

y^

p

g

systems.– Large storage capacities are needed for many

database applications scientific computations withdatabase applications, scientific computations withlarge data sets, video and music, and so forth.

Speed is important to keep up with our pipelined

CPUs

hich ma

access both an instr ction and data

CPUs, which may access both an instruction and datain the same clock cycle. Things get become evenworse if we move to a superscalar CPU design.

So far we’ve assumed our memories can keep up and ourCPU can access memory twice in one cycle, but as we’llsee that’s a simplification.

Small or slowSmall

or slow

Unfortunately there is a tradeoff between speed, cost and capacity.

Storage

Speed

Cost

Capacit

Storage

Speed

Cost

Capacit

Static RAM

Fastest

Expensive

Smallest

Dynamic RAM

Slow

Cheap

Large

Hard disks

Slowest

Cheapest

Largest

Fast memory is too expensive for most people to buy a lot of.

But dynamic memory has a much longer delay than other functional units in adatapath If every lw or sw accessed dynamic memory we’d have to either

Hard disks

Slowest

Cheapest

Largest

datapath. If every lw or sw accessed dynamic memory, we d have to eitherincrease the cycle time or stall frequently.

Here are rough estimates of some current storage parameters.*

Storage

Delay

Cost/MB

Capacity

Storage

Delay

Cost/MB

Capacity

Static RAM

1-10 cycles

128KB-2MB

Dynamic RAM

100-200 cycles

~$0.

128MB-4GB

Hard disks

10 000 000 cycles

~$0 0005

20GB-400GB

Hard disks

,000,000 cycles

~$0.

20GB-400GB

*These numbers are a couple of years old now, but the ratios are still about right.More recent numbers in Sec. 5.1 of the book.

The principle of localityThe

principle of locality

It’s usually difficult or impossible to figure out what data

y^

p

g

will be “most frequently accessed” before a programactually runs, which makes it hard to know what to storeinto the small, precious cache memory.

, p

y

But in practice, most programs exhibit

locality

, which the

cache can take advantage of.

The principle of temporal localit

sa s that if a program

The principle of temporal locality says that if a program

accesses one memory address, there is a good chancethat it will access the same address again.

The principle of spatial locality says that if a program

accesses one memory address, there is a good chancethat it will also access other nearby addresses.

Temporal locality in programs •^

The principle of temporal locality says that if a program accessesone memory address there is a good chance that it will access the

Temporal locality in programs

one memory address, there is a good chance that it will access thesame address again.

Loops are excellent examples of temporal locality in programs.– The loop body will be executed many times.– The computer will need to access those same few locations of

the instruction memory repeatedly.

For example:

Loop:

$t0,

0($s1)

add

$t0, $t0, $s

$t0,

0($s1)

addi

$s1, $s1, -

Each instruction will be fetched over and over again, once on

every loop iteration

, $

bne

$s1,

$0,

Loop

every

loop iteration.

Spatial locality in programs •^

The principle of spatial locality says that if a program accessesone memory address there is a good chance that it will also

Spatial locality in programs

one memory address, there is a good chance that it will alsoaccess other nearby addresses.

sub

$sp,

$ra,

0($sp)

$s0,

4($sp)

$a0,

8($sp)

$a1,

12($sp)

•^

Nearly every program exhibits spatial locality, becauseinstructions are usually executed in sequence—if we execute aninstruction at memory location

then we will probably also

instruction at memory location

i , then we will probably also

execute the next instruction, at memory location

•^

Code fragments such as loops exhibit

both

temporal and spatial

locality.

Spatial locality in data

Spatial locality in dataPrograms often accessdata that is stored

sum = 0;

data

that is stored

contiguously.– Arrays, like a in the

code on the top are

for (i = 0; i <

MAX; i++)

sum = sum + a[i];

code

on the top, are

stored in memorycontiguously.

The individual fields of

employee.name = “Homer Simpson”;employee.boss =

“Mr.

Burns”;

employee.age = 45;

The

individual fields of

a record or object likeemployee are alsokept contiguously in

employee

.age

45;

p

g

y

memory.

Can data have bothspatial and temporal

p

locality?

How caches take advantage ofspatial localityspatial locality

•^

When the CPU reads location

from main

memory, a copy of that data is placed in the

CPU

cache.

But instead of just copying the contents oflocation

i , we can copy

several

values into the

cache at once such as the four bytes from

CPU

cache at once, such as the four bytes fromlocations

through

If the CPU later does need to read from

locations

2 or

3 it can access that

A little staticRAM (cache)

locations

2 or

3, it can access that

data from the cache and not the slower mainmemory.

For example, instead of reading just one array

Lots of

element at a time, the cache might actually beloading four array elements at once.

•^

Again, the initial load incurs a performancepenalty but we’re gambling on spatial locality and

dynamic RAM

penalty

, but we re gambling on spatial locality and

the chance that the CPU will need the extra data.

Other kinds of cachesOther

kinds of caches

•^

The general idea behind caches is used in many otheri

situations.

Networks are probably the best example.– Networks have relatively high “latency” and low “bandwidth,”

t d d t

so repeated data transfers are undesirable.

Browsers like Netscape and Internet Explorer store your

most recently accessed web pages on your hard disk.Administrators can set up a network wide cache and

Administrators can set up a network-wide cache, and

companies like Akamai also provide caching services.

•^

A few other examples:

Many processors have a “translation lookaside buffer ” which

Many processors have a

translation lookaside buffer, which

is a cache dedicated to virtual memory support.

Operating systems may store frequently-accessed disk

blocks, like directories, in main memory... and that data may

,^

then in turn be stored in the CPU cache!

A simple cache designA

simple cache design

Caches are divided into blocks, which may be of various sizes.– The number of blocks in a cache is usually a power of 2.– For now we’ll say that each block contains one byte. This won’t take

advantage of spatial locality, but we’ll do that next time.

Here is an example cache with eight blocks, each holding one byte.

Blockindex

8-bit data

000001010011100101

110111

Four important questionsFour

important questions

When we copy a block of data from main memory

When we copy a block of data from main memory

to the cache, where exactly should we put it?

How can we tell if a word is already in the cache, or

if it has to be fetched from main memory first?if it has to be fetched from main memory first?

Eventually, the small cache memory might fill up.

To load a new block from main RAM, we’d have toreplace one of the existing blocks in the cachereplace one of the existing blocks in the cache...which one?

How can

write

operations be handled by the

memory system?memory system?

^

Questions 1 and 2 are related

we have to know where the

^

Questions 1 and 2 are related—we have to know where thedata is placed if we ever hope to find it again later!

It

’s all divisions

It s all divisions…•^

One way to figure out which cache block a particularmemory address should go to is to use the modmemory address should go to is to use the mod(remainder) operator.

If the cache contains 2

blocks, then the data at

0 MemoryAddress

memory address

i

would

go to cache block index

i^ mod 2

Index

0 1 2 3 4

i^ mod

For instance, with thefour-block cache here,

dd

ld

0 1 2 Index

4 5 6 7 8

address 14 would mapto cache block 2.

14 mod 4 = 2

(^910111213)

131415

Docsity.com

or least

-significant bits

…or least significant bits•^

An equivalent way to find the placement of a memory

dd

i^

th

h

i^

t^

l^

k^

t th

l^

t^

i^

ifi

t^

k^

bit

address in the cache is to look at the least significant

k

bits

of the address.

With our four-byte cachewe would inspect the two

00 MemoryAddress

we

would inspect the two

least significant bits ofour memory addresses.

Again, you can see that

I d

00 0001001000110100

g

y

address 14 (1110 in binary)maps to cache block 2(10 in binary).

Taking the least

k

bits of

Index^000110

01000101011001111000

Taking

the least

k

bits of

a binary value is the sameas computing that valuemod 2

k.

1001101010111100

110111101111

Docsity.com

Memory Hierarchies and Caches - Computer Systems - Lecture Slides, Slides of Computer Science

Related documents

Partial preview of the text

Download Memory Hierarchies and Caches - Computer Systems - Lecture Slides and more Slides Computer Science in PDF only on Docsity!

CSE 410Computer Systems

L

t^

M

Hi

hi

& C

h

Lecture 11 – Memory Hierarchies & Caches

Memory Systems and I/OMemory

Systems and I/O

•^

Large and fastLarge

and fast

Today’s computers depend upon large and fast storage

y^

p

p

p

g

g

systems.– Large storage capacities are needed for many

database applications scientific computations withdatabase applications, scientific computations withlarge data sets, video and music, and so forth.

CPUs

hich ma

access both an instr ction and data

CPUs, which may access both an instruction and datain the same clock cycle. Things get become evenworse if we move to a superscalar CPU design.

So far we’ve assumed our memories can keep up and ourCPU can access memory twice in one cycle, but as we’llsee that’s a simplification.

Small or slowSmall

or slow

The principle of localityThe

principle of locality

It’s usually difficult or impossible to figure out what data

y^

p

g

will be “most frequently accessed” before a programactually runs, which makes it hard to know what to storeinto the small, precious cache memory.

, p

y

But in practice, most programs exhibit

locality

, which the

cache can take advantage of.

The principle of temporal localit

sa s that if a program

accesses one memory address, there is a good chancethat it will access the same address again.

accesses one memory address, there is a good chancethat it will also access other nearby addresses.

Temporal locality in programs •^

Temporal locality in programs

Spatial locality in programs •^

Spatial locality in programs

•^

•^

Spatial locality in data

Spatial locality in dataPrograms often accessdata that is stored

data

that is stored

contiguously.– Arrays, like a in the

code on the top are

code

on the top, are

stored in memorycontiguously.

The

individual fields of

a record or object likeemployee are alsokept contiguously in

p

g

y

memory.

Can data have bothspatial and temporal

p

p

locality?

How caches take advantage ofspatial localityspatial locality

•^

•^

^

^