Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Cache Memory Management: How to Decide What to Remove when Full - Prof. Alan L. Sussman, Study notes of Computer Science

University of Maryland Computer Science

Prof. Alan L. Sussman

Cache memory management, focusing on how to decide what to remove when the cache is full. Various cache organization techniques, such as fully associative, set associative, and direct mapped caches, and their replacement policies. It also touches upon the importance of cache hits and misses, and the impact of cache misses on machine performance. From a computer science course, cmsc 411, taught by alan sussman.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-vxj-1 🇺🇸

5

(2)

10 documents

1 / 25

This page cannot be seen from the preview

Don't miss anything!

CMSC 411 - A. Sussman (from D. O'Leary) 1

Computer Systems Architecture

CMSC 411

Unit 5 – Memory Hierarchy

Alan Sussman

October 7, 2004

CMSC 411 - Alan Sussman 2

Administrivia

• HW #3 due today

– questions

• Quiz 2 Tuesday, Oct. 12

– on Unit 3, basic pipelining

– practice quiz posted, answers posted later today

– questions?

• Read Chapter 5

– except 5.11-5.15

CMSC 411 - Alan Sussman 3

Last time

• Long instructions

– can cause structural hazards, and WAW hazards – why?

– detect hazards early, to allow precise exceptions

• in ID pipeline stage, and delay EX cycle if problem detected

• can delay WB, use history or future file, let OS deal wit h it, to

enable precise exceptions

• MIPS R4000 pipeline design

– 8 stage pipeline – superpipelining

– extra stages come from multi-cycle cache accesses

– 2 cycle load delay and 3 cycle branch delay (1 delay

slot, 2 cycle stall for taken branches)

– complex FP pipeline – 8 stages used in different

combinations for different operations

Cache Memory

CMSC 411 - Alan Sussman 5

Issues to consider

• How big should the fastest memory (cache

memory) be?

• How do we decide what to put in cache

memory?

• If the cache is full, how do we decide what

to remove?

• How do we find something in cache?

• How do we handle writes?

CMSC 411 - Alan Sussman 6

First, there is main memory

• Jargon:

– frame address – which page?

– block number – which cache block?

– contents – the data

Discover Study notes of Computer Science University of Maryland

Partial preview of the text

Download Cache Memory Management: How to Decide What to Remove when Full - Prof. Alan L. Sussman and more Study notes Computer Science in PDF only on Docsity!

Computer Systems Architecture

CMSC 411

Unit 5 – Memory Hierarchy

Alan Sussman

October 7, 2004

CMSC 411 - Alan Sussman 2

Administrivia

• HW #3 due today

– questions

• Quiz 2 Tuesday, Oct. 12

– on Unit 3, basic pipelining

– practice quiz posted, answers posted later today

– questions?

• Read Chapter 5

– except 5.11-5.

CMSC 411 - Alan Sussman 3

Last time

• Long instructions

can cause structural hazards, and WAW hazards – why?
detect hazards early, to allow precise exceptions
- in ID pipeline stage, and delay EX cycle if problem detected
- can delay WB, use history or future file, let OS deal with it, to enable precise exceptions

• MIPS R4000 pipeline design

8 stage pipeline – superpipelining
extra stages come from multi-cycle cache accesses
2 cycle load delay and 3 cycle branch delay (1 delay slot, 2 cycle stall for taken branches)
complex FP pipeline – 8 stages used in different combinations for different operations

Cache Memory

CMSC 411 - Alan Sussman 5

Issues to consider

• How big should the fastest memory (cache

memory) be?

• How do we decide what to put in cache

memory?

• If the cache is full, how do we decide what

to remove?

• How do we find something in cache?

• How do we handle writes?

CMSC 411 - Alan Sussman 6

First, there is main memory

• Jargon:

– frame address – which page?

– block number – which cache block?

– contents – the data

CMSC 411 - Alan Sussman 7

Then add a cache

• Jargon: Each address of a memory location

is partitioned into

– block address

tag
index

– block offset

Fig. 5. CMSC 411 - Alan Sussman 8

How does cache memory work?

• The following slides discuss:

– what cache memory is

– three organizations for cache memory

direct mapped.
set associative
fully associative

– how the bookkeeping is done

• Important note : All addresses shown are in

octal. Addresses in the book are usually decimal.

CMSC 411 - Alan Sussman 9

What is cache memory?

Main memory first

Main memory is divided into (cache) blocks. Each block contains many words (16-64 common now).

CMSC 411 - Alan Sussman 10

Main memory

Blocks are grouped into frames (pages), 3 frames in this picture.

CMSC 411 - Alan Sussman 11

Main memory (cont.)

Blocks are addressed by their frame number, and their block number within the frame.

CMSC 411 - Alan Sussman 12

Cache memory

Cache has many, MANY fewer blocks than main memory, each with a block number ,

a memory address ,

data ,

a valid bit,

a dirty bit.

CMSC 411 - Alan Sussman 19

Note that the last two bits of the memory block’s address always match the set number, so do not need to be stored. This part of the address is called the index. The higher order bits are stored, and are called the tag. In these pictures, both index and tag shown.

Set 0 Set 1 Set 2 Set 3

Set associative cache (cont.)

CMSC 411 - Alan Sussman 20

Set associative cache replacement

• Which entry in the set to replace?

• Three common choices:

– Replace an eligible random block

– Replace the least recently used (LRU) block

can be hard to keep track of, so often only approximated

– Replace the oldest eligible block (First In, First

Out, or FIFO)

CMSC 411 - Alan Sussman 21

Data cache replacement – Fig. 5.

256KB 92.2 92.1 92.5 92.1 92.1 92.5 92.1 92.1 92.

64KB 103.4 104.3 103.9 102.4 102.3 103.1 99.7 100.5 100.

16KB 114.1 117.3 115.5 111.7 115.1 113.3109.0 111.8 110.

Size LRU RandomFIFO LRU Random FIFO LRU RandomFIFO

Two-way Four-way Eight-Way

SPEC2000, in misses per 1000 instructions Set associativity

Computer Systems Architecture

CMSC 411

Unit 5 – Memory Hierarchy

Alan Sussman

October 12, 2004

CMSC 411 - Alan Sussman 23

Administrivia

• Quiz 2 today

– questions?

• HW for Unit 5 out soon

CMSC 411 - Alan Sussman 24

Last time

• Main memory

frame address – page number
block number – cache block within page
contents – the data

• Cache memory

block address
- tag – high order bits for matching
- index – which set, for set associative caches
block offset – which byte within the block
contains way fewer blocks than main memory
for each cache block – block number, memory address of block it contains, data, valid bit, dirty bit

CMSC 411 - Alan Sussman 25

Last time (cont.)

• Direct mapped cache

each memory block can only go into 1 cache block – use low order bits of block address

• Set associative cache

multiple places for a memory block to go – the degree of set associativity is how many
don’t need to store the index (the set number), since its known from the cache block number – rest of block address is the tag
replacement policy determines which block to replace when new one is loaded (e.g., random, LRU, FIFO)

CMSC 411 - Alan Sussman 26

Fully associative cache

In fully associative cache, memory blocks may be stored anywhere.

So block 14 might be put in the first available block -- one with valid = 0.

CMSC 411 - Alan Sussman 27

Fully associative cache (cont.)

With this result.

CMSC 411 - Alan Sussman 28

Managing cache

Use direct mapped cache as an example.

After first read operation, cache memory looked like this.

Valid Dirty

CMSC 411 - Alan Sussman 29

Managing cache (cont.)

If all other memory references involved block 14, no other blocks would need to be fetched from memory.

But suppose eventually need to fetch blocks 10, 31 and 66.

Need to fetch all three, because don’t have valid versions of them.

Valid Dirty

CMSC 411 - Alan Sussman 30

Managing cache (cont.)

The result looks like this.

Now suppose write to block 66.

Valid Dirty

CMSC 411 - Alan Sussman 37

Write through vs. write back

• Which is better?

– Write back gives faster writes, since don't have

to wait for main memory

– Write back is very efficient if want to modify

many bytes in a given block

– But write back can slow down some reads,

since a cache miss might cause a write back

– In multiprocessors, write through might be the

only correct solution. Why?

CMSC 411 - Alan Sussman 38

Cache summary

• Cache memory can be organized as direct

mapped, set associative, or fully associative

• Can be write-through or write-back

• Extra bits such as valid and dirty bits help

keep track of the status of the cache

Computer Systems Architecture

CMSC 411

Unit 5 – Memory Hierarchy

Alan Sussman

October 14, 2004

CMSC 411 - Alan Sussman 40

Administrivia

• HW for Unit 5 posted

– due date TBD

• Quizzes returned Tuesday

– answers already posted

• Grad school workshop Tuesday, Oct. 19, 5-

7PM, CSIC 2117

– come ask questions to both faculty and current

grad students!

CMSC 411 - Alan Sussman 41

Last time

• Fully associative cache

any memory block can go into any cache block

• Write through cache

memory gets updated immediately on write
reads only cause block to get loaded on miss

• Write back cache

writes only to cache
cache and main memory can be inconsistent
reads can cause updates from cache to memory, if block replaced is dirty

• Write through vs. write back

name one good feature of each

CMSC 411 - Alan Sussman 42

How much do memory stalls slow

down a machine?

• Suppose that on pipelined MIPS, each instruction

takes, on average, 2 clock cycles, not counting

cache faults/misses

• Suppose, on average, there are 1.33 memory

references per instruction, memory access time is

50 cycles, and the miss rate is 2%

• Then each instruction takes, on average:

2 + (0 × .98) + (1.33 × .02 × 50) = 3.33 clock cycles

CMSC 411 - Alan Sussman 43

Memory stalls (cont.)

• To reduce the impact of cache misses, can

reduce any of three parameters:

– main memory access time (miss penalty)

– miss rate

– cache access (hit) time

CMSC 411 - Alan Sussman 44

Reducing cache miss penalty

• 5 strategies:

– Give priority to read misses over write misses

– Don't wait for the whole block

– Use a nonblocking cache

– Multi-level cache

– Victim caches

• First 4 used in most desktop and server

machines

CMSC 411 - Alan Sussman 45

Give priority to read misses over

write misses

• But need to be careful

• Example:

– Suppose have a direct mapped cache, with

room for 8 blocks of 16 bytes each

– Then M[512] and M[1024] both get stored in

block 0, so can't be in cache at the same time

• Consider the following instructions:

SD R3, 512(R0)

LD R1, 1024(R0)

LD R2, 512(R0)

CMSC 411 - Alan Sussman 46

Example (cont.)

If the cache is write-through, the SD will cause memory location 512 to be changed
The first LW will cause block 0 to be replaced, so that the contents M[512] are no longer available in cache - If the system is write-back, this is when memory location 512 will be changed
Physically, the contents of block 0 will be put into temporary storage (a write buffer ) while the new block is loaded, then the write back proceeds
The second LW again replaces block 0, but this time no write-back is necessary
But get a RAW hazard if don’t ensure that the write- through or write-back completes before the second LW reads memory

CMSC 411 - Alan Sussman 47

Example (cont.)

• To avoid such RAW hazards:

– Can force the read miss to always wait until the

write buffer is empty

– Or can force the hardware to check the write

buffer before read and only wait if there is a

potential hazard

CMSC 411 - Alan Sussman 48

Another write buffer optimization

• Write buffer mechanics, with merging

An entry may contain multiple words (maybe even a whole cache block)
If there’s an empty entry, the data and address are written to the buffer, and the CPU is done with the write
If buffer contains other modified blocks, check to see if new address matches one already in the buffer – if so, combine the new data with that entry
If buffer full and no address match, cache and CPU wait for an empty entry to appear (meaning some entry has been written to main memory)
Merging improves memory efficiency, since multi- word writes usually faster than one word at a time

CMSC 411 - Alan Sussman 55

Miss rate – Fig. 5.

SPEC2000,

LRU

replacement

CMSC 411 - Alan Sussman 56

How to reduce the miss rate?

• Use larger blocks

• Use more associativity, to reduce conflict misses

• Victim cache

• Pseudo-associative caches (won’t talk about this)

• Prefetch (hardware controlled)

• Prefetch (compiler controlled)

• Compiler optimizations

CMSC 411 - Alan Sussman 57

Increasing block size

• Want the block size large so don’t have to stop so

often to load blocks

• Want the block size small so that blocks load

quickly

Fig. 5.16 – SPEC

CMSC 411 - Alan Sussman 58

Increasing block size (cont.)

• So large block size reduces miss rates, but...

• Example:

– Suppose that loading a block takes 80 cycles

(overhead) plus 2 clock cycles for each 16 bytes

– A block of size 64 bytes can be loaded in

80 + 2*64/16 cycles = 88 cycles (miss penalty)

– If the miss rate is 7%, then the average memory

access time is

1 + .07 * 88 = 7.16 cycles

CMSC 411 - Alan Sussman 59

Memory Access Times – Fig. 5.

Miss 4K 16K 64K 256K

penalty

Block size

Cache size

SPEC92 benchmarks on DEC workstation Computer Systems Architecture

CMSC 411

Unit 5 – Memory Hierarchy

Alan Sussman

October 19, 2004

CMSC 411 - Alan Sussman 61

Administrivia

• HW for Unit 5 posted

due date TBD
turn it in!

• Quizzes returned today

Average: 62
Median: 65 25%: 51 75%: 73
questions

• Grad school workshop today, 5-7PM, CSIC 2117

CMSC 411 - Alan Sussman 62

Last time

• Reducing cache miss penalty

priority to read misses over write misses
- be careful to use contents of write buffer
- can merge entries into write buffer
don’t wait for whole block
- early restart or critical word first
use a non-blocking cache
- works best for more complex pipelines than we’ve seen so far
multi-level cache
- to capture misses in lower level caches
- lowers effective miss penalty
victim cache
- to reduce conflict misses

CMSC 411 - Alan Sussman 63

Last time (cont.)

• Reducing miss rate - compulsory, capacity,

conflict misses

– use larger blocks

what’s the cost of larger blocks?

– use higher associativity

– victim cache

– prefetch – hardware or software/compiler

– compiler optimizations

CMSC 411 - Alan Sussman 64

Higher associativity

A direct-mapped cache of size N has about the same miss rate as a 2-way set-associative cache of size N/ - 2:1 cache rule of thumb (seems to work up to 128KB caches)
But associative cache is slower than direct-mapped, so the clock may need to run slower
Example:
Suppose that the clock for 2-way memory needs to run at a factor of 1.1 times the clock for 1-way memory
the hit time increases with higher associativity
Then the average memory access time for 2-way is 1.10 + miss rate × 50 (assuming that the miss penalty is 50)

CMSC 411 - Alan Sussman 65

Memory access time – Fig. 5.

Cache size One-way Two-way Four-way Eight-way (KB)

Associativity

CMSC 411 - Alan Sussman 66

Pseudo-associative cache

• Uses the technique of chaining , with a series of

cache locations to check if the block is not found

in the first location

e.g., invert most significant bit of index part of address (as if it were a set associative cache)

• The idea:

Check the direct mapped address
Until the block is found or the chain of addresses ends, check the next alternate address
If the block has not been found, bring it in from memory

• Three different delays generated, depending on

which step succeeds

CMSC 411 - Alan Sussman 73

Merging arrays (cont.)

Means that at least 2 blocks must be in cache to begin using the arrays.

val[0] val[1] val[2] val[3] . . .

val[64] val[65] val[66] val[67] . . .

val[size-1] key[0] key[1] key[2] key[3] . . CMSC 411 - Alan Sussman 74

Merging arrays (cont.)

More efficient, especially if more than two arrays are coupled this way, to store them together.

val[0] key[0] val[1] key[1] . . .

val[32] key[32] val[33] key[33] . . .

CMSC 411 - Alan Sussman 75

Merging arrays (cont.)

Can do this by making the two arrays part of a structure.

val[0] key[0] val[1] key[1] . . .

val[32] key[32] val[33] key[33] . . .

CMSC 411 - Alan Sussman 76

Technique 2:

interchanging loops

Example:

x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

CMSC 411 - Alan Sussman 77

Interchanging loops (cont.)

Notice that accesses are by columns, so the elements are spaced 100 words apart.

Blocks are bouncing in and out of cache.

x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

CMSC 411 - Alan Sussman 78

Interchanging loops (cont.)

First color the loops:

x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

CMSC 411 - Alan Sussman 79

Interchanging loops (cont.)

Notice that the program has the same effect if the two loops are interchanged:

x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

CMSC 411 - Alan Sussman 80

Interchanging loops (cont.)

But with this ordering, use every element in a cache block before needing another block!

x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

CMSC 411 - Alan Sussman 81

Technique 3: loop fusion

Example:

x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

y[i][j] = x[i][j] * a[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for; CMSC 411 - Alan Sussman 82

Loop fusion (cont.)

Note that the loop control is the same for both sets of loops.

x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

y[i][j] = x[i][j] * a[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

CMSC 411 - Alan Sussman 83

Loop fusion (cont.)

And note that the array x is used in each, so probably needs to be loaded into cache twice, which wastes cycles.

x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

y[i][j] = x[i][j] * a[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for; CMSC 411 - Alan Sussman 84

Loop fusion (cont.)

So combine,or fuse , the loops to improve efficiency. x[i][j] = 2 * x[i][j];

For i=0, 1, …, 4999

End for;

For j=0, 1, …, 99

End for;

y[i][j] = x[i][j] * a[i][j];

CMSC 411 - Alan Sussman 91

Blocking access to arrays (cont.)

A B = C

CMSC 411 - Alan Sussman 92

Blocking access to arrays (cont.)

A B = C

CMSC 411 - Alan Sussman 93

Blocking access to arrays (cont.)

A B = C

CMSC 411 - Alan Sussman 94

Blocking access to arrays (cont.)

Instead, order the computation using rectangular blocks of A and B.

A B = C

Partial answer!

CMSC 411 - Alan Sussman 95

Blocking access to arrays (cont.)

If the block of A has k rows, then only need to load B m/k times.

A B = C

Partial answer!

CMSC 411 - Alan Sussman 96

Blocking access to arrays (cont.)

Improves temporal locality

/* Before / for (i=0; i<N; i++) for (j=0; j<N; j++) { r=0; for (k=0; k<N; k++) r=r+y[i][k]z[k][j]; x[i][j]=r; }

/* After / for (jj=0; jj<N; jj=jj+B) for (kk=0; kk<N; kk=kk+B) for (i=0; i<N; i++) for (j=jj; j<min(jj+B,N); j++) { r=0; for (k=kk; k<min(kk+B,N); k++) r= r+y[i][k]z[k][j]; x[i][j]=x[i][j]+r; }

Computer Systems Architecture

CMSC 411

Unit 5 – Memory Hierarchy

Alan Sussman

October 21, 2004

CMSC 411 - Alan Sussman 98

Administrivia

• Quiz 2 questions?

• HW for Unit 5

– questions?

– due date posted by tomorrow

• Midterm

– will be rescheduled to later by tomorrow

CMSC 411 - Alan Sussman 99

Last time

• Reducing cache miss rate

– larger blocks

– higher associativity

but can make cache hits slower

– hardware prefetch

works well for sequential accesses
cost?

– software/compiler prefetch

instruction that moves data into cache, w/o causing exceptions or pipeline bubbles

– compiler optimizations

CMSC 411 - Alan Sussman 100

Last time (cont.)

• Compiler optimizations

– merging arrays

separate arrays to array of structs ordering, improve spatial locality

– loop interchange

to access data in the order it is stored, improve spatial locality

– loop fusion

to improve temporal locality

– blocking

improves both temporal and spatial locality

CMSC 411 - Alan Sussman 101

Reducing the time for cache hits

• K.I.S.S.

• Use virtual addresses rather than physical

addresses in the cache.

• Pipeline cache accesses

• Trace caches (won’t talk about these)

CMSC 411 - Alan Sussman 102

K.I.S.S.

• Cache should be small enough to fit on the

processor chip

• Direct mapped is faster than associative,

especially on read

– overlap tag check with transmitting data

• For current processors, small L1 caches to

keep fast clock cycle time, hide L1 misses

with dynamic scheduling, and use L

caches to avoid main memory accesses

CMSC 411 - Alan Sussman 109

Main memory management

• Questions:

– How big should main memory be?

– How to handle reads and writes?

– How to find something in main memory?

– How to decide what to put in main memory?

– If main memory is full, how to decide what to

replace?

CMSC 411 - Alan Sussman 110

The scale of things

• Typically (as of 2000):

Registers : < 1 KB, access time .25 - .5 ns
Cache : < 8 MB, access time .5 - 25 ns
Main Memory : < 4 GB, access time 150 - 250 ns
Disk Storage : > 30 GB, access time 5,000,000 ns (5ms)

• Memory Technology: CMOS (Complementary

Metal Oxide Semiconductor)

uses a combination of n- and p-doped semiconductor material to achieve low power dissipation.

Cache Memory Management: How to Decide What to Remove when Full - Prof. Alan L. Sussman, Study notes of Computer Science

Related documents

Partial preview of the text

Download Cache Memory Management: How to Decide What to Remove when Full - Prof. Alan L. Sussman and more Study notes Computer Science in PDF only on Docsity!

Computer Systems Architecture

CMSC 411

Unit 5 – Memory Hierarchy

Alan Sussman

October 7, 2004

Administrivia

• HW #3 due today

– questions

• Quiz 2 Tuesday, Oct. 12

– on Unit 3, basic pipelining

– practice quiz posted, answers posted later today

– questions?

• Read Chapter 5

– except 5.11-5.

Last time

• Long instructions

• MIPS R4000 pipeline design

Cache Memory

Issues to consider

• How big should the fastest memory (cache

memory) be?

• How do we decide what to put in cache

memory?

• If the cache is full, how do we decide what

to remove?

• How do we find something in cache?

• How do we handle writes?

First, there is main memory

• Jargon:

– frame address – which page?

– block number – which cache block?

– contents – the data

Then add a cache

• Jargon: Each address of a memory location

is partitioned into

– block address

– block offset

How does cache memory work?

• The following slides discuss:

– what cache memory is

– three organizations for cache memory

– how the bookkeeping is done

• Important note : All addresses shown are in

octal. Addresses in the book are usually decimal.

What is cache memory?

Main memory first

Main memory

Main memory (cont.)

Cache memory

Set associative cache (cont.)

Set associative cache replacement

• Which entry in the set to replace?

• Three common choices:

– Replace an eligible random block

– Replace the least recently used (LRU) block

– Replace the oldest eligible block (First In, First

Out, or FIFO)

Data cache replacement – Fig. 5.

Computer Systems Architecture

CMSC 411

Unit 5 – Memory Hierarchy

Alan Sussman

October 12, 2004

Administrivia

• Quiz 2 today

– questions?

• HW for Unit 5 out soon

Last time

• Main memory

• Cache memory

Last time (cont.)

• Direct mapped cache

• Set associative cache

Fully associative cache

Fully associative cache (cont.)

Managing cache