Cache Operation - Advanced Microprocessor Systems Design | ECE 463, Study notes of Electrical and Electronics Engineering

Material Type: Notes; Class: Advanced Microprocessor Systems Design; Subject: Electrical and Computer Engineering; University: North Carolina State University; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-eb6-1
koofers-user-eb6-1 🇺🇸

10 documents

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Lecture 5
Cache Operation
ECE 463/521
Fall 2002
Edward F. Gehringer
Based on notes by Drs. Eric Rotenberg & Tom Conte of NCSU
Outline
Review of cache parameters
Example of operation of a direct-mapped
cache.
Example of operation of a set-associative
cache.
Simulating a generic cache.
Write-through vs. write-back caches.
Write-allocate vs. no write-allocate.
Victim caches.
Cache Parameters
SIZE= total amount of cache data storage,
in bytes
BLOCKSIZE = total number of bytes in a
single block
ASSOC= associativity, i.e., # of blocks in a
set
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Cache Operation - Advanced Microprocessor Systems Design | ECE 463 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

Lecture 5

Cache Operation

ECE 463/

Fall 2002

Edward F. Gehringer

Based on notes by Drs. Eric Rotenberg & Tom Conte of NCSU

Outline

  • Review of cache parameters
  • Example of operation of a direct-mapped

cache.

  • Example of operation of a set-associative

cache.

  • Simulating a generic cache.
  • Write-through vs. write-back caches.
  • Write-allocate vs. no write-allocate.
  • Victim caches.

Cache Parameters

• SIZE = total amount of cache data storage,

in bytes

• BLOCKSIZE = total number of bytes in a

single block

• ASSOC = associativity, i.e., # of blocks in a

set

Cache Parameters (cont.)

• Equation for # of cache blocks in cache:

• Equation for # of sets in cache:

BLOCKSIZE

SIZE

#cacheblocks?

BLOCKSIZE ASSOC

SIZE

ASSOC?

#cacheblocks

#sets

Address Fields

block tag index offset

Tag field is compared to the tag(s) of the indexed cache line(s).

  • If it matches, block is there (hit).
  • If it doesn’t match, block is not there (miss).

Used to look up a “set,” whose lines contain one or more memory blocks. (The # of blocks per set is the “associativity ”.)

Once a block is found, the offset selects a particular byte or word of data in the block.

Address Fields (cont.)

• Widths of address fields (# bits)

# index bits = log 2 ( # sets )

# block offset bits = log 2 ( block size )

# tag bits = 32 – # index bits – # block offset

bits

Assuming 32-bit addresses 31 0 block offset

tag index

Example (cont.)

# index bits = log 2 ( # sets ) = log 2 (8) = 3 # block offset bits = log 2 ( block size ) = log 2 (32 bytes) = 5 # tag bits = total # address bits# index bits# block offset bits = 32 bits – 3 bits – 5 bits = 24

# sets?

BLOCKSIZE ASSOC

SIZE

Thus, the top 6 nibbles (24 bits) of address form the tag and lower 2 nibbles (8 bits) of address form the index and block offset fields.

Index (decimal)

0x00122544 0x001225 010 0 0100 Hit

0x0012255C 0x001225 010 1 1100 Miss/Replace

0x00101064 0x001010 011 0 0100 Hit

0x002183E0 0x002183 111 0 0000 Miss/Replace

0x00101078 0x001010 011 1 1000 Miss

0xFF0040E8 0xFF0040 111 0 1000 Hit

0xFF0040E2 0xFF0040 111 0 0010 Hit

0xBEEF005C 0xBEEF00 010 1 1100 Miss

0xFF0040E0 0xFF0040 111 0 0000 Miss

Index & offset Comment bits (binary)

Address (hex) Tag (hex)

Match?

Tags Data

block offset

tag index

FF0040 7

FF

Get block from memory (slow)

BEEF00 2

BEEF

FF

Match?

Tags Data

block offset

tag (^) index

Get block from memory (slow)

FF0040 7

BEEF

FF

Match?

Tags Data

block offset

tag (^) index

FF0040 7

BEEF

FF

Match?

Tags Data

block offset

tag index

BEEF

Get block from memory (slow)

Match?

Tags Data

block offset

tag (^) index

Match?

Tags Data

block offset

tag (^) index

Set-Associative Example

• Example: Processor accesses a 256-byte

2-way set -associative cache, which has

block size of 32 bytes, with the following

sequence of addresses.

  • Show contents of cache after each access.
  • Count # of hits and # of replacements.

Tags Data

block tag index offset

Match? select a block

hit

Address (from processor)

? 32 bytes?

select certain bytes

Match?

or

? 32 bytes?

Example (cont.)

# index bits = log 2 ( # sets ) = log 2 (4) = 2

# block offset bits = log 2 ( block size ) = log 2 (32 bytes)

# tag bits = total # address bits – # index bits –

# block offset bits = 32 bits – 2 bits – 5 bits = 25

# sets?

BLOCKSIZE ASSOC

SIZE

0xFF0040E2 0x1FE0081 11 0 0010 3 Hit

0x00101078 0x0002020 11 1 1000 3 Miss

Index (decimal)

0x00101064 0x0002020 11 0 0100 Hit

0x002183E0 0x0004307 11 0 0000 Miss/Replace

0x00101078 0x0002020 11 1 1000 Hit

0xBEEF005C 0x17DDE00 10 1 1100 Miss

0xFF0040E0 0x1FE0081 11 0 0000 Miss

Index & offset Comment bits (binary)

Address (hex) Tag (hex)

1FE0081 3

1FE

17DDE

Match?

block offset

tag index

Match?

25 Data not shown for convenience

Tags 0 1 2 3

1FE

17DDE

Match?

block offset

tag index

Match?

25 Data not shown for convenience

Tags 0 1 2 3

1FE

17DDE

Match?

block offset

tag index

Match?

25 Data not shown for convenience

Tags 0 1 2 3

17DDE

Match?

block offset

tag index

Match?

25 Data not shown for convenience

Tags 0 1 2 3

Generic Cache

  • Every cache is an n -way set-associative

cache.

  1. Direct-mapped:
    • 1-way set-associative 2. n -way set -associative:
    • n -way set-associative
  2. Fully -associative:
    • n -way set-associative, where there is only 1 set containing n blocks
    • index bits = log 2 (1) = 0 (equation still works!)

Generic Cache (cont.)

  • The same equations hold for any cache type
  • Equation for # of cache blocks in cache:
  • Equation for # of sets in cache:
  • Fully-associative: ASSOC = # cache blocks

BLOCKSIZE

SIZE

#cacheblocks?

BLOCKSIZE ASSOC

SIZE

ASSOC?

#cacheblocks

sets

Handling Writes

• What happens when a write occurs?

• Two issues

1. Is just the cache updated with new data,

or are lower levels of memory hierarchy

updated at same time?

2. Do we allocate a new block in the cache

if the write misses?

The Write-Update Question

  • Write-through (WT) policy: Writes that go to

the cache are also “written through” to the

next level in the memory hierarchy.

cache

next level in memory hierarchy

The Write-Update Question, cont.

  • Write-back (WB) policy: Writes go only to the

cache, and are not (immediately) written

through to the next level of the hierarchy.

cache

next level in memory hierarchy

The Write-Update Question, cont.

• Write-back, cont.

  • What happens when a line previously

written to needs to be replaced?

1. We need to have a “dirty bit” (D) with

each line in the cache, and set it when

line is written to.

2. When a dirty block is replaced, we need

to write the entire block back to next level

of memory (“write-back”).

The Write-Update Question, cont.

  • With the write-back policy, replacement of a

dirty block triggers a writeback of the entire

line.

cache

next level in memory hierarchy

D

Replacement causes writeback

The Write-Allocation Question

  • Write-Allocate (WA)
    • Bring the block into the cache if the write misses (handled just like a read miss).
    • Typically used with write-back policy: WBWA
  • Write-No-Allocate (NA)
    • Do not bring the block into the cache if the write misses.
    • Must be used in conjunction with write-through: WTNA.

Victim-Cache Example

• Suppose we have a 2-entry victim

cache

  • It initially holds blocks X and Y.
  • Y is the LRU block in the victim cache.

• The main cache is direct-mapped.

  • Blocks A and B map to the same set in

main cache

  • Trace: A B A B A B …

X

Y (LRU)

A

Main cache Victim cache

Victim-Cache Example, cont.

X (LRU)

A

B

Main cache Victim cache

  1. B misses in main cache and evicts A.
  2. A goes to victim cache & replaces Y. (the previous LRU)
  3. X becomes LRU.

Victim-Cache Example, cont.

X (LRU)

B

A

L1 cache Victim cache

  1. A misses in L1 but hits in victim cache, so A and B swap positions:
  2. A is moved from victim cache to L1, and B (the victim) goes to victim cache where A was located. ( Note: We don’t replace the LRU block, X, in case of victim-cache hit)

Victim-Cache Example, cont.

Victim Cache – Why?

• Direct-mapped caches suffer badly from

repeated conflicts

  • Victim cache provides illusion of set

associativity.

  • A poor-man’s version of set associativity.
  • A victim cache does not have to be large to

be effective; even a 4–8 entry victim cache

will frequently remove > 50% of conflict

misses.