Download Cache Operation - Advanced Microprocessor Systems Design | ECE 463 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!
Lecture 5
Cache Operation
ECE 463/
Fall 2002
Edward F. Gehringer
Based on notes by Drs. Eric Rotenberg & Tom Conte of NCSU
Outline
- Review of cache parameters
- Example of operation of a direct-mapped
cache.
- Example of operation of a set-associative
cache.
- Simulating a generic cache.
- Write-through vs. write-back caches.
- Write-allocate vs. no write-allocate.
- Victim caches.
Cache Parameters
• SIZE = total amount of cache data storage,
in bytes
• BLOCKSIZE = total number of bytes in a
single block
• ASSOC = associativity, i.e., # of blocks in a
set
Cache Parameters (cont.)
• Equation for # of cache blocks in cache:
• Equation for # of sets in cache:
BLOCKSIZE
SIZE
#cacheblocks?
BLOCKSIZE ASSOC
SIZE
ASSOC?
#cacheblocks
#sets
Address Fields
block tag index offset
Tag field is compared to the tag(s) of the indexed cache line(s).
- If it matches, block is there (hit).
- If it doesn’t match, block is not there (miss).
Used to look up a “set,” whose lines contain one or more memory blocks. (The # of blocks per set is the “associativity ”.)
Once a block is found, the offset selects a particular byte or word of data in the block.
Address Fields (cont.)
• Widths of address fields (# bits)
# index bits = log 2 ( # sets )
# block offset bits = log 2 ( block size )
# tag bits = 32 – # index bits – # block offset
bits
Assuming 32-bit addresses 31 0 block offset
tag index
Example (cont.)
# index bits = log 2 ( # sets ) = log 2 (8) = 3 # block offset bits = log 2 ( block size ) = log 2 (32 bytes) = 5 # tag bits = total # address bits – # index bits – # block offset bits = 32 bits – 3 bits – 5 bits = 24
# sets?
BLOCKSIZE ASSOC
SIZE
Thus, the top 6 nibbles (24 bits) of address form the tag and lower 2 nibbles (8 bits) of address form the index and block offset fields.
Index (decimal)
0x00122544 0x001225 010 0 0100 Hit
0x0012255C 0x001225 010 1 1100 Miss/Replace
0x00101064 0x001010 011 0 0100 Hit
0x002183E0 0x002183 111 0 0000 Miss/Replace
0x00101078 0x001010 011 1 1000 Miss
0xFF0040E8 0xFF0040 111 0 1000 Hit
0xFF0040E2 0xFF0040 111 0 0010 Hit
0xBEEF005C 0xBEEF00 010 1 1100 Miss
0xFF0040E0 0xFF0040 111 0 0000 Miss
Index & offset Comment bits (binary)
Address (hex) Tag (hex)
Match?
Tags Data
block offset
tag index
FF0040 7
FF
Get block from memory (slow)
BEEF00 2
BEEF
FF
Match?
Tags Data
block offset
tag (^) index
Get block from memory (slow)
FF0040 7
BEEF
FF
Match?
Tags Data
block offset
tag (^) index
FF0040 7
BEEF
FF
Match?
Tags Data
block offset
tag index
BEEF
Get block from memory (slow)
Match?
Tags Data
block offset
tag (^) index
Match?
Tags Data
block offset
tag (^) index
Set-Associative Example
• Example: Processor accesses a 256-byte
2-way set -associative cache, which has
block size of 32 bytes, with the following
sequence of addresses.
- Show contents of cache after each access.
- Count # of hits and # of replacements.
Tags Data
block tag index offset
Match? select a block
hit
Address (from processor)
? 32 bytes?
select certain bytes
Match?
or
? 32 bytes?
Example (cont.)
# index bits = log 2 ( # sets ) = log 2 (4) = 2
# block offset bits = log 2 ( block size ) = log 2 (32 bytes)
# tag bits = total # address bits – # index bits –
# block offset bits = 32 bits – 2 bits – 5 bits = 25
# sets?
BLOCKSIZE ASSOC
SIZE
0xFF0040E2 0x1FE0081 11 0 0010 3 Hit
0x00101078 0x0002020 11 1 1000 3 Miss
Index (decimal)
0x00101064 0x0002020 11 0 0100 Hit
0x002183E0 0x0004307 11 0 0000 Miss/Replace
0x00101078 0x0002020 11 1 1000 Hit
0xBEEF005C 0x17DDE00 10 1 1100 Miss
0xFF0040E0 0x1FE0081 11 0 0000 Miss
Index & offset Comment bits (binary)
Address (hex) Tag (hex)
1FE0081 3
1FE
17DDE
Match?
block offset
tag index
Match?
25 Data not shown for convenience
Tags 0 1 2 3
1FE
17DDE
Match?
block offset
tag index
Match?
25 Data not shown for convenience
Tags 0 1 2 3
1FE
17DDE
Match?
block offset
tag index
Match?
25 Data not shown for convenience
Tags 0 1 2 3
17DDE
Match?
block offset
tag index
Match?
25 Data not shown for convenience
Tags 0 1 2 3
Generic Cache
- Every cache is an n -way set-associative
cache.
- Direct-mapped:
- 1-way set-associative 2. n -way set -associative:
- n -way set-associative
- Fully -associative:
- n -way set-associative, where there is only 1 set containing n blocks
index bits = log 2 (1) = 0 (equation still works!)
Generic Cache (cont.)
- The same equations hold for any cache type
- Equation for # of cache blocks in cache:
- Equation for # of sets in cache:
- Fully-associative: ASSOC = # cache blocks
BLOCKSIZE
SIZE
#cacheblocks?
BLOCKSIZE ASSOC
SIZE
ASSOC?
#cacheblocks
sets
Handling Writes
• What happens when a write occurs?
• Two issues
1. Is just the cache updated with new data,
or are lower levels of memory hierarchy
updated at same time?
2. Do we allocate a new block in the cache
if the write misses?
The Write-Update Question
- Write-through (WT) policy: Writes that go to
the cache are also “written through” to the
next level in the memory hierarchy.
cache
next level in memory hierarchy
The Write-Update Question, cont.
- Write-back (WB) policy: Writes go only to the
cache, and are not (immediately) written
through to the next level of the hierarchy.
cache
next level in memory hierarchy
The Write-Update Question, cont.
• Write-back, cont.
- What happens when a line previously
written to needs to be replaced?
1. We need to have a “dirty bit” (D) with
each line in the cache, and set it when
line is written to.
2. When a dirty block is replaced, we need
to write the entire block back to next level
of memory (“write-back”).
The Write-Update Question, cont.
- With the write-back policy, replacement of a
dirty block triggers a writeback of the entire
line.
cache
next level in memory hierarchy
D
Replacement causes writeback
The Write-Allocation Question
- Write-Allocate (WA)
- Bring the block into the cache if the write misses (handled just like a read miss).
- Typically used with write-back policy: WBWA
- Write-No-Allocate (NA)
- Do not bring the block into the cache if the write misses.
- Must be used in conjunction with write-through: WTNA.
Victim-Cache Example
• Suppose we have a 2-entry victim
cache
- It initially holds blocks X and Y.
- Y is the LRU block in the victim cache.
• The main cache is direct-mapped.
- Blocks A and B map to the same set in
main cache
X
Y (LRU)
A
Main cache Victim cache
Victim-Cache Example, cont.
X (LRU)
A
B
Main cache Victim cache
- B misses in main cache and evicts A.
- A goes to victim cache & replaces Y. (the previous LRU)
- X becomes LRU.
Victim-Cache Example, cont.
X (LRU)
B
A
L1 cache Victim cache
- A misses in L1 but hits in victim cache, so A and B swap positions:
- A is moved from victim cache to L1, and B (the victim) goes to victim cache where A was located. ( Note: We don’t replace the LRU block, X, in case of victim-cache hit)
Victim-Cache Example, cont.
Victim Cache – Why?
• Direct-mapped caches suffer badly from
repeated conflicts
- Victim cache provides illusion of set
associativity.
- A poor-man’s version of set associativity.
- A victim cache does not have to be large to
be effective; even a 4–8 entry victim cache
will frequently remove > 50% of conflict
misses.