Memory Cache Exercise solutions, Exercises of Advanced Computer Architecture

DEtailed solution for exercise on Memory cache

Typology: Exercises

2015/2016

Uploaded on 12/03/2016

Lucky.Bohemia
Lucky.Bohemia 🇬🇧

4.8

(6)

3 documents

1 / 96

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Memory: Set-Associative $CSCE430/830
Memory Hierarchy: Set-Associative Cache
CSCE430/830 Computer Architecture
Lecturer: Prof. Hong Jiang
Courtesy of Yifeng Zhu (U. Maine)
Fall, 2006
Portions of these slides are derived from:
Dave Patterson © UCB
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60

Partial preview of the text

Download Memory Cache Exercise solutions and more Exercises Advanced Computer Architecture in PDF only on Docsity!

CSCE430/830 Memory: Set-Associative $

Memory Hierarchy: Set-Associative Cache

CSCE430/830 Computer Architecture

Lecturer: Prof. Hong Jiang

Courtesy of Yifeng Zhu (U. Maine)

Fall, 2006

Portions of these slides are derived from:

Dave Patterson © UCB

Miss-oriented Approach to Memory Access:

CPI

Execution

includes ALU and Memory instructions

MissRate MissPenalty CycleTime

Inst

MemAccess

Execution

CPUtime IC CPI  

MissPenalty CycleTime

Inst

MemMisses

Execution

CPUtime IC CPI  

Cache performance

Separating out Memory component entirely

AMAT = Average Memory Access Time

CPI

ALUOps

does not include memory instructions

AMAT CycleTime

Inst

MemAccess

CPI

Inst

AluOps

CPUtime IC

AluOps

AMATHitTimeMissRateMissPenalt y

Data Data Data

Inst Inst Inst

HitTime MissRate MissPenalt y

HitTime MissRate MissPenalt y

Performance Example Problem

Assume:

For gcc, the frequency for all loads and stores is 36%.
instruction cache miss rate for gcc = 2%
data cache miss rate for gcc = 4%.
If a machine has a CPI of 2 without memory stalls
and the miss penalty is 40 cycles for all misses,
how much faster is a machine with a perfect cache?

Instruction miss cycles =IC x 2% x 40 = 0.80 x IC

Data miss cycles = IC x 36% x 4% x 40 = 0.576 x IC

CPIstall = 2 + ( 0.80 + 0.567 ) = 2 + 1.376 = 3.

IC x CPIstall x Clock period 3.

IC x CPIperfect x Clock period 2

Performance Example Problem

For gcc, the frequency for all loads and stores is 36%

Instruction miss cycles = IC x 2% x 80 = 1.600 x IC

Data miss cycles = IC x 36% x 4% x 80 = 1.152 x IC

2.752 x IC

I x CPI slowClk

x Clock period 3.

I x CPI fastClk

x Clock period 4.752 x 0.

= = 1.42 (not 2)

Assume: we increase the performance of the previous machine by

doubling its clock rate. Since the main memory speed is unlikely to

change, assume that the absolute time to handle a cache miss does not

change. How much faster will the machine be with the faster clock?

Q1: Block Placement

Where can block be placed in cache?

In one predetermined place - direct-mapped

Use part of address to calculate block location in cache

Compare cache block with tag to check if block present

Anywhere in cache - fully associative

Compare tag to every block in cache

In a limited set of places - set-associative

Use portion of address to calculate set (like direct-

mapped)

Place in any block in the set

Compare tag to every block in set

Hybrid of direct mapped and fully associative

Direct Mapped Block Placement

***0 *4 8 C

Cache

00 04 08 0C 10 14 18 1C 20 24 28 2C 30 34 38 3C 40 44 48 4C

Memory

address maps to block:

location = (block address MOD # blocks in cache)

Example: Accessing A Direct-Mapped Cache

DM cache contains 4 1-word blocks. Find the # Misses for each

cache given this sequence of memory block accesses: 0, 8, 0, 6, 8

DM Memory Access 1: Mapping: 0 mod 4 = 0

Block 0 Mem[0]

Block 1

Block 2

Block 3

Mem Block DM Hit/Miss

0 miss

Set 0 is empty: write Mem[0]

Example: Accessing A Direct-Mapped Cache

DM cache contains 4 1-word blocks. Find the # Misses for each

cache given this sequence of memory block accesses: 0, 8, 0, 6, 8

DM Memory Access 2: Mapping: 8 mod 4 = 0

Block 0 Mem[0]

Block 1

Block 2

Block 3

Mem Block DM Hit/Miss

0 miss

Example: Accessing A Direct-Mapped Cache

DM cache contains 4 1-word blocks. Find the # Misses for each

cache given this sequence of memory block accesses: 0, 8, 0, 6, 8

DM Memory Access 3: Mapping: 0 mod 4 = 0

Block 0 Mem[8]

Block 1

Block 2

Block 3

Mem Block DM Hit/Miss

0 miss

8 miss

CSCE430/830 Memory: Set-Associative $

Example: Accessing A Direct-Mapped Cache

DM cache contains 4 1-word blocks. Find the # Misses for each

cache given this sequence of memory block accesses: 0, 8, 0, 6, 8

DM Memory Access 3: Mapping: 0 mod 4 = 0

Block 0 Mem[0]

Block 1

Block 2

Block 3

Mem Block DM Hit/Miss

0 miss

8 miss

0 miss

Set 0 contains Mem[8]. Overwrite with Mem[0]

Example: Accessing A Direct-Mapped Cache

DM cache contains 4 1-word blocks. Find the # Misses for each

cache given this sequence of memory block accesses: 0, 8, 0, 6, 8

DM Memory Access 4: Mapping: 6 mod 4 = 2

Block 0 Mem[0]

Block 1

Block 2 Mem[6]

Block 3

Mem Block DM Hit/Miss

0 miss

8 miss

0 miss

6 miss

Set 2 empty. Write Mem[6]

Example: Accessing A Direct-Mapped Cache

DM cache contains 4 1-word blocks. Find the # Misses for each

cache given this sequence of memory block accesses: 0, 8, 0, 6, 8

DM Memory Access 5: Mapping: 8 mod 4 = 0

Block 0 Mem[0]

Block 1

Block 2 Mem[6]

Block 3

Mem Block DM Hit/Miss

0 miss

8 miss

0 miss

6 miss

Direct-Mapped Cache with n one-word blocks

Pros: find data fast

Con: What if access 00001 and 10001 repeatedly?

 We always miss…

00001 00101 01001 01101 10001 10101 11001 11101

Cache

Memory

Fully Associative Block Placement

00 04 08 0C 10 14 18 1C 20 24 28 2C 30 34 38 3C 40 44 48 4C

Cache

Memory

arbitrary block mapping

location = any