Memory Hierarchy - Advance Computers Architectures - Lecture Slides, Slides of Computer Architecture and Organization

Main points of this lecture are: Memory Hierarchy, Locality, Cache Design, Virtual Address Spaces, Page Table Layout, Design Options, Levels of Memory Hierarchy, Programs Address, Principle of Locality, Block Placement, Direct Mapped

Typology: Slides

2012/2013

Uploaded on 04/23/2013

atasi
atasi 🇮🇳

4.6

(32)

134 documents

1 / 36

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CIS 600 Advanced Computer
Architecture
Lecture 4 Memory Hierarchy
Review
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24

Partial preview of the text

Download Memory Hierarchy - Advance Computers Architectures - Lecture Slides and more Slides Computer Architecture and Organization in PDF only on Docsity!

CIS 600 Advanced Computer

Architecture

Lecture 4 – Memory Hierarchy

Review

Docsity.com

Outline

  • Memory hierarchy
  • Locality
  • Cache design
  • Virtual address spaces
  • Page table layout
  • TLB design options
  • Conclusion

Docsity.com

1977: DRAM faster than

microprocessors

Apple ][ (1977)

Steve Wozniak^ Steve Jobs

CPU: 1000 ns DRAM: 400 ns

Docsity.com

Levels of the Memory Hierarchy

CPU Registers 100s Bytes <10s ns Cache K Bytes 10-100 ns 1-0.1 cents/bit

Main Memory M Bytes 200ns- 500ns$.0001-.00001 cents /bit Disk G Bytes, 10 ms (10,000,000 ns) 10 -5 - 10-6 cents/bit

Capacity Access Time Cost

Tape infinite sec-min 10 -

Registers

Cache

Memory

Disk

Tape

Instr. Operands

Blocks

Pages

Files

StagingXfer Unit

prog./compiler 1-8 bytes

cache cntl 8-128 bytes

OS512-4K bytes

user/operator Mbytes

Upper Level

Lower Level

faster

Larger

Docsity.com

iMac’s PowerPC 970: All caches on-

chip

(1K)

R eg ist er s 512KL

L1 (64K Instruction)

L1 (32K Data) Docsity.com

The Principle of Locality

  • The Principle of Locality:
    • Program access a relatively small portion of the address space at any instant of time.
  • Two Different Types of Locality:
    • Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced againsoon (e.g., loops, reuse)
    • Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close bytend to be referenced soon (e.g., straightline code, array access)
  • Last 15 years, HW relied on locality for speed

It is a property of programs which is exploited in machine design.

Docsity.com

Memory Hierarchy: Terminology

  • Hit: data appears in some block in the upper level (example: Block X) - Hit Rate: the fraction of memory access found in the upper level - Hit Time: Time to access the upper level which consists of RAM access time + Time to determine hit/miss
  • Miss: data needs to be retrieve from a block in the lower level (Block Y) - Miss Rate = 1 - (Hit Rate) - Miss Penalty: Time to replace a block in the upper level + Time to deliver the block the processor
  • Hit Time << Miss Penalty (500 instructions on 21264!)

Lower Level To Processor Upper Level Memory^ Memory

From Processor^ Blk X Blk Y Docsity.com

Cache Measures

  • Hit rate : fraction found in that level
    • So high that usually talk about Miss rate
    • Miss rate fallacy: as MIPS to CPU performance, miss rate to average memory access time in memory
  • Average memory-access time = Hit time + Miss rate x Miss penalty (ns or clocks)
  • Miss penalty : time to replace a block from lower level, including time to Docsity.com

Q1: Where can a block be

placed in the upper level?

  • Block 12 placed in 8 block cache:
    • Fully associative, direct mapped, 2-way set associative
    • S.A. Mapping = Block Number Modulo Number Sets Cache

01234567 01234567 01234567

Memory

012345678901234567890123456789011111111111222222222233

Full Mapped Direct Mapped(12 mod 8) = 4 (12 mod 4) = 0^ 2-Way Assoc

Docsity.com

Q2: How is a block found if it is

in the upper level?

  • Tag on each block
    • No need to check index or block offset
  • Increasing associativity shrinks index, expands tag

Block Offset

Block Address Tag Index

Docsity.com

Q3: After a cache read miss, if there are no empty cache blocks, which block should be removed from the cache?

A randomly chosen block? Easy to implement, how well does it work?

The Least Recently Used (LRU) block? Appealing, but hard to implement for high associativity

Miss Rate for 2-way Set Associative Cache Also, try other LRU approx.

Size Random LRU 16 KB (^) 5.7% 5.2% 64 KB 2.0% 1.9% 256 KB (^) 1.17% 1.15%

Docsity.com

Q4: What happens on a write?

Write-Through Write-Back

Policy

Data written to cache block also written to lower- level memory

Write data only to the cache Update lower level when a block falls out of the cache Debug Easy Hard Do read misses produce writes? No^ Yes Do repeated writes make it to lower level?

Yes No

Additional option -- let writes to an un-cached address allocate a new cache line (“write-allocate”). Docsity.com

5 Basic Cache Optimizations

  • Reducing Miss Rate
  1. Larger Block size (compulsory misses)
  2. Larger Cache size (capacity misses)
  3. Higher Associativity (conflict misses)
  • Reducing Miss Penalty
  1. Multilevel Caches
  • Reducing hit time Docsity.com

The Limits of Physical Addressing

CPU (^) Memory

A0-A31 A0-A

D0-D31 D0-D

“Physical addresses” of memory locations

Data All programs share one address space: The physical address space

No way to prevent a program from accessing any machine resource

Machine language programs must be aware of the machine organization

Docsity.com