Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Memory Organization - Computer Organization II | ECE 366, Study notes of Computer Architecture and Organization

University of Illinois - Chicago Computer Architecture and Organization

Material Type: Notes; Class: Computer Organization II; Subject: Electrical and Computer Engr; University: University of Illinois - Chicago; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 07/23/2009

koofers-user-v6x 🇺🇸

10 documents

1 / 29

This page cannot be seen from the preview

Don't miss anything!

EECS 366: Computer Architecure

Instructor: Shantanu Dutt

Department of EECS

University of Illinois at Chicago

Lecture Notes # 16

Memory Organization



Shantanu Dutt



Shantanu Dutt, UIC 1

Discover Study notes of Computer Architecture and Organization University of Illinois - Chicago

Partial preview of the text

Download Memory Organization - Computer Organization II | ECE 366 and more Study notes Computer Architecture and Organization in PDF only on Docsity!

EECS 366: Computer Architecure

Instructor: Shantanu Dutt

Department of EECS

University of Illinois at Chicago

Lecture Notes # 16

Memory Organization

c Shantanu Dutt

c Shantanu Dutt, UIC^

Memory Hierarchy Design

lems they solve increase. Many programs need large amounts of memory, as the size of the prob-

To solve the problem quickly, fast access is

needed to all this data

of this kind is impractical to realizeexample) consumes too much VLSI area and power, so that large memoryof storing 1000s of MBytes. As we saw, fast memory (static memory, for One solution is, of course, to build very large fast memory units capable

largermemory, it is well known that access to this memory gets slower as it gets Furthermore, even if it becomes feasible to build large amounts of fast

Fortunately, there is a way out!

Because of

locality

property of most

(1)quick access to large amounts of data:programs, it is not necessary to have large amounts of fast memory for

Temporal Locality

: An item just referenced will be referenced again

(2)soon.

(^) Spatial Locality

: When an item is referenced, nearby items in memory

will also be referenced soon.

c Shantanu Dutt, UIC^

Memory Hierarchy Design (contd.)

In principle, there can be

levels in the memory hierarchy as shown be-

low.

expensivemore Faster,

expensivelessSlower,

The Memory Hierarchy

c Shantanu Dutt, UIC^

Memory Hierarchy Design (contd.)

level, and also belong to the entire memory address space An upper level is generally a subset of the data contained in the next lower

STOREslower levels are handled explicitly by the program in using LOADs andto the register file only, and data transfer between the register file and theaddress space—registers are addressed by a different address that pertainsin the cache at all times. Also, the register file is not part of the memory An exception is the register level, all of whose data may not be contained

operating system (main mem.–secondary storage hierarchy)they are handled either by hardware (cache–main mem. hierarchy) or thetransfers between them are “automatic” and transparent to the program— The rest of the levels share a common memory address space, and data

c Shantanu Dutt, UIC^

General Definitions and Principles of Memory Hierarchy (contd.)

Consider any 2 adjacent levels in the memory hierarchy:

Miss penalty

: Time to replace a block in the upper level by a needed block

obtaining the required block. The miss penaltythat is not in that level. Since there can be hits or misses at lower levels for

for the upper-most

level (level 1) is be given by:

where

is the miss rate in level

, and

is the block replace-

ment time from level

The average memort access time

for the CPU is given by

The block replacement time

= access time

^

(time to access the the

1st word of the block in the lower level

)

transfer time

(time to access the remaining word),

where

is the block size in the upper level

and

is the transfer rate

(per word) from level

For e.g., there is an initial time

required to search for the block/page

average timelocation in main memory (MM), and further due to refreshing we saw that

to access MM is given by:

Then the initital access time to MM is:

However, the entire row is stored in the row register after spending

the rest of the words in the block can be sent in approx. time to access the word, and the required block is part of this row. Thus

time per

word. Thus

Example:

There are 3-levels in the memory hierarchy:

cache, MM,

secondary storage.

The following are values of above parameters:

cc’s,

, cache block size = 4 words,

cc’s,

(^9

cc’s,

cc’s, MM page size = 2K

words.

Then, the average time taken by the CPU to access a word is:

A 9! <

B@

B;

@ 9 C 8 B

:D

D

E

c Shantanu Dutt, UIC^

General Definitions and Principles of Memory Hierarchy (contd.)

Effect of Block Size:

enced soon (spatial locality) Larger the block size, better the anticipation of nearby items to be refer-

stretched. However, beyond a certain block size, the concept of spatial locality is

Note that while a program may access almost all items in a

by random accesses (for ex., due to branches)necessarily one following the current one—spatial locality is punctuatedsmall or medium-size block, it later accesses a random next block, not

the programHence the miss rate increases when the next random block is accessed byupper level is limited, larger the block size, smaller is the # of blocks.the program might not access in the near-future. Since the space on the Thus for large block sizes, there will be many useless data items in it that

c Shantanu Dutt, UIC^

Effect of Block Size (contd.)

Initial A access, miss, Work on C Work on A

Next access is C, miss,A loaded A

Empty

Next access is A, hitC loaded Work on A

Next access is C, hit

block size = 16 words (c) Miss pattern with0 misses per iteration

block size = 32 words(b) Miss pattern with

A & B

C & D loadedNext access is C, miss,A&B loaded Initial A access, miss,

C & D

A & B loadedNext access is A, miss, 2 misses per iterationWork on C Work on A

D C B^ A

16 words 16 words 16 words 16 words

(a) Program Structure

c Shantanu Dutt, UIC^

General Definitions and Principles of Memory Hierarchy (contd.)

(when the current process is done or it also has a miss)previous process’s status, so that it can start re-executing at a later stageexecuting. When the requested block is brought in, this is noted in theor page fault), CPU is interrupted on a miss, and another process starts(2) If the miss penalty, is 100s to 1000s of cc’s (as in main-memory misswaits (ex., cache miss)(1) If the miss penalty is a few 10s of clock cycles (cc’s), then the CPU What the CPU does on a miss in the upper level:

takes place simultaneouslyCPU executes another process, while transfer from disk to main-mem.sets up the appropriate disk interface for a DMA and leaves the CPU; the(2) Done in software (O.S. could do this) for main-mem. miss—the O.S.(1) Done in hardware for few 10s of cc’s penalty (cache) Block transfer mechanism:

c Shantanu Dutt, UIC^

Some Basic Issues in Memory Hierarchies

Again we consider 2 adjacent levels of the hierarchy:

Write Strategy: What happens on a write to the upper level—how is this3. Block Replacement: Which block to replace during a miss?2. Block Identification: How is a block found in the upper level?1. Block Placement: Where can a block be placed in the upper level?

percolated to the lower level

c Shantanu Dutt, UIC^

Some Basic Issues in Memory Hierarchies

(1) Block Placement (contd.):

set containing all FA and DM are special cases of set-associative. In FA, there is only one

^

blocks. In DM, there are

^

sets, each containing exactly

1 block

FA has the most flexibility in placing a block, while DM has the least

c Shantanu Dutt, UIC^

Some Basic Issues in Memory Hierarchies (contd.)

(2) Block Identification:

tags Associative or content-addressible memory (CAM): stores the block # or

of resident blocks for each set.

The

index

, which is the

H K g J h

main-mem. hierarchy.for the rest of the block # (the tag). This is generally used in the cache – rightmost bits of the block #, determines which set of the CAM to search

(^0) (^1) (^2) (^3) (^4) (^5) (^6) (^7) (^0) (^1) (^2) (^3) (^4) (^5) (^6) (^7) (^0) (^1) (^2) (^3) (^4) (^5) (^6)

0 Set

1 Set

2 Set

3 Set

4 1

position 14 mod 8 = 6Search only in tag

within set 14 mod 4 = 2Search everywhere

Word #Block offset/

Tag

Index

Block #

the "indexed" set, and the word # is used to select the word in the blockselect the set (in DM and SA), and the tag is used to check all blocks in (b) Different portions of an address: The index (address mod s) is used to

i i

j j

k k

l l

m m

n n

o p o p o p o p o p o p q r q r q r q r q r q r

s t s t s t s t s t s t

u v u v u v u v u v u v

Bl. #

Block 14

Direct mapped (DM):

2-way Set Associative (SA):

Fully associative (FA):

Tag Data

Data

Tag

Search everywhere

performed in parallel in FA and SA caches for speed. (a) Block identification in different cache types. Search

c Shantanu Dutt, UIC^

Some Basic Issues in Memory Hierarchies (contd.)

CAMs:

Hardware Complexity: Of parallel search logic =

8 x

(^83)

for a FA cache,

where

is the size of the cache in blocks, and

is the # of bits in the

block #. This can be prohibitive for large

and

For SA cache, we have one such CAM of size

@ x h

for each of

the

J

sets. So total CAM size is

3 @ x h

. However, there is only

one parallel search logic of size

@ x h

which is used to search

only the indexed set

Tag

Data Block Logic Search

xednI

xedn I

Data Store

StoreTag

Decoder5−to−32=l−to−2**l

= 32−to−12**(r−l)−to− Mux

#Set = 312**l−1 1 0

=32 2**(r−l)

=15 m−l

=32 2**(r−l)

= 512 bits^ 16 blocks

bits 512

bits 15

(^5)

Decoder1−to−32= l−to−2**l

(^3110) Set #

4 3

Word #

9 8

r=10l=5 m=

Block # (20)

Tag (15)

Index (5)

of sets = 32, set size = 32 blocks2**r = 1024 blocksCache size=

w^ There is only one equality comparator in a DM cache; thus complexity is

8 x y

Time complexity of search:

H

K

for FA,

H K g x h

for SA, and

H K g x y

c Shantanu Dutt, UIC^

Memory Organization - Computer Organization II | ECE 366, Study notes of Computer Architecture and Organization

Related documents

Partial preview of the text

Download Memory Organization - Computer Organization II | ECE 366 and more Study notes Computer Architecture and Organization in PDF only on Docsity!

^

(^9

A    9! <

B@

B;

@ 9 C   8 B

:D

D

E

^

^

J

of sets = 32, set size = 32 blocks2**r = 1024 blocksCache size=

H

K

^

A 9! <

@ 9 C 8 B

^

^