Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Cache Systems and Memory Access Exam - Summer 2004: Advanced Computer Architecture, Exams of Computer Architecture and Organization

Georgia Institute of Technology - Main Campus Computer Architecture and Organization

The questions and answers for exam iii of the advanced computer architecture course (ece 4100/6100) held in summer 2004. The exam focuses on various aspects of cache systems and memory access, including cache sizes, cache miss penalties, and cache conflict resolution. Students are required to apply their knowledge of cache systems and memory access to solve problems and answer questions related to cache hit rates, average memory access times, and loop unrolling.

Typology: Exams

Pre 2010

Uploaded on 08/05/2009

koofers-user-a7d 🇺🇸

4

(1)

9 documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

SCORE:________ Name:__________________________________________

ECE 4100/6100 Advanced Computer Architecture

Exam III – Summer 2004

1. (10 points) What limits the size and complexity of the L1 cache and why is an L2 (and even an L3) in

common use today?

The L1 cache is in the critical delay path for clock cycle time on most processors. If it gets too large

(larger memory is slower) everything else would slow down. Adding larger L2 or L3 cache levels keeps

L1 hits as fast as possible and reduces the miss penalty on an L1 miss.

2. (10 points) List the five approaches suggested in the text to reduce the cache miss penalty.

Write Policy

Read Priority over write on a miss

Early Restart – Critical Word First

Non Blocking Caches

Add an additional level of cache

3. (10 points) Other than increasing cache size, associativity or number of levels how can you reduce the

bad effects of cache conflicts (in hardware and software)?

Hardware: Victim Cache

Software: compiler optimizations to reduce conflicts

4. (15 points) A computer has three levels of cache. The L1 cache has a 8% local miss rate and the L2

cache has a local miss rate of 30%. The global (L1,L2,L3 combined) cache system hit rate is 98%. Main

memory takes 50 clock cycles at 4Ghz, an L3 hit is 20 clock cycles, an L2 hit is 8 clock cycles, and an L1

hit is 2 clock cycles. Compute the average memory access time.

Need L3 hit/miss rate: ML1*ML2*ML3= Global miss rate, .08*.30*ML3 = 100-98, so ML3=.833

AMAT=HTL1+MRL1*(HTL2+MRL2*(HTL3+MRL3*Mem))

AMAT = 2+ .08*(8 + .3 * (20 + .833*50) = 4.12

4.12/4Ghz = 1.03 ns.

Average memory access time = _____4.12______clocks and _____1.03___________ ns.

Discover Exams of Computer Architecture and Organization Georgia Institute of Technology - Main Campus

Partial preview of the text

Download Cache Systems and Memory Access Exam - Summer 2004: Advanced Computer Architecture and more Exams Computer Architecture and Organization in PDF only on Docsity!

SCORE: Name:__________________________________

ECE 4100/6100 Advanced Computer Architecture

Exam III – Summer 2004

1_. (10 points)_ What limits the size and complexity of the L1 cache and why is an L2 (and even an L3) in common use today? The L1 cache is in the critical delay path for clock cycle time on most processors. If it gets too large (larger memory is slower) everything else would slow down. Adding larger L2 or L3 cache levels keeps L1 hits as fast as possible and reduces the miss penalty on an L1 miss.

(10 points) List the five approaches suggested in the text to reduce the cache miss penalty. _Write Policy Read Priority over write on a miss Early Restart – Critical Word First Non Blocking Caches Add an additional level of cache
(10 points)_ Other than increasing cache size, associativity or number of levels how can you reduce the bad effects of cache conflicts (in hardware and software)? Hardware: Victim Cache Software: compiler optimizations to reduce conflicts
(15 points) A computer has three levels of cache. The L1 cache has a 8% local miss rate and the L cache has a local miss rate of 30%. The global (L1,L2,L3 combined) cache system hit rate is 98%. Main memory takes 50 clock cycles at 4Ghz, an L3 hit is 20 clock cycles, an L2 hit is 8 clock cycles, and an L hit is 2 clock cycles. Compute the average memory access time. Need L3 hit/miss rate: ML1ML2ML3= Global miss rate, .08.30ML3 = 100-98, so ML3=. AMAT=HTL1+MRL1(HTL2+MRL2(HTL3+MRL3Mem)) AMAT = 2+ .08(8 + .3 * (20 + .833*50) = 4. 4.12/4Ghz = 1.03 ns. Average memory access time = _____4.12______clocks and _____1.03___________ ns.

Instruction producing result Instruction using result Latency in clock cycles FP. ALU Op FP. ALU Op 3 FP. ALU Op Store Double 2 Int. ALU Op Any 1 Load Double FP. ALU Op; 2 Load Double Store Double 0 LOOP: L.D F2, 0(R1) SUB.D F6, F4, F ADD.D F4, F6, F S.D F4, 0(R1) DADDIU R1, R1, # BNE R1, R3 LOOP Before loop unrolling a single execution requires ___15_____clocks (original code - do not reschedule)

(25 points) Unroll the loop shown above four times and schedule operations to reduce the number of stalls and control overhead. You can assume that the loop executes a multiple of four times. Use extra registers F8..F30, as needed. Indicate any stalls in your answer. Assume 1 branch delay slot is present.

LOOP:L.D F2, 0(R1)

L.D F8, 8(R1)

L.D F14, 16(R1)

L.D F20, 24(R1)

SUB.D F6, F4, F

SUB.D F12, F10, F

SUB.D F18, F16, F

SUB.D F24, F16, F

ADD.D F4, F6, F

ADD.D F10, F12, F

DADDIU R1, R1,

ADD.D F16, F18, F

ADD.D F22, F24,F

S.D F4, -32(R1)

S.D F10, -24(R1)

S.D F16, -16(R1)

BNE R1, R3, LOOP

S.D F22, -8(R1)

After unrolling, a (single) execution of the original loop’s operations now requires ___4.5____clocks

(10 points) Using the your loop (four times) unrolled code from the earlier problem, schedule it on this VLIW machine. Assume the same latencies as the earlier problem and no branch delay. Just like the books example, it can do 2 memory operations (loads and/or stores), two floating point operations, and an integer ALU or Branch operation every clock cycle. Memory Ref. 1 Memory Ref. 2 FP Operation 1 FP Operation 2 Int. Op/Branch L.D F2,0(R1) L.D F8,8(R1) L.D F14,16(R1) L.D F20,24(R1) Must stall SUB.D F6,F4,F2 SUB.D F12,F10,F SUB.D F18,F16,F14 SUB.D F24,F22,F Must stall Must stall ADD.D F4,F6,F0 ADD.D F10,F12,F ADD.D F16,F18,F0 ADD.D F22,F24,F0 DADDIU R1,R1,# Must stall S.D F4,-32(R1) S.D F10,-24(R1) S.D F16,-16(R1) S.D F22,-8(R1) BNE R1,R3,LOOP A single loop iteration now takes 12/4 = 3 clocks. (based on operations in the original loop code)

Cache Systems and Memory Access Exam - Summer 2004: Advanced Computer Architecture, Exams of Computer Architecture and Organization

Related documents

Partial preview of the text

Download Cache Systems and Memory Access Exam - Summer 2004: Advanced Computer Architecture and more Exams Computer Architecture and Organization in PDF only on Docsity!

SCORE:________ Name:__________________________________________

ECE 4100/6100 Advanced Computer Architecture

Exam III – Summer 2004

LOOP:L.D F2, 0(R1)

L.D F8, 8(R1)

L.D F14, 16(R1)

L.D F20, 24(R1)

SUB.D F6, F4, F

SUB.D F12, F10, F

SUB.D F18, F16, F

SUB.D F24, F16, F

ADD.D F4, F6, F

ADD.D F10, F12, F

DADDIU R1, R1,

ADD.D F16, F18, F

ADD.D F22, F24,F

S.D F4, -32(R1)

S.D F10, -24(R1)

S.D F16, -16(R1)

BNE R1, R3, LOOP

S.D F22, -8(R1)

SCORE: Name:__________________________________