Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

CS433 Midterm Exam for CS Class with Instructions on Pipelining and Tomasulo's Algorithm -, Exams of Computer Architecture and Organization

University of Illinois - Urbana-Champaign Computer Architecture and Organization

Prof. Josep Torrellas

A midterm exam for a computer science class focusing on cs433. The exam covers topics such as pipelining, control hazards, exceptions, tomasulo's algorithm, and speculative execution. Students are required to answer questions related to instruction execution, functional unit usage, and cycle occupancy in the is, ex, wr, and cmt stages.

Typology: Exams

Pre 2010

Uploaded on 03/16/2009

koofers-user-xdk-1 🇺🇸

10 documents

1 / 10

This page cannot be seen from the preview

Don't miss anything!

CS433 Midterm

Prof Josep Torrellas

October 17, 2006

Time: 1 hour + 15 minutes

Name:

Alias:

Instructions:

1. This is a closed-book, closed-notes examination.

2. The Exam has 3Questions. Please budget your time.

3. Calculators are allowed.

4. Please write your answers neatly. Good luck!

Problem No. Maxm Points Points Scored

1 50

2 60

3 40

Total 150

Discover Exams of Computer Architecture and Organization University of Illinois - Urbana-Champaign

Partial preview of the text

Download CS433 Midterm Exam for CS Class with Instructions on Pipelining and Tomasulo's Algorithm - and more Exams Computer Architecture and Organization in PDF only on Docsity!

CS433 Midterm

Prof Josep Torrellas

October 17, 2006

Time: 1 hour + 15 minutes

Name:

Alias:

Instructions:

This is a closed-book, closed-notes examination.
The Exam has 3 Questions. Please budget your time.
Calculators are allowed.
Please write your answers neatly. Good luck!

Problem No. Maxm Points Points Scored 1 50 2 60 3 40 Total 150

I Pipelining [50 points]

A. Control Hazards [25 points] Suppose we have a MIPS processor with a 1-delay slot for branches. Consider codes (a) through (c):

ADD R1,R2,R3 ADD R1,R2,R3 ADD R1,R2,R NOP NOP NOP BEQZ R4 label BEQZ R1 label BEQZ R1 label [ [ [ ADD R10,R10,R10 ADD R7,R9,R10 ADD R13,R9,R JMP end JMP end JMP end NOP NOP NOP label: ADD R6,R6,R6 label: ADD R7,R11,R12 label: ADD R14,R9,R end: end: end:

(a) (b) (c)

a. [2 points] What is the best instruction to put in the delay slot in code (a)? Explain why. ADD R1,R2,R b. [2 points] What is the best instruction to put in the delay slot in code (b)? Explain why. NOP. Cannot find any c. [7 points] What is the best instruction to put in the delay slot in code (c) if R2+R3=0 60% of the time? Show the re- sulting code. In this case, what are the instructions executed when R2+R3=0, and what are the instructions executed when R2+R3!=0? ADD R1,R2,R NOP BEQ R1 end ADD R14,R9,R ADD R13,R9,R JMP end NOP label: ADD R14,R9,R end:

case 1) inst # 1, 2, 3, 4 case 2) inst # 1, 2, 3, 4, 5, 6, 7

d. [7 points] Repeat the whole question c if R2+R3=0 40% of the time.

B. Exceptions [25 points]

a. [5 points] What does it mean that a pipeline supports precise exceptions?

b. [4 points] How does a pipeline support precise exceptions?

c. [4 points] List one good thing and one bad thing of precise exceptions?

d. [4 points] What is a statically and a dynamically scheduled machine?

e. [4 points] What is a superscalar and a VLIW? Are they static or dynamic machines?

f. [4 points] How is the reorder buffer related to exception han- dling?

II Tomasulo’s Algorithm and Speculative Execution [60 points]

Consider the following code fragment DADDI R1, R0, # LOOP: L.D F0, 0(R2) MUL.D F2, 0(R1) DADDI F4, F2, F S.D F6, 0(R2) ADD.D F2, F2, F DSUBI F6, F4, F BNEZ R1, LOOP running on a system with the following specifications, noting that the assumptions are the same as in the homework except for those in bold typeface.

Assume a single-issue machine with unlimited reservation stations and the pipeline functional units described by Table 1.

Functional Unit Cycles in EX # Functional Units Integer 1 1 FP add 3 1 FP multiply 8 1

Table 1: Functional Unit Specification

Functional units are not pipelined.
All stages except EX take one cycle to complete.
There is no forwarding between functional units. Both integer and floating point results are communicated through the CDB.
Memory accesses and branches use the integer functional unit to perform address calculations. All loads and stores access memory during the EX stage and take one cycle to execute.
There are unlimited load/store buffers and an infinite instruction queue.
If an instruction is in the WR stage in cycle x, then an instruction that is waiting on the same functional unit (due to a structural hazard) can begin execution in cycle x.
Only one instruction can write to the CDB in a clock cycle.
Branches and stores do not need the CDB. When executing with specu- lative hardware, branches and stores can commit in the cycle that follows EX if there is no other constraint that prevents it.
Whenever there is a conflict for a functional unit or the CDB, assume program order.

B. Tomasulo’s Algorithm with Speculative Execution [30 points]

Now, assume the architecture above, except with hardware specu- lation. Assume that the reorder buffer has four entries, named 0, 1, 2, and 3. Only one ROB entry can commit per cycle. Complete table 3, including:

The ROB entry used by each instruction.
The cycles that each instruction occupies in the IS, EX, WR, and CMT stages.
Comments to justify your answer such as type of hazards/stalls and the registers/ROB entries involved.

Instruction ROB IS EX WR CMT Comments (if appropriate) DADDI R1, R0, #2 0 1 2 3 4 L.D F0, 0(R2) 1 2 3 4 5 MUL.D F2, F0, F2 2 3 5-12 13 14 RAW (L.D F0) DADDI R2, R2, #32 3 4 5 6 15 in-order CMT S.D F2, 0(R2) 0 5 14 - 16 RAW (MUL.D F2), in-order CMT ADD.D F4, F2, F4 1 6 14-16 17 18 RAW (MUL.D F2) DSUBI R1, R1, #1 2 15 16-17 18 19 No ROB until 15, CDB conflict BNEZ R1, LOOP 3 16 19 - 20 RAW (DSUBI R1) L.D F0, 0(R2) 0 17 18 19 21 in-order CMT MUL.D F2, F0, F2 1 19 20-27 28 29 No ROB until 19 DADDI R2, R2, #32 2 20 21 22 30 in-order CMT

Table 3: Tomasulo’s Algorithm with Speculative Excecution

III [40 points] Software ILP Consider the following machine.

There is 1 integer functional unit, taking 1 cycle to perform integer addition (including effective address calculation for loads/stores), subtraction, logic operations, and branch operations
There is 1 FP/integer multiplier, taking 7 cycles to perform any multiply. It is pipelined
There is 1 FP adder, taking 4 cycles to perform FP additions and subtractions. It is pipelined
Branches are resolved in the ID stage.
There is one branch delay slot
There is full forwarding and bypassing, including forwarding from the end of an FU to the MEM stage for stores
Loads and stores complete in one cycle. That is, they spend one cycle in the MEM stage after the effective address calculation
There are as many registers, both FP and integer, as you need
While the hardware has full forwarding/bypassing, it is the responsibility of the compiler to schedule such that the operands of each instruction are avail- able when needed by each instruction
If unspecified, its properties are like those in the MIPS pipeline we studied in class

Now consider this code fragment:

loop L.D F2, 0(R1) L.D F4, 0(R2) DADDUI R1, R1, # DADDUI R2, R2, # ADD.D F6, F2, F MUL.D F8, F4, F ADD.D F10, F6, F S.D F10, 0(R3) DADDUI R3, R3, # DSUBUI R4, R4, # BNEZ R4, loop

B. Software Pipelining [12 points] Software pipeline the loop and reorder the instructions to reduce stalls. Don’t write the startup or cleanup code.

loop S.D F10, -24(R3) //x- ADD.D F10, F6, F8 //x- ADD.D F6, F2, F2 //x- MUL.D F8, F4, F4 //x- L.D F2, 0(R1) //x L.D F4, 0(R2) //x DADDUI R1, R1, # DSUBUI R3, R3, # DADDUI R2, R2, # BNEZ R4, loop DADDUI R4, R4, #

C. Short Answer [8 points]

a. [4 points] Name one advantage and disadvantage of Loop Un- rolling.

Advantages: more ILP, fewer overheard instructions Disadvantages: code size increases, register pressure increases, problem becomes worse in multiple issue processors b. [4 points] Name one advantage of using VLIW and one prob- lem with the original VLIW model.

Advantages: keep more FU’s busy by issuing multiple instruc- tions, simpler hardware Problems: code size increase, limitations of lockstep operation, binary compatibility, finding parallelism

CS433 Midterm Exam for CS Class with Instructions on Pipelining and Tomasulo's Algorithm -, Exams of Computer Architecture and Organization

Related documents

Partial preview of the text

Download CS433 Midterm Exam for CS Class with Instructions on Pipelining and Tomasulo's Algorithm - and more Exams Computer Architecture and Organization in PDF only on Docsity!

CS433 Midterm

Prof Josep Torrellas

October 17, 2006

Time: 1 hour + 15 minutes