Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

CS433ug Midterm Exam Questions and Solutions for MIPS Pipeline and Tomasulo's Algorithm - , Exams of Computer Architecture and Organization

University of Illinois - Urbana-Champaign Computer Architecture and Organization

Prof. Josep Torrellas

The cs433ug midterm exam for computer organization, with questions related to mips pipeline and tomasulo's algorithm. Solutions for questions regarding modified and original mips pipelines, tomasulo's algorithm, and software ilp. Students can use this document as a study resource for understanding mips pipeline structures, tomasulo's algorithm, and software ilp concepts.

Typology: Exams

Pre 2010

Uploaded on 03/16/2009

koofers-user-d8g 🇺🇸

9 documents

1 / 10

This page cannot be seen from the preview

Don't miss anything!

CS433ug Midterm

Prof Josep Torrellas

March 8, 2007

Time: 1 hour + 15 minutes

Name:

Alias:

Instructions:

1. This is a closed-book, closed-notes examination.

2. The Exam has 3Questions. Please budget your time.

3. Calculators are allowed.

4. Please write your answers neatly. Good luck!

Problem No. Maxm Points Points Scored

1 40

2 40

3 40

Total 120

Discover Exams of Computer Architecture and Organization University of Illinois - Urbana-Champaign

Partial preview of the text

Download CS433ug Midterm Exam Questions and Solutions for MIPS Pipeline and Tomasulo's Algorithm - and more Exams Computer Architecture and Organization in PDF only on Docsity!

CS433ug Midterm

Prof Josep Torrellas

March 8, 2007

Time: 1 hour + 15 minutes

Name:

Alias:

Instructions:

This is a closed-book, closed-notes examination.
The Exam has 3 Questions. Please budget your time.
Calculators are allowed.
Please write your answers neatly. Good luck!

Problem No. Maxm Points Points Scored 1 40 2 40 3 40 Total 120

I MIPS Pipeline [40 points]

A. Modified MIPS Pipeline [20 points] We change the MIPS pipeline to have the following structure.

IF instruction fetch ID register read and instruction decode ALU1 first stage of execution. Branch condition is completed in this cycle, as is the address of the branch target; ALU operands are needed at the beginning of this cycle, as is the base register for loads and stores. ALU2 second stage of execution; ALU results available in this cycle; effective address for loads and stores is available MEM1 first cycle of data memory access; only address is needed. MEM2 second cycle of data memory access; store data needed at the beginning of the cycle, load data available in this cycle. WB write back of results

Assume the register file operates on split cycles as discussed in Ap- pendix A, so as to minimize bypassing requirements. a) How many branch delay slots are there? Why? Solution: 2 delay slots. The branch target is available until at the end of the ALU1 stage (which is 2 cycles after the fetch). b) Assuming all possible forwarding is supported, how many stall cycles we have in the following case? (draw a pipeline picture)

ADD R1 R2 R ADD R7 R1 R

Solution: 1 stall. IF ID ALU1 ALU2 MEM1 MEM2 WB IF ID Stall ALU1 ALU2 MEM1 MEM2 WB c) Repeat b) for:

LOAD R1, 10(R5) ADD R7 R1 R

Solution: 3 stalls. IF ID ALU1 ALU2 MEM1 MEM2 WB IF ID Stall Stall Stall ALU1 ALU2 MEM1 MEM2 WB

II Tomasulo’s Algorithm and Speculative Execution [40 points]

Consider the following code fragment LOOP: L.D F0, 0(R1) L.D F2, 8(R1) MUL.D F4, F2, F ADD.D F4, F4, F S.D F4, 0(R2) DADDI R2, R2, # DSUBI R2, R1, # BNEZ R1, LOOP running on a system with the following specifications, noting that the assumptions are the same as in the homework except for those in bold typeface.

Assume a single-issue machine with unlimited reservation stations and the pipeline functional units described by Table ??.

Functional Unit Cycles in EX # Functional Units Integer 1 1 FP Add 3 1 FP Multiply 8 1

Table 1: Functional Unit Specification

Functional units are not pipelined.
All stages except EX take one cycle to complete.
All forwarding occurs through the CDB in the WR stage
Loads and stores take one cycle to execute. During that cycle, they use the integer functional unit to perform effective address calculation and, in addition, they access memory.
There are unlimited load/store buffers and an infinite instruction queue.
Branches are resolved in the EX stage.
If an instruction is in the WR stage in cycle x, then an instruction that is waiting on the same functional unit (due to a structural hazard) can begin execution in cycle x + 1.
Only one instruction can write to the CDB in a clock cycle.
Branches and stores do not need the CDB.
Whenever there is a conflict for a functional unit or the CDB, assume program order.

When an instruction is done executing in its functional unit and is waiting for the CDB, it is still occupying the functional unit and its reservation station (meaning no other instruction may enter).
Treat the BNEZ instruction as an Integer instruction. Assume L.D instruction after the BNEZ can be issued the cycle after the BNEZ instruction is issued due to branch prediction.

A. Tomasulo’s Algorithm [20 points]

Complete table ?? using Tomasulo’s algorithm for the given code fragment with no hardware speculation for branches.Include:

The functional unit used by each instruction. For structural hazards, assume program order.
The cycles that each instruction occupies in the IS, EX, and WR stages.
Comments to justify your answer such as type of hazards and the registers involved in the hazard.

Instruction Funct. Unit IS EX WR Comments (if appropriate) L.D F0, 0(R1) Integer 1 2 3 L.D F2, 8(R1) Integer 2 3 4 MUL.D F4, F2, F0 FP Mul 3 5-12 13 RAW F ADD.D F4, F4, F0 FP Add 4 14-16 17 RAW F S.D F4, 0(R2) Integer 5 18 19 (-) RAW F DADDI R2, R2, #8 Integer 6 7 8 DSUBI R2, R1, #16 Integer 7 8 9 BNEZ R1, LOOP Integer 8 9 10 (-) L.D F0, 0(R1) Integer 9 10 11

Table 2: Execution profile using Tomasulo’s Algorithm

III Software ILP [40 points] Consider the following machine.

There is 1 integer functional unit, taking 1 cycle to perform integer addition (including effective address calculation for loads/stores), subtraction, logic operations, and branch operations
There is 1 FP/integer multiplier, taking 8 cycles to perform any multiply. It is pipelined
There is 1 FP adder, taking 3 cycles to perform FP additions and subtractions. It is pipelined
Branches are resolved in the ID stage.
There is one branch delay slot
There is full forwarding and bypassing, including forwarding from the end of an FU to the MEM stage for stores
Loads and stores spend 1 cycle in the MEM stage after the effective address calculation.
There are as many registers, both FP and Integer, as you need
While the hardware has full forwarding/bypassing, it is the responsibility of the compiler to schedule such that the operands of each instruction are avail- able when needed by each instruction
If unspecified, its properties are like those in the MIPS pipeline we studied in class

Now consider this code fragment:

loop L.D F0, 0(R1) L.D F2, 8(R1) ADD.D F4, F2, F MUL.D F6, F0, F ADD.D F6, F6, F S.D F6, 0(R2) DADDUI R1, R1, # DADDUI R2, R2, # DSUBUI R3, R3, # BNEZ R3, loop

A. Loop Unrolling [30 points]

a. [15 points] Reschedule the code to minimize stalls. How many stalls are there? Please show the resulting code. Solution:

loop L.D F0, 0(R1) L.D F2, 8(R1) MUL.D F6, F0, F ADD.D F4, F2, F DADDUI R1, R1, # DADDUI R2, R2, # DSUBUI R3, R3, # 3 STALLS ADD.D F6, F6, F BNEZ R3, loop S.D F6, -8(R2) Answer: 3 stalls

B. Short Answer [10 points]

a. [5 points] How does loop unrolling improve performance? What are 2 disadvantages of loop unrolling? Solution: Increases ILP in each iteration, fewer overheard instructions. Disadvantages: code size increases, register pressure increases b. [5 points] What are 2 differences between dynamically sched- uled superscalar and VLIW processors? Solution: Superscalar - Issues multiple arbitrary instructions, instruc- tions dynamically schedule, if instruction cannot be issued, dont issue VLIW - Issues a fixed number of different types of instructions, instructions packaged together at compile time, if parallel in- structions cannot be found, put NOP in its slot.

CS433ug Midterm Exam Questions and Solutions for MIPS Pipeline and Tomasulo's Algorithm - , Exams of Computer Architecture and Organization

Related documents

Partial preview of the text

Download CS433ug Midterm Exam Questions and Solutions for MIPS Pipeline and Tomasulo's Algorithm - and more Exams Computer Architecture and Organization in PDF only on Docsity!

CS433ug Midterm

Prof Josep Torrellas

March 8, 2007

Time: 1 hour + 15 minutes