Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Pipeline Hazards and Tomasulo's Algorithm: Homework Solution - Prof. Prabhat Kumar Mishra, Assignments of Electrical and Electronics Engineering

University of Florida (UF)Electrical and Electronics Engineering

Prof. Prabhat Kumar Mishra

The solution to a homework assignment on pipeline hazards and tomasulo's algorithm. The assignment involves identifying hazards in a given program and using forwarding techniques to avoid them. The solution also covers the use of tomasulo's algorithm with and without speculation, and the impact of the reorder buffer size on processor performance.

Typology: Assignments

Pre 2010

Uploaded on 03/10/2009

koofers-user-ije 🇺🇸

10 documents

1 / 5

This page cannot be seen from the preview

Don't miss anything!

Homework 2 Solution

CDA 5155: Fall 2008

Due Date: 10/16/2008 11:59 PM (UF EDGE Students: 10/23/2008 11:59 PM)

Brief Solution

You are not allowed to take or give help in completing this assignment. Submit the PDF version of

the submission in e-Learning website before the deadline. Please include the sentence in bold on

top of your submission: “I have neither given nor received any unauthorized aid on this

assignment”.

1. [10 points] Identify all possible hazards in the following instructions and show clearly which

instructions are involved in what type of hazard (structural, WAW, WAR, RAW and control)

and why? Is it possible to avoid such hazard using data forwarding? If possible, what

forwarding should be used? Consider a simple 5-stage MIPS pipeline as shown in Figure A.3

for this exercise. I1 is the first instruction and I13 is the last instruction in the program.

Solution:

Since the 5-stage pipeline is in-order issue and in-order execution pipeline, there is no WAR or

WAW hazard.

Possible hazards Possible solution

I 1 LD R1,1024(R4)

I 2 SUBI R5,R5,#4

I 3 ADD R1,R2,R1 RAW: R1 is the result of I1 Forwarding from MEM/WB to ID/EX

I 4 LD R2,1000(R5) RAW: R5 is the result of I2 Forwarding from MEM/WB to ID/EX

I 5 SUB R1,R2,R1 RAW: R1 is the result of I3

RAW: R2 is the result of I4

Forwarding from MEM/WB to ID/EX

I 6 BGTZ R1, #16 RAW: R1 is the result of I5

Control hazard

Forwarding from EX/MEM to ID/EX

I 7 SUB R1,R0,R1 RAW: R1 is the result of I5

Forwarding from MEM/WB to ID/EX

I 8 SD R1, 1024(R4) RAW: R1 is the result of I7 Forwarding from MEM/WB to EX/MEM

I 9 SUBI R4,R4,#4

I 10 BNE R4, R0, #64 RAW: R4 is the result of I9

Control hazard

Forwarding from EX/MEM to ID/EX

I 11 NOP

I 12 J #800 Control hazard

I 13 BREAK

Discover Assignments of Electrical and Electronics Engineering University of Florida (UF)

Partial preview of the text

Download Pipeline Hazards and Tomasulo's Algorithm: Homework Solution - Prof. Prabhat Kumar Mishra and more Assignments Electrical and Electronics Engineering in PDF only on Docsity!

Homework 2 Solution

CDA 5155: Fall 2008

Due Date: 10/16/2008 11:59 PM (UF EDGE Students: 10/23/2008 11:59 PM)

Brief Solution

You are not allowed to take or give help in completing this assignment. Submit the PDF version of the submission in e-Learning website before the deadline. Please include the sentence in bold on top of your submission: “ I have neither given nor received any unauthorized aid on this assignment ”.

[10 points] Identify all possible hazards in the following instructions and show clearly which instructions are involved in what type of hazard (structural, WAW, WAR, RAW and control) and why? Is it possible to avoid such hazard using data forwarding? If possible, what forwarding should be used? Consider a simple 5-stage MIPS pipeline as shown in Figure A. for this exercise. I1 is the first instruction and I13 is the last instruction in the program.

Solution: Since the 5-stage pipeline is in-order issue and in-order execution pipeline, there is no WAR or WAW hazard.

Possible hazards Possible solution I 1 LD R1,1024(R4)

I 2 SUBI R5,R5,#

I 3 ADD R1,R2,R1 RAW: R1 is the result of I1 Forwarding from MEM/WB to ID/EX

I 4 LD R2,1000(R5) RAW: R5 is the result of I2 Forwarding from MEM/WB to ID/EX

I 5 SUB R1,R2,R1 RAW: R1 is the result of I RAW: R2 is the result of I

Forwarding from MEM/WB to ID/EX Forwarding from MEM/WB to ID/EX I 6 BGTZ R1, #16 RAW: R1 is the result of I Control hazard

Forwarding from EX/MEM to ID/EX

I 7 SUB R1,R0,R1 RAW: R1 is the result of I5 Forwarding from MEM/WB to ID/EX

I 8 SD R1, 1024(R4) RAW: R1 is the result of I7 Forwarding from MEM/WB to EX/MEM

I 9 SUBI R4,R4,#

I 10 BNE R4, R0, #64 RAW: R4 is the result of I Control hazard

Forwarding from EX/MEM to ID/EX

I 11 NOP

I 12 J #800 Control hazard

I 13 BREAK

[20 points] For the following instructions:

Loop: L.D F0, 1024(R1) MUL.D F4, F2, F S.D F4, 1024(R2) DIV.D F5, F3, F ADD.D F5, F5, F S.D F5, 1024(R1) DSUBIU R2, R2, # DSUBIU R1, R1, # BNEZ R1, Loop

The second column in the following table indicates the number of cycles spent by the instruction (first column) in their respective functional units. Only exception is the load-store instructions where it spends one clock cycle in “Integer ALU” and three clock cycles in load/store (Memory) unit. The third column indicates the number of functional units for different types of instructions.

Instruction Cycle Number of unit Integer ALU 1 2 Load/Store 3 2 ADD.D 3 1 MUL.D 3 1 DIV.D 10 1

Assume that the reservation station and the reorder buffer both have infinite size. The integer ALUs are used for effective address calculation, ALU operations and branch condition evaluation. Assume that you can make at most two writes to CDB in one clock cycle. Complete the following two tables showing when each instruction issues, begins execution, accesses memory and writes its result to the CDB for the first two iterations for the following two scenarios using Tomasulo’s algorithm.

a) Use a MIPS pipeline with two-issue and without speculation. Assume that branches are issued alone (single-issue for that time step) and branch prediction is perfect. To answer this question, use the MIPS pipeline structure in Figure 2.9 in page 94 of the book with modified information based on the abovementioned table.

b) Use a MIPS pipeline with two-issue and with speculation. You also need to specify when each instruction commits. Moreover, assume that up to two instructions of any type can commit per cycle. Furthermore, assume that branches are issued alone (single-issue for that time step) and branch prediction is perfect. Note that stores will spend 3 cycles in the commit stage, because its memory access occurs during commit. To answer this question, use the MIPS pipeline structure in Figure 2.14 in page 107 of the book with modified information based on the abovementioned table.

c) In part b, stores access the memory only during the commit stage. Why is this important for Tomasulo’s algorithm with speculation?

Each wrong cycle calculation violating any structural /data dependences, - Any instruction issued in a wrong cycle, -2. In part a), any instruction executed before the execution of BNEZ in the first loop, - In part b), any instruction committed in a wrong cycle, -

c) This ensures the memory is not updated until the store instruction is no longer speculative. (Note that answers like “to keep loads/stores in order” or “to keep in order completion of instructions” are not accurate enough. The point here is that if some instruction is still speculative, we must be able to roll it back. )

[10 points]

a) Draw the state transition diagram of a 2-bit predictor that can “E-perfectly” predict the following branch sequence:

<NT, NT, NT, T>, <NT, NT, NT, T>, ….. (repeat the 4-element sequence indefinitely).

Recall that you have four possible states for any 2-bit predictor. The initial state of your predictor can be anyone of the four states. By “E-perfectly”, we mean that your predictor must be always correct after a finite number of wrong predictions, regardless of its initial state. (The correct solution is not unique.)

b) Is it possible to build a 2-bit predictor that can “E-perfectly” predict the following branch sequence?

<NT, NT, NT, NT, T>, <NT, NT, NT, NT, T>, ….. (repeat the 5-element sequence).

If possible, draw its state transition diagram. If not, why?

Solution: a)

b) No. It is not possible. If we have such a predictor, it cannot have more than 4 states. Assume after finite number of wrong predictions, it can predict the sequence of NT, NT, NT, NT, T correctly.

NT (00)

T (11)

T

T NT

NT (10)

NT (^) T

NT

T^ NT

NT (01)

Then after its correct prediction of the last NT, its state must go back to some previous state, within which it predicts that the branch will not be taken, because we only have 4 different states and all previous 4 predictions are correct. Therefore its next prediction will not be correct.

[10 points] In this problem, you are going to investigate the impact of the reorder buffer size on the performance of a processor implementing Tomasulo’s algorithm. Assume we have infinite functions units and reservation stations. The branch predictor works perfectly. Infinite number of instructions can be issued and committed per cycle. There are no data dependences in the flow of instruction. a) If our fetch unit can fetch up 8 instructions per cycle. Every instruction takes 1 cycle to execute. What is the average IPC for such a processor? b) Suppose that now every 100th^ instruction is a float division instruction which takes 200 cycles to complete. What is the average IPC? c) Consider the system in part b) except that we only have 30 entries in ROB. What is the average IPC for this processor? If we still want to keep an IPC of 8, what is the minimum size of the ROB? Solution: a) IPC is 8 b) IPC is still 8 c) If we only have 30 entries in ROB, the IPC will be 100/(200+(100-30)/8)=0.48. (The exact IPC may be different if you make some different assumptions about the architecture. But the result should be close to 0.5. The reason is that we just need a little more than 200 cycles to run every 100 instructions.)

To keep an IPC of 8 we need 200*8=1600 entries in ROB.

Pipeline Hazards and Tomasulo's Algorithm: Homework Solution - Prof. Prabhat Kumar Mishra, Assignments of Electrical and Electronics Engineering

Related documents

Partial preview of the text

Download Pipeline Hazards and Tomasulo's Algorithm: Homework Solution - Prof. Prabhat Kumar Mishra and more Assignments Electrical and Electronics Engineering in PDF only on Docsity!

Homework 2 Solution

CDA 5155: Fall 2008

Due Date: 10/16/2008 11:59 PM (UF EDGE Students: 10/23/2008 11:59 PM)

Brief Solution