12322222222232545658, Abschlussarbeiten von Computeralgebra

1233333312345579741852063.2222

Art: Abschlussarbeiten

2025/2026

Hochgeladen am 11.12.2025

masoumeh-darabi
masoumeh-darabi 🇩🇪

1 dokument

1 / 16

Toggle sidebar

Diese Seite wird in der Vorschau nicht angezeigt

Lass dir nichts Wichtiges entgehen!

bg1
Computer Microarchitecture Practice Questions
COMP2300/ENGN2219/COMP6300
Total Points: 100
Important Instructions: (1) Write down the names and UIDs of each student
in a group (if applicable) on the first page of your submission. (2) Submit the
solution as a single pdf file.
Instructions for Q12: You can fill out the PowerPoint slide deck and convert it
to a pdf document. You can then combine the pdf document with a second pdf
file with responses to all other questions. Alternatively, you can copy and paste
the main structures from the PowerPoint slide deck to your word document.
And then submit a single pdf file. You can also print out the slide, fill the
contents by hand, scan the document, and convert it to pdf.
Note about MIPS ISA: The practice questions in this homework assume MIPS
ISA. Please use your favorite search engines to learn about the differences
between ARM and MIPS ISA. It will not take more than 10 minutes to
understand MIPS if you know ARM. Once you understand one ISA, learning
about another ISA should be straightforward.
Perhaps all you need to know is that MIPS base plus offset addressing specifies
the base register inside parentheses and offset outside the parentheses.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Unvollständige Textvorschau

Nur auf Docsity: Lade 12322222222232545658 und mehr Abschlussarbeiten als PDF für Computeralgebra herunter!

Computer Microarchitecture Practice Questions

COMP2300/ENGN2219/COMP

Total Points: 100

Important Instructions: (1) Write down the names and UIDs of each student in a group (if applicable) on the first page of your submission. (2) Submit the solution as a single pdf file. Instructions for Q1 2 : You can fill out the PowerPoint slide deck and convert it to a pdf document. You can then combine the pdf document with a second pdf file with responses to all other questions. Alternatively, you can copy and paste the main structures from the PowerPoint slide deck to your word document. And then submit a single pdf file. You can also print out the slide, fill the contents by hand, scan the document, and convert it to pdf. Note about MIPS ISA: The practice questions in this homework assume MIPS ISA. Please use your favorite search engines to learn about the differences between ARM and MIPS ISA. It will not take more than 10 minutes to understand MIPS if you know ARM. Once you understand one ISA, learning about another ISA should be straightforward. Perhaps all you need to know is that MIPS base plus offset addressing specifies the base register inside parentheses and offset outside the parentheses.

( 2 Points) Q 1. Compilers impact the performance of applications in different ways. For a program, compiler X results in a dynamic instruction count of 1 billion instructions, and an execution time of one second. A second compiler Y results in an execution time of 1.5 seconds, and a dynamic instruction count of 1. billion instructions. For a processor with a clock cycle time of one nano seconds, find the average CPI for each of the two programs. ( 8 Points) Q 2. We are interested in adding a register-memory arithmetic instruction to the MIPS architecture. The new instruction exploits the I-format for load, store, and branch instructions. The new instruction has the label ACCM and employs an unused opcode in the ISA (the exact opcode is irrelevant). The semantics of the ACCM instruction is shown below: Instruction: ACCM Rt, Const(Rs) Interpretation: Reg[Rt] = Reg[Rt] + Mem[Reg[Rs] + Const] MIPS I-Format (shown for convenience) A. Draw the datapath and control signals for a single-cycle implementation of the ACCM instruction. Your datapath should show the new components, control signals, multiplexers, and instruction labels. Your illustration must show every logic and memory element on the critical path. B. Identify the critical path for the ACCM instruction. Write the equation (like the lecture slides) for the critical path. For example, use tALU and tMEM for the latency of ALU and memory. List all your assumptions.

( 10 Points) Q4. Consider an analytics application running on top of the MIPS processor. A fraction of instructions in this application exposes a specific type of RAW hazard. We identify the type of RAW hazard by the stage that produces the result (EX or MEM) and the instruction that consumes the result (1st following instruction, 2nd instruction that follows, or both). The type of RAW hazard and the fraction of instructions are shown in the table below. Answer the questions below with the following assumptions: (1) A register write happens in the first half of the clock cycle and a register read happens in the second half, (2) CPI of the processor is one if there are no data hazards. Assume stores are never followed by loads. All other hazards can be resolved by other tricks (RF read/write policy). A. What fraction of the cycles does the pipeline stalls with no forwarding? B. What fraction of the cycles does the pipelines stalls with full forwarding? C. What is the speedup with full forwarding versus no forwarding? Note: Speedup is defined as the ratio of execution times with and without an optimization. D. To avoid the complexity of large-input multiplexers, we need to decide if it is better to forward only from the EX/MEM pipeline register or the MEM/WB pipeline register. Which option would you choose to minimize data stall cycles? (Show your calculation)

( 4 points, 2, 2 ) Q 5. Find the longest chain of dependent instructions in the following code sequence. If maximizing IPC is the goal, should a microarchitect consider a stall-on-use in-order pipeline over a stall-on-miss in-order pipeline? name dst src1 scr i1: add r1 r1 r i2: add r1 r1 r i3: sub r1 r1 r i4: load r5 # 0 r i5: load r7 # 0 r i6: add r9 r5 r ( 12 Points) Q6. Assume that a branch has the following sequence of taken (T) and not- taken (N) outcomes: T,T,T,N,N,T,T,T,N,N,T,T,T,N,N A. What is the prediction accuracy for a 2-bit counter (Smith predictor) for this sequence assuming an initial state is strongly taken? B. What is the minimum local history length needed to achieve perfect branch prediction for this branch outcome sequence? C. Draw the corresponding PHT and fill in each entry with one of T (predict taken), N (predict not taken), or X (does not matter).

( 6 points ) Q 9. In this question, consider an out-of-order pipeline with an architectural register file (ARF) and a reorder buffer (ROB). The ROB has 32 entries. The tail currently points at the eighth entry of the ROB (rob7). The head of the ROB is stalled for an additional 100 cycles. The state of the ARF, the rename map table (RMT), and the ROB are shown below. Rename the destination and source register specifiers in the instruction sequence below. Identify the dependences in the original and the renamed sequence. Draw the state of the RMT after the instruction sequence is renamed.

( 8 points, 2.5, 2.5 ) Q 10. Briefly explain how we can add the following features to the CDC 6600 scoreboard. (1) Register renaming. (2) Hardware speculation. Start with the scoreboard design as we studied in the lectures and briefly explain the steps required to add the two features. ( 8 points, 2, 2, 2 ) Q 11. The complexity of processor pipelines we have encountered in the lectures vary. We rank three different pipelines with increasing complexity as follows: (1) stall-on-miss (simple) (2) stall-on-use (moderately complex) (3) ARF+ROB (very complex). For each of the following scenarios, pick the simplest pipeline that would likely deliver the highest IPC. The in-order pipelines do not use branch prediction. The OOO pipeline uses a simple one-bit branch predictor.

  1. Scenario 1: Frequent RAW hazards, infrequent branches, negligible WAR/WAW hazards, infrequent memory operations
  2. Scenario 2: Infrequent RAW hazards, frequent hard-to-predict branches, frequent independent memory operations, frequency of WAR/WAW hazards is unknown
  3. Scenario 3: Same as scenario 2, but easy to predict branches, and the frequency of WAR/WAW hazards is known to be very high

Solutions Q1: Cycles(X)=1e+9, Cycles(Y)=1.5e+9, CPI(X)=1, CPI(Y)=1. Q2: One approach: (1) There is an extra ALU in the datapath. (2) The second ALU is fed data from either data memory or RF. (3) There is a multiplexer in front of the second ALU that chooses an operand from either memory (ACCM) or register (other instructions). (Note: There are other ways of doing this but a second ALU and multiplexer is not avoidable. There are ways to avoid putting the extra multiplexer off the critical path. But most correct solutions report the following equation for the critical path.) The critical path is: Tpc + 2 Tmem + Trf-read + 2 Talu + 2*Tmux + Trf-setup Q3: (A) 2 nops b/w i1 and i2, 1 nop b/w i3 and i4, 2 nops b/w i4 and i (B) Code executes correctly. There is no load-use hazard (C) We need to detect dependency with first/second youngest instruction. The instruction that is currently in the ID stage needs to be stalled if it depends on a value produced by the instruction in the EX or the instruction in the MEM stage. So we need to check the destination register of these two instructions. For the instruction in the EX stage, we need to check Rd for R-type instructions and Rd for loads. For the instruction in the MEM stage, the destination register is already selected (by the Mux in the EX stage) so we need to check that register number (this is the bottommost output of the EX/ MEM pipeline register). First youngest: ID/EX.RegWrite and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt)) Second youngest: EX/MEM.RegWrite and ((EX/MEM.RegisterRt = IF/ID.RegisterRs) or (EX/MEM.RegisterRt = IF/ID.RegisterRt))

011: T 100: T 101: X 110: N 111: N Q7: A: The values of destination registers reside either in ARF or ROB. If an instruction in the rename/register-read stage does not capture the values from ROB, then the value may move to the ARF by the time instruction is ready to execute. This is the reason why we must have register read stage before the issue stage in ARF+ROB. B: The issue stage enables dynamic scheduling of instruction out of the original program order. The instruction enter the issue queue in program order (dispatch) and are selected by the dynamic scheduler (issue) for execution out of order. Q8: (1) no hazards (2) RAW only (3) All hazards are possible (need renaming to avoid WAR and WAW) i1->i2: RAW (r3) i1->i3: RAW (r3) i2->i3: RAW (r4) i1->i2: WAR (r4) i2->i3: WAR (r3) i1->i3: WAW (r3) Q9: No dependences in the renamed sequence (purpose of renaming) Dependences in the original sequence: i1->i2: RAW (r5) i1->i4: RAW (r5) i1->i5: RAW (r5) i1->i3 WAR (r2) i2->i4: WAW (r3) i2->i4: RAW (r3) i4->i5: RAW (r3) Renamed Sequence: add rob7, rob3, rob

lw rob8, 4(rob7) lw rob9, 0(rob3) or rob10, rob7, rob sw rob10, 0(rob7) RMT: r r1 1 rob r2 1 rob r3 1 rob r r5 1 rob Q10: Register renaming: (1) Add RMT, (2) Add an extended set of registers (e.g., via ROB) Hardware speculation: Use ROB Q11: (1) Stall-on-miss (2) Stall-on-use (3) ARF+ROB Q12: See next few pages.