Electrical and Computer Engineering - Quiz 2 | ECE 511, Quizzes of Computer Architecture and Organization

Material Type: Quiz; Class: Computer Architecture; Subject: Electrical and Computer Engr; University: University of Illinois - Urbana-Champaign; Term: Fall 2004;

Typology: Quizzes

Pre 2010

Uploaded on 03/10/2009

koofers-user-9gk
koofers-user-9gk 🇺🇸

4

(1)

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
University of Illinois at Urbana-Champaign
Department of Electrical and Computer Engineering
Quiz 2
ECE 511, Fall 2004
You have until 6pm Saturday, December 18 to complete this quiz. By turning in this quiz, you
will have attested that you have neither received nor given inappropriate aid on this quiz from any
person except the instructor. You may use class notes, textbooks, web resources, computer
simulations, and published materials.
NAME:
pf3
pf4
pf5

Partial preview of the text

Download Electrical and Computer Engineering - Quiz 2 | ECE 511 and more Quizzes Computer Architecture and Organization in PDF only on Docsity!

University of Illinois at Urbana-Champaign Department of Electrical and Computer Engineering Quiz 2 ECE 511, Fall 2004 You have until 6pm Saturday, December 18 to complete this quiz. By turning in this quiz, you will have attested that you have neither received nor given inappropriate aid on this quiz from any person except the instructor. You may use class notes, textbooks, web resources, computer simulations, and published materials. NAME:

1a. (10 pts) The MIPS R10000 renaming scheme that we discussed in class has a problem. As you discovered on homework 3, it is often the case that we discover a misprediction for some branch deep in the reorder buffer, but because of the way we cancel instructions, we don’t tell the front end of the pipeline (the address generator and renamer) about the misprediction until the branch in question retires. Thus, as we make the reorder buffer longer, performance can actually decrease. The Pentium 4 attempts to solve this problem by telling the address generator about the misprediction as soon as the mispredicting branch completes. Unfortunately, on the Pentium 4 the renamer still can not start renaming down the correct path until the branch that mispredicted retires. Explain why. (Give an example instruction sequence where the rename Register Alias Table will be in an inconsistent state until the mispredicted branch retires.) 1b. (10 pts) The Alpha 21264 renamer can start renaming down the correct path as soon as a mispredicted branch completes (rather than waiting for the branch to retire) because in the 21264 every branch is given a copy of the register alias table. This means that instead of just two register alias tables (one representing the speculative state in the renamer and one representing the architectural state at retirement) the 21264 contains many register alias tables. As each branch passes the renamer it receives a copy of the speculative RAT at that point. If a branch completes and is mispredicted the address generator is signaled to send it down the now correctly determined fetch path, and the copy of the rename table from the mispredicted branch is copied over the rename RAT so that the renamer can go ahead and consistently rename the correct path instructions. Suppose that the reorder buffer on the 21264 has 128 entries and that extensive measurements have shown that in the Alpha ISA each basic block has an average length of 8 instructions. Thus you expect, on average, 16 branches to be in the reorder buffer, if it is full. You also know that the branch predictor on the 21264 is better than 95% accurate on average. Suppose you also know that the scoreboard has 16 entries, and that extensive measurements have shown that there are rarely more than 4 branches that have dispatched but not yet completed. How many rename alias tables would you suggest the designers of the 21264 implement? (Please explain your answer, don’t just give a number. Think about the “lifetime” of a register alias table copy. Each copy gets made as a branch passes through the renamer. What’s the earliest time at which it is no longer useful to hold a register alias table copy? Do we have to wait until the branch retires, or can we reuse the register alias table sooner?)

  1. (15 pts.) Suppose you have a dynamically optimizing processor with a frame cache. The following sequence of instructions is in the frame cache. Determine the set of instructions that can be eliminated from the frame by the dynamic optimizer. For each instruction that you eliminate, explain why it is legal to eliminate it. [beginning of frame] Ra <- Rx + Ry Rb <- Ra + Rz Rc <- Rc + Rb Rd <- Rc + Rw Ri <- Ri + 1 Assert Ri < Rn Ra <- Rx + Ry Rb <- Ra + Rz Rc <- Rc + Rb Rd <- Rc + Rw Ri <- Ri + 1 Assert Ri < Rn Ra <- Rx + Ry Rb <- Ra + Rz Rc <- Rc + Rb Rd <- Rc + Rw Ri <- Ri + 1 Branch Ri < Rn target [end of frame]
  1. (15 pts.) You are profiling the newly release SPEC 2004 benchmark suite on your Pentium4 based Linux workstation, and discover that one of the new benchmarks spends 80% of its execution time in a procedure called “sum_array” (shown below): int sum_array(int *a, int length) { int j; int sum = 0; for (j = 0; j < length; j++) { sum += a[j]; } return sum; } Your lab partner suggests that the routine could be rewritten to get 4 times better ILP as follows: int sum_array(int *a, int length) { int j; int sum0 = 0; int sum1 = 0; int sum2 = 0; int sum3 = 0; for (j = 0; (j+3) < length; j+=4) { sum0 += a[j]; sum1 += a[j+1]; sum2 += a[j+2]; sum3 += a[j+3]; } while (j < length) { sum0 += a[j]; j++; } return sum0 + sum1 + sum2 + sum3; } Under what circumstances is this idea going to succeed or fail?