ECE 412 Exam 2 Study Questions - Fall 2003 - Prof. Rakesh Kumar, Study notes of Computer Architecture and Organization

Study questions for exam 2 of the ece 412 course, focusing on topics such as simulators, register renaming, loop unrolling, dynamic scheduling, cache coherence, and parallel processing. Questions cover concepts like out-of-order execution, control and data dependence speculation, and cache invalidations.

Typology: Study notes

Pre 2010

Uploaded on 03/09/2009

koofers-user-gi5
koofers-user-gi5 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECE 412 Exam 2 Study Questions
Fall 2003
1. The first version of the ECE412 course simulator issued instructions in order and
had no store buffer. When the course staff first got out-of-order issue working
c.p.i. increased, rather than decreased. Then we implemented the store buffer and
c.p.i. improved substantially. Explain.
2. There are two very strong reasons why the register renaming process gets worse
as the width of the machine is increased. State both.
3. Is loop unrolling likely to have a positive or negative effect on correlated branch
predictors?
4. What is the dynamic scheduling analog of the EPIC concept of control
speculation?
5. What is the dynamic scheduling analog of the EPIC concept of data dependence
speculation?
6. Consider the following C code running on two processors of a cache coherent
shared memory multiprocessor:
Processor A: Processor B:
for (i=0; i < 5; i++) for (j=0; j < 5; j++)
x++; x++;
a. Assume x is initialized to 0. What are the possible values x can take after
both processors are done?
b. If the multiprocessor uses the Illinois protocol for cache coherence what is the
minimum and maximum number of cache invalidations that occur when the
code is run.
pf2

Partial preview of the text

Download ECE 412 Exam 2 Study Questions - Fall 2003 - Prof. Rakesh Kumar and more Study notes Computer Architecture and Organization in PDF only on Docsity!

ECE 412 Exam 2 Study Questions Fall 2003

  1. The first version of the ECE412 course simulator issued instructions in order and had no store buffer. When the course staff first got out-of-order issue working c.p.i. increased , rather than decreased. Then we implemented the store buffer and c.p.i. improved substantially. Explain.
  2. There are two very strong reasons why the register renaming process gets worse as the width of the machine is increased. State both.
  3. Is loop unrolling likely to have a positive or negative effect on correlated branch predictors?
  4. What is the dynamic scheduling analog of the EPIC concept of control speculation?
  5. What is the dynamic scheduling analog of the EPIC concept of data dependence speculation?
  6. Consider the following C code running on two processors of a cache coherent shared memory multiprocessor: Processor A: Processor B: for (i=0; i < 5; i++) for (j=0; j < 5; j++) x++; x++; a. Assume x is initialized to 0. What are the possible values x can take after both processors are done? b. If the multiprocessor uses the Illinois protocol for cache coherence what is the minimum and maximum number of cache invalidations that occur when the code is run.
  1. Consider the following two (equivalent) C programs. Under what circumstances would you expect the second version to achieve better c.p.i. on a high throughput out-of-order superscalar processor? First version: double a[n]; double x[n+1]; int j; x[0] = 0.0; for (j = 0; j < n; j++) { a[j] = a[j] * pi; x[j+1] = x[j]7.0 + a[j]; }* Second version: double a[n]; double x[n+1]; int j; for (j = 0; j < n; j++) { a[j] = a[j] * pi; } x[0] = 0.0; for (j = 0; j < n; j++) { x[j+1] = x[j]7.0 + a[j]; }*
  2. Consider the following C code: double total[n]; double a[n][n]; int i, j; for (i = 0; i < n; i++) { for (j = 0; j < n; j++) { total[i] = total[i] + a[i][j]; } } Write an equivalent program that will get higher throughput when run on a wide- issue machine.
  3. Suppose we have four processors A, B, C and D, connected with a bi-directional ring network. (By a bi-direction ring we mean that there are unidirectional links A->B->C->D->A and A->D->C->B->A.) Give a routing rule for each of the 12 possible message source/destinations pairs (A->B, A->C, A-D, B->C, B->D, B-

A, C->D, C->A, C->B, D->A, D->B, D->C), that guarantees that the interconnect is deadlock free.