Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

CS/ECE 752 Spring 2014 Exam 1: Computer Architecture and Organization, Study notes of Computer Architecture and Organization

University of Massachusetts - Dartmouth Computer Architecture and Organization

University of Wisconsin - Madison. CS/ECE 752 Advanced Computer Architecture I. Midterm Exam 1. Monday, February 17, 2014. Instructions:.

Typology: Study notes

2022/2023

Uploaded on 05/11/2023

leonpan 🇺🇸

4

(12)

286 documents

1 / 10

This page cannot be seen from the preview

Don't miss anything!

CS/ECE 752 Spring 2014 Exam 1 -- Page 1

Last (family) name: _________________________

First (given) name: _________________________

Student I.D. #: _____________________________

Department of Computer Sciences

University of Wisconsin - Madison

CS/ECE 752 Advanced Computer Architecture I

Midterm Exam 1

Monday, February 17, 2014

Instructions:

1. Open book/open notes.

2. The exam is multiple choice and will be graded using a separate automatically read grading

sheet. Please write your name, ID number, and answers on the grading sheet. Be sure to fill

in the bubbles fully for each question.

3. Upon announcement of the end of the exam, stop writing on the exam paper immediately.

Pass the exam to aisles to be picked up by the proctors. The instructor will announce when

to leave the room.

4. Failure to follow instructions may result in forfeiture of your exam and will be handled

according to UWS 14 Academic misconduct procedures.

Problem

Type

Points

Score

1-15

Multiple Choice

30

16-18

Hierarchical Branch Predictor Performance

15

19-23

Program Data Dependence Analysis

15

24-27

Instruction Scheduling

20

28-37

From the Readings

20

Total

100

Discover Study notes of Computer Architecture and Organization University of Massachusetts - Dartmouth

Partial preview of the text

Download CS/ECE 752 Spring 2014 Exam 1: Computer Architecture and Organization and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Last (family) name: _________________________ First (given) name: _________________________ Student I.D. #: _____________________________ Department of Computer Sciences University of Wisconsin - Madison CS/ECE 752 Advanced Computer Architecture I

Midterm Exam 1

Monday, February 17, 2014

Instructions:

Open book/open notes.
The exam is multiple choice and will be graded using a separate automatically read grading sheet. Please write your name, ID number, and answers on the grading sheet. Be sure to fill in the bubbles fully for each question.
Upon announcement of the end of the exam, stop writing on the exam paper immediately. Pass the exam to aisles to be picked up by the proctors. The instructor will announce when to leave the room.
Failure to follow instructions may result in forfeiture of your exam and will be handled according to UWS 14 Academic misconduct procedures. Problem Type Points Score 1 - 15 Multiple Choice 30 16 - 18 Hierarchical Branch Predictor Performance 15 19 - 23 Program Data Dependence Analysis 15 24 - 27 Instruction Scheduling 20 28 - 37 From the Readings 20 Total 100

Problems 1-20: (40 pts): Multiple choice; select the best answer for each question Note: Some of the answers below are “list” answers, such as “All of the above” and “Both (a) and (b).” In these answers, the word “above” refers only to “real” answers with specific content, not other list answers.

A pipelined processor that does not have any WAR register hazards
1. Must have an earlier register read stage and a later register write stage
2. Must have an earlier register write stage and a later register read stage
3. Must have two stages that write registers, one later than the other
4. None of the above
A VLIW instruction set processor
1. Packs multiple operations into a single instruction
2. Usually relies on software to resolve pipeline hazards
3. Can operate at much higher frequency than other approaches
4. Exposes a lot more instruction-level parallelism than other approaches
5. Both (a) and (b)
6. None of the above
Local branch history in a dynamic branch predictor is used to:
1. Predict a branch based on how neighboring branches were resolved recently
2. Predict a branch based on how that same branch resolved recently
3. Predict a branch based on the sign bit of its offset field
4. Predict a branch based on a profiling run collected with a representative input set
A branch that is mispredicted as not-taken requires the processor control logic to:
1. Restart fetching instructions from the not-taken path
2. Clear out all instructions that were not tagged with the mispredicted branch’s tag
3. Fix up the rename table to correspond to the state following the branch
4. None of the above
5. Both (b) and (c)
A two-level dynamic branch prediction:
1. Uses a second-level branch history table to capture the branch working sets of very large programs that contain thousands of branches
2. Uses two levels of branch confidence to identify easy-to-predict and hard-to- predict branches
3. Learns more than one possible prediction for a static branch by using branch outcome history as part of the lookup index into the pattern history table
4. Accurately predicts exits from loops that have iteration counts in the hundreds
5. None of the above
The return address stack (RAS):
1. Pushes the return address whenever a call instruction is encountered
2. Pops a return address for every return instruction
3. Raises an exception when the stack overflows or underflows
4. None of the above
5. Both (a) and (b)
6. All of the above

Clustering an N-wide super scalar typically:
1. Increases CPI and increases cycle time
2. Increases CPI and decreases cycle time
3. Decreases CPI and increases cycle time
4. Decreases CPI and decreases cycle time
Dynamic power has a
1. Linear relationship with voltage
2. Quadratic relationship with voltage
3. Cubic relationship with voltage
4. None of the above
Reducing clock frequency
1. Reduces dynamic power
2. Reduces static power
3. Reduces performance
4. All of the above
5. Both (a) and (b)
6. Both (a) and (c)
7. None of the above

Hierarchical Branch Predictors (15 points) Some highly-pipelined processors use a hierarchical branch prediction scheme, similar to how most modern processors now use hierarchical caches. These systems typically have a small, simple level-1 (L1) predictor (e.g., a branch target buffer) that can return a prediction within a single cycle. The second level-2 (L2) predictor is typically a much larger multilevel predictor (such as a the Alpha EV-8 direction predictor) that makes a much more accurate prediction, but requires two or more cycles to make the prediction. Both predictors are accessed for each branch. The processor uses the L1 predictor to begin speculative execution of the branch, but checks this prediction using the L2 predictor. If the L2 predictor disagrees with the L1, the branch is aborted and restarted using the L2 prediction. Assuming the L2 prediction is correct, the L1 misprediction penalty is much smaller than a full misprediction penalty. These hierarchical predictors can be better than a large single-level predictor because the latter may have a larger penalty on a correctly predicted branch. Consider a hierarchical predictor with the following performance: L1 Prediction L2 Prediction Stall Cycles Correct Correct 0 Correct Incorrect 11 Incorrect Correct 3 Incorrect Incorrect 8 Assume that one in 6 instructions are branch instructions and that the L1 and L2 predictions are independent. Assume that all branch prediction stalls directly impact performance (as they might in the MIPS 5 - stage pipeline). Thus we want to know the contribution to the CPI caused by branch mispredictions (i.e., the stall cycles per instruction). Assume the L1 predictor is right 80% of the time and the L2 predictor is right 95% of the time.

How many stall cycles per instruction are due to either L1 or L2 mispredictions?
1. 1. 65
5. None of the above
Which case contributes the most to stall cycles per instruction?
1. L1 Correct, L2 Correct
2. L1 Correct, L2 Incorrect
3. L1 Incorrect, L2 Correct
4. L1 Incorrect, L2 Correct
5. None of the above
Which is more important?
1. Improving the L1 predictor from 80% to 90% accurate?
2. Improving the L2 predictor from 95% to 97% accurate?
3. Reducing the penalty on L1 mispredict, L2 correct predict from 3 to 2 cycles?
4. Reducing the penalty on L1 correct predict, L2 mispredict from 11 to 8 cycles?

Instruction Scheduling (20 points) Using the same pseudo-assembly language, but adding floating point (i.e., the “.d” suffix means double precision), the questions below involve instruction schedules. 1 lw.d F0 = mem[R2 + 0] 2 mult.d F3 = F0 * F 3 sw.d F3  mem[R2 + 0] 4 lw.d F4 = mem[R3 + 0] 5 mult.d F5 = F4 * F 6 sw.d F5  mem[R3 + 0] 7 add.d F6 = F3 + F 8 sw.d F6  mem[R4 + 0] Assume the standard MIPS 5 - stage (single-issue) pipeline plus separate pipes for floating point multiply and floating point add. All loads and stores use the integer pipeline and the floating- point pipeline does not have an “M” stage. Load-use delay is one cycle, regardless of which pipeline uses the result. Floating point multiply has 3 execute cycles and floating point add takes two execute cycles, and both are fully pipelined.

On what cycle does instruction 8 store its value to memory?
1. Cycle 13
2. Cycle 14
3. Cycle 15
4. Cycle 16
5. Cycle 17
6. None of the above
A compiler scheduler that is trying to reduce stalls might generate the following instruction schedule (using the instruction numbers):
1. 1, 2, 3, 4, 5, 6, 7, 8
2. 1, 2, 3, 4, 5, 7, 6, 8
3. 1, 2, 3, 4, 7, 5, 6, 8
4. 1, 4, 2, 5, 3, 6, 7, 8
5. 1, 4, 7, 8, 2, 5, 3, 6
6. None of the above
The key source of stalls in this code sequence is:
1. The depth of the floating point multiply pipeline
2. The depth of the floating point add pipeline
3. The load-use delay
4. A maybe dependence
5. None of the above
If the instruction set were IA-64 instead of MIPS, we could reduce stalls using:
1. An advanced load for instruction 1
2. An advanced load for instruction 2
3. A speculative load for instruction 1
4. A speculative load for instruction 2
5. None of the above

From the readings (20 points)

In Moore’s classic paper semiconductor scaling, he predicted which of the following technologies might be made possible:
1. Personal computers
2. Cell phones
3. Self-driving cars
4. None of the above
5. All of the above
Moore also predicted that the maximum number of transistors per chip would double:
1. Every 12 months
2. Every 18 months
3. Every 24 months
4. None of the above
In Wulf’s paper on Compilers and Com[uter Architecture, he argues that “the failure general- register machines to treat all their registers alike” violates the following property:
1. Regularity
2. Orthogonality
3. Composability
4. None of the above
5. All of the above
In Srinivasan, et al.’s paper on optimal pipeline depth, they argue that as the pipeline depth increases, the following changes also occur:
1. FO4 per stage increases, average glitch factor increases
2. FO4 per stage increases, average glitch factor decreases
3. FO4 per stage decreases, average glitch factor increases
4. FO4 per stage in decreases, average glitch factor decreases
5. None of the above
In Seznec, et al.’s paper on the Alpha EV-8 branch predictor, the authors argue that partial update is superior to full update because:
1. It limits the number of strengthened counters on a correct prediction.
2. It doesn’t steal a table entry if it can be avoided
3. It utilizes space better.
4. None of the above
5. All of the above
The Intel IA-64 instruction set includes the following features:
1. A register stack to pass arguments to and return values from subroutines
2. Support for full predication
3. Register renaming to support software pipelined loops
4. None of the above
5. All of the above

(blank page for additional work)

CS/ECE 752 Spring 2014 Exam 1: Computer Architecture and Organization, Study notes of Computer Architecture and Organization

Related documents

Partial preview of the text

Download CS/ECE 752 Spring 2014 Exam 1: Computer Architecture and Organization and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Midterm Exam 1

Instructions: