MIPS Assembly Midterm Examination: Problem Solutions, Study notes of Electrical and Electronics Engineering

Solutions to the problems presented in the mips assembly midterm examination held on october 11, 2006. Answers to questions related to writing c code, identifying raw, war, and waw conflicts, optimizing assembly code with register renaming and loop unrolling, and analyzing the performance of a speculative processor.

Typology: Study notes

Pre 2010

Uploaded on 11/08/2009

koofers-user-v3d
koofers-user-v3d 🇺🇸

5

(1)

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EEL 5708 – Midterm examination
Date: October 11, 2006
Name:
Student Code:
Instructions:
This exam is open book and open notes. Allotted time is 50 minutes.
Explicitly state all your assumptions.
Don’t just give answers, always try to write down the way you were thinking.
Even if you can not find a solution, write down the paths you have tried.
Problem 1 (50 points)
Consider the following assembly program written for a 32 bit MIPS machine with no
delayed branches.
li r20, 0 # load immediate
li r19, 16000
li r15, 0
start:
ld r1, 0(r20) # load into r1 from memory
add r15, r15, r1
addi r20, r20, 32 # add immediate
sub r3, r19, r20
bnez r3, start # jump to start if r3 not equal with zero
a) Write a C, C++ or Java program which corresponds to this assembly program
b) Identify the RAW, WAR and WAW conflicts in the assembly program.
c) Assume that the program is run on a machine where there is a stall of 3 cycles
between the writing and the reading of a register1. Rewrite the assembly program
by inserting NOP operations where there will be stalls.
d) Unroll the loop in the program 4 times in software (assembly). Transform the
program by register renaming such that the number of stalls is minimized.
1 That is, you need 3 NOPs (or other operations) between ADD R1,R2,R3 and ADD R6, R1,R5.
pf2

Partial preview of the text

Download MIPS Assembly Midterm Examination: Problem Solutions and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

EEL 5708 – Midterm examination

Date: October 11, 2006 Name: Student Code: Instructions:  This exam is open book and open notes. Allotted time is 50 minutes.  Explicitly state all your assumptions.  Don’t just give answers, always try to write down the way you were thinking. Even if you can not find a solution, write down the paths you have tried. Problem 1 (50 points) Consider the following assembly program written for a 32 bit MIPS machine with no delayed branches. li r20, 0 # load immediate li r19, 16000 li r15, 0 start: ld r1, 0(r20) # load into r1 from memory add r15, r15, r addi r20, r20, 32 # add immediate sub r3, r19, r bnez r3, start # jump to start if r3 not equal with zero a) Write a C, C++ or Java program which corresponds to this assembly program b) Identify the RAW, WAR and WAW conflicts in the assembly program. c) Assume that the program is run on a machine where there is a stall of 3 cycles between the writing and the reading of a register^1. Rewrite the assembly program by inserting NOP operations where there will be stalls. d) Unroll the loop in the program 4 times in software (assembly). Transform the program by register renaming such that the number of stalls is minimized. (^1) That is, you need 3 NOPs (or other operations) between ADD R1,R2,R3 and ADD R6, R1,R5.

Problem 2 (50 points) Suppose that a processor with a load/store architecture and no delayed branches executes at a clock rate of 2GHz. Arithmetic and logic instructions require 1 cycle, load and store operations 2 cycles, and conditional branches 3 cycles because of the control hazard involved. The typical applications run on this processor contain a mix of 70% arithmetic and logic instructions, 10% load and store instructions and 20% conditional branches instructions. An engineer proposes a modification in the architecture which introduces speculation. The branch prediction algorithm would be correct 80% of the time. When correct, branches would take 1 cycle, when incorrect, 3 cycles as before. However, the modification requires the reduction of the clock frequency to 1.5GHz. a) What is the average cycles per instruction of the original processor? b) What is the average cycles per instruction of the speculative processor? c) Is the speculative processor faster or slower than the original one? By how much?