ENEE 350H: Computer Organization Midterm Solutions, Exams of Computer Architecture and Organization

Solutions to the midterm exam for the computer organization course (enee 350h) at the university level. It includes assembly code translations, interpreting assembly code, interpreting machine code, processor performance analysis, and state machine theory. Useful for students preparing for exams, quizzes, or assignments related to computer organization.

Typology: Exams

Pre 2010

Uploaded on 07/30/2009

koofers-user-4x2-1
koofers-user-4x2-1 🇺🇸

10 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ENEE 350H: Computer Organization Midterm Solutions
1
You will have the entire class period to work on this exam. The exam is closed book, closed notes.
There are five questions, each is worth 4 points.
Name:______________________________________________________
LoginID: ____________________________________________________
Midterm Solutions
October 22, 1998
20 points (20% of final grade)
Solutions
blj
Distribution of Grades: 22 -
- 21.5 21.5 21.5 21.5 21.5
21 - 21
- 20.5 20.5
20 - 20 20 20 20
-
19 - 19
-
18 -
-
17 - 17
-
16 -
-
15 -
- 14.5
14 -
-
13 -
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download ENEE 350H: Computer Organization Midterm Solutions and more Exams Computer Architecture and Organization in PDF only on Docsity!

You will have the entire class period to work on this exam. The exam is closed book, closed notes.

There are five questions, each is worth 4 points.

Name: ______________________________________________________

LoginID: ____________________________________________________

Midterm Solutions

October 22, 1998

20 points (20% of final grade)

Solutions

blj

Distribution of Grades: 22 -

- 21.5 21.5 21.5 21.5 21.

- 20.5 20.

- 14.

  1. Writing assembly code

A. Translate the following C code to assembly. You do not have to declare .fills for a , b , and c.

int a, b, c;

do { if (b != c) { a++; } else { b++; } c = a & b; /* c = a AND b */ } while (a < b);

B. Translate the following C code to assembly. You do not have to declare .fills for s1 , s2 , or cp.

char s1[], s2[]; char *cp;

cp = s1; while (*cp != ‘\0’) { *s2 = *cp; cp++; }

C. (extra credit)

Create a code fragment to perform the following:

signed int x; /* these two variables are not held in memory; / unsigned char c; / they are held in registers (assume $1 and $2) */

x = (int)c;

loop: bne $b, $c, bNEc addi $a, $a, 1 jump endif bNEc: addi $b, $b, 1 endif: and $c, $a, $b blt $a, $b, loop

lw $s, addr.s2($0) lw $c, addr.s1($0) loop: lbu $1, 0($c) # load-byte-unsigned beq $1, $0, out sbu $1, 0($s) addi $c, $c, 1 jump loop out: ...

addr.s1: .fill s addr.s2: .fill s

andi $1, $2, 0x00FF # move c into x, make sure the top bits are zero

  1. Interpreting machine code

A. Interpret the following block of LC-998 machine code. The numbers are hexadecimal, and

you can assume that they are loaded into memory starting at address 0.

bd 4406 bd 4407 bd c 7fff 000a 5808 24e 2490 a

B. What does this code do?

0: 0100 1000 0000 1001 — lw 2 0 9 1: 0100 0100 0000 0101 — lw 1 0 5 2: 1011 1101 0000 0000 — jalr 7 2 3: 0100 0100 0000 0110 — lw 1 0 6 4: 1011 1101 0000 0000 — jalr 7 2 5: 0100 0100 0000 0111 — lw 1 0 7 6: 1011 1101 0000 0000 — jalr 7 2 7: 1100 0000 0000 0000 — halt 8: 0111 1111 1111 1111 — .fill 0x7fff (or sw 7 7 -1) 9: 0000 0000 0000 1010 — .fill 10 10: 0101 1000 0000 1000 — lw 6 0 8 11: 0010 0100 1110 0000 — nand 1 1 6 12: 0010 0100 1001 0000 — nand 1 1 1 13: 1010 0011 1000 0000 — jalr 0 7

this code performs three jumps to a subroutine that zeroes out the top bit of the number, then the program halts. the numbers that the program loads and sends to the subroutine are actually instructions.

  1. Processor performance

You have two processors, X and Z, with the following characteristics. X is a single-cycle

processor (an entire instruction executes in a single cycle), Z divides each instruction execution

into multiple cycles with dedicated data paths (it is not bus-based). They execute the same

instruction set (LC-998). Assume that our implementation has the following timing

characteristics:

- Cache read/write: 32 ns - Register file read/write: 17 ns - ALU operation: 23 ns

Z is not bus-based, but it only has one ALU, therefore its PC-increment happens at the end of each

instruction, after all other computation has finished. For JALR’s timing, assume that you can read

and write to the register file simultaneously (which is normally the case).

Assume we execute 100,000,000 instructions with the following distribution of instruction types:

- R-type: 35% - beq: 27% - jalr: 13% - lw: 16% - sw: 9%

A. Compare the execution times of the two processors.

B. Suppose that we redesigned Z so that the PC-increment could be overlapped in the next-to-last

cycle of each instruction, so long as that cycle didn’t already use the ALU? For example, the

next-to-last cycle of the R-type instructions uses the register file—the PC increment can use

the ALU at the same time without conflict. How do the processors X and Z´ compare?

R-type: IC -> RF -> ALU -> RF BEQ: IC -> RF -> ALU JALR: IC -> RF read/write LW: IC -> RF -> ALU -> DC -> RF SW: IC -> RF -> ALU -> DC

Tx = time for LW = 121 ns Tz = worst-case of any stage = ALU = 32 ns

R-type: IC -> RF -> ALU -> RF ... ALU BEQ: IC -> RF -> ALU ... ALU JALR: IC -> RF read/write LW: IC -> RF -> ALU -> DC -> RF ... ALU SW: IC -> RF -> ALU -> DC ... ALU

Machine X: Machine Z:

Execution time = TIC

Ex = Tx * Ix * Cx = (121 * 10 -9) * (10^8 ) * (1) = 12.1 sec Ez = Tz * Iz * Cz = (32 * 10-9) * (10^8 ) * (4.5) = 14.4 sec

CPI for machine Z = (0.35 * 5) + (0.27 * 4) + (0.13 * 2) + (0.16 * 6) + (0.09 * 5) = 4.

Ez/Ex = 14.4/12.1 = 1.19 ... X is roughly 19% faster than Z

R-type: IC -> RF -> ALU -> RF/ALU BEQ: IC -> RF -> ALU ... ALU JALR: IC -> RF read/write LW: IC -> RF -> ALU -> DC -> RF/ALU SW: IC -> RF -> ALU -> DC/ALU

Machine Z’:

Ez’ = Tz’ * Iz’ * Cz’ = (32 * 10-9) * (10^8 ) * (3.9) = 12.5 sec

CPI for Z’ = (0.35 * 4) + (0.27 * 4) + (0.13 * 2) + (0.16 * 5) + (0.09 * 4) = 3.

Ez’/Ex = 12.5/12.1 = 1.03 ... X is roughly 3% faster than Z’

  1. State machines

A. (extra credit)

How does a state machine work?

B. Construct a finite state machine that has four states and two input values (a single input that is

either 0 or 1). On each cycle, the state machine should output its state value. If the input is

zero, the state should return to zero. On every cycle in which the input value is non-zero, the

state machine should advance to the next numerical state. From state 3, either input should

return the state machine to the state 0. Show a truth table and resulting FSM. You can use the

same PLA notation as in the appendix example if you like.

INPUTS OUTPUTS

STATE IN OUT

New

State

A state machine is just a register feeding through some logic back into itself.

The next state is completely dependent on the present state and input values.

STATE

OUT-msb OUT-lsb

IN

New-State-msb New-State-lsb

  1. Appendix A — LC-998 Instruction-Set Architecture

There are 4 machine-code instruction formats: R-type, I-type, J-type, and O-type.

The following table describes the different opcodes.

Mnemonic

Name and

Format

Opcode

(binary)

Assembly

Format

Action

add

Add

R-type

000 add rA rB rC

Add contents of regB with regC,

store result in regA.

nand

Nand

R-type

001 nand rA rB rC

Nand contents of regB with regC,

store results in regA.

lw

Load word

I-type

010 lw rA rB imm

Load value from memory into regA.

Memory address is formed by adding

immediate with contents of regB.

sw

Store word

I-type

011 sw rA rB imm

Store value from regA into memory.

Memory address is formed by adding

immediate with contents of regB.

beq

Branch if

equal

I-type

100 beq rA rB imm

If the contents of regA and regB are the

same, branch to the address

PC+1+immediate, where PC is the

address of the beq instruction.

jalr

Jump-and-

link-register

J-type

101 jalr rA rB

Branch to the address in regB.

Store PC+1 into regA, where PC is the

address of the jalr instruction.

halt

Halt

O-type

110 halt

Halt the machine; do nothing and

announce that the machine has stopped.

nop

No-op

O-type

111 nop Do nothing for this cycle.

opcode reg A reg B reg C 0

3 bits 3 bits 3 bits 3 bits 4 bits R-type:

opcode reg A reg B signed immediate (-64 to 63)

3 bits 3 bits 3 bits 7 bits I-type:

opcode reg A reg B 0

3 bits 3 bits 3 bits 7 bits J-type:

opcode 0

3 bits 13 bits O-type:

Bit: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

Bit: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

  1. Appendix B. — PLA example

Here is an example PLA based on the following truth table:

INPUT OUTPUT

INPUT

OUT-msb OUT-lsb