Assembly Language and Computer Organization - Assignment | CSCE 2610, Study notes of Computer Architecture and Organization

Material Type: Notes; Class: Assembly Language and Computer Organization; Subject: Computer Science/Computer Engineering; University: University of North Texas; Term: Fall 2008;

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-n61
koofers-user-n61 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
2.30
sll $a2, $a2, 2 # max i= 2500 * 4
sll $a3, $a3, 2 # max j= 2500 * 4
add $v0, $zero, $zero # $v0 = 0
add $t0, $zero, $zero # i = 0
outer: add $t4, $a0, $t0 # $t4 = address of array 1[i]
lw $t4, 0($t4) # $t4 = array 1[i]
add $t1, $zero, $zero # j = 0
inner: add $t3, $a1, $t1 # $t3 = address of array 2[j]
lw $t3, 0($t3) # $t3 = array 2[j]
bne $t3, $t4, skip # if (array 1[i] != array 2[j]) skip $v0++
addi $v0, $v0, 1 # $v0++
skip addi $t1, $t1, 4 # j++
bne $t1, $a3, inner # loop if j != 2500 * 4
addi $t0, $t0, 4 # i++
bne $t0, $a2, outer # loop if i != 2500 * 4
The code determines the number of matching elements between the two arrays
and returns this number in register $v0.
2.31 Ignoring the four instructions before the loops, we see that the outer loop
(which iterates 2500 times) has three instructions before the inner loop and two
after. The cycles needed to execute these are 1 + 2 + 1 = 4 cycles and 1 + 2 = 3
cycles, for a total of 7 cycles per iteration, or 2500 × 7 cycles. The inner loop
requires 1 + 2 + 2 + 1 + 1 + 2 = 9 cycles per iteration and it repeats 2500 × 2500
times, for a total of 9 × 2500 × 2500 cycles. The total number of cycles executed is
therefore (2500 × 7) + (9 × 2500 × 2500) = 56,267,500. The overall execution time
is therefore (56,267,500) / (2 × 109) = 28 ms. Note that the execution time for the
inner loop is really the only code of significance.
pf2

Partial preview of the text

Download Assembly Language and Computer Organization - Assignment | CSCE 2610 and more Study notes Computer Architecture and Organization in PDF only on Docsity!

sll $a2, $a2, 2 # max i= 2500 * 4 sll $a3, $a3, 2 # max j= 2500 * 4 add $v0, $zero, $zero # $v0 = 0 add $t0, $zero, $zero # i = 0 outer: add $t4, $a0, $t0 # $t4 = address of array 1[i] lw $t4, 0($t4) # $t4 = array 1[i] add $t1, $zero, $zero # j = 0 inner: add $t3, $a1, $t1 # $t3 = address of array 2[j] lw $t3, 0($t3) # $t3 = array 2[j] bne $t3, $t4, skip # if (array 1[i] != array 2[j]) skip $v0++ addi $v0, $v0, 1 # $v0++ skip addi $t1, $t1, 4 # j++ bne $t1, $a3, inner # loop if j != 2500 * 4 addi $t0, $t0, 4 # i++ bne $t0, $a2, outer # loop if i != 2500 * 4 The code determines the number of matching elements between the two arrays and returns this number in register $v0. 2.31 Ignoring the four instructions before the loops, we see that the outer loop (which iterates 2500 times) has three instructions before the inner loop and two after. The cycles needed to execute these are 1 + 2 + 1 = 4 cycles and 1 + 2 = 3 cycles, for a total of 7 cycles per iteration, or 2500 × 7 cycles. The inner loop requires 1 + 2 + 2 + 1 + 1 + 2 = 9 cycles per iteration and it repeats 2500 × 2500 times, for a total of 9 × 2500 × 2500 cycles. The total number of cycles executed is therefore (2500 × 7) + (9 × 2500 × 2500) = 56,267,500. The overall execution time is therefore (56,267,500) / (2 × 109) = 28 ms. Note that the execution time for the inner loop is really the only code of significance.

Pseudoinstruction What it accomplishes Solution move $t1, $t2 $t1 = $t2 add $t1, $t2, $zero clear $t0 $t0 = 0 add $t0, $zero, $zero beq $t1, small, L if ($t1 == small) go to L li $at, small beq $t1, $at, L beq $t2, big, L if ($t2 == big) go to L li $at, big beq $at, $zero, L li $t1, small $t1 = small addi $t1, $zero, small li $t2, big $t2 = big lui $t2, upper(big) ori $t2, $t2, lower(big) ble $t3, $t5, L if ($t3 <= $t5) go to L slt $at, $t5, $t beq $at, $zero, L bgt $t4, $t5, L if ($t4 > $t5) go to L slt $at, $t5, $t bne $at, $zero, L bge $t5, $t3, L if ($t5 >= $t3) go to L slt $at, $t5, $t beq $at, $zero, L addi $t0, $t2, big $t0 = $t2 + big li $at, big add $t0, $t2, $at lw $t5, big($t2) $t5 = Memory[$t2 + big] li $at, big add $at, $at, $t lw $t5, $t2, $at Note: In the solutions, we make use of the li instruction, which should be implemented as shown in rows 5 and 6.

Effective CPI = Sum of (CPI of instruction type × Frequency of execution) The average instruction frequencies for SPEC2000int and SPEC2000fp are 0. arithmetic (0.36 arithmetic and 0.11 logical), 0.375 data transfer, 0.12 conditional branch, 0.015 jump. Thus, the effective CPI is 0.47 × 1.0 + 0.375 × 1.4 + 0.12 × 1.

  • 0.015 × 1.2 = 1.2.