

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Assembly Language and Computer Organization; Subject: Computer Science/Computer Engineering; University: University of North Texas; Term: Fall 2008;
Typology: Study notes
1 / 2
This page cannot be seen from the preview
Don't miss anything!


sll $a2, $a2, 2 # max i= 2500 * 4 sll $a3, $a3, 2 # max j= 2500 * 4 add $v0, $zero, $zero # $v0 = 0 add $t0, $zero, $zero # i = 0 outer: add $t4, $a0, $t0 # $t4 = address of array 1[i] lw $t4, 0($t4) # $t4 = array 1[i] add $t1, $zero, $zero # j = 0 inner: add $t3, $a1, $t1 # $t3 = address of array 2[j] lw $t3, 0($t3) # $t3 = array 2[j] bne $t3, $t4, skip # if (array 1[i] != array 2[j]) skip $v0++ addi $v0, $v0, 1 # $v0++ skip addi $t1, $t1, 4 # j++ bne $t1, $a3, inner # loop if j != 2500 * 4 addi $t0, $t0, 4 # i++ bne $t0, $a2, outer # loop if i != 2500 * 4 The code determines the number of matching elements between the two arrays and returns this number in register $v0. 2.31 Ignoring the four instructions before the loops, we see that the outer loop (which iterates 2500 times) has three instructions before the inner loop and two after. The cycles needed to execute these are 1 + 2 + 1 = 4 cycles and 1 + 2 = 3 cycles, for a total of 7 cycles per iteration, or 2500 × 7 cycles. The inner loop requires 1 + 2 + 2 + 1 + 1 + 2 = 9 cycles per iteration and it repeats 2500 × 2500 times, for a total of 9 × 2500 × 2500 cycles. The total number of cycles executed is therefore (2500 × 7) + (9 × 2500 × 2500) = 56,267,500. The overall execution time is therefore (56,267,500) / (2 × 109) = 28 ms. Note that the execution time for the inner loop is really the only code of significance.
Pseudoinstruction What it accomplishes Solution move $t1, $t2 $t1 = $t2 add $t1, $t2, $zero clear $t0 $t0 = 0 add $t0, $zero, $zero beq $t1, small, L if ($t1 == small) go to L li $at, small beq $t1, $at, L beq $t2, big, L if ($t2 == big) go to L li $at, big beq $at, $zero, L li $t1, small $t1 = small addi $t1, $zero, small li $t2, big $t2 = big lui $t2, upper(big) ori $t2, $t2, lower(big) ble $t3, $t5, L if ($t3 <= $t5) go to L slt $at, $t5, $t beq $at, $zero, L bgt $t4, $t5, L if ($t4 > $t5) go to L slt $at, $t5, $t bne $at, $zero, L bge $t5, $t3, L if ($t5 >= $t3) go to L slt $at, $t5, $t beq $at, $zero, L addi $t0, $t2, big $t0 = $t2 + big li $at, big add $t0, $t2, $at lw $t5, big($t2) $t5 = Memory[$t2 + big] li $at, big add $at, $at, $t lw $t5, $t2, $at Note: In the solutions, we make use of the li instruction, which should be implemented as shown in rows 5 and 6.
Effective CPI = Sum of (CPI of instruction type × Frequency of execution) The average instruction frequencies for SPEC2000int and SPEC2000fp are 0. arithmetic (0.36 arithmetic and 0.11 logical), 0.375 data transfer, 0.12 conditional branch, 0.015 jump. Thus, the effective CPI is 0.47 × 1.0 + 0.375 × 1.4 + 0.12 × 1.