



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Computer Architecture Questions
Typology: Exercises
1 / 6
This page cannot be seen from the preview
Don't miss anything!




Single cycle 10000 - 2.0 ns - Multi cycle - - - - Pipelined - - - -
lw $t0, 0($t2) lw $t1, 4($t0) sub $s5, $t1, $t sw $s5, 4($t0)
mulacc r1, r2, r3 // r1 = r1 + r2*r
/* Process P1 / / Process P2 */
lw $t0,a lw $t2,a lw $t1,b lw $t3,c add $t0,$t0,t1 sub $t2,$t2,$t sw $t0,a sw $t2,a
void strcpy (char x[], char y[]) { int i; i = 0; while ((x[i] = y[i]) != 0) // copy and test byte i++; }
Assume you have a program with a serial part (S), which does not show any speedup when executed on a parallel processor, and a parallel part (P), which shows ideal speedup on a parallel processor. a. Assume the S = 10 % of the program, when running on a single core, with a certain data input with fixed size (D). Give the speedup when running this program on a 10-core system. b. Give the general formula for Speedup when running this program on an N-core parallel system? c. Now we assume so-called weak scaling, i.e. the amount of input data (D) for the parallel part is proportional to the number of cores (N); give again the general formula for speedup.
Question 12. Parallel execution of Matrix-Matrix multiply We are building a highly parallel system (like a GPU processor) to support matrix- multiplication. The system contains: