Practice Midterm Exam - High Performance Computing | CSI 702, Exams of Computer Science

Midterm exam Material Type: Exam; Professor: Wallin; Class: High-Performance Comput; Subject: Computational Sci& Informatics; University: George Mason University; Term: Fall 2007;

Typology: Exams

Pre 2010

Uploaded on 12/09/2008

koofers-user-dap
koofers-user-dap 🇺🇸

5

(1)

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSI 702 FALL 2007 Midterm Exam 1
CSI 702 FALL 2007 Midterm Exam
Answers should be brief - most require only one paragraph. Please use additional paper to
answer these questions, and write your name at the top of each page. Goo d luck!
Honor Code Certification
Name :
I certify that I have abided by the GMU honor code in taking this examination. The work
on this exam is my own. I have received no assistance from other persons in completing this
exam.
Signature:
1. (35 pts) Optimization and Modern CPU’s For this problem, use the table, plots, and
profile on the next page.
(a) (5 pts) Explain why branches within loops generally reduce performance.
(b) (5 pts) Explain why unrolling loops generally improves performance.
(c) (5 pts) Give a short pseudo-code example of how a “sentinel value” can be used to
improve performance.
(d) (5 pts) The table and graph below show the CPU and memory requirements as
a function of run-size for a particular code using the workstations in Research I
room 249 (aka the COS cluster). What is the maximum run size practical on this
workstation and why?
(e) (5 pts) Assume you needed to do twenty runs with a size of 100 million. Estimate
the computational requirements and CPU configuration for the run. Justify this
estimate, and explain changes that might be needed to run the code on this machine.
(f) (5 pts) Estimate how many years it will be until a run of size 100 million could be
executed on a typical home computer.
(g) (5 pts) From the abbreviated profile of the code, briefly discuss the prospects for
optimizing this code.
pf3
pf4

Partial preview of the text

Download Practice Midterm Exam - High Performance Computing | CSI 702 and more Exams Computer Science in PDF only on Docsity!

CSI 702 FALL 2007 Midterm Exam

Answers should be brief - most require only one paragraph. Please use additional paper to answer these questions, and write your name at the top of each page. Good luck!

Honor Code Certification Name :

I certify that I have abided by the GMU honor code in taking this examination. The work on this exam is my own. I have received no assistance from other persons in completing this exam.

Signature:

  1. (35 pts) Optimization and Modern CPU’s For this problem, use the table, plots, and profile on the next page.

(a) (5 pts) Explain why branches within loops generally reduce performance. (b) (5 pts) Explain why unrolling loops generally improves performance. (c) (5 pts) Give a short pseudo-code example of how a “sentinel value” can be used to improve performance. (d) (5 pts) The table and graph below show the CPU and memory requirements as a function of run-size for a particular code using the workstations in Research I room 249 (aka the COS cluster). What is the maximum run size practical on this workstation and why? (e) (5 pts) Assume you needed to do twenty runs with a size of 100 million. Estimate the computational requirements and CPU configuration for the run. Justify this estimate, and explain changes that might be needed to run the code on this machine. (f) (5 pts) Estimate how many years it will be until a run of size 100 million could be executed on a typical home computer. (g) (5 pts) From the abbreviated profile of the code, briefly discuss the prospects for optimizing this code.

Run Size Memory Requirements (bytes) Total CPU Time (seconds) 262144 109069440 4102. 131072 54543488 1953. 65536 27280512 926. 32768 13649024 425. 16384 6833280 185. 8192 3425408 88. 4029 1721472 37. 2048 869504 16.

0

2e+

4e+

6e+

8e+

1e+

1.2e+

0 50000 100000 150000 200000 250000 300000

Memory Requirements (bytes)

problem size

single node run

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0 50000 100000 150000 200000 250000 300000

CPU time (seconds)

problem size

single node run

Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ks/call Ks/call name 73.72 3023.91 3023.91 47226979 0.00 0.00 gravity_module_mp_calculate_gravity_ 19.78 3835.35 811.44 47226979 0.00 0.00 walk_module_mp_walk_theta_ 2.10 3921.56 86.21 366 0.00 0.01 walk_module_mp_gravity_walk_ 1.04 3964.36 42.80 366 0.00 0.00 build_module_mp_build_recursive_ 0.48 3984.03 19.68 366 0.00 0.00 build_module_mp_quadrupole_recursive 0.26 3994.59 10.56 366 0.00 0.00 build_module_mp_build_master_ 0.23 4004.20 9.60 364 0.00 0.00 verlet_module_mp_predict_verlet_its_ 0.21 4012.74 8.55 366 0.00 0.00 build_module_mp_build_tree_ 0.21 4021.22 8.47 366 0.00 0.00 build_module_mp_set_mac_bmax_

#define NUM_THREADS 8 int counter = 0;

void *test_mutex(void *tid) { int *tt; int thread_id; int rnd; int count_start;

tt = tid; thread_id = (int) *tt; count_start = counter; rnd = rand() % 5; sleep(rnd); counter = count_start + thread_id; printf("Hello World! It’s me, thread #%d counter = %d!\n", thread_id, counter); }

int main(int argc, char *argv[]){ pthread_t thread1[NUM_THREADS]; int t; int ec; int thread_ids[NUM_THREADS];

srand(time(0));

for(t=0;t<NUM_THREADS;t++){ thread_ids[t] = t; printf("In main: creating thread %d\n", t); ec = pthread_create(&thread1[t], NULL, test_mutex, (void *) &thread_ids[t]); } printf( "the final results is %d \n",counter); pthread_exit(NULL);