


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Assignment; Professor: Carothers; Class: PARALLEL PROGRAMMING; Subject: Computer Science; University: Rensselaer Polytechnic Institute; Term: Spring 2009;
Typology: Assignments
1 / 4
This page cannot be seen from the preview
Don't miss anything!



For this assignment you will be comparing your implementation of a parallel reduction on the Blue
Gene/L vs. the existing implementation (already done for you) on CUDA. The Blue Gene details are:
To execute the CUDA reduction algorithm, execute the following command on AREA 51 (IP
address is 72.224.56.37): /usr/local/cuda/bin/linux/release/reduction --n=SIZE Where SIZE is one of the array sizes above. You can examine the source code in: /usr/local/cuda/projects/reduction NOTE, you will have to execute the following shell command so the CUDA shared lib is
loaded when you execute the reduction command above.
export LD LIBRARY PATH=/usr/local/cuda/lib/:$LD LIBRARY PATH
2 Timing Code
The following is a routine that you would include as “rdtsc.h”. RDTSC stands for the “read time-
stamp counter” and it is an x86 or PowerPC assembly language instruction that returns a 64 bit
number that is the number of cycles this machine is processed since the last boot-time. The C code
for this is:
#ifndef RDTSC_H_DEFINED #define RDTSC_H_DEFINED
#if defined(i386)
static inline unsigned long long rdtsc(void) { unsigned long long int x; asm volatile (".byte 0x0f, 0x31" : "=A" (x)); return x; } #elif defined(x86_64)
// typedef unsigned long long int unsigned long long;
static inline unsigned long long rdtsc(void) { unsigned hi, lo;
unsigned long long total_time = 0;
rdtsc( start_time );
for( i; i < MAX_WHATEVER; i++ ) { DO TEST }
rdtsc( finish_time );
total_time = finish_time - start_time;
Note, I’ll place a copy of of this in rdtsc.h on the Class website for you to download.
3 HAND-IN INSTRUCTIONS
THE DEADLINE FOR THIS ASSIGNMENT IS TUESDAY, APRIL 7th, 2009.
Write-up a short report that describes your Blue Gene implementation and graphs your Blue
Gene and CUDA performance data for the different array sizes. Hand-in a hard copy of your report
in class on Tuesday, April 7th. Also, please a copy of your code on AREA 51 in your account under
the subdirectory assignment3.