Parallel Min and Max, Schemes and Mind Maps of Advanced Computer Programming

In this lab you will be writing a program to find the smallest and the largest numbers in an array. Consider Table 1 below which shows the steps for finding ...

Typology: Schemes and Mind Maps

2022/2023

Uploaded on 03/01/2023

agrima
agrima 🇺🇸

4.8

(10)

257 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Parallel Min and Max
Course Level:
CS1
PDC Concepts Covered:
PDC Concept
Bloom Level
Concurrency
C
Sequential Dependency
C
Data Parallel
A
Programming Knowledge Prerequisites:
Basic programming knowledge in C++ is required for this lab. More specifically, the below skills
are sufficient to complete the coding assignment.
1. Variable declaration
2. If-else
3. For loop
4. Functions/methods
Tools Required:
An editor
A C++ compiler that is OpenMP capable (e.g. gnu gcc C++ compiler)
Problem Description
In this lab you will be writing a program to find the smallest and the largest numbers in an
array. Consider Table 1 below which shows the steps for finding the smallest number in an array.
The algorithm uses a loop that indexes into the array using the loop variable. It also uses a
variable that holds the smallest value found in the array so far. This variable called smallest
begins with the value in the first item of the array. Then, at each iteration, the value of the array
item at the current index is compared to the value in smallest. If the array value is smaller, then
the smallest is updated to have the new smaller value. Once all iterations are finished, the
smallest variable will hold the overall smallest value in the array. Note that the loop only iterates
nine times. The algorithm starts iterating at index 1, not 0. It does not need to compare the first
number in the array because it is initially considered the smallest. Implementing an algorithm to
find the largest is similar.
Convince yourself that the table below is correct. Can you create an algorithm that does what
the table depicts?
pf3
pf4

Partial preview of the text

Download Parallel Min and Max and more Schemes and Mind Maps Advanced Computer Programming in PDF only on Docsity!

Parallel Min and Max

Course Level:

CS

PDC Concepts Covered: PDC Concept Bloom Level Concurrency C Sequential Dependency C Data Parallel A Programming Knowledge Prerequisites: Basic programming knowledge in C++ is required for this lab. More specifically, the below skills are sufficient to complete the coding assignment.

  1. Variable declaration
  2. If-else
  3. For loop
  4. Functions/methods Tools Required: An editor A C++ compiler that is OpenMP capable (e.g. gnu gcc C++ compiler) Problem Description In this lab you will be writing a program to find the smallest and the largest numbers in an array. Consider Table 1 below which shows the steps for finding the smallest number in an array. The algorithm uses a loop that indexes into the array using the loop variable. It also uses a variable that holds the smallest value found in the array so far. This variable called smallest begins with the value in the first item of the array. Then, at each iteration, the value of the array item at the current index is compared to the value in smallest. If the array value is smaller, then the smallest is updated to have the new smaller value. Once all iterations are finished, the smallest variable will hold the overall smallest value in the array. Note that the loop only iterates nine times. The algorithm starts iterating at index 1, not 0. It does not need to compare the first number in the array because it is initially considered the smallest. Implementing an algorithm to find the largest is similar. Convince yourself that the table below is correct. Can you create an algorithm that does what the table depicts?

array: (^) 5.0 10.3 8.7 5.0 52.9 18.0 13.0 2.3 82.7 68. Loop step Index array[index] Smallest initialize 5.0^ ß^ array[0] first 1 10.3^ 5. second 2 8.7^ 5. third 3 5.0 5. fourth 4 52.9^ 5. fifth 5 18.0^ 5. sixth 6 13.0^ 5. seventh 7 2 .3^ 2. eighth 8 82.7^ 2. ninth 9 68.2^ 2. Table 1: Summing integers in an array. Methodology To implement a parallel version, you will use domain decomposition, also sometimes called data decomposition, to find the smallest (min) and largest (max) value of the array in parallel. Domain decomposition requires dividing the array into equal parts and assigning each part to a processor. Consider the simple example of 12 numbers and three processors. The computation occurs in two phases. First, the work is divided equally among the three processors. In this case, each processor finds the min and max of four numbers in the array. The first processor finds the min/max value among the elements starting at index 0 and ending at index 3, the second processor finds the min/max value among the elements starting at index 4 and ending at index 7, and the third processor finds the min/max value among the elements starting at index 8 and ending at index 11. The second phase occurs after all of the processor are finished with the first phase. In this phase, each processor compares its min/max to the global min/max and updates the global values if needed.

Now, inside the FOR loop, we can read the values and update the local min and max as needed. The final thing to do is to combine the local mins and maxes into a single total min and max. The potential problem here is that there is only one global min (and max) variable. If multiple threads try to change the value then they will create a race condition. In other words, the final value depends on the order of the writes. In fact, if you run the program multiple times, you can get different answers each time! The solution to this is to use a critical section: #pragma omp critical { // put code here to update gmax and gmin } The code inside the braces is executed by all the threads, but only by one thread at a time. We can compare the local min/max to the global min/max and update the global values if needed inside this section to avoid the issues mentioned. To see if your parallel version of this program is any faster, add some timing code. OpenMP includes a function to get the current time. Its prototype is: double omp_get_wtime( ). Put a call to this function before and after your parallel code, then print the difference, i.e. double start = omp_get_wtime(); // Do all the parallel stuff above right here… double end = omp_get_wtime(); std::cout << "The time was " << end-start << std::endl; Post Lab Questions

  1. What is the average time for twenty runs of the serial version of the code?
  2. What is the average time for twenty runs of the parallel version of the code?
  3. Calculate the speedup of the parallel version. Is the parallel code significantly faster?
  4. The Methodology section above described how you decompose the summation routine to parallelize it. Obviously, OpenMP did all the work for you. How many elements of the array do you think OpenMP assigned to each processor? Hint: have your code print the number of threads in the computation (the function omp_get_thread_num() returns the number of threads). Turn In Submit a file called README, as a text file, that has the post lab questions and the answers to those questions. Also include your source code.