Download Shared Memory Parallelism - Parallel and Distributed Computing - Lecture Slides and more Slides Parallel Computing and Programming in PDF only on Docsity!
How Shared Memory Parallelism
Behaves
The Fork/Join Model
- Many shared memory parallel systems use a programming model called Fork/Join. Each program begins executing on just a single thread, called the parent.
- Fork: When a parallel region is reached, the parent thread spawns additional child threads as needed.
- Join: When the parallel region ends, the child threads shut down, leaving only the parent still running.
The Fork/Join Model (cont’d)
- In principle, as a parallel section completes, the child threads shut down (join the parent), forking off again when the parent reaches another parallel section.
- In practice, the child threads often continue to exist but are idle.
- Why?
Principle vs. Practice
5
Fork
Join
Start
End
Fork
Join
Start
End
Idle
OpenMP
What Is OpenMP?
- Portable, shared-memory threading API
- Fortran, C, and C++
- Multi-vendor support for both Linux and Windows
- Standardizes task & loop-level parallelism
- Supports coarse-grained parallelism
- Combines serial and parallel code in single source
- Standardizes ~ 20 years of compiler-directed threading experience
8
http://www.openmp.org Current spec is OpenMP 3. 318 Pages (combined C/C++ and Fortran)
A Few Syntax Details to Get Started
- Most of the constructs in OpenMP are compiler directives or pragmas - For C and C++, the pragmas take the form: - #pragma omp construct *clause *clause+…+
- Header file
Agenda
- What is OpenMP?
- Parallel regions
- Worksharing
- Data environment
- Synchronization
Code 1: Hello World(s)
Worksharing
- Worksharing is the general term used in OpenMP to describe distribution of work across threads.
- Three examples of worksharing in OpenMP are: - omp for construct - omp sections construct - omp task construct 14
Automatically divides work among threads
Combining constructs
- These two code segments are equivalent
16
#pragma omp parallel { #pragma omp for for (i=0;i< MAX; i++) { res[i] = huge(); } } #pragma omp parallel for for (i=0;i< MAX; i++) { res[i] = huge(); }
The Private Clause
- Reproduces the variable for each task
- Variables are un-initialized; C++ object is default constructed
- Any value external to the parallel region is undefined
17
void work(float c, int N) { #pragma omp parallel for private(x,y)**^ float x, y; int i; for(i=0; i<N; i++) { x = a[i]; y = b[i]; c[i] = x + y; }^ }
Schedule Clause Example
19
#pragma omp parallel for schedule (static, 8) for( int i = start; i <= end; i += 2 ) { if ( TestForPrime(i) ) gPrimesFound++; }
Iterations are divided into chunks of 8
- If start = 3, then first chunk is i ={3,5,7,9,11,13,15,17}
Data Scoping – What’s shared
- OpenMP uses a shared-memory programming model
- Shared variable - a variable whose name provides access to a the same block of storage for each task region - Shared clause can be used to make items explicitly shared - Global variables are shared among tasks - C/C++: File scope variables, namespace scope variables, static variables, Variables with const-qualified type having no mutable member are shared, Static variablesDocsity.com^20