OpenMP Programming: Dot Product of Two Vectors using Parallel Communication, Slides of Parallel Computing and Programming

Information on writing an openmp-based program to compute the dot product of two vectors with 1200 elements each on a 6-core smp system. It covers openmp directives and library functions, group communication operations, cost analysis, and examples of broadcast and reduction in matrix-vector multiplication. The document also discusses communication patterns in different topologies.

Typology: Slides

2011/2012

Uploaded on 07/23/2012

paramita
paramita ๐Ÿ‡ฎ๐Ÿ‡ณ

4.6

(16)

120 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Quiz๎˜ƒ#๎˜ƒ05
Marks๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ10๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒTime๎˜ƒ10๎˜ƒminutes
Write๎˜ƒan๎˜ƒOpenMP๎˜ƒbased๎˜ƒprogram๎˜ƒto๎˜ƒcompute๎˜ƒthe๎˜ƒ
Dot๎˜ƒproduct๎˜ƒof๎˜ƒtwo๎˜ƒvectors๎˜ƒA๎˜ƒ&๎˜ƒB๎˜ƒ(with๎˜ƒ1200๎˜ƒ
elements๎˜ƒeach)๎˜ƒusing๎˜ƒa๎˜ƒ6โ€Core๎˜ƒSMP๎˜ƒsystem.
Refer๎˜ƒdirective/๎˜ƒlib๎˜ƒfunction๎˜ƒlist
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download OpenMP Programming: Dot Product of Two Vectors using Parallel Communication and more Slides Parallel Computing and Programming in PDF only on Docsity!

Quiz

Marks

Time

minutes

Write

an

OpenMP

based

program

to

compute

the

Dot

product

of

two

vectors

A

B

(with

elements

each)

using

a

โ€Core

SMP

system.

Refer

directive/

lib

function

list

Some openMp directives & library Functions

void omp_set_num_threads

(int num_threads); int omp_get_num_threads ();int omp_get_max_threads ();int omp_get_thread_num ();int omp_get_num_procs ();int omp_in_parallel();

#pragma omp parallel [clause list]

reduction (operator: variable list). #pragma omp for [clause list

#pragma omp sections

#pragma omp section

void omp_set_lock

*(omp_lock_t lock); void omp_unset_lock

*(omp_lock_t lock); int omp_test_lock

*(omp_lock_t lock);

Summary: Group Communication

Operations

โ€ข^

Group communication operations are builtusing point-to-point messaging primitives.

-^

Communicating a message of size

m

over an

uncongested network takes time (

ts

+ t

w

m )

โ€ข^

Where necessary, we take congestion intoaccount explicitly by scaling the

t

w^

term.

โ€ข^

We assume that the network is bidirectionaland that communication is single-ported.

Cost Analysis: one-to-all broadcast

& reduction

-^

Using recursive doubling approach thebroadcast or reduction procedure on all thetopologies (array, mesh, tree, hypercube)involves

log p

point-to-point simple message

transfers, each at a time cost of

t^ s

+ t

w

m

-^

The total time (assuming no congestion) istherefore given by:

Broadcast and Reduction: Matrix-Vector Multiplication One-to-all broadcast and all-to-one reduction in the multiplication of a

4 x 4

matrix with a

4 x 1

vector.

Communication Patterns in Different

Topologies

  • One-to-All Broadcast and All-to-One

Reduction

  • All-to-All Broadcast and Reduction

& All-Reduce Operations

  • Scatter and Gather (one to all & all to

one personalized comm)

All-to-All Broadcast & Reduction

on a Ring

โ€ข^

Simplest approach: perform

p

one-to-all broadcasts

This is not the most efficient way, though.

-^

Each node first sends to one of its neighbors thedata it needs to broadcast.

-^

In subsequent steps, it forwards the data receivedfrom one of its neighbors to its other neighbor.

-^

The algorithm terminates in

p-

steps.

All-to-All Broadcast and Reduction on a Ring

-^

On a ring, the time is given by:

(t

s^

+ t

w

m)(p-1)

. docsity.com

All-to-all broadcast on a

3 x 3

mesh. The groups of nodes communicating with each other in each phase

are enclosed by dotted boundaries. By the end of the second phase, all nodes get (0,1,2,3,4,5,6,7)

All-to-all Broadcast on a Mesh

-^

On a mesh, the time is given by:

2t

(s^

p โ€“ 1) + t

m(p-1)w^