Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

PARALLEL ALGORITHMS CONTEXT, Lecture notes of Parallel Computing and Programming

Walden University Parallel Computing and Programming

PARALLEL ALGORITHMS CONTEXT PARALLEL ALGORITHMS CONTEXT PARALLEL ALGORITHMS CONTEXT

Typology: Lecture notes

2022/2023

Available from 06/07/2023

cynthia-std 🇺🇸

84 documents

1 / 59

This page cannot be seen from the preview

Don't miss anything!

Parallel and &

Distributed Systems

MOD 4: PARALLEL

ALGORITHMS

Discover Lecture notes of Parallel Computing and Programming Walden University

Partial preview of the text

Download PARALLEL ALGORITHMS CONTEXT and more Lecture notes Parallel Computing and Programming in PDF only on Docsity!

Parallel and &

Distributed Systems

MOD 4: PARALLEL

ALGORITHMS

MOD 4 - Parallel Algorithms

CSE524 Parallel

Algorithms

Part I: Introduction

Goal: Set the parameters for studying parallelism

Why Study Parallelism?

(^) After all, for most of our daily computer uses, sequential processing is plenty fast - It is a fundamental departure from the “normal” computer model, therefore

it is inherently cool

(^) The extra power from parallel computers is enabling in science, engineering,

business, …

(^) Multicore chips present a new opportunity
(^) Deep intellectual challenges for CS -- models, programming languages,

algorithms, HW, …

Size vs Power

(^) Power5 (Server)
- (^) 389mm^
- (^) 120W@1900MHz
Intel Core2 sc (laptop)
- 130mm^
- 15W@1000MHz
ARM Cortex A8 (automobiles)
- (^) 5mm^
- 0.8W@800MHz
(^) Tensilica DP (cell phones / printers)
- (^) 0.8mm^
- (^) 0.09W@600MHz
(^) Tensilica Xtensa (Cisco router)
- 0.32mm^2 for 3!
- (^) 0.05W@600MHz

Intel Core

ARM TensilicaDP Xtensa x 3

Power 5

Each processor operates with 0.3-0.1 efficiency of the largest chip: more threads, lower power

Topic Overview

(^) Goal: To give a good idea of parallel computation
- Concepts -- looking at problems with “parallel eyes”
- (^) Algorithms -- different resources; different goals
- (^) Languages -- reduce control flow; increase independence; new abstractions
- (^) Hardware -- the challenge is communication, not instruction execution
- (^) Programming -- describe the computation without saying it sequentially
- (^) Practical wisdom about using parallelism

Parallel vs Distributed Computing

(^) Comparisons are often matters of degree Characteristic Parallel Distributed Overall Goal Speed Convenience Interactions Frequent Infrequent Granularity Fine Coarse Reliable Assumed Not Assumed

Parallel vs Concurrent

(^) In OS and DB communities execution of multiple threads is logically simultaneous
(^) In Arch and HPC communities execution of multiple threads is physically simultaneous
(^) The issues are often the same, say with respect to races
(^) Parallelism can achieve states that are impossible with concurrent execution because two events happen at once

Parallel Summation

(^) To sum a sequence in parallel
- add pairs of values producing 1st level results,
- add pairs of 1st level results producing 2nd level results,
- (^) sum pairs of 2nd level results …
(^) That is,

(…((A[0]+A[1]) + (A[2]+A[3])) + ... + (A[n-2]+A[n-1]))…)

Express the Two Formulations

(^) Graphic representation makes difference clear
- (^) Same number of operations; different order 6 4 16 10 16 14 2 8

What’s the Problem?

(^) It’s not likely a compiler will produce parallel code from a C specification any time soon…
(^) Fact: For most computations, a “best” sequential solution (practically, not theoretically) and a “best” parallel solution are usually fundamentally different … - (^) Different solution paradigms imply computations are not

“simply” related

(^) Compiler transformations generally preserve the solution

paradigm

Therefore... the programmer must discover the || solution

A Related Computation

(^) Consider computing the prefix sums
(^) Semantics ...
- (^) A[0] is unchanged
- (^) A[1] = A[1] + A[0]
- A[2] = A[2] + (A[1] + A[0])

(^) A[n-1] = A[n-1] + (A[n-2] + ( ... (A[1] + A[0]) … )

for (i=1; i<n; i++) {

A[i] += A[i-1];

A[i] is the sum of the first i + 1 elements What advantage can ||ism give?

10 0 0+ 6 6+0 4+6 4 16 16+10 10+26 10 16 16+36 14+52 14 2 2+66 8+68 8 26 10 10+ 30 36 36+ 10 66 66+ 36 0 0+ 40 36 36+ 76 0 0+ 0 Parallel Prefix Algorithm 6 4 16 10 16 14 2 8 6 10 26 36 52 66 68 76 Compute sum going up Figure prefixes going down Invariant: Parent data is sum of elements to left of subtree

Fundamental Tool of || Pgmming

(^) Original research on parallel prefix algorithm published by R. E. Ladner and M. J. Fischer Parallel Prefix Computation Journal of the ACM 27(4):831-838, 1980 The Ladner-Fischer algorithm requires 2log n time , twice as much as simple tournament global sum, not linear time

PARALLEL ALGORITHMS CONTEXT, Lecture notes of Parallel Computing and Programming

Related documents

Partial preview of the text

Download PARALLEL ALGORITHMS CONTEXT and more Lecture notes Parallel Computing and Programming in PDF only on Docsity!

Parallel and &

Distributed Systems

MOD 4: PARALLEL

ALGORITHMS

MOD 4 - Parallel Algorithms

CSE524 Parallel

Algorithms

Part I: Introduction

Goal: Set the parameters for studying parallelism

it is inherently cool

business, …

algorithms, HW, …

Intel Core

Power 5

(…((A[0]+A[1]) + (A[2]+A[3])) + ... + (A[n-2]+A[n-1]))…)

“simply” related

paradigm

for (i=1; i<n; i++) {

A[i] += A[i-1];

Applies to a wide class of operations