Understanding Vector Pipeline, Shared & Distributed Memory Models in Parallel Processing -, Study notes of Computer Science

An overview of parallel processing and distributed systems, focusing on vector pipeline, shared memory, and distributed memory models. It covers the concepts of subdividing arithmetic operations into stages, shared memory machines, and distributed memory machines. The document also discusses the importance of interconnection topology and strategies for reducing cache misses.

Typology: Study notes

Pre 2010

Uploaded on 02/12/2009

koofers-user-hwz
koofers-user-hwz 🇺🇸

10 documents

1 / 78

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
High Performance Computing
1. Code Optimization
2. Shared Memory Systems
3. Distributed Memory Systems
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e

Partial preview of the text

Download Understanding Vector Pipeline, Shared & Distributed Memory Models in Parallel Processing - and more Study notes Computer Science in PDF only on Docsity!

High Performance Computing^ 1.^ Code Optimization^ 2.^ Shared Memory Systems^ 3.^ Distributed Memory Systems

Why Supercomputers? We want to: • Run many problems in a timely manner (e.g.optimization) • Run more complex problems (e.g. multi-scale models) • Run larger problems (e.g. more resolution) • Run for longer times (e.g. atmospheric dispersionmodels) Major issues: • Speed • Memory • Storage

Forms of Parallelism • Multiple functional units • Pipelining • Vector processors • Multiprocessor systems • Distributed systems

Multiple Functional Units • This is one of the earliest forms of parallelism • It consists of multiplying the number of functional unitssuch as adders and multipliers • The detection of parallelism is done at compile time witha dependence analysis tree^ +^ +^ +^ a^ b^ *^ c^ d • Example of dependence analysis for arithmeticexpression and parallel processing of operations

(a + b) + (c * d + d * e) * e f

Vector Processors • Vector computers are equipped with pipelined functionalunits such as pipelined floating point adders andmultipliers • In addition they incorporate vector instructions explicitlyas part of their instructions sets. For example: – vload: load a vector from memory to a vector register – vadd: add the content of two vector registers – vmul: multiply the content of two vector registers • Similarly to multiple functional units for scalar machines,vector pipelines can be duplicated into

multiple vector pipelines • Examples: NEC, Fujitsu

  • Multiprocessor Systems • A multiprocessor system is a computer, or a set ofcomputers, consisting of several processing elements,each consisting of a CPU, a memory and an I/O system • The processing elements are interconnected with a busor network • Examples: Dual Processor PC’s, IBM SP

Parallel Processing Models • Vector pipeline model • Shared memory model • Single instruction multiple data model or data parallelmodels • Distributed memory message passing model

Vector pipeline model • Vector computers have vector pipeline processorsconnected to a large global memory • The parallelization is done at the level of the arithmeticoperations • Useful when we need to do many times the sameoperation on an array of data (vector operation) • Model: subdivide the arithmetic operations into differentstages and perform every stage on a different entry

SIMD or data parallel model • A typical distributed memory machine consists in a largenumber of identical processors with their own memoryinterconnected in a regular topology • In the SIMD model a host processor stores the program andeach slave processor holds different data • The host then broadcast instructions to processors, whichexecute them simultaneously • The interconnection topology is important for mapping thephysical problem to the processing space^ P^ P^ P^

P P P P P P P P P P P P P P P P P^ P P

P^ P P^ P P^ P P^ P Ring^ Mesh^

N-cube

MIMD or distributed computing model • Distributed computing uses the

message passing^ model

-^ In this model there is no global synchronization of theparallel tasks •^ Computations are^ data driven

: a processor performs a given task only when the operands it requires becomeavailable • The programmer must code all the data exchangesbetween the processors explicitly

Grid Computing • The idea of grid computing is that a user submits acalculation to a computational grid^ (a heterogeneousnetwork of computers) and gets the results withouthaving to worry on which computer it will run • The system monitors the availability and usage ofresources depending on the size of the requestedcalculation and ships the necessary codes and data forexecution to the appropriate computer • When the results are ready they are shipped back to theuser

Grid Systems Key components of a grid computing system include • System management (which machines are online) • Resources management (what’s the load of eachmachine) • Process management (are the processes alive / done ?) • Fault tolerance (if a process is interrupted, it is re-started)

High Performance Programming • Coding for scalar computers • Coding for vector computers • Coding for shared memory computers • Coding for distributed memory computers

Memory architecture Modern computers have a hierarchical memory system: • Backup system: tapes / DVD / CD-ROM / floppy • Hard disk / virtual or swap memory • RAM • Cache • Registers on the processor

Processor^ Registers^ Cache^ RAM Tape Hard Disk^