Download Understanding Vector Pipeline, Shared & Distributed Memory Models in Parallel Processing - and more Study notes Computer Science in PDF only on Docsity!
High Performance Computing^ 1.^ Code Optimization^ 2.^ Shared Memory Systems^ 3.^ Distributed Memory Systems
Why Supercomputers? We want to: • Run many problems in a timely manner (e.g.optimization) • Run more complex problems (e.g. multi-scale models) • Run larger problems (e.g. more resolution) • Run for longer times (e.g. atmospheric dispersionmodels) Major issues: • Speed • Memory • Storage
Forms of Parallelism • Multiple functional units • Pipelining • Vector processors • Multiprocessor systems • Distributed systems
Multiple Functional Units • This is one of the earliest forms of parallelism • It consists of multiplying the number of functional unitssuch as adders and multipliers • The detection of parallelism is done at compile time witha dependence analysis tree^ +^ +^ +^ a^ b^ *^ c^ d • Example of dependence analysis for arithmeticexpression and parallel processing of operations
(a + b) + (c * d + d * e) * e f
Vector Processors • Vector computers are equipped with pipelined functionalunits such as pipelined floating point adders andmultipliers • In addition they incorporate vector instructions explicitlyas part of their instructions sets. For example: – vload: load a vector from memory to a vector register – vadd: add the content of two vector registers – vmul: multiply the content of two vector registers • Similarly to multiple functional units for scalar machines,vector pipelines can be duplicated into
multiple vector pipelines • Examples: NEC, Fujitsu
- Multiprocessor Systems • A multiprocessor system is a computer, or a set ofcomputers, consisting of several processing elements,each consisting of a CPU, a memory and an I/O system • The processing elements are interconnected with a busor network • Examples: Dual Processor PC’s, IBM SP
Parallel Processing Models • Vector pipeline model • Shared memory model • Single instruction multiple data model or data parallelmodels • Distributed memory message passing model
Vector pipeline model • Vector computers have vector pipeline processorsconnected to a large global memory • The parallelization is done at the level of the arithmeticoperations • Useful when we need to do many times the sameoperation on an array of data (vector operation) • Model: subdivide the arithmetic operations into differentstages and perform every stage on a different entry
SIMD or data parallel model • A typical distributed memory machine consists in a largenumber of identical processors with their own memoryinterconnected in a regular topology • In the SIMD model a host processor stores the program andeach slave processor holds different data • The host then broadcast instructions to processors, whichexecute them simultaneously • The interconnection topology is important for mapping thephysical problem to the processing space^ P^ P^ P^
P P P P P P P P P P P P P P P P P^ P P
P^ P P^ P P^ P P^ P Ring^ Mesh^
N-cube
MIMD or distributed computing model • Distributed computing uses the
message passing^ model
-^ In this model there is no global synchronization of theparallel tasks •^ Computations are^ data driven
: a processor performs a given task only when the operands it requires becomeavailable • The programmer must code all the data exchangesbetween the processors explicitly
Grid Computing • The idea of grid computing is that a user submits acalculation to a computational grid^ (a heterogeneousnetwork of computers) and gets the results withouthaving to worry on which computer it will run • The system monitors the availability and usage ofresources depending on the size of the requestedcalculation and ships the necessary codes and data forexecution to the appropriate computer • When the results are ready they are shipped back to theuser
Grid Systems Key components of a grid computing system include • System management (which machines are online) • Resources management (what’s the load of eachmachine) • Process management (are the processes alive / done ?) • Fault tolerance (if a process is interrupted, it is re-started)
High Performance Programming • Coding for scalar computers • Coding for vector computers • Coding for shared memory computers • Coding for distributed memory computers
Memory architecture Modern computers have a hierarchical memory system: • Backup system: tapes / DVD / CD-ROM / floppy • Hard disk / virtual or swap memory • RAM • Cache • Registers on the processor
Processor^ Registers^ Cache^ RAM Tape Hard Disk^