Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Multiprocessor - Advance Computers Architectures - Lecture Slides, Slides of Computer Architecture and Organization

Islamic University of Science & Technology Computer Architecture and Organization

Main points of this lecture are: Multiprocessor, Uniprocessor Performance, Data-Intensive Applications, Leveraging Design, Flynn’s Taxonomy, Centralized Memory Multiprocessor, Distributed Memory, Centralized Memory, Symmetric Multiprocessors

Typology: Slides

2012/2013

Uploaded on 04/23/2013

atasi 🇮🇳

4.6

(32)

134 documents

1 / 45

This page cannot be seen from the preview

Don't miss anything!

CIS 600 Advanced Computer

Architecture

Lecture 8 –Multiprocessor

Introduction

Docsity.com

Discover Slides of Computer Architecture and Organization Islamic University of Science & Technology

Partial preview of the text

Download Multiprocessor - Advance Computers Architectures - Lecture Slides and more Slides Computer Architecture and Organization in PDF only on Docsity!

CIS 600 Advanced Computer

Architecture

Lecture 8 –Multiprocessor

Introduction

Uniprocessor Performance (SPECint)

100

1000

10000

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

Performance (vs. VAX-11/780) 25%/year

52%/year

??%/year

VAX : 25%/year 1978 to 1986
RISC + x86: 52%/year 1986 to 2002
RISC + x86: ??%/year 2002 to present

From Hennessy and Patterson, Computer Architecture: A Quantitative Approach , 4th edition, 2006

Docsity.com

Other Factors ⇒ Multiprocessors

Growth in data-intensive applications
- Data bases, file servers, …
Growing interest in servers, server perf.
Increasing desktop perf. less important
- Outside of graphics
Improved understanding in how to use multiprocessors effectively - Especially server where significant natural TLP
Advantage of leveraging design investment by replication - Rather than unique design

Flynn’s Taxonomy

Flynn classified by data and control streams in 1966
SIMD ⇒ Data Level Parallelism
MIMD ⇒ Thread Level Parallelism
MIMD popular because
- Flexible: N pgms and 1 multithreaded pgm
- Cost-effective: same MPU in desktop & MIMD

Single Instruction Single Data (SISD) (Uniprocessor)

Single Instruction Multiple Data SIMD (single PC: Vector, CM-2) Multiple Instruction Single Data (MISD) (????)

Multiple Instruction Multiple Data MIMD (Clusters, SMP servers)

Proc. of the IEEE^ M.J. Flynn, "Very High-Speed Computers",, V 54, 1900-1909, Dec. 1966.

Centralized vs. Distributed Memory

P 1 $ Interconnection network

Mem (^) Mem

P 1 $ Interconnection network

Mem Mem

Centralized Memory Distributed Memory

Scale

Centralized Memory Multiprocessor

Also called symmetric multiprocessors (SMPs) because single main memory has a symmetric relationship to all processors
Large caches ⇒ single memory can satisfy memory demands of small number of processors
Can scale to a few dozen processors by using a switch and by using many memory banks
Although scaling beyond that is technically conceivable, it becomes less attractive as the number of processors sharing centralized memory increases

2 Models for Communication and

Memory Architecture

1. Communication occurs by explicitly passing

messages among the processors:

message-passing multiprocessors

2. Communication occurs through a shared

address space (via loads and stores):

shared memory multiprocessors either

UMA (Uniform Memory Access time) for shared address, centralized memory MP
NUMA (Non Uniform Memory Access time multiprocessor) for shared address, distributed memory MP
In past, confusion whether “sharing” means

sharing physical memory (Symmetric MP) or

sharing address space

Challenges of Parallel Processing

First challenge is % of program inherently sequential
Suppose 80X speedup from 100 processors. What fraction of original program can be sequential? a. 10% b. 5% c. 1% d. <1%

Challenges of Parallel Processing

Application parallelism ⇒ primarily via new algorithms that have better parallel performance
Long remote latency impact ⇒ both by architect and by the programmer

For example, reduce frequency of remote accesses either by
- Caching shared data (HW)
- Restructuring the data layout to make more accesses local (SW)
Today’s lecture on HW to help latency via caches

Symmetric Shared-Memory

Architectures

From multiple boards on a shared bus to multiple processors inside a single chip
Caches both
- Private data are used by a single processor
- Shared data are used by multiple processors
Caching shared data ⇒ reduces latency to shared data, memory bandwidth for shared data, and interconnect bandwidth ⇒ cache coherence problem

Example

Intuition not guaranteed by coherence
expect memory to respect order between

accesses to different locations issued by a given process

to preserve orders among accesses to same location by different processes

P 1 P 2 /Assume initial value of A and flag is 0/ A = 1; while (flag == 0); /spin idly/ flag = 1; print A;

Mem

P 1 Pn

Conceptual Picture

Intuitive Memory Model

Too vague and simplistic; 2 issues

Coherence defines values returned by a

read

Consistency determines when a written

value will be returned by a read

Disk

Memory

100:

Reading an address should return the last value written to that address
- Easy in uniprocessors, except for I/O

Write Consistency

For now assume

A write does not complete (and allow the

next write to occur) until all processors have seen the effect of that write

The processor does not change the order of

any write with respect to any other memory access

⇒ if a processor writes location A followed by

location B, any processor that sees the new value of B must also see the new value of A

Basic Schemes for Enforcing Coherence

Program on multiple processors will normally

have copies of the same data in several caches

Unlike I/O, where its rare
Rather than trying to avoid sharing in SW,

SMPs use a HW protocol to maintain coherent caches

Migration and Replication key to performance of shared data
Migration - data can be moved to a local cache

and used there in a transparent fashion

Multiprocessor - Advance Computers Architectures - Lecture Slides, Slides of Computer Architecture and Organization

Related documents

Partial preview of the text

Download Multiprocessor - Advance Computers Architectures - Lecture Slides and more Slides Computer Architecture and Organization in PDF only on Docsity!

CIS 600 Advanced Computer

Architecture

Lecture 8 –Multiprocessor

Introduction

Other Factors ⇒ Multiprocessors

Flynn’s Taxonomy

Centralized vs. Distributed Memory

Centralized Memory Multiprocessor

2 Models for Communication and

Memory Architecture

1. Communication occurs by explicitly passing

messages among the processors:

message-passing multiprocessors

2. Communication occurs through a shared

address space (via loads and stores):

shared memory multiprocessors either

sharing physical memory (Symmetric MP) or

sharing address space

Challenges of Parallel Processing

Challenges of Parallel Processing

Symmetric Shared-Memory

Architectures

Example

Intuitive Memory Model

Write Consistency