Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Multithreading - Intro to Computer Architecture - Lecture Slides, Slides of Computer Architecture and Organization

During the course work of the Intro to Computer Architecture, we study the main concept regarding the:Multithreading, Pipeline Hazards, Peripheral Processors, Simple Multithreaded Pipeline, Multithreading Costs, Thread Scheduling Policies, Coarse-Grained Multithreading, Multithreading Design Choices, Instruction Format

Typology: Slides

2012/2013

Uploaded on 05/06/2013

anurati 🇮🇳

4.2

(24)

121 documents

1 / 30

This page cannot be seen from the preview

Don't miss anything!

CS 162 Computer Architecture

Lecture 10: Multithreading

Docsity.com

Partial preview of the text

Download Multithreading - Intro to Computer Architecture - Lecture Slides and more Slides Computer Architecture and Organization in PDF only on Docsity!

CS 162 Computer Architecture

Lecture 10: Multithreading

Pipeline Hazards

LW r1, 0(r2) LW r5, 12(r1) ADDI r5, r5, # SW 12(r1), r

Each instruction may depend on the next
- Without bypassing, need interlocks LW r1, 0(r2) LW r5, 12(r1) ADDI r5, r5, # SW 12(r1), r
Bypassing cannot completely eliminate interlocks or delay slots

CDC 6600 Peripheral Processors

(Cray, 1965)

First multithreaded hardware
10 “virtual” I/O processors
fixed interleave on simple pipeline
pipeline has 100ns cycle time
each processor executes one

instruction every 1000ns

accumulator-based instruction set to

reduce processor state

Simple Multithreaded Pipeline

Have to carry thread select down pipeline to ensure correct state bits read/written at each pipe stage

Thread Scheduling Policies

Fixed interleave (CDC 6600 PPUs, 1965)
- each of N threads executes one instruction every N cycles
- if thread not ready to go in its slot, insert pipeline bubble
Software-controlled interleave (TI ASC PPUs, 1971)
- OS allocates S pipeline slots amongst N threads
- hardware performs fixed interleave over S slots, executing whichever thread is in that slot
Hardware-controlled thread scheduling (HEP, 1982)
- hardware keeps track of which threads are ready to go
- picks next thread to execute based on hardware priority scheme

What “Grain” Multithreading?

So far assumed fine-grained multithreading
- CPU switches every cycle to a different thread
- When does this make sense?
Coarse-grained multithreading
- CPU switches every few cycles to a different thread
- When does this make sense?

Denelcor HEP

(Burton Smith, 1982)

First commercial machine to use

hardware threading in main CPU

120 threads per processor
10 MHz clock rate
Up to 8 processors
precursor to Tera MTA (Multithreaded Architecture)

Tera MTA Overview

Up to 256 processors
Up to 128 active threads per processor
Processors and memory modules populate a sparse 3D torus interconnection fabric
Flat, shared main memory
- No data cache
- Sustains one main memory access per cycle per processor
50W/processor @ 260MHz

MTA Multithreading

Each processor supports 128 active hardware threads - 128 SSWs, 1024 target registers, 4096 general- purpose registers
Every cycle, one instruction from one active thread is launched into pipeline
Instruction pipeline is 21 cycles long
At best, a single thread can issue one instruction every 21 cycles - Clock rate is 260MHz, effective single thread issue rate is 260/21 = 12.4MHz

MTA Pipeline

MIT Alewife

Modified SPARC chips
- register windows hold different thread contexts
Up to four threads per node
Thread switch on local cache miss

IBM PowerPC RS64-III

(Pulsar)

Commercial coarse-grain multithreading CPU
Based on PowerPC with quad-issue in-order fivestage pipeline
Each physical CPU supports two virtual CPUs
On L2 cache miss, pipeline is flushed and execution switches to second thread - short pipeline minimizes flush penalty (4 cycles), small compared to memory access latency - flush pipeline to simplify exception handling

Multithreading - Intro to Computer Architecture - Lecture Slides, Slides of Computer Architecture and Organization

Related documents

Partial preview of the text

Download Multithreading - Intro to Computer Architecture - Lecture Slides and more Slides Computer Architecture and Organization in PDF only on Docsity!

CS 162 Computer Architecture

Lecture 10: Multithreading

Pipeline Hazards

CDC 6600 Peripheral Processors

(Cray, 1965)

instruction every 1000ns

reduce processor state

Simple Multithreaded Pipeline

Thread Scheduling Policies

What “Grain” Multithreading?

Denelcor HEP

(Burton Smith, 1982)

hardware threading in main CPU

Tera MTA Overview

MTA Multithreading

MTA Pipeline

MIT Alewife

IBM PowerPC RS64-III

(Pulsar)

Superscalar Machine

Efficiency

Vertical Multithreading