Simultaneous Multithreading - Computer Systems Architecture - Lecture Slides, Slides of Computer Architecture and Organization

Some concept of Computer Systems Architecture are Acyclic Graph, Advanced Micro Devices, Basic Grid Architecture, Control Flow Prediction, Desktop Processor Architecture, Message-Driven Processor. Main points of this lecture are: Simultaneous Multithreading, Motivation, Parallesim, Horizontal Wasted Slot, Superscalar Processors, Multithreading, Simultaneous Multithreading, Simulation Results, Comparision, Drawbacks

Typology: Slides

2012/2013

Uploaded on 04/27/2013

jutt
jutt 🇮🇳

4.5

(154)

75 documents

1 / 42

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Simultaneous Multithreading
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a

Partial preview of the text

Download Simultaneous Multithreading - Computer Systems Architecture - Lecture Slides and more Slides Computer Architecture and Organization in PDF only on Docsity!

Simultaneous Multithreading

AGENDA

  • INTRODUCTION
    • Motivation
    • Types of Parallesim
    • Vertical and Horizontal Wasted Slot
    • Superscalar Processors
  • Multithreading
  • Simultaneous Multithreading
    • The Idea
    • SMT Model
    • Issues: What to Fetch and What to Issue? Caching
  • Performance Analysis
    • Simulation Results
    • Comparision
    • Drawbacks
  • Commercial Examples
    • IBM POWER
  • Future Tendincies

INTRODUCTION: Motivation

  • Memory subsystem improvement or increasing system integration is not sufficient for significant performance improvement.
  • Solution: Increase parallelism in all its available form
  • Combine the multiple-issue-per-instruction features of modern superscalar processors
  • With latency-hiding ability of multithreaded architectures

INTRODUCTION: Types of Parallesim

  • Bit-level Wider processor datapaths (8,16,32,64…)
  • Word-level (SIMD) Vector processors Multimedia instruction sets (Intel’s MMX and SSE, Sun’s VIS, etc.)
  • Instruction-level Pipelining Superscalar VLIW and EPIC
  • Task and Application-levels Explicit parallel programming Multiple threads Multiple applications

INTRODUCTION: Superscalar

  • Issues multiple instructions in each cycle. Typically 4.
  • Several functional units of the same type, e.g. ALUs
  • Dispatcher reads instructions, decides which can run in parallel
  • Limited by instruction dependencies and long-latency operations
  • Effects Horizontal & Vertical Waste
  • Low Utilization even with higher-issue machines; 8 Issue with %

INTRODUCTION: Superscalar

  • Many slots in the execution core are unused.

MULTITHREADING

MULTITHREADING

  • What a processor needs for Multithreading?
  1. Processor must be aware of several independent states, one per each thread:
  • Program Counter
  • Register File (and Flags)
  • Memory
  1. Either multiple resources in the processor or a fast way to switch across states

MULTITHREADING: Coarse-Grain

MULTITHREADING: Fine - Grain

Multithreading

  • Context switch the threads on every clock cycle.
  • Occupancy of the execution core is now much higher
  • Hides both long and short latency events
  • Vertical waste are eliminated but horizontal waste is not. If a thread has little or no operations to execute issue slots will be wasted.

Simultaneous Multithreading: Idea

  • Combine Superscalar and Multithreading such that;
  1. Issue multiple instructions per cycle – Supercalar
  2. Hardware state for several programs/threads – Multithreading
  • So; issue multiple instructions from multiple threads in each cycle

Simultaneous Multithreading: Idea

Simultaneous Multithreading:

Model

  • Resources redesigned
    • Instruction fetch unit
    • Processor pipeline
  • Instruction Scheduling
    • Does not require additional hardware
    • Register renaming (same as superscalar)

Simultaneous Multithreading: Model

SuperScalar Architecture