Lecture Notes for CS 333: Pipelining and Instruction Execution - Appendix A, Study notes of Computer Architecture and Organization

A set of lecture notes for cs 333, focusing on pipelining and instruction execution. It covers key concepts such as pipelining, throughput, latency, speedup, structural hazards, data hazards, and control hazards. The notes also discuss various solutions to hazards, including bypassing, pipeline scheduling, and delayed branches.

Typology: Study notes

Pre 2010

Uploaded on 03/16/2009

koofers-user-sbk
koofers-user-sbk 🇺🇸

10 documents

1 / 45

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture notes for CS 333 - Appendix A 2008-1-30
Sarita Adve 1
Appendix A: Pipelining: Basic and Intermediate Concepts
Key ideas and simple pipeline (Section A.1)
Hazards (Sections A.2 and A.3)
St t l h d
St
ruc
t
ura
l
h
azar
d
s
Data hazards
Control hazards
Exceptions (Section A.4)
Multicycle operations (Section A.5)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d

Partial preview of the text

Download Lecture Notes for CS 333: Pipelining and Instruction Execution - Appendix A and more Study notes Computer Architecture and Organization in PDF only on Docsity!

2008-1-

Appendix A: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section A.1)Hazards (Sections A.2 and A.3)

St

t^

l h

d

Structural hazardsData hazardsControl hazards Exceptions (Section A.4)Multicycle operations (Section A.5)

2008-1-

Pipelining - Key Idea

time

time

1/Throughput

1/Throughput

instrns

instrns

g p

Ideally

Time Latency

Latency

Time

sequential

Pipeline Depth

Time

pipeline

Time

sequential

S^

d^

l^

h

sequential Time

pipeline

Speedup =

= Pipeline Depth

2008-1-

Practical Limit 2 - Overheads

Let

0 be extra delay per stage e.g., latches Δ^

limits the useful depth of a pipeline. With an n-stage pipelineWith an n stage pipeline

Δ + max t

Throughput =

<^

n T

Δ + max t

i

g p

T

Latency = n

×

+ max t

)^ i

≥^

n Δ

T

Σ t

i

Δ + max t

i

Speedup =

<^

n

2008-1-

Example

Let

t^ 1,2,

ns and

ns

Throughput =Latency =Speedup =

2008-1-

Pipelining a Basic RISC ISA

MIPS ISAOnly loads and stores affect memory

Base register + immediate offset = effective address ALU operations

Only access registersTwo sources – two registers, or register and immediate Branches and jumps

Comparison between a register and zeroAdd

PC

ff

Address = PC + offset

2008-1-

A Simple Five Stage RISC Pipeline

Pipeline Stages

IF – Instruction FetchID

Instruction decode register read branch computation ID – Instruction decode, register read, branch computationEX – Execution and Effective AddressMEM – Memory AccessWB – Writeback

1

2

3

4

5

6

7

8

9

1

2

3

4

5

6

7

8

9

i^

IF

ID

EX

MEM WB

i+

IF

ID

EX

MEM WB

i+

IF

ID

EX

MEM WB

i+

IF

ID

EX

MEM WB

i+

IF

ID

EX

MEM WB

i+

IF

ID

EX MEM WB

Pipelining really isn

't this simple

Pipelining really isn t this simple

2008-1-

Hazards

HazardsStructural HazardsData HazardsControl Hazards

2008-1-

Handling Hazards

Pipeline interlock logic

Detects hazard and takes appropriate action Simplest solution: stall

Increases CPIDecreases performance Other solutions are harder, but have better performance

2008-1-

Structural Hazards, cont.

Pipeline Resource

  • Good performance-^

Often complex to doUse when simple to doE.g., write & read registers every cycle

Structural hazards are avoided if each instruction uses a resource

At most onceAl

i^

h^

i^

li

Always in the same pipeline stageFor one cycle(^

l^

h^

t^

i^

t^

ti^

th

no cycle where two instructions use the same resource)

2008-1-

Structural Hazard Example

Loads/stores (MEM) use same memory port as instrn fetches (IF)30% of all instructions are loads and storesAssume

CPI

old

is 1.

1

2

3

4

5

6

7

8

9

i^

IF

ID

EX

MEM WB

l^

d

i^

IF

ID

EX

MEM WB

<— a load

i+

IF

ID

EX

MEM WB

i+

IF

ID

EX

MEM WB

i+

**

IF

ID

EX

MEM WB

i+

IF

ID

EX MEM WB

How much faster could a new machine with two memory ports be?How

much faster could a new machine with two memory ports be?

2008-1-

r^

ritten

Example Read-After-Write Hazards

ADD r1,,

IF^

ID^

EX

MEM

WBNOT OK! r1 written

SUB , r1,

IF^

ID^

EX

MEM

WB

r1 read

r1 written

LW r1,,

IF^

ID^

EX

MEM

WBNOT OK!

SUB

1

IF^

ID^

EX

MEM

WB

r1 read

memory written

SUB , r1,

IF^

ID^

EX

MEM

WB

SW r1,100(r0)

IF^

ID^

EX

MEM

WBCORRECT!

LW r2,100(r0)

IF^

ID^

EX

MEM

WB

memory read

(Unless LW instrn is at address 100(r0))

2008-1-

RAW Solutions

Solutions must first detect RAW, and then ...Stall ADD r1,,

IF^

ID^

EX

MEM

WB r1 written

SUB , r1,

IF^

ID^

stall

stall

EX

MEM

WB

r1 read

(Assumes registers written then read each cycle)

L^

i^

l

  • Low cost, simple-^

Increases CPI (plus 2 per stall in 5 stage pipeline)U^

f^

t

Use for rare events

2008-1-

Bypass, cont.

Figure A.

g

Additional hardware

Muxes supply correct result to ALU Additi

l^

t^

l

Additional control

Interlock logic must control muxes

  • This figure has been taken from Computer Architecture, A Quantitative Approach, 3rd Edition Copyright 2003 by Elsevier Inc. All rights reserved. It has been used with permission by Elsevier Inc.

2008-1-

RAW Solutions, cont.

Hybrid solution sometimes required:LW^

r1,, IF

ID^

EX

MEM

WB

r1 written

data available

SUB , r1,

IF^

ID^

stall

EX

MEM

WB

r1 read

data used

One-cycle bubble if result of load used by next instructionPi

li^

h d li

il^

i

Pipeline scheduling at compile-time

Moves instructions to eliminate stalls