Pipelining in ECE 4436 & ECE 5367: Single-Cycle vs. Pipelined Multi-Cycle, Exams of Microprocessors

An in-depth analysis of pipelining in ece 4436 and ece 5367, comparing single-cycle machines and pipelined multi-cycle machines. Topics covered include instruction execution time, structural and control hazards, stall on branch performance, and branch prediction. The document also discusses the impact of pipelining on mips and the elimination of data hazards through data forwarding.

Typology: Exams

Pre 2010

Uploaded on 08/19/2009

koofers-user-klv
koofers-user-klv 🇺🇸

9 documents

1 / 60

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECE 4436ECE 5367
Pipelining
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c

Partial preview of the text

Download Pipelining in ECE 4436 & ECE 5367: Single-Cycle vs. Pipelined Multi-Cycle and more Exams Microprocessors in PDF only on Docsity!

ECE 4436ECE 5367

Pipelining

ECE 4436ECE 5367

Simple Pipelining Example^ Time

6 PM

2 AM

A B C D

Time

6 PM

2 AM

A B C D

Taskorder Taskorder

ECE 4436ECE 5367

Pipelined Multi-cycle Machine

Instruction

memory

Address

Add

Add

result

Shiftleft 2

Instruction

IF/ID

EX/MEM

MEM/WB

0 Mux 1 Add

PC

Writedata

M

ux

Registers

Read

data 1

Read

data 2

Readregister 1Readregister 2

Sign

extend

WriteregisterWritedata

Read

data

ALU

result

Mux

ALU

Zero

ID/EX

Data

memory

Address

ECE 4436ECE 5367

lw $10,20($1)sub $11,$2,$

Instruction

memory

Address

Add

Add result

Shiftleft 2

Instruction

IF/ID
EX/MEM
MEM/WB

0 Mux 1 Add

PC

Writedata

1 Mux

Registers

Read data 1

Read data 2

Readregister 1Readregister 2

Sign extend

WriteregisterWritedata

Read

data

ALU

result

Mux

ALU

Zero

ID/EX

Instruction decode

lw $10, 20($1)

sub $11, $2, $3Instruction fetch

Instruction

memory

Address

Add

Add result

Shiftleft 2

Instruction

IF/ID
EX/MEM
MEM/WB

0 Mux 1 Add

PC

Writedata

1 Mux

Registers

Read data 1

Read data 2

Readregister 1Readregister 2

Sign extend

WriteregisterWritedata

Read

data

ALU

result

Mux

ALU

Zero

ID/EX

lw $10, 20($1)Instruction fetch

Address

Data memory

Address

Data memory

Clock 1 Clock 2

ECE 4436ECE 5367

lw $10,20($1)sub $11,$2,$

Instruction

memory

Address

Add

Add result

ALU

result

Zero

Shiftleft 2

Instruction

IF/ID
EX/MEM
ID/EX
MEM/WB

Write back

0 Mux 1 Add

PC

Writedata

1 Mux

Registers

Read data 1

Read data 2

Readregister 1Readregister 2

Sign extend

Mux

ALU

Read

data

WriteregisterWritedata

lw $10, 20($1)

Instruction

memory

Address

Add

Add result

ALU

result

Zero

Shiftleft 2

Instruction

IF/ID
EX/MEM
ID/EX
MEM/WB

Write back

0 Mux 1 Add

PC

Writedata

1 Mux

Registers

Read data 1

Read data 2

Readregister 1Readregister 2

Sign extend

Mux

ALU

Read

data

WriteregisterWritedata

sub $11, $2, $

Memory

sub $11, $2, $

Address

Data memory

Address

Data memory

Clock 5 Clock 6

ECE 4436ECE 5367

Pipeline Graphical Representation

Tim

e

add$s0, $t0,

$t

IF

ID

W

B

E

X

M

E

M

ECE 4436ECE 5367

Single-Cycle Vs. Pipelined

Instruction

fetch

Reg

ALU

Data

access

Reg

8 ns

Instruction

fetch

Reg

ALU

Data

access

Reg

8 ns

Instruction

fetch

8 ns

Time

lw $1, 100($0)lw $2, 200($0)lw $3, 300($0)

2

4

6

8

10

12

14

16

18

2

4

6

8

10

12

14

...

Programexecutionorder(in instructions)

Instruction

fetch

Reg

A

LU

Data

access

Reg

Time

lw $1, 100($0)lw $2, 200($0)lw $3, 300($0)

2 ns

Instruction

fetch

Reg

A

LU

Data

access

Reg

2 ns

Instruction

fetch

Reg

A

LU

Data access

Reg

2 ns

2 ns

2 ns

2 ns

2 ns

Programexecutionorder(in instructions)

ECE 4436ECE 5367

Pipelining Performance

• What’s the speed-up in the previous

slide?

• Will you always get this speed-up?

Why or why not?

ECE 4436ECE 5367

MIPS:…..

• Memory accesses only through load and

store (load/store architecture)

– can always do address(execute)-memory access– as opposed to address(execute)-memory-

execute-address(execute)-memory for mem-to-mem operation

– ex: add$t1,16($t4),8($t4)

• Operands must be aligned in memory

– don’t need to worry about two mem. accesses

to get data.

ECE 4436ECE 5367

Pipeline Hazards

(Even the best designs have problems!)

Structural Hazards

  • Two instructions need the same hardware at the

same time (such as reg. file, mem., or ALU)

Control Hazards

  • Because of branch tests, we don’t know where

to fetch the next instruction. Results in stalls

Data Hazards

  • Some instructions need data from the previous

instruction. Unfortunately, the previousinstruction has not completed.

ECE 4436ECE 5367

Control Hazard

add $4,$5,$6beq $1,$2,40lw $2,300($0)

WB

MEM

EX

ID/RF

IF

ECE 4436ECE 5367

Control Hazard

Instruction

fetch

Reg

ALU

Data

access

Reg

Time

beq $1, $2, 40

add $4, $5, $

lw $3, 300($0)

4 ns

Instruction

fetch

Reg

ALU

Data

access

Reg

2ns

Instruction

fetch

Reg

ALU

Data

access

Reg

2ns

2

4

6

8

10

12

14

16

Programexecutionorder(in instructions)

Note: This is assuming we add a bunch of extra hardware so that we canresolve the branch test at the end of the second stage. (Which is not thecase in our machine.)

ECE 4436ECE 5367

Branch Prediction

  • predict taken
    • (assuming modified

hardware)

add $4,$5,$6beq $1,$2,40lw $2,300($0)

WB

MEM

EX

ID/RF

IF

ECE 4436ECE 5367

Branch Prediction

  • predict not-taken
    • (assuming modified hardware)

add $4,$5,$6beq $1,$2,40lw $2,300($0)

WB

MEM

EX

ID/RF

IF