Data Hazards - Computer Architecture - Lecture Notes | ECEN 4243, Study notes of Computer Architecture and Organization

Material Type: Notes; Professor: Johnson; Class: COMPUTER ARCHITECTURE; Subject: Electrical and Computer Engineering ; University: Oklahoma State University - Stillwater; Term: Spring 2008;

Typology: Study notes

Pre 2010

Uploaded on 11/08/2009

koofers-user-25e
koofers-user-25e 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECEN 4243 Computer Architecture
Data Hazards March 5, 2008 page 1 of 6
Data Hazards
Operand Forwarding. Note that the pipeline reads the register file source register before
writing results into the register file from the previous instruction. Thus,
ADD R1,R2,R3
SUB R4,R3,R1
the above instruction sequence has a RAW hazard because R1 has not yet had the results
of the previous ADD stored in it when it is read as a source for the SUB instruction. The
following methods can be used to insure that the hazard is avoided and only valid register
data are used.
1. Let the optimizing compiler keep track of what data is in each register. In this example,
R1 contains the old value of R1 before the sum of R2 and R3 are computed. If the com-
piler wants the sum of R2 and R3 to be in R1, then it must put another instruction
between the ADD and SUB (a NOP if nothing else).
2. Provide the new register data from other pipeline stages, rather than waiting for it to be
stored in the register file. This is called operand forwarding.
Let us first consider operand forwarding with just the pipelined register and ALU. The
new register contents will be available in the D-register just when the source operands for
the next instruction need to be read. If the next instruction needs the modified register
contents, then the data can be provided from the D-register rather than from the A-register
or B-register. Note that there is a connection from the D-register to both ALU inputs so
that the D-register can provide either or both of the operands to the next instruction. The
Reg
File CL
A-select
B-select
A
B
ALU
A data bus
B data bus
rd
wr
WR
D
CL
D forward bus
MUX
MUX
CL_D
CL_D
pf3
pf4
pf5

Partial preview of the text

Download Data Hazards - Computer Architecture - Lecture Notes | ECEN 4243 and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Data Hazards

Operand Forwarding. Note that the pipeline reads the register file source register before writing results into the register file from the previous instruction. Thus,

ADD R1,R2,R SUB R4,R3,R

the above instruction sequence has a RAW hazard because R1 has not yet had the results of the previous ADD stored in it when it is read as a source for the SUB instruction. The following methods can be used to insure that the hazard is avoided and only valid register data are used.

  1. Let the optimizing compiler keep track of what data is in each register. In this example, R1 contains the old value of R1 before the sum of R2 and R3 are computed. If the com- piler wants the sum of R2 and R3 to be in R1, then it must put another instruction between the ADD and SUB (a NOP if nothing else).
  2. Provide the new register data from other pipeline stages, rather than waiting for it to be stored in the register file. This is called operand forwarding.

Let us first consider operand forwarding with just the pipelined register and ALU. The new register contents will be available in the D-register just when the source operands for the next instruction need to be read. If the next instruction needs the modified register contents, then the data can be provided from the D-register rather than from the A-register or B-register. Note that there is a connection from the D-register to both ALU inputs so

that the D-register can provide either or both of the operands to the next instruction. The

Reg

File

CL

A-select

B-select

A

B

ALU

A data bus

B data bus

rd

wr

WR

D

CL

D forward bus

MUX

CL_D MUX

CL_D

timing of the two instruction example is shown below. Control logic needs to be provided

for the multiplexers so that the D-register is used when appropriate to provide inputs to the ALU inputs.

Operand forwarding is more complex for a longer pipeline such as the 5-stage MIPS. In this case, RAW hazards can occur between any 5 consecutive instructions (see fig. 6.28, p. 405).

For ALU instructions, operand forwarding can provide the necessary operands by for- warding them from one of the pipeline registers (see fig. 6.29, p. 408). Note that the last two instructions really do not cause a hazard since the correct operands are already in the register file when it is read by these instructions.

A close inspection of fig. 6.29 shows that ALU operands may be forwarded from either the EX/MEM pipeline register (the second instruction) or the MEM/WB register (the third instruction). Multiplexers are added to the ALU inputs to select between the normal input (ID/EX pipeline register) and the two forwarding locations as shown in fig. 6.30, p.

  1. The forwarding MUX on the B input to the ALU should really be combined with the MUX that selects between the register file and the immediate input as shown in fig. 6.33, p. 412.

So far we have taken care of only RAW data hazards. What about WAR and WAW haz- ards? In the MIPS pipeline, operands are read in the ID stage near the beginning of the pipeline and operands are written in the WB stage at the end of the pipeline. Since writing is done after reading operands of all previous instructions, this eliminates the WAR haz- ard. Also, since instructions results are written back in the same order that instructions are fetched, this eliminates WAW hazards.

ADD R1,R2,R3^ I-fetch^ src^ ALU dest

Clock

SUB R4,R3,R

MUX selects

D-register in place

of R

I-fetch src ALU dest

old (wrong) R

is in B-register

R1 has

new

value

LW R3, 0(R1)

also produces a RAW hazard that cannot be eliminated with operand forwarding. Note that store followed by another instruction does not cause a hazard because a store instruc- tion does not modify the register file.

A stall (the “bubble” in fig. 6.35, p.415) avoids the RAW hazard by delaying instructions already in the pipeline after the load instruction. The delayed instructions after the load instruction can then use operand forwarding to avoid the hazard. Operand forwarding helps to avoid the hazard without having to stall for an additional clock cycle, but cannot prevent a single cycle stall.

Controlling Stalls. Since the instruction after a load with a RAW hazard (the “and” instruction in fig. 6.34) must stall before going into the EX pipeline stage, the pipeline controller must detect the stall while the instruction after the load is in ID and the load is in EX. The stall can be detected by comparing the source operands for the instruction fol- lowing the load (in IF/ID) with the destination operand for the load (in ID/EX). This is done by the hazard detection unit in fig. 6.36, p. 416.

A careful understanding of the timing in fig. 6.35, p. 415 is needed to understand how the hazard detection unit works. Although this type of diagram makes it clear where each instruction is during each clock cycle, it is not clear what each stage in the pipeline is doing during each clock cycle. Rather than organizing the pipeline timing diagram by instruction, organize the diagram by pipeline stage to show what each pipeline stage must do and when it must do it. It is much easier to see the stalls (bubbles) in each pipeline stage in these diagrams.

The stall is detected during clock cycle 3.

The EX, MEM, and WB stalls can be implemented by inserting a NOP (any ALU op with R0 as the destination) into the ID/EX pipeline register at the beginning of clock cycle 4 and letting the NOP flow down the rest of the pipeline in a normal manner.

The IF and ID stalls can be implemented by repeating the instructions after the load in IF and ID during cycle 3. that is leave the PC and IF/ID register unchanged at the beginning of cycle 4.

Pipeline Stage Time IF ID EX MEM WB CC 1 lw $2, 20($1) CC 2 and $4, $2, $5 lw $2, 20($1) CC 3 or $8, $2, $6 and $4, $2, $5 lw $2, 20($1) CC 4 REPEAT REPEAT NOP lw $2, 20($1) CC 5 add $9, $4, $2 or $8, $2, $6 and $4, $2, $5 NOP lw $2, 20($1) CC 6 slt $1, $6, $7 add $9, $4, $2 or $8, $2, $6 and $4, $2, $5 NOP CC 7 slt $1, $6, $7 add $9, $4, $2 or $8, $2, $6 and $4, $2, $

The snapshots clearly show that each stage is stalled for just a single clock cycle, but that different stages can be stalled at different times.

The following hardware diagram shows an implementation of this scheme.

Compiler Scheduling. The hardware can eliminate all of the data hazards except for a load followed by an instruction that uses the load destination operand. The optimizing compiler can reduce the performance impact of the stall by trying to find a useful instruc- tion to insert between the load instruction and the ALU instruction. This is not always easy to do. Consider the following.

Code for a = b + c;

lw $1, b lw $2, c add $3, $1, $2 (must be stalled!) sw $3, a

Code for a = b + c; d = e - f

lw $1, b lw $2, c lw $4, e (instruction order changed to separate lw and add) add $3, $1, $2 (no stall)

PC

MUX

IF/ID

MUX

ID/EX

MUX

NOP

compare IF/ID sources with ID/EX load dest

stall

Hazard Dectection Unit