Solving Data Hazards - High Performance Computing - Lecture Slides, Slides of Computer Science

Some concept of High Performance Computing are Addressing Modes, Program Execution, Basic Computer Organization, Control Hazard Solutions, Least Recently Used, Memory Hierarchy Progression. Main points of this lecture are: Solving Data Hazards, Instruction Scheduling, Load Delay Slot, Forwarding or Bypassing, Interlocks, Stalling Dependent Instructions, Instructions, Dynamic Instruction Scheduling, Eliminate Data Hazards, Execution Time

Typology: Slides

2012/2013

Uploaded on 04/28/2013

dewaan
dewaan 🇮🇳

3.8

(4)

43 documents

1 / 23

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
High Performance Computing
Lecture 24
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17

Partial preview of the text

Download Solving Data Hazards - High Performance Computing - Lecture Slides and more Slides Computer Science in PDF only on Docsity!

High Performance Computing

Lecture 24

2

Solving Data Hazards

1. Interlocks & stalling dependent instructions

2. Forwarding or Bypassing

3. Load delay slot

4. Instruction Scheduling

 Reorder the instructions of the program so that

dependent instructions are far enough apart

 This could be done either

 by the compiler, before the program runs: Static

Instruction Scheduling

 by the hardware, when the program is running:

Dynamic Instruction Scheduling

4

Static Instruction Scheduling

 Reorder the instructions of the program to

eliminate data hazards …

 or in general to reduce the execution time of the

program

 Reordering must be safe

 should not change the meaning of the program

 Two instructions can be exchanged if they

are independent of each other

5

Example: Static Instruction Scheduling

Program fragment:

LW R3, 0(R1)

ADDI R5, R3, 1

ADD R2, R2, R

LW R13, 0(R11)

ADD R12, R13, R

Scheduling:

1 stall

1 stall

2 stalls

0 stalls

LW R3, 0(R1)

ADDI R5, R3, 1

ADD R2, R2, R

LW R13, 0(R11)

ADD R12, R13, R

7

Kinds of Data Dependence

 True dependence

ADD R1, R2, R

SUB R4, R1, R

 Anti-dependence

ADD R1, R2, R

SUB R2, R4, R

 Output dependence

ADD R1, R2, R

SUB R1, R4, R

8

Dynamic Instruction Scheduling

IF ID^ EX^

MEM WB

IF EX

WB

ID

Instruction Window

Instruction Queue

Functional Units

Floating point Adder

Floating point Multiplier

Integer ALU

Integer Multiplier

Memory Unit

With dynamic instruction scheduling …

10

Problem: Pipeline Hazards

A situation where an instruction cannot

proceed through the pipeline as it should

1. Structural hazard: When 2 or more

instructions in the pipeline need to use the

same resource at the same time

2. Data hazard: When an instruction depends

on the data result of a prior instruction that

is still in the pipeline

3. Control hazard: A hazard that arises due to

control transfer instructions

11

Recall: Execution of Branch Instruction

Mem

PC

Reg

File

Sign

extend

IF ID

4

ALU

Zero?

Mem

EX

MEM WB

13

Control Hazards

 Observation: Since the branch is resolved

only in the EX stage, there must be 2 stall

cycles after every conditional branch

instruction

14

Reducing Impact of Branch Stall

 The execution of a conditional branch

instruction involves 2 activities

1. evaluating the branch condition (determine

whether it is to be taken or not-taken)

2. computing the branch target address

 To reduce branch stall effect we could

 evaluate the condition earlier (in ID stage)

 compute the target address earlier (in ID stage)

 The number of stall cycles would then be

reduced to 1 cycle

16

Prediction and Correctness

 Prediction: guessing what is going to happen

 What if the guess is incorrect?

 The pipelined processor hardware must be built to

detect the misprediction and take appropriate

corrective action

17

Control Hazard Solutions

1. Static Branch Prediction

Example: Static Not-Taken policy

 The hardware is built to fetch next from PC + 4

 After ID stage, if it is found that the branch

condition is false (i.e., not taken), continue with

the fetched instruction (from PC + 4)

 Else, squash the fetched instruction and re-fetch

from the branch target address

 squash: cancel, annul the processing of that instruction

19

IF

Static Not-Taken Branch Prediction

BEQZ R3, out

Fetch inst i +

IF ID

IF ID EX MEM WB

Suppose that the

condition evaluates

to TRUE

ID EX MEM

IF ID EX

Fetch inst from

branch target address

etc

SQUASH inst i+

i.e., ONE BRANCH STALL CYCLE

20

Control Hazard Solutions

1. Static Branch Prediction

Example: Static Not-Taken policy

 The hardware is built to fetch next from PC + 4

 After ID stage, if it is found that the branch

condition is false (i.e., not taken), continue with

the fetched instruction (from PC + 4)

 Else, squash the fetched instruction and re-fetch

from the branch target address

 Thus, average branch penalty < 1 cycle

0 stall cycles

1 stall cycle