Dynamic Instruction Scheduling: Overcoming Data Dependencies and Improving Performance, Slides of Advanced Computer Architecture

This presentation provides an overview of instruction scheduling, focusing on dynamic scheduling methods such as Tomasulo's scoreboarding and algorithm. Dynamic scheduling allows instructions to execute out of order, improving performance by handling data dependencies and tolerating unprecedented delays. The presentation covers the advantages and disadvantages of dynamic scheduling, as well as approaches like scoreboarding and Tomasulo's algorithm.

Typology: Slides

2019/2020

Uploaded on 03/24/2020

saniksha-murria
saniksha-murria 🇮🇳

3 documents

1 / 24

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
A PRESENTATION ON
DYNAMIC INSTRUCTION
SCHEDULING
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18

Partial preview of the text

Download Dynamic Instruction Scheduling: Overcoming Data Dependencies and Improving Performance and more Slides Advanced Computer Architecture in PDF only on Docsity!

A PRESENTATION ON

DYNAMIC INSTRUCTION

SCHEDULING

Submitted By

Saniksha Murria

MCA 6

th

Sem

Instruction Scheduling 4

  1. Broad Classification of Instruction Scheduling Let’s start with the first set of slides

Dynamic Instruction

Scheduling

is a method in which the hardware determines which instructions to execute.In essence, the processor is executing instructions out of order. Dynamic scheduling is similar to a data flow machine, in which instructions don't execute based on the order in which they appear, but rather on the availability of the source operands. 7

Need of Dynamic

Instruction Scheduling

╺ (^) To overcome data dependency through interlocking and forwarding ╺ (^) Scheduling: ordering the execution of instructions in a program so as to improve the performance ╺ (^) To overcome the inability to detect the dependencies at compile time. 8

10 Approaches for Dynamic Instruction Scheduling Tomasulo’s Scoreboardi ng Approach

Scoreboarding Approach ╺ (^) Thornton's technique of Scoreboarding is a technique for allowing instructions to execute out of order when there are sufficient resources and no data dependencies ╺ (^) The goal of a scoreboard is to maintain an execution rate of one instruction per clock cycle by executing an instruction as early as possible. Thus, when the next instruction to execute is stalled, other instructions can be issued and executed if they do not depend on any active or stalled instruction. ╺ (^) The scoreboard takes full responsibility for instruction issue and execution, including all hazard detection. ╺ (^) Every instruction goes through the scoreboard, where a record of the data dependencies is constructed; this step corresponds to instruction issue and replaces part of the ID step in the DLX pipeline.

╺ (^) Scoreboard Example We'll assume a machine with 2 integer units, 2 FP add units, 1 FP multiply unit, and 1 FP divide unit. A scoreboard consists of three parts: the instruction status, the functional unit status, and the register status. These parts are shown and explained below. ╺ (^) Instruction status 13

Functional unit status 14

Register result status

Scoreboarding Approach ╺ (^) In the tables above, we have filled out the initial status of the tables, assuming all the instructions had been fetched into the instruction status table. We issue instructions sequentially until we encounter a WAW hazard or we run out of functional units. In this case, we ran out of functional units. ╺ (^) Here the scoreboard machine took 33 cycles complete the above instructions. The only thing limiting this is the lack of a second floating point multiply unit. 16

Tomasulo's algorithm

╺ (^) Tomasulo's algorithm is another method of implementing dynamic scheduling. This scheme was invented by Robert Tomasulo, and was first used in the IBM 360/91. ╺ Tomasulo's algorithm differs from scoreboarding in that it uses register renaming to eliminate output and anti- dependences, i.e. WAW and WAR hazards. Output and anti- dependences are just name dependences, there is no actual data dependence. 17

Tomasulo's algorithm

╺ (^) For example, the code below MULTD F4,F2,F ADDD F2,F0,F ╺ contains an anti-dependence since the first instruction reads from F2 and the second instruction writes to F2 (a WAR hazard). However, there is no data dependence, as is shown by the code below : MULTD F4,F2,F ADDD F8,F0,F ╺ (^) The anti-dependence is removed without changing the semantics of the code simply by changing the F2 to an F8. However, we don't have to use F8, we can use any available register. (^19)

Tomasulo's algorithm Suppose there were some extra registers. We can use those extra registers for register renaming it allows the hardware to detect a name dependence and eliminate it by storing the result of an instruction somewhere else. Thus, with register renaming, MULTD F4,F2,F ADDD F2,F0,F the add can execute and finish before the multiply starts even though looking at the code suggests that would result in an erroneous answer. 20