Predicated Execution and Control Dependence Analysis - Prof. Scott Mahlke, Assignments of Electrical and Electronics Engineering

The concept of predicated execution and control dependence analysis (cda) in computer science. Predicated execution is a technique used to improve the efficiency of branching instructions by allowing multiple conditions to be evaluated in parallel. Cda is used to determine the order in which operations must be executed to ensure the correct result. The use of compare-to-predicate operations (cmpps), or-type and and-type predicates, and the steps involved in generating predicated code. It also includes a running example to illustrate the concepts.

Typology: Assignments

Pre 2010

Uploaded on 09/17/2009

koofers-user-wfl-2
koofers-user-wfl-2 🇺🇸

10 documents

1 / 31

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EECS 583 – Lecture 5
If-conversion
University of Michigan
January 22, 2003
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f

Partial preview of the text

Download Predicated Execution and Control Dependence Analysis - Prof. Scott Mahlke and more Assignments Electrical and Electronics Engineering in PDF only on Docsity!

EECS 583 – Lecture 5 If-conversion

University of Michigan January 22, 2003

  • 1 -

Homework 1^ Y^ Due next Monday (1/27) @11:59pm^ Y^ Check out the class newsgroup for help/info^ Y^ Notes

»^ My soln is ~400 lines »^ Please delete the original code in loopdet.cpp when you turn inthe HWth^ »^4 testcase not ready yet, will be soon

  • 3 -

Recap: Predicated Execution Example^ a = b + c^ if (a > 0)^ e = f + g^ else^ e = f / g^ h = i - j

BB1 BB1 BB3 BB3 BB2 BB

add a, b, c bgt a, 0, L1 div e, f, g jump L2 L1: add e, f, g L2: sub h, i, j

BB1 BB2^ BB3^ BB

Traditional branching code^ BB1^ BB1^ BB1^ BB3^ BB2^ BB

add a, b, c if T p2 = a > 0 if T p3 = a <= 0 if T div e, f, g if p3 add e, f, g if p2 sub h, i, j if T

BB1 BB2 BB3 BB

p2^ Æ^ BB2 p3^ Æ^ BB

Predicated code

  • 4 -

Compare-to-Predicate Operations (CMPPs)^ Y^ How do we compute predicates

»^ Compare registers/literals like a branch would do »^ Efficiency, code size, nested conditionals, etc Y 2 targets for computing taken/fall-through conditions with1 operation

p1, p2 = CMPP.cond.D1a.D2a (r1, r2) if p3 p1 = first destination predicate p2 = second destination predicate cond = compare condition (ie EQ, LT, GE, …) D1a = action specifier for first destination D2a = action specifier for second destination (r1,r2) = data inputs to be compared (ie r1 < r2) p3 = guarding predicate

  • 6 -

OR-type, AND-type Predicates^ p1 = 0^ p1 = cmpp_ON (r1 < r2) if T^ p1 = cmpp_OC (r3 < r4) if T^ p1 = cmpp_ON (r5 < r6) if T^ p1 = (r1 < r2) | (!(r3 < r4)) |^ (r5 < r5)^ Wired-OR into p

p1 = 1 p1 = cmpp_AN (r1 < r2) if T p1 = cmpp_AC (r3 < r4) if T p1 = cmpp_AN (r5 < r6) if T p1 = (r1 < r2) & (!(r3 < r4)) &^ (r5 < r5) Wired-AND into p

Generating predicated code for some source code requires OR-type predicates

Talk about these later – used for control height reduction

  • 7 -

Use of OR-type Predicates^ a = b + c^ if (a > 0 && b > 0)^ e = f + g^ else^ e = f / g^ h = i - j

BB1 BB1 BB5 BB2 BB2 BB3 BB

add a, b, c ble a, 0, L1 ble b, 0, L1 add e, f, g jump L2 L1: div e, f, g L2: sub h, i, j

BB1 BB5 BB

BB

Traditional branching code

BB

BB1 BB1 BB5 BB3 BB2 BB

add a, b, c if T p3, p5 = cmpp.ON.UC a <= 0 if T p3, p2 = cmpp.ON.UC b <= 0 if p5 div e, f, g if p3 add e, f, g if p2 sub h, i, j if T

BB1 BB5 BB2 BB3 BB

p2^ Æ^ BB2 p3^ Æ^ BB3 p5^ Æ^ BB

Predicated code

  • 9 -

If-conversion^ Y^ Algorithm for generating predicated code

»^ Automate what we’ve been doing by hand »^ Handle arbitrary complex graphs^ y^

But, acyclic subgraph only!! y Need a branch to get you back to the top of a loop

»^ Efficient Y Roots are from Vector computer days »^ Vectorize a loop with an if-statement in the body Y 4 steps »^ 1. Loop backedge coalescing »^ 2. Control dependence analysis »^ 3. Control flow substitution »^ 4. CMPP compaction Y My version of Park & Schlansker

  • 10 -

Running Example – Initial State^ do {^ b = load(a)^ if (b < 0) {

if ((c > 0) && (b > 13))^ b = b + 1 else^ c = c + 1 d = d + 1 } else { e = e + 1 if (c > 25) continue } a = a + 1 } while (e < 34)

BB2c > 0^ c <= 0 BB4b <= 13^ BB

BB

BB

BB1 b < 0 b >= 0

e+ BB3+ c > 25 c <= 25 c++

b > 13 b++

d++^ BB^

a++ e < 34

e >= 34

  • 12 -

Running Example – Backedge Coalescing

b < 0^

b >= 0 c <= 0 c > 0

b <= 13

c <= 25^

c > 25 d++

e++

b++^

c++

b < 0^

b >= 0 c <= 0 c > 0 b <= 13

e++ c > 25c <= 25 c++

BB2 BB4^ BB

BB

BB

BB1^ BB3^ BB

b > 13

e < 34 e >= 34 a++

BB

BB1 BB2 BB

BB4 b > 13 BB^

BB

b++

BB^

d++^ BB^

a++ e < 34

e >= 34

  • 13 -

Step 2: Control Dependence Analysis (CD)^ Y^ Control flow – Execution transfer from 1 BB to anothervia a taken branch or fallthrough path^ Y^ Dependence – Ordering constraint between 2 operations

»^ Must execute in proper order to achieve the correct result »^ O1: a = b + c »^ O2: d = a – e »^ O2 dependent on O1 Y Control dependence – One operation controls theexecution of another »^ O1: blt a, 0, SKIP »^ O2: b = c + d »^ SKIP: »^ O2 control dependent on O1 Y Control dependence analysis derives these dependences

  • 15 -

Control Dependence Example

BB2 T BB4^ BB5 BB

T^ F

F

BB^

Control dependences BB1: BB2: BB3: BB4: BB5: BB6: BB7: BB

Notation positive BB number = fallthru direction^ negative BB number = taken direction BB

  • 16 -

Running Example – CDs^ Entry

First, nuke backedge(s) Second, nuke exit edges Then, Add pseudo entry/exit nodes^ - Entry

Æ^ nodes with no predecessors - Exit Æ^ nodes with no successors

BB1 b < 0 b >= 0 c <= 0 c > 0

b <= 13

c <= 25^

c > 25 d++

e++ c++ BB^

BB

BB^

Control deps (left is taken) BB1: BB2: BB3: BB4: BB5: BB6: BB7: BB8: BB9:

b > 13^ BB

BB

b++

BB

a++^ BB

e < 34^

BB

Exit

  • 18 -

Running Example – Post Dominators^ Entry

BB^

pdom^

ipdom

BB1:^

1, 9, ex

BB2:^

2, 7, 8, 9, ex

BB3:^

3, 9, ex

BB4:^

4, 7, 8, 9, ex

BB5:^

5, 7, 8, 9, ex

BB6:^

6, 7, 8, 9, ex

BB7:^

7, 8, 9, ex

BB8:^

8, 9, ex

BB9:^

9, ex^

ex

b < 0^

b >= 0 c <= 0 c > 0

b <= 13

c <= 25^

c > 25 d++

e++ c++ BB^

BB

BB4 b > 13 BB^

BB

b++

BB

a++^ BB

e < 34^

BB

Exit

  • 19 -

Running Example – CDs Via Algorithm

c <= 0 c > 0

b <= 13

c <= 25^

c > 25 d++

e++ c++ BB2 BB4^ BB

BB

BB

BB1 b < 0 b >= 0^ BB3^ BB b > 13

e < 34 a++ b++

x = 1e = taken edge 1

Æ^2

y = 2y not in pdom(x)lub = 9x_id = -1t = 22 != 9cd(2) += -1t = 77 != 9cd(7) += -1t = 88 != 9cd(8) += -1t = 99 == 9 1 Æ^ 2 edge (aka –1)

Entry

BB

Exit