Efficient Compiler Support: Predicated Execution & Control Analysis - Prof. Scott Mahlke, Papers of Electrical and Electronics Engineering

The concept of effective compiler support for predicated execution using hyperblocks. The authors discuss how to compute predicates, introduce cmpp action specifiers, and differentiate between or-type and and-type predicates. The document also covers the use of or-type predicates in generating predicated code and the importance of control dependence analysis in the process.

Typology: Papers

Pre 2010

Uploaded on 09/02/2009

koofers-user-ryq-1
koofers-user-ryq-1 🇺🇸

10 documents

1 / 30

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EECS 583 – Class 4
If-conversion
University of Michigan
September 17, 2007
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e

Partial preview of the text

Download Efficient Compiler Support: Predicated Execution & Control Analysis - Prof. Scott Mahlke and more Papers Electrical and Electronics Engineering in PDF only on Docsity!

EECS 583 – Class 4 If-conversion

University of Michigan September 17, 2007

  • 1 -

Reading Material^ ™^ Today’s class

»^ “The Program Dependence Graph and Its Use in Optimization”,^ J. Ferrante, K. Ottenstein, and J. Warren, ACM TOPLAS, 1987^ y^

This is a long paper – the part we care about is the controldependence stuff. The PDG is interesting and you shouldskim it over, but we will not talk about it now

»^ “On Predicated Execution”, Park and Schlansker, HPL TechnicalReport, 1991. ™ Next class »^ "Effective Compiler Support for Predicated Execution using theHyperblock", S. Mahlke et al., MICRO-25, 1992. »^ “Profiled Guided Code Positioning”,^ K. Pettis and R. Hansen, Proc. PLDI-90, 1999.

  • 3 -

CMPP Action Specifiers

Guarding^ predicate^0011

Compare^ Result^0101

UN^ UC^0001

ON 0 0 1 0

OC -^ - -^ - -^11 -

AN^ AC^ -^ -^0 -

      • 0

UN/UC = Unconditional normal/complement

This is what we used in the earlier examples guard = 0, both outputs are 0 guard = 1, UN = Compare result, UC = opposite

ON/OC = OR-type normal/complement AN/AC = AND-type normal/complement

  • 4 -

OR-type, AND-type Predicates^ p1 = 0^ p1 = cmpp_ON (r1 < r2) if T^ p1 = cmpp_OC (r3 < r4) if T^ p1 = cmpp_ON (r5 < r6) if T^ p1 = (r1 < r2) | (!(r3 < r4)) |^ (r5 < r5)^ Wired-OR into p

p1 = 1 p1 = cmpp_AN (r1 < r2) if T p1 = cmpp_AC (r3 < r4) if T p1 = cmpp_AN (r5 < r6) if T p1 = (r1 < r2) & (!(r3 < r4)) &^ (r5 < r5) Wired-AND into p1 Talk about these later – used for control height reduction

Generating predicated code for some source code requires OR-type predicates

  • 6 -

Class Problem^ if (a > 0) {

if (b > 0)^ r = t + s else^ u = v + 1 y = x + 1 } a. Draw the CFG b. Predicate the code removing all branches

  • 7 -

If-conversion^ ™^ Algorithm for generating predicated code

»^ Automate what we’ve been doing by hand »^ Handle arbitrary complex graphs^ y^

But, acyclic subgraph only!! y Need a branch to get you back to the top of a loop

»^ Efficient ™ Roots are from Vector computer days »^ Vectorize a loop with an if-statement in the body ™ 4 steps »^ 1. Loop backedge coalescing »^ 2. Control dependence analysis »^ 3. Control flow substitution »^ 4. CMPP compaction ™ My version of Park & Schlansker

  • 9 -

Step 1: Backedge Coalescing^ ™^ Recall – Loop backedge is branch from inside the loopback to the loop header^ ™^ This step only applicable for a loop body

»^ If not a loop body

Æ^ skip this step

™^ Process^ »^

Create a new basic block^ y^ New BB contains an unconditional branch to the loop header » Adjust all other backedges to go to new BB rather than header

™^ Why do this?^ »^

Heuristic step – Not essential for correctness^ y^ If-conversion cannot remove backedges (only forward edges)^ y^ But this allows the control logic to figure out which backedge youtake to be eliminated » Generally this is a good thing to do

  • 10 -

Running Example – Backedge Coalescing

BB2 BB4^ BB

BB

BB

BB1 b < 0 b >= 0^ BB3c <= 0^ BB c > 0

b <= 13

c <= 25^

c > 25 e < 34 d++ b++^

c++

b > 13

e >= 34 a++

e++^ BB

BB2 BB4^ BB

BB

BB

BB1 b < 0 b >= 0^ BB3c <= 0^ BB c > 0 b <= 13

c > 25c <= 25

b++^

c++

b > 13

e < 34

d++^ a++^ e >= 34

e++

  • 12 -

Control Dependences^ ™^ Recall

»^ Post dominator – BBX is post dominated by BBY if every pathfrom BBX to EXIT contains BBY »^ Immediate post dominator – First breadth first successor of ablock that is a post dominator ™ Control dependence – BBY is control dependent on BBXiff »^ 1. There exists a directed path P from BBX to BBY with anyBBZ in P (excluding BBX and BBY) post dominated by BBY »^ 2. BBX is not post dominated by BBY ™ In English, »^ A BB is control dependent on the closest BB(s) that determine(s)its execution »^ Its actually not a BB, it’s a control flow edge coming out of a BB

  • 13 -

Control Dependence Example

BB2 BB4^ BB5 BB

BB

BB

BB

Control dependences BB1: BB2: BB3: BB4: BB5: BB6: BB7:

T^

F

T^

F

Notation positive BB number = fallthru direction^ negative BB number = taken direction

  • 15 -

Algorithm for Control Dependence Analysis^ for each

basic block x in region for each^ outgoing control flow edge e of x y = destination basic block of e if (y not in pdom(x)) then lub = ipdom(x) if^ (e corresponds to a taken branch) then^ x_id = -x.id else^ x_id = x.id endif t = y while^ (t != lub) do^ cd(t) += x_id;^ t = ipdom(t) endwhile endif endfor endfor

Notes Compute cd(x) which contains those BBs which x is control dependent on^ Iterate on per edge basis, adding^ edge to each cd set it is a member of

  • 16 -

Running Example – Post Dominators

BB2 BB4^ BB

BB

BB

BB1 b < 0 b >= 0^ BB3c <= 0^ BB c > 0

b <= 13

c <= 25^

c > 25 e < 34 d++

c++

b > 13

a++

e++

b++

BB

pdom^

ipdom

BB1:^

1, 9, ex

BB2:^

2, 7, 8, 9, ex

BB3:^

3, 9, ex

BB4:^

4, 7, 8, 9, ex

BB5:^

5, 7, 8, 9, ex

BB6:^

6, 7, 8, 9, ex

BB7:^

7, 8, 9, ex

BB8:^

8, 9, ex

BB9:^

9, ex^

ex

Entry

Exit

  • 18 -

Running Example – CDs Via Algorithm (2)

BB2 BB4^ BB

BB

BB

BB1 b < 0 b >= 0^ BB3c <= 0^ BB c > 0

b <= 13

c <= 25^

c > 25 e < 34 d++

c++

b > 13

a++

e++

b++

BB

Entry

Exit

x = 3e = taken edge 3

Æ^8

y = 8y not in pdom(x)lub = 9x_id = -3t = 88 != 9cd(8) += -3t = 99 == 9 3 Æ^ 8 edge (aka -3)^ Class Problem: 1

Æ^ 3 edge (aka 1)

  • 19 -

Running Example – CDs Via Algorithm (3)

BB2 BB4^ BB

BB

BB

BB1 b < 0 b >= 0^ BB3c <= 0^ BB c > 0

b <= 13

c <= 25^

c > 25 e < 34 d++

c++

b > 13

a++

e++

b++

BB

Entry

Exit

Control deps (left is taken) BB1: none BB2: -1 BB3: 1 BB4: -2 BB5: -4 BB6: 2, 4 BB7: -1 BB8: -1, -3 BB9: none