Advanced Compilers - Hyperblocks Control Height Reduction | EECS 583, Study notes of Electrical and Electronics Engineering

Material Type: Notes; Professor: Mahlke; Class: Advanced Compilers; Subject: Electrical Engineering And Computer Science; University: University of Michigan - Ann Arbor; Term: Winter 2004;

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-l75
koofers-user-l75 🇺🇸

10 documents

1 / 43

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EECS 583 – Lecture 5
Hyperblocks
Control Height Reduction
University of Michigan
January 26, 2004
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b

Partial preview of the text

Download Advanced Compilers - Hyperblocks Control Height Reduction | EECS 583 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

EECS 583 – Lecture 5 Hyperblocks Control Height Reduction

University of Michigan January 26, 2004

  • 1 -

Recap: Backedge Coalescing

BB2^ c <= 0 c > 0

b <= 13

c <= 25

c > 25 d++

b++^

c++

c > 25 c <= 25 c++

BB

BB

BB

BB

BB

b < 0^

b >= 0^ BB3 BB

b > 13

e < 34 e >= 34 a++

e++^ BB

BB2^ c <= 0 c > 0

b <= 13 BB

BB

BB

BB

BB

b < 0^

b >= 0^ BB^

e++

b > 13 b++

d++^ BB^

a++ e < 34

e >= 34

  • 3 -

Recap: CDs Via Algorithm^ Entry

BB

Control deps (left is taken) BB1: none BB2: -1 BB3: 1 BB4: -2 BB5: -4 BB6: 2, 4 BB7: -1 BB8: -1, -3 BB9: none

b < 0^

b >= 0 BB2^ c <= 0 c > 0

b <= 13

c <= 25

c > 25 d++

e++ BB3 (^) c++

BB4 b > 13 BB^

BB

b++

BB

BB

a++

e < 34^

BB

Exit

  • 4 -

Recap: CMPP Creation^ Entry

K^ = {{-1}, {1}, {-2}, {-4}, {2,4}, {-1,-3}} p’s =^

p1,^ p2,

p3,

p4,

p5,

p

BB

b < 0^

b >= 0 BB2^ c <= 0 c > 0

b <= 13

c <= 25

c > 25 d++

e++ BB3 (^) c++

p1 = cmpp.ON (b < 0) if T

Æ^ BB

p2 = cmpp.ON (b >= 0) if T

Æ^ BB

p3 = cmpp.ON (c > 0) if T

Æ^ BB

p4 = cmpp.ON (b > 13) if T

Æ^ BB

p5 = cmpp.ON (c <= 0) if T

Æ^ BB

p5 = cmpp.ON (b <= 13) if T

Æ^ BB

p6 = cmpp.ON (b < 0) if T

Æ^ BB

p6 = cmpp.ON (c <= 25) if T

Æ^ BB

BB4 b > 13 BB^

BB

b++

BB

BB

a++

e < 34^

BB

Exit

  • 6 -

Running Example – Control Flow Substitution

Loop:^ p1 = p2 = p3 = p4 = p5 = p6 = 0^ b = load(a) if T^ p1 = cmpp.ON (b < 0) if T^ p2 = cmpp.ON (b >= 0) if T^ p6 = cmpp.ON (b < 0) if T^ p3 = cmpp.ON (c > 0) if p1^ p5 = cmpp.ON (c <= 0) if p1^ p4 = cmpp.ON (b > 13) if p3^ p5 = cmpp.ON (b <= 13) if p3^ b = b + 1 if p4^ c = c + 1 if p5^ d = d + 1 if p1^ p6 = cmpp.ON (c <= 25) if p2^ e = e + 1 if p2^ a = a + 1 if p6^ bge e, 34, Done if p6^ jump Loop if T Done:

BB2^ c <= 0 c > 0

b <= 13

c <= 25

c > 25 d++

c++ BB b < 0^

b >= 0^ BB^

e++

BB4 b > 13 BB^

BB

b++

BB

BB

a++

e < 34^

BB

e >= 34

  • 7 -

Step 4: CMPP Compaction^ Y^

Convert ON CMPPs to UN^ »^ All singly defined predicates don’t need to be OR-type^ »^ OR of 1 condition

Æ^ Just compute it !!!

»^ Remove initialization (Unconditional don’t require init) Y Reduce number of CMPPs »^ Utilize 2

nd^ destination slot

»^ Combine any 2 CMPPs with:^ y

Same source operands y Same guarding predicate y Same or opposite compare conditions

  • 9 -

Class Problem^ if (a > 0) {

r = t + s if (b > 0 || c > 0)^ u = v + 1 else if (d > 0)^ x = y + 1 else^ z = z + 1 } a. Draw the CFG b. Compute CD c. If-convert the code

  • 10 -

Region Formation + If-conversion

BB2 BB4 BB

BB5’

BB

BB

64.^

BB4’^2028 BB6’

2.^

Y^ Control flow representation^ »

branches » predicated operations

Y^ If-conversion not all all ornothing deal^ »

Often bad to apply in blanketmode » Selectively apply

Y^ Regions^ »

Extend a superblock tocontain if-converted code » Convert off-trace transitionsto on-trace » A hyperblock is born » Superblock is a special caseHB where all guardingpredicates are True

  • 12 -

Negative 1: Resource Usage Resource usage is additive for all BBs that are if-converted

Case 1: Each BB requires 3 resources Assume processor has 2 resources No IC: 13 + .63 + .43 + 13 = 9

9 / 2 = 4.5 = 5 cycles IC: 1(3 + 3 + 3+ 3) = 12

12 / 2 = 6 cycles

BB

60 BB

BB^

BB

BB2 if p BB^

Case 2: Each BB requires 3 resources Assume processor has 6 resources No IC: 13 + .63 + .43 + 13 = 9

9 / 6 = 1.5 = 2 cycles IC: 1(3+3+3+3) = 12

12 / 6 = 2 cycles

BB3 if p2^ BB

100

  • 13 -

Negative 2: Dependence Height Dependence height is max of for all BBs that are if-converted (dep height = schedule length with infinite resources)

Case 1: height(bb1) = 1, height(bb2) = 3 Height(bb3) =

^9 , height(bb4) = 2 No IC: 11 + .63 + .49 + 12 = 8.4 IC: 11 + 1MAX(3,9) + 1*3 = 13

BB

60 BB

BB^

Case 2: height(bb1) = 1, height(bb2) = 3 Height(bb3) =

^3 , height(bb4) = 2 No IC: 11 + .63 + .43 + 12 = 6 IC: 11 + 1MAX(3,3) + 1*2 = 6 BB

BB2 if p BB^

BB3 if p2^ BB

100

  • 15 -

When To If-convert^ Y^

Resources^ »^ Small resource usage ideal forless important paths Y Dependence height^ »^ Matched heights are ideal^ »^ Close to same heights is ok Y Remember everything is relative for resources and dependenceheight! Y Hazards^ »^ Avoid hazards unless on mostimportant path Y Estimate of benefit^ »^ Branches/Mispredicts removed^ »^ Fudge factor

BB

BB1 BB

BB

BB1 BB2 if p1 BB3 if p2 BB

  • 16 -

The Hyperblock^ Y^

Hyperblock

  • Collection of

basic blocks in which controlflow may only enter at the firstBB.^ All internal control flowis eliminated via if-conversion^ »^ “Likely control flow paths

»^ Acyclic (outer backedge ok) »^ Multiple intersecting traceswith no side entrances »^ Side exits still exist Y Hyperblock formation »^ 1. Block selection »^ 2. Tail duplication »^ 3. If-conversion

BB

BB4 BB

BB

BB

BB

  • 18 -

Block Selection^ Y^

Create a trace

Æ“main path”

»^ Use a heuristic function to select other blocks that are“compatible” with the main path »^ Consider each BB by itself for simplicity^ y

Compute priority for other BB’s y Normalize against main path.

Y^ BSVi = (K x (weight_bbi / size_bbi) x (size_main_path /weight_main_path) x bb_chari)^ »

weight = execution frequency » size = number of operations » bb_char = characteristic value of each BB^ y^

Max value = 1, Hazardous instructions reduce this to 0.5, 0.25, ...

»^ K = constant to represent processor issue rate Y Include BB when BSVi > Threshold

  • 19 -

Example - Step 1 - Block Selection

main path = 1,2,4,

num_ops = 5 + 8 + 3 + 2 = 18 weight = 80 Calculate the BSVs for BB3, BB5 assuming no hazards, K = 4 BSV3 = 4 x (20 / 2) x (18 / 80) = 9 BSV5 = 4 x (10 / 5) x (18 / 80) = 1.8 If Threshold = 2.0, select BB3 along with main path 10 BB2 - 8

80 BB4 - 3^ BB6 - 2

BB5 - 5

BB1 - 5 80

BB3 – 2