Slides on Classic Optimization - Fall 2006 | EECS 583, Study notes of Electrical and Electronics Engineering

Material Type: Notes; Professor: Mahlke; Class: Advanced Compilers; Subject: Electrical Engineering And Computer Science; University: University of Michigan - Ann Arbor; Term: Winter 2006;

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-5yn
koofers-user-5yn 🇺🇸

10 documents

1 / 27

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EECS 583 – Class 10
Classic Optimization
Guest speaker today: Rajiv Ravindran
University of Michigan
February 13, 2006
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b

Partial preview of the text

Download Slides on Classic Optimization - Fall 2006 | EECS 583 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

EECS 583 – Class 10Classic Optimization

Guest speaker today: Rajiv RavindranUniversity of MichiganFebruary 13, 2006

  • 1 -

Reading Material^ 

Today’s class»

Compilers: Principles, Techniques, and Tools

A. Aho, R. Sethi, and J. Ullman, Addison-Wesley, 1988,9.9, 10.2, 10.3, 10.

Material for the next lecture»

"Compiler Code Transformations for Superscalar-Based High-Performance Systems", S. Mahlke et al., Supercomputing '92,Nov. 1992, pp. 808-817.

Predicate-sensitive dataflow analysis»

Finish this on Wednesday when Scott is back

  • 3 -

Classical Optimizations^ 

Operation-level – 1 operation in isolation»

Constant folding, strength reduction



Dead code elimination (global, but 1 op at a time)



Local/Global – Pairs of operations»

Constant propagation »^

Forward copy propagation »^

Backward copy propagation »^

CSE

»^

Constant combining »^

Operation folding



Loop – Body of a loop»

Invariant code removal »^

Global variable migration »^

Induction variable strength reduction »^

Induction variable elimination

  • 4 -

Caveat^ 

Traditional compiler class»

Fancy implementations of optimizations, efficient algorithms

»^

Entire papers written on how to do 1 optimization

»^

Spend entire class on 1 optimization

For this class – Go over concepts of each optimization»

What it is, why its useful

»^

When can it be applied (set of conditions that must be satisfied)

Challenges»

How do predicates affect things?

»^

Register pressure?

»^

ILP verses operation count

  • 6 -

Strength Reduction^ 

Replace expensive ops with cheaper ones»

Constant propagation creates opportunities for this

Power of 2 constants»

Multiply by power of 2, replace with left shift^ 

r1 = r2 * 8

r1 = r2 << 3

»^

Divide by power of 2, replace with right shift^ 

r1 = r2 / 4

r1 = r2 >> 2

»^

Remainder by power of 2, replace with logical and^ 

r1 = r2 REM 16

r1 = r2 & 15

More exotic»

Replace multiply by constant by sequence of shift and adds/subs^ 

r1 = r2 * 6

^

r100 = r2 << 2; r101 = r2 << 1; r1 = r100 + r

^

r1 = r2 * 7^ 

r100 = r2 << 3; r1 = r100 – r

  • 7 -

Dead Code Elimination^ 

Remove any operation who’sresult is never consumed



Rules»

X can be deleted^ 

no stores or branches

»^

DU chain empty or destregister not live



This misses some dead code!!»

Especially in loops »^

Critical operation^ 

store or branch operation

»^

Any operation that does notdirectly or indirectly feed acritical operation is dead »^

Trace UD chains backwardsfrom critical operations »^

Any op not visited is dead

r1 = 3r2 = 10 r4 = r4 + 1r7 = r1 * r

r2 = 0

r3 = r3 + 1

r3 = r2 + r1 store (r1, r3)

  • 9 -

Constant Propagation^ 

Forward propagation of movesof the form»

rx = L (where L is a literal) »^

Maximally propagate »^

Assume no instructionencoding restrictions



When is it legal?»

SRC: Literal is a hard codedconstant, so never a problem »^

DEST: Must be available^ 

Guaranteed to reach  May reach not good enough

r1 = 5r2 = r1 + r

r1 = r1 + r

r7 = r1 + r

r8 = r1 + 3

r9 = r1 + r

  • 10 -

Local Constant Propagation^ 

Consider 2 ops, X and Y in aBB, X is before Y»

  1. X is a move »^
    1. src1(X) is a literal »^
      1. Y consumes dest(X) »^
        1. There is no definition ofdest(X) between X and Y »^
          1. No danger betw X and Y^ 

When dest(X) is a Macroreg, BRL destroys the value



Note, ignore operation formatissues, so all operations canhave literals in either operandposition

r1 = 5r2 = ‘_x’r3 = 7r4 = r4 + r1r1 = r1 + r2r1 = r1 + 1r3 = 12r8 = r1 - r2r9 = r3 + r5r3 = r2 + 1r10 = r3 – r

  • 12 -

What About Predicated Code?^ 

Use global formulation with predicate-sensitive dataflow

Use local formulation

» Predicate dominates – predicate of def >=

predicate of use » Intervening writes do not matter if they are on

disjoint predicater1 = 2 if p1.. .r2 = r1 + r3 if p

r1 = 2 if p1.. .r1 = load(r5) if p3.. .r2 = r1 + r3 if p

  • 13 -

Predicated Local Constant Propagation^ 

Consider 2 ops, X and Y in aPredicated BB, X is before Y»

  1. X is a move »^
    1. src1(X) is a literal »^
      1. Y consumes dest(X) »^

3.5. pred(X) >= pred(Y) »^

  1. There is no definition of dest(X), D, between X and Y such that pred(D)

pred(X)

!= False »^

  1. No danger betw X and Y^ 

When dest(X) is a Macroreg, BRL destroys the value

r1 = 5 if Tr2 = ‘_x’ if Tr3 = 7 if Tp1,p2 = cmppUNUC r5 < 0 if Tr4 = r4 + r1 if p1r1 = r1 + r2 if p1r1 = r1 + 1 if p1r3 = 12 if p1r8 = r1 - r2 if p2r9 = r3 + r5 if p2r3 = r1 + 1 if p2r10 = r3 – r1 if p

  • 15 -

Forward Copy Propagation^ 

Forward propagation of the RHSof moves»

r1 = r »^

»^

r4 = r1 + 1

r4 = r2 + 1



Benefits»

Reduce chain of dependences »^

Eliminate the move



Rules (ops X and Y)»

X is a move »^

src1(X) is a register »^

Y consumes dest(X) »^

X.dest is an available def at Y »^

X.src1 is an available expr at Y

r1 = r2r3 = r

r2 = 0

r6 = r3 + 1

r5 = r2 + r

  • 16 -

Backward Copy Propagation^ 

Backward propagation of the LHSof moves»

r1 = r2 + r



r4 = r2 + r

»^

… »^

r5 = r1 + r



r5 = r4 + r

»^

… »^

r4 = r



noop



Rules (ops X and Y in same BB)»

dest(X) is a register »^

dest(X) not live out of BB(X) »^

Y is a move »^

dest(Y) is a register »^

Y consumes dest(X) »^

dest(Y) not consumed in (X…Y) »^

dest(Y) not defined in (X…Y) »^

There are no uses of dest(X) afterthe first redefinition of dest(Y)

r1 = r8 + r9r2 = r9 + r1r4 = r2r6 = r2 + 1r9 = r1r10 = r6r5 = r6 + 1r4 = 0r8 = r2 + r

  • 18 -

Class Problem

Optimize this applying 1. dead code elimination2. forward copy propagation3. backward copy propagation4. CSE

r1 = r5 if Truer4 = r1 if Truer6 = r15 if Truer2 = r3 * r4 if Truer8 = r2 + r5 if Truer9 = r3 if Truep1,p2 = cmpp(r2, r8) if Truer7 = load(r2) if Truer5 = r9 * r4 if p1r3 = load(r2) if p2r10 = r3 / r6 if p2r11 = r2 if Truer12 = load(r11) if p1store (r8, r7) if p2store(r12, r3) if True

Be careful of the predicates!

  • 19 -

Loop Optimizations^ 

The most important set of optimizations»

Because programs spend so much time in loops

Optimize given that you know a sequence of code will berepeatedly executed

Optis»

Invariant code removal

»^

Global variable migration

»^

Induction variable strength reduction

»^

Induction variable elimination