


















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Two compiler optimization techniques: induction variable strength reduction and induction variable elimination. These techniques aim to improve the performance of loops by reducing the number of induction variables and eliminating unnecessary ones. The document also covers examples of their application and potential issues.
Typology: Study notes
1 / 26
This page cannot be seen from the preview
Don't miss anything!



















Reading Material^
Induction Variable Strength Reduction
Induction Variable Strength Reduction (2)^
Rules»
X is a *, <<, + or – operation »^
src1(X) is a basic ind var »^
src2(X) is invariant »^
No other ops modify dest(X) »^
dest(X) != src(X) for all srcs »^
dest(X) is a register
Transformation»
Insert the following into the preheader^
new_reg = RHS(X)
»^
If opcode(X) is not add/sub, insert to thebottom of the preheader^
new_inc = inc(src1(X)) opcode(X) src2(X)
»^
else^
new_inc = inc(src1(X))
»^
Insert the following at each update ofsrc1(X)^
new_reg += new_inc
»^
Change X
dest(X) = new_reg
r5 = r4 - 3r4 = r4 + 1
r7 = r4 * r
r6 = r4 << 2
Class Problem
Optimize this applying induction var str reductioninduction var elimination
r5 = r5 + 1r11 = r5 * 2r10 = r11 + 2 r12 = load (r10+0)
r9 = r1 << 1r4 = r9 - 10r3 = load(r4+4)r3 = r3 + 1store(r4+0, r3)r7 = r3 << 2r6 = load(r7+0)r13 = r2 - 1r1 = r1 + 1r2 = r2 + 1
r1 = 0r2 = 0
r13, r12, r6, r
liveout
ILP Optimization^
Global Register Renaming
A single use may havemultiple reaching defs
Identify webs^
Take a def, add all uses Take all uses, add allreaching defs Take all defs, add all uses repeat until stable soln
»^
Each web renamed if name isthe same as another web
x =y =
= y= x
x =y =
= y
= x
y =
= y
Rename with Copy
The worst case is a web spansall defs/uses »^
Want to enable some of thedefs within the web to bereordered or executed inparallel
Rename def »^
Rename uses for which def isthe the only reaching def »^
Insert copy^
orig_dest = new_dest
y =
= y
y == y
= y
y =
= y
= y
Promote with Copy^
Promotion alone not legalbecause a live valuedestroyed
Might as well choose True »^
Substitute uses for which defis the only reaching def »^
Insert copy of old_dest =new_dest if original_ped »^
Again, must ensure operationwill not cause a spuriousexception
Class Problem
using promotion with renaming
Tree Height Reduction^
Obey precedence rules »^
Essentially re-parenthesize
Height reduced (n terms)^
n-1 (assuming unit latency) ceil(log2(n))
»^
Number of operationsremains constant »^
Cost^
Temporary registers “live”longer
»^
Watch out for^
Always ok for integerarithmetic Floating-point – may notbe!!
r9 = r1 + r2r10 = r9 + r3r11 = r10 - r4r12 = r11 + r5r13 = r12 – r
r13 = r1 + r2 + r3 – r4 + r5 – r r1 + r
r3 – r
r5 – r
t1 = r1 + r2t2 = r3 – r4t3 = r5 – r6t4 = t1 + t2r13 = t4 + t
r
after back subs:
original:
final code:
Fancier Tree Height Reduction^
Reassociate to maximizeopportunities for combiningliterals at compile time »^
Reduces amount ofcomputation
r13 = r1 + 4 + r2 - 3 + r3 - 6
r1 + r
r3 – 5 + r
after back subs:^ reassociate:r13 = r1 + r2 + r3 + (4 - 3 – 6)simplify:r13 = r1 + r2 + r3 - 5balance:
Class Problem
Assume: + = 1, * = 3
0 r
0 r
0 r
1 r
2 r
0 r
operandarrival times
Back susbstituteRe-express in tree-height reduced form
Account for latency and arrival times
Optimizing Unrolled Loops
r1 = load(r2)r3 = load(r4)r5 = r1 * r3r6 = r6 + r5r2 = r2 + 4r4 = r4 + 4if (r4 < 400) goto loop
loop:
r1 = load(r2)r3 = load(r4)r5 = r1 * r3r6 = r6 + r5r2 = r2 + 4r4 = r4 + 4r1 = load(r2)r3 = load(r4)r5 = r1 * r3r6 = r6 + r5r2 = r2 + 4r4 = r4 + 4r1 = load(r2)r3 = load(r4)r5 = r1 * r3r6 = r6 + r5r2 = r2 + 4r4 = r4 + 4if (r4 < 400) goto loop
iter1 iter2 iter
Unroll = replicate loop bodyn-1 times.Hope to enable overlap ofoperation execution fromdifferent iterationsNot possible!
loop:
unroll 3 times