



















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Professor: Mahlke; Class: Advanced Compilers; Subject: Electrical Engineering And Computer Science; University: University of Michigan - Ann Arbor; Term: Fall 2007;
Typology: Exams
1 / 27
This page cannot be seen from the preview
Don't miss anything!




















Schedule for the Rest of the Semester^ ^
Reading Material^ ^
From Last Time: A Software PipelineLoop bodywith 4 ops
Prologue - fill the pipe Kernel – steady state Epilogue - drain the pipe
time
Dependences in a Loop^ ^
Intra-iteration » Inter-iteration ^
Minimum time interval betweenthe start of operations » Operation read/write times ^
Number of iterations separatingthe 2 operations involved » Distance of 0 means intra-iteration ^
<1,2> <1,0>
<1,2>
<delay, distance> Edges annotated with tuple
Dynamic Single Assignment (DSA) Form
1: r3 = load(r1)2: r4 = r3 * 263: store (r2, r4)4: r1 = r1 + 45: r2 = r2 + 46: p1 = cmpp (r1 < r9)7: brct p1 Loop Impossible to overlap iterations because each iteration writes to the sameregister. So, we’ll have to remove the anti and output dependences.Recall back the notion of a rotating register (virtual for now)* Each register is an infinite push down array (Expanded virtual reg or EVR
Æ^ r[n] changes to r[n+1]
A program is in DSA form if the same virtual register (EVR element) is neverassigned to more than 1x on any dynamic execution path
1: r3[-1] = load(r1[0])2: r4[-1] = r3[-1] * 263: store (r2[0], r4[-1])4: r1[-1] = r1[0] + 45: r2[-1] = r2[0] + 46: p1[-1] = cmpp (r1[-1] < r9)remap r1, r2, r3, r4, p17: brct p1[-1] Loop DSAconversion
Loop Dependence Example
1: r3[-1] = load(r1[0])2: r4[-1] = r3[-1] * 263: store (r2[0], r4[-1])4: r1[-1] = r1[0] + 45: r2[-1] = r2[0] + 46: p1[-1] = cmpp (r1[-1] < r9)remap r1, r2, r3, r4, p17: brct p1[-1] Loop
In DSA form, there are no inter-iteration anti or output dependences!
3,0 1, 2,
1,
1, 0,0 1,1 1,1 1, 0,0 <delay, distance>
Class Problem
1: r1[-1] = load(r2[0])2: r3[-1] = r1[1] – r1[2]3: store (r3[-1], r2[0])4: r2[-1] = r2[0] + 45: p1[-1] = cmpp (r2[-1] < 100)remap r1, r2, r36: brct p1[-1] Loop^ Draw the dependence graph^ showing both intra and inter^ iteration dependences Latencies: ld = 2, st = 1, add = 1, cmpp = 1, br = 1
ResMII^ Concept: If there were no dependences between the operations, what^ is the the shortest possible schedule?^ Simple resource model^ A processor has a set of resources R. For each resource r in R^ there is count(r) specifying the number of identical copies
ResMII = MAX
(uses(r) / count(r)) for all r in R uses(r) = number of times the resource is used in 1 iteration In reality its more complex than this because operations can have multiple alternatives (different choices for resources it could be assigned to), but we will ignore this for now
RecMII Example^ 1: r3 = load(r1)2: r4 = r3 * 263: store (r2, r4)4: r1 = r1 + 45: r2 = r2 + 46: p1 = cmpp (r1 < r9)7: brct p1 Loop
3,0 1, 2,
1,1 1, 0,0 1,1 1,1 1, 0,0 <delay, distance>
RecMII = MAX(1,1,1,1) = 1 Then, MII = MAX(ResMII, RecMII) MII = MAX(2,1) = 2
Class Problem
1: r1[-1] = load(r2[0])2: r3[-1] = r1[1] – r1[2]3: store (r3[-1], r2[0])4: r2[-1] = r2[0] + 45: p1[-1] = cmpp (r2[-1] < 100)remap r1, r2, r36: brct p1[-1] Loop^ Calculate RecMII, ResMII, and MII Latencies: ld = 2, st = 1, add = 1, cmpp = 1, br = 1 Resources: 1 ALU, 1 MEM, 1 BR
Priority Function^ Height-based priority worked well for acyclic scheduling, makes sense^ that it will work for loops as well Acyclic:
Height(X) =
0, if X has no successors MAX
((Height(Y) + Delay(X,Y)), otherwise
for all Y = succ(X)
Cyclic:
HeightR(X) =
0, if X has no successors MAX
((HeightR(Y) + EffDelay(X,Y)), otherwise
for all Y = succ(X) EffDelay(X,Y) = Delay(X,Y) – II*Distance(X,Y)
Calculating Height
2,2 1,
Insert pseudo edges from all nodes to branch with latency = 0, distance = 0 (dotted edges)
2.^
Compute II, For this example assume II = 2
3.^
HeightR(4) =
4.^
HeightR(3) =
5.^
HeightR(2) =
6.^
HeightR(1)
2, 0,0 0,
0,