Control Flow Analysis: Loop Detection and Unrolling in EECS 483 at University of Michigan, Study notes of Electrical and Electronics Engineering

This document from the university of michigan covers control flow analysis, specifically loop detection and unrolling in the context of the eecs 483 course. It explains the concepts of natural loops, backedges, loop detection, and loop unrolling. The document also includes examples and class problems to help students understand these concepts.

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-4vb-2
koofers-user-4vb-2 🇺🇸

10 documents

1 / 31

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Control Flow Analysis/Opti II
Loop Detection, Unrolling
EECS 483 – Lecture 17
University of Michigan
Wednesday, November 10, 2004
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f

Partial preview of the text

Download Control Flow Analysis: Loop Detection and Unrolling in EECS 483 at University of Michigan and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

Control Flow Analysis/Opti II Loop Detection, Unrolling

EECS 483 – Lecture 17 University of Michigan Wednesday, November 10, 2004

  • 1 -

From Last Time: Natural Loops

Y

Cycle suitable for optimization » Discuss opti later

Y

2 properties: » Single entry point called the header

y

Header dominates

all blocks in the loop

» Must be one way to iterate the loop (ie at least 1 path back to the header from within theloop) called a backedge

Y

Backedge detection » Edge, x Æ y where the target (y) dominates the source (x)

  • 3 -

Loop Detection

Y Identify all backedges using dominance info Y Each backedge (x Æ y) defines a loop

» Loop header is the backedge target (y) » Loop BB – basic blocks that comprise the loop

y

All predecessor blocks of x for which control can reach xwithout going through y are in the loop

Y Merge loops with the same header

» I.e., a loop with 2 continues » LoopBackedge = LoopBackedge1 + LoopBackedge2 » LoopBB = LoopBB1 + LoopBB

Y Important property

» Header dominates all LoopBB

  • 4 -

Loop Detection Example

Loop detection: 3 steps: •

Identify backedges

Compute LoopBB

Merge loops with the same header

BB2 BB
BB
BB

Entry Exit

dom(1) = E,

BB1 BB

dom(2) = E,1,

dom(3) = E,1,2,

Loop1: defined by 6

Æ

LoopBB = 2,3,4,5,

Loop2: defined by 4

Æ

LoopBB = 3,

Loop3: defined by 5

Æ

LoopBB = 3,4,

Merge loops 2,

LoopBB = 3,4,5 Backedges = 4

Æ

Æ

dom(4) = E,1,2,3,4 dom(5) = E,1,2,3,

dom(6) = E,1,2,

  • 6 -

Important Parts of a Loop

Y Header, LoopBB Y Backedges, BackedgeBB Y Exitedges, ExitBB

» For each LoopBB, examine each outgoing edge » If the edge is to a BB not in LoopBB, then its an exit

Y Preheader (Preloop)

» New block before the header (falls through to header) » Whenever you invoke the loop, preheader executed » Whenever you iterate the loop, preheader NOT executed » All edges entering header

y

Backedges – no change, All others - retarget to preheader

Y Postheader (Postloop) - analogous

  • 7 -

ExitBB/Preheader Example

Note, preheader for blue loop is contained in yellow loop BB2 BB

BB
BB

Entry Exit BB BB Pre1 Pre BB2 BB

BB
BB

Entry BB1 BB6^ Exit

Exit BB

Blue loop: BB

Yellow loop: Exit

  • 9 -

Trip Count Calculation Example^ Calculate the trip counts for all the^ loops in the graph

BB2^60

900 60 1240 200 BB

BB
BB

Entry BB1^ Exit 20

Blue loop:

w(header) = w(BB3)

w(preheader) = w(BB2)

= 60 ( why not 100??? )

avg trip count = 2000/60 = 33.

Yellow loop:

w(header) = w(BB2)

w(preheader) = w(BB1) = 20 avg trip count = 100/20 = 5

700 1100 40 80

BB

20

  • 10 -

Loop Induction Variables

Y Induction variables are variables such that everytime they changes value, they areincremented/decremented by some constant Y Basic induction variable – induction variablewhose only assignments within a loop are of theform j = j +/- C, where C is a constant Y Primary induction variable – basic inductionvariable that controls the loop execution (for i=0;i<100; i++), i (virtual register holding i) is theprimary induction variable Y Derived induction variable – variable that is alinear function of a basic induction variable

  • 12 -

Reducible Flow Graphs

Y A flow graph is reducible if and only if we can partition the edges into 2 disjoint groups oftencalled forward and back edges with the followingproperties

» The forward edges form an acyclic graph in which

every node can be reached from the Entry

» The back edges consist only of edges whose

destinations dominate their sources

Y More simply – Take a CFG, remove all thebackedges (x Æ y where y dominates x), you should have a connected, acyclic graph

  • 13 -

Irreducible Flow Graph Example

  • In C/C++, its not possible to create an irreducible flow graph without using goto’s * Cyclic graphs that are NOT natural loops cannot be optimized by the compiler L1: x = x + 1 if (x) {

L2: y = y + 1 if (y > 10) goto L

} else {

L3: z = z + 1 if (z > 0) goto L

bb Non-reducible! bb bb

  • 15 -

Loop Unroll – Type 1

r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + 1 blt r2 100 Loop Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + 1 blt r2 100 Loop r2 is the loop variable, Increment is 1 Initial value is 0 Final value is 100 Trip count is 100 Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + 1 Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r1 = MEM[r2 + 1] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + 2 blt r2 100 Loop

Counted loop All parms known

Remove r2 increments from first N-1 iterations and update last increment Remove branch from first N-1 iterations

  • 16 -

Loop Unroll – Type 2

Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + X blt r2 Y Loop r2 is the loop variable, Increment is? Initial value is? Final value is? Trip count is? Remainder loop executes the “leftover” iterations Unrolled loop same as Type 1, and is guaranteed to execute a multiple of N times

Counted loop Some parms unknown

tc = final – initialtc = tc / incrementrem = tc % Nfin = rem * increment^ RemLoop:^ r1 = MEM[r2 + 0]^ r4 = r1 * r5^ r6 = r4 << 2^ MEM[r3 + 0] = r6^ r2 = r2 + X^ blt r2 fin RemLoop Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r1 = MEM[r2 + X] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + (N*X) blt r2 Y Loop

  • 18 -

Loop Unroll Summary

Y Goals

» Reduce number of executed branches inside loop

y

Note: Type1/Type2 only

» Enable the overlapped execution of multiple iterations

y

Reorder instructions between iterations

» Enable dataflow optimization across iterations

Y Type 1 is the most effective

» All intermediate branches removed, least code expansion » Only applicable to a small fraction of loops

  • 19 -

Loop Unroll Summary (2)

Y Type 2 is almost as effective

» All intermediate branches removed » Remainder loop is required since trip count not known at

compile time

» Need to make sure don’t spend much time in rem loop

Y Type 3 can be effective

» No branches eliminated » But iteration overlap still possible » Always applicable (most loops fall into this category!) » Use average trip count to guide unroll amount