Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Control Flow Analysis: Loop Detection and Unrolling in EECS 483 at University of Michigan, Study notes of Electrical and Electronics Engineering

University of Michigan (UM) - Ann Arbor Electrical and Electronics Engineering

This document from the university of michigan covers control flow analysis, specifically loop detection and unrolling in the context of the eecs 483 course. It explains the concepts of natural loops, backedges, loop detection, and loop unrolling. The document also includes examples and class problems to help students understand these concepts.

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-4vb-2 🇺🇸

10 documents

1 / 31

This page cannot be seen from the preview

Don't miss anything!

Control Flow Analysis/Opti II

Loop Detection, Unrolling

EECS 483 – Lecture 17

University of Michigan

Wednesday, November 10, 2004

Discover Study notes of Electrical and Electronics Engineering University of Michigan (UM) - Ann Arbor

Partial preview of the text

Download Control Flow Analysis: Loop Detection and Unrolling in EECS 483 at University of Michigan and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

Control Flow Analysis/Opti II Loop Detection, Unrolling

EECS 483 – Lecture 17 University of Michigan Wednesday, November 10, 2004

From Last Time: Natural Loops

Y

Cycle suitable for optimization » Discuss opti later

Y

2 properties: » Single entry point called the header

y

Header dominates

all blocks in the loop

» Must be one way to iterate the loop (ie at least 1 path back to the header from within theloop) called a backedge

Y

Backedge detection » Edge, x Æ y where the target (y) dominates the source (x)

Loop Detection

Y Identify all backedges using dominance info Y Each backedge (x Æ y) defines a loop

» Loop header is the backedge target (y) » Loop BB – basic blocks that comprise the loop

y

All predecessor blocks of x for which control can reach xwithout going through y are in the loop

Y Merge loops with the same header

» I.e., a loop with 2 continues » LoopBackedge = LoopBackedge1 + LoopBackedge2 » LoopBB = LoopBB1 + LoopBB

Y Important property

» Header dominates all LoopBB

Loop Detection Example

Loop detection: 3 steps: •

Identify backedges

Compute LoopBB

Merge loops with the same header

BB2 BB

BB

Entry Exit

dom(1) = E,

BB1 BB

dom(2) = E,1,

dom(3) = E,1,2,

Loop1: defined by 6

Æ

LoopBB = 2,3,4,5,

Loop2: defined by 4

Æ

LoopBB = 3,

Loop3: defined by 5

Æ

LoopBB = 3,4,

Merge loops 2,

LoopBB = 3,4,5 Backedges = 4

Æ

dom(4) = E,1,2,3,4 dom(5) = E,1,2,3,

dom(6) = E,1,2,

Important Parts of a Loop

Y Header, LoopBB Y Backedges, BackedgeBB Y Exitedges, ExitBB

» For each LoopBB, examine each outgoing edge » If the edge is to a BB not in LoopBB, then its an exit

Y Preheader (Preloop)

» New block before the header (falls through to header) » Whenever you invoke the loop, preheader executed » Whenever you iterate the loop, preheader NOT executed » All edges entering header

y

Backedges – no change, All others - retarget to preheader

Y Postheader (Postloop) - analogous

ExitBB/Preheader Example

Note, preheader for blue loop is contained in yellow loop BB2 BB

BB

Entry Exit BB BB Pre1 Pre BB2 BB

BB

Entry BB1 BB6^ Exit

Exit BB

Blue loop: BB

Yellow loop: Exit

Trip Count Calculation Example^ Calculate the trip counts for all the^ loops in the graph

BB2^60

900 60 1240 200 BB

BB

Entry BB1^ Exit 20

Blue loop:

w(header) = w(BB3)

w(preheader) = w(BB2)

= 60 ( why not 100??? )

avg trip count = 2000/60 = 33.

Yellow loop:

w(header) = w(BB2)

w(preheader) = w(BB1) = 20 avg trip count = 100/20 = 5

700 1100 40 80

BB

10 -

Loop Induction Variables

Y Induction variables are variables such that everytime they changes value, they areincremented/decremented by some constant Y Basic induction variable – induction variablewhose only assignments within a loop are of theform j = j +/- C, where C is a constant Y Primary induction variable – basic inductionvariable that controls the loop execution (for i=0;i<100; i++), i (virtual register holding i) is theprimary induction variable Y Derived induction variable – variable that is alinear function of a basic induction variable

12 -

Reducible Flow Graphs

Y A flow graph is reducible if and only if we can partition the edges into 2 disjoint groups oftencalled forward and back edges with the followingproperties

» The forward edges form an acyclic graph in which

every node can be reached from the Entry

» The back edges consist only of edges whose

destinations dominate their sources

Y More simply – Take a CFG, remove all thebackedges (x Æ y where y dominates x), you should have a connected, acyclic graph

13 -

Irreducible Flow Graph Example

In C/C++, its not possible to create an irreducible flow graph without using goto’s * Cyclic graphs that are NOT natural loops cannot be optimized by the compiler L1: x = x + 1 if (x) {

L2: y = y + 1 if (y > 10) goto L

} else {

L3: z = z + 1 if (z > 0) goto L

bb Non-reducible! bb bb

15 -

Loop Unroll – Type 1

r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + 1 blt r2 100 Loop Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + 1 blt r2 100 Loop r2 is the loop variable, Increment is 1 Initial value is 0 Final value is 100 Trip count is 100 Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + 1 Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r1 = MEM[r2 + 1] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + 2 blt r2 100 Loop

Counted loop All parms known

Remove r2 increments from first N-1 iterations and update last increment Remove branch from first N-1 iterations

16 -

Loop Unroll – Type 2

Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + X blt r2 Y Loop r2 is the loop variable, Increment is? Initial value is? Final value is? Trip count is? Remainder loop executes the “leftover” iterations Unrolled loop same as Type 1, and is guaranteed to execute a multiple of N times

Counted loop Some parms unknown

tc = final – initialtc = tc / incrementrem = tc % Nfin = rem * increment^ RemLoop:^ r1 = MEM[r2 + 0]^ r4 = r1 * r5^ r6 = r4 << 2^ MEM[r3 + 0] = r6^ r2 = r2 + X^ blt r2 fin RemLoop Loop: r1 = MEM[r2 + 0] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r1 = MEM[r2 + X] r4 = r1 * r5 r6 = r4 << 2 MEM[r3 + 0] = r6 r2 = r2 + (N*X) blt r2 Y Loop

18 -

Loop Unroll Summary

Y Goals

» Reduce number of executed branches inside loop

y

Note: Type1/Type2 only

» Enable the overlapped execution of multiple iterations

y

Reorder instructions between iterations

» Enable dataflow optimization across iterations

Y Type 1 is the most effective

» All intermediate branches removed, least code expansion » Only applicable to a small fraction of loops

19 -

Loop Unroll Summary (2)

Y Type 2 is almost as effective

» All intermediate branches removed » Remainder loop is required since trip count not known at

compile time

» Need to make sure don’t spend much time in rem loop

Y Type 3 can be effective

» No branches eliminated » But iteration overlap still possible » Always applicable (most loops fall into this category!) » Use average trip count to guide unroll amount