Data Flow Analysis: Forward and Backward, Lattices, Fixpoints, and Elimination - Prof. Jef, Study notes of Computer Science

The concepts of data flow analysis, including forward and backward analysis, lattices, fixpoints, and elimination methods. It covers topics such as monotonic functions, meets, and the height of function lattices. The document also discusses the elimination methods for conditionals and loops, and the challenges of analyzing functions and the heap.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-sx5
koofers-user-sx5 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
34
CMSC 631, Fall 2004
Forward Data Flow, Again
Out(s) = Top for all statements s
W := { all statements } (worklist)
repeat
Take s from W
temp := fs(s󱐭󲶽 pred(s) Out(s󱐭)) (fs monotonic transfer fn)
if (temp != Out(s)) {
-Out(s) := temp
-W := W 󲰏 succ(s)
}
until W = 󲰖
35
CMSC 631, Fall 2004
Lattices (P, )
Available expressions
P = sets of expressions
S1 S2 = S1 󲰐 S2
Top = set of all expressions
Reaching Definitions
P = set of definitions (assignment statements)
S1 S2 = S1 󲰏 S2
Top = empty set
36
CMSC 631, Fall 2004
Fixpoints
We always start with Top
Every expression is available, no defns reach this point
Most optimistic assumption
Strongest possible hypothesis
-= true of fewest number of states
Revise as we encounter contradictions
Always move down in the lattice (with meet)
Result: A greatest fixpoint
37
CMSC 631, Fall 2004
Lattices (P, ), cont’d
Live variables
P = sets of variables
S1 S2 = S1 S2
Top = empty set
Very busy expressions
P = set of expressions
S1 S2 = S1 󲰐 S2
Top = set of all expressions
38
CMSC 631, Fall 2004
Forward vs. Backward
Out(s) = Top for all s
W := { all statements }
repeat
Take s from W
temp := fs(s󱐭󲶽 pred(s) Out(s󱐭))
if (temp != Out(s)) {
Out(s) := temp
W := W 󲰏succ(s)
}
until W = 󲰖
In(s) = Top for all s
W := { all statements }
repeat
Take s from W
temp := fs(s󱐭󲶽 succ(s) In(s󱐭))
if (temp != In(s)) {
In(s) := temp
W := W 󲰏pred(s)
}
until W = 󲰖
39
CMSC 631, Fall 2004
Termination Revisited
How many times can we apply this step:
temp := fs(s󱐭󲶽 pred(s) Out(s󱐭))
if (temp != Out(s)) { ... }
Claim: Out(s) only shrinks
-Proof: Out(s) starts out as top
-So temp must be than Top after first step
-Assume Out(s󱐭) shrinks for all predecessors s󱐭 of s
-Then s󱐭󲶽 pred(s) Out(s󱐭) shrinks
-Since fs monotonic, fs(s󱐭󲶽 pred(s) Out(s󱐭)) shrinks
pf3
pf4
pf5

Partial preview of the text

Download Data Flow Analysis: Forward and Backward, Lattices, Fixpoints, and Elimination - Prof. Jef and more Study notes Computer Science in PDF only on Docsity!

CMSC 631, Fall 2004 34

Forward Data Flow, Again

  • Out(s) = Top for all statements s
  • W := { all statements } (worklist)
  • repeat

■ Take s from W

■ temp := fs(⊓ s pred(s)

Out(s )) (f s

monotonic transfer fn )

■ if (temp != Out(s)) {

  • Out(s) := temp
  • W := W succ(s)

■ (^) }

  • until W =

CMSC 631, Fall 2004 35

Lattices (P, ≤ )

• Available expressions

■ P = sets of expressions

■ S1 ⊓ S2 = S1 S

■ (^) Top = set of all expressions

• Reaching Definitions

■ P = set of definitions (assignment statements)

■ S1 ⊓ S2 = S1 S

■ (^) Top = empty set

36

CMSC 631, Fall 2004

Fixpoints

• We always start with Top

■ (^) Every expression is available, no defns reach this point

■ (^) Most optimistic assumption

■ (^) Strongest possible hypothesis

  • = true of fewest number of states

• Revise as we encounter contradictions

■ (^) Always move down in the lattice (with meet)

• Result: A greatest fixpoint

37

CMSC 631, Fall 2004

Lattices (P, ≤ ), cont’d

• Live variables

■ (^) P = sets of variables

■ S1 ⊓ S2 = S1 ∪ S

■ Top = empty set

• Very busy expressions

■ P = set of expressions

■ S1 ⊓ S2 = S1 S

■ (^) Top = set of all expressions

Forward vs. Backward

Out(s) = Top for all s

W := { all statements }

repeat

Take s from W

temp := f s

s pred(s)

Out(s ))

if (temp != Out(s)) {

Out(s) := temp

W := W succ(s)

until W =

In(s) = Top for all s

W := { all statements }

repeat

Take s from W

temp := f s

s succ(s)

In(s ))

if (temp != In(s)) {

In(s) := temp

W := W pred(s)

until W =

Termination Revisited

• How many times can we apply this step:

temp := f s

(⊓ s pred(s)

Out(s ))

if (temp != Out(s)) { ... }

■ Claim: Out(s) only shrinks

  • Proof:^ Out(s)^ starts out as top
    • So temp must be ≤ than Top after first step
  • Assume^ Out(s^ )^ shrinks for all predecessors^ s^ of^ s

Then ⊓ s pred(s)

Out(s ) shrinks

Since f s

monotonic, f s

(⊓ s pred(s)

Out(s )) shrinks

CMSC 631, Fall 2004 40

Termination Revisited (cont’d)

• A descending chain in a lattice is a sequence

■ x0 Ӹx1 Ӹx2 Ӹ...

• The height of a lattice is the length of the longest

descending chain in the lattice

• Then, dataflow must terminate in O(nk) time

■ (^) n = # of statements in program

■ (^) k = height of lattice

■ assumes meet operation takes O(1) time

CMSC 631, Fall 2004 41

Least vs. Greatest Fixpoints

• Dataflow tradition: Start with Top, use meet

■ To do this, we need a meet semilattice with top

■ (^) meet semilattice = meets defined for any set

■ (^) Computes greatest fixpoint

• Denotational semantics tradition: Start with

Bottom, use join

■ (^) Computes least fixpoint

42

CMSC 631, Fall 2004

• By monotonicity, we also have

• A function f is distributive if

Distributive Data Flow Problems

f (x! y) ≤ f (x)! f (y)

f (x! y) = f (x)! f (y)

43

CMSC 631, Fall 2004

• Joins lose no information

Benefit of Distributivity

f g

h

k

k(h(f (!) " g(!))) =

k(h(f (!)) " h(g(!))) =

k(h(f (!))) " k(h(g(!)))

• Ideally, we would like to compute the meet over

all paths (MOP) solution:

■ Let^ f s

be the transfer function for statement s

■ If^ p^ is a path^ {s 1

, ..., s n

}, let f p

= f n

;...;f 1

■ Let path(s) be the set of paths from the entry to s

• If a data flow problem is distributive, then solving

the data flow equations in the standard way

yields the MOP solution

Accuracy of Data Flow Analysis

MOP(s) =! p∈ path(s)

fp (")

• Analyses of how the program computes

■ (^) Live variables

■ (^) Available expressions

■ (^) Reaching definitions

■ (^) Very busy expressions

• All Gen/Kill problems are distributive

What Problems are Distributive?

CMSC 631, Fall 2004 52

• Must vs. May

■ (Not always followed in literature)

• Forwards vs. Backwards

• Flow-sensitive vs. Flow-insensitive

• Distributive vs. Non-distributive

Terminology Review

CMSC 631, Fall 2004 53

• Recall in practice, one transfer function per basic

block

• Why not generalize this idea beyond a basic

block?

■ “Collapse” larger constructs into smaller ones,

combining data flow equations

■ Eventually program collapsed into a single node!

■ “Expand out” back to original constructs, rebuilding

information

Another Approach: Elimination

54

CMSC 631, Fall 2004

Lattices of Functions

• Let (P, ≤) be a lattice

• Let M be the set of monotonic functions on P

Define f ≤

f

g if for all x, f(x) ≤ g(x)

• Define the function^ f^ ⊓^ g^ as

■ (f ⊓ g) (x) = f(x) ⊓ g(x)

• Claim:^ (M,^ ≤

f

) forms a lattice

55

CMSC 631, Fall 2004

Elimination Methods: Conditionals

f ite

= (f then

◦ f if

) " (f else

◦ f if

Out(if) = f if

(In(ite)))

Out(then) = (f then

◦ f if

)(In(ite)))

Out(else) = (f else

◦ f if

)(In(ite)))

If

Then Else

z

IfThenElse

z

Elimination Methods: Loops

Head

Body

z

While

z

f while

= f head

f head

◦ f body

◦ f head

f head

◦ f body

◦ f head

◦ f body

◦ f head

Elimination Methods: Loops (cont’d)

• Let f

i

= f o f o ... o f (i times)

■ (^) f

0 = id

• Let

• Need to compute limit as j goes to infinity

■ Does such a thing exist?

• Observe: g(j+1) ≤ g(j)

g(j) = !i∈[0..j] (f head

◦ f body

i ◦ f head

CMSC 631, Fall 2004 58

Height of Function Lattice

• Assume underlying lattice (P, ≤) has finite height

■ What is height of lattice of monotonic functions?

■ (^) Claim: At most |P|×Height(P)

• Therefore, g(j) converges

CMSC 631, Fall 2004 59

• Elimination methods usually only applied to

reducible flow graphs

■ Ones that can be collapsed

■ Standard constructs yield only reducible flow graphs

• Unrestricted goto can yield non-reducible graphs

Non-Reducible Flow Graphs

A

B C

z w

60

CMSC 631, Fall 2004

Comments

• Can also do backwards elimination

■ (^) Not quite as nice (regions are usually single entry but

often not single exit )

• For bit-vector problems, elimination efficient

■ (^) Easy to compose functions, compute meet, etc.

• Elimination originally seemed like it might be

faster than iteration

■ (^) Not really the case

61

CMSC 631, Fall 2004

• What happens at a function call?

■ (^) Lots of proposed solutions in data flow analysis

literature

• In practice, only analyze one procedure at a time

• Consequences

■ Call to function kills all data flow facts

■ (^) May be able to improve depending on language, e.g.,

function call may not affect locals

Data Flow Analysis and Functions

• An analysis that models only a single function at

a time is intraprocedural

• An analysis that takes multiple functions into

account is interprocedural

• An analysis that takes the whole program into

account is...guess?

• Note: global analysis means “more than one

basic block,” but still within a function

More Terminology

• Data Flow is good at analyzing local variables

■ (^) But what about values stored in the heap?

■ (^) Not modeled in traditional data flow

• In practice: *x := e

■ Assume all data flow facts killed (!)

■ Or, assume write through x may affect any variable

whose address has been taken

• In general, hard to analyze pointers

Data Flow Analysis and The Heap