Lecture Slides on Data Flow Analysis | CMSC 631, Study notes of Computer Science

Material Type: Notes; Class: PROG ANLYS&UNDERSTANDING; Subject: Computer Science; University: University of Maryland; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-4ho-1
koofers-user-4ho-1 🇺🇸

9 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Data flow analysis
Abstract Syntax Trees
x = a*b;
y = a*b;
while (y > a+b) {
a = a+1;
x = a+b;
}
StmtList
x = a*b StmtList
y = a*b StmtList
whileStmt
y > a+b StmtList
a = a+1 StmtList
x = a+b StmtList
AST stopped
at statement/expression
level for brevity
Control Flow Graph
x = a*b;
y = a*b;
while (y > a+b) {
a = a+1;
x = a+b;
}
Entry
x = a*b
y = a*b
if y > a+b Exit
a = a+1
x = a+b
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Lecture Slides on Data Flow Analysis | CMSC 631 and more Study notes Computer Science in PDF only on Docsity!

Data flow analysis

Abstract Syntax Trees

x = ab; y = ab; while (y > a+b) { a = a+1; x = a+b; } StmtList x = ab StmtList y = ab StmtList whileStmt y > a+b StmtList a = a+1 StmtList x = a+b StmtList AST stopped at statement/expression level for brevity

Control Flow Graph

x = ab; y = ab; while (y > a+b) { a = a+1; x = a+b; } Entry x = ab y = ab if y > a+b (^) Exit a = a+ x = a+b

Choosing a representation

  • Control flow graph is more general
  • AST allows for more efficient algorithms
    • but new programming constructs require changing the algorithm - e.g., continue, break, switch, try-catch-finally, goto
    • program transformations may not leave the program in AST form
    • bytecode/machine code isn’t in AST form
      • although you may be able to recover it

Data flow analysis

  • A framework for proving facts about a program - reasoning about lots of little facts - little or no interaction between facts - based on all paths through program - including infeasible paths - e.g., which assignments to x can be seen at this read of x?

Reaching definitions

  • Each assignment to a variable is a definition
  • defs(v) represents the set of all definitions of v
  • Assume all variables are scalars
    • no pointers or arrays

Computing In(S)

  • If S has one predecessor P, In(S) = Out(P)
  • Otherwise,
    • In(S) = meet (^) P in Pred(S) Out(P)
  • The meet function defines how to combine alternatives
  • For reaching definitions, meet = union

iterative solution

  • For control flow graphs with cycles, can’t directly solve the equations - compute final answer for values in terms of other final values already known
  • Use iterative solution
    • Can compute dataflow values is any order
      • some orders are more efficient than others
    • computation will converge to right answer

Initial value

  • For iterative solution
    • might need Out(S) before we get a chance to compute In(S)
  • Need an initial value for Out(S) of all statements other than Entry

Control Flow Graph

parameter a; parameter b; x = ab; y = ab; while (y > a+b) { a = a+1; x = a+b; } defs(x) = {1,4} defs(y) = {2} defs(a) = {0,3} defs(b) = {0} Entry 3: x = ab 4: y = ab if y > a+b Exit 5: a = a+ 6: x = a+b 1: parameter a 2: parameter b {} {1} {1,2} {1,2,3} {1,2,3,4} (^) {1,2,3,4} {1,2,3,4} {2,4,5,6} {2,3,4,5} {} {1,2,3,4,5,6} {1,2,3,4,5,6} {2,3,4,5,6}

More control flow programs

  • Definitely uninitialized variables
  • Possibly uninitialized variables
    • compare with definitely initialized variables
  • What is Gen and Kill?
  • What is Out(Entry)?
  • What is the meet function?
  • What is the initial value?

Available expressions

  • An expression e is available at point p if on all paths to p, e must have been computed and since that computation, none of the variables in e have been modified - i.e., computation of e here would be redundant
  • Gen( x = a+b ) = { a+b } - Kill( x = a+b )
  • Kill( x = a+b ) = any expression using x

Questions

  • Does it terminate?
  • Does it compute a valid answer?

Definitions

  • Meet function:!
  • Meet function is commutative and associative
  • x! x = x
  • Unique bottom ⊥ and top # element
    • x! ⊥ = ⊥
    • x! # = x

Ordering

  • x $ y if and only if x! y = x
  • A function f is monotone if forall x and y,
    • x $ y implies f(x) $ f(y)

Lattice example

111 011 101 110 001 010 100 000 meet is bit-vector logical and

Relating to data flow analysis

  • Top is value to initialize non-entry nodes to - the identity element for the meet function
  • If node function is monotone
    • each re-evaluation of a node moves down the lattice, if it moves at all
  • If height of lattice is finite, must terminate

Is it accurate?

  • We want the meet over all paths solution
  • MOP( B ) = meet (^) p in Path(Entry, B ) f p (Init)
    • note that Paths can be infinite if there are loops
  • As good as we can do given the framework
  • Iterative analysis computes Maximum Fixed Point solution - largest solution, ordered by $, that is a fixed point of the iterative computation - bottom is also a fixed point, but often not maximal

We know x! y $ x since f is monotone, f(x! y) $ f(x) which means f(x! y)! f(x) = f(x! y) and f(x! y)! f(y) = f(x! y) f(x! y)! f(x)! f(y) = f(x! y)! f(y) = f(x! y) Entry a b c MeetOverAllPaths(dIn) = fc(fa(EntryOut))! fc(fb(EntryOut)) d MaximalFixedPoint(dIn) = fc(fa(EntryOut)! fb(EntryOut)))

Distributive problems

  • For a distributive problem
    • you can push transfer functions over meets without causing any reduction in accuracy
  • Which problems are distributive?
    • reaching definitions, very busy expressions, live variables, available expressions
  • Which are not?
    • most formulations of constant propagation

Constant propagation

Entry x = 1 x = - y = x*x

All Gen/Kill problems are

distributive

  • If OutS = GenS union InS - KillS
  • Problem is distributive
    • left at exercise for the reader
    • and/or exam question

Are all problems monotone?

  • No, you have to be careful
  • Consider constant propagation of truth values - What is the rule for if x then y else z