Download Pointer and Alias Analysis in Computer Programming and more Slides Computer Science in PDF only on Docsity!
Pointer analysis
Pointer Analysis
• Outline:
– What is pointer analysis
– Intraprocedural pointer analysis
– Interprocedural pointer analysis
• Andersen and Steensgaard
Useful for what?
• Improve the precision of analyses that require knowing what
is modified or referenced (eg const prop, CSE …)
• Eliminate redundant loads/stores and dead stores.
• Parallelization of code
- can recursive calls to quick_sort be run in parallel? Yes, provided that
they reference distinct regions of the array.
• Identify objects to be tracked in error detection tools
x := *p; ... y := *p; // replace with y := x?
*x := ...; // is *x dead?
x.lock(); ... y.unlock(); // same object as x?
Kinds of alias information
• Points-to information (must or may versions)
- at program point, compute a set of pairs of the form p! x, where p
points to x.
- can represent this information
in a points-to graph
• Alias pairs
- at each program point, compute the set of of all pairs (e 1 ,e 2 ) where e (^1)
and e 2 must/may reference the same memory.
• Storage shape analysis
- at each program point, compute an
abstract description of the pointer structure.
p
x
y
z
p
Flow functions
x := a + b
in
out
Fx := a+b(in) =
x := k
in
out
Fx := k (in) =
Flow functions
x := &y
in
out
Fx := &y (in) =
x := y
in
out
Fx := y (in) =
Intraprocedural Points-to Analysis
• Flow functions:
Pointers to dynamically-allocated memory
- Handle statements of the form: x := new
T
- One idea: generate a new variable each time
the new statement is analyzed to stand for the
new location:
Example solved
l := new Cons
p := l
t := new Cons
*p := t
p := t
l p
V
l p
V1 t V
l p
V
t V
l
t
V
p
V
l
t
V
p
V
l
t
V
p
V2 V
l
t
V
p
V2 V
l
t
V
p
V2 V
What went wrong?
• Lattice was infinitely tall!
• Instead, we need to summarize the infinitely many allocated
objects in a finite way.
- introduce summary nodes, which will stand for a whole class of
allocated objects.
• For example: For each new statement with label L, introduce a
summary node loc L , which stands for the memory allocated
by statement L.
• Summary nodes can use other criterion for merging.
Example revisited & solved
S1: l := new Cons
p := l
S2: t := new Cons
*p := t
p := t
l p
S
l p
S1 t S
l p
S
t S
l
t
S
p
S
l
t
S
p
S
l
t
S
p
S
l
t
S
p
S
l
t
S
p
S
l
t S
p S
l
t
S
p
S
l
t
S
p
S
l
t
S
p
S
Iter 1 Iter 2 Iter 3
Array aliasing, and pointers to
arrays
• Array indexing can cause aliasing:
– a[i] aliases b[j] if:
• a aliases b and i = j
• a and b overlap, and i = j + k, where k is the amount of
overlap.
• Can have pointers to elements of an array
– p := &a[i]; ...; p++;
• How can arrays be modeled?
– Could treat the whole array as one location.
– Could try to reason about the array index
Summary
• We just saw:
- intraprocedural points-to analysis
- handling dynamically allocated memory
- handling pointers to arrays
• But, intraprocedural pointer analysis is not enough.
- Sharing data structures across multiple procedures is one the big
benefits of pointers: instead of passing the whole data structures
around, just pass pointers to them (eg C pass by reference).
- So pointers end up pointing to structures shared across procedures.
- If you don’t do an interproc analysis, you’ll have to make conservative
assumptions functions entries and function calls.
Conservative approximation on
entry
• Say we don’t have interprocedural pointer
analysis.
• What should the information be at the input
of the following procedure:
global g;
void p(x,y) {
x y g