
Efficient Context-Sensitive Pointer Analysis for C Programs
Robert P. Wilson and Monica S. Lam
Computer Systems Laboratory
Stanford University, CA 94305
http://suif.stanford.edu
bwilson,lam
@cs.stanford.edu
Abstract
This paper proposes an efficient technique for context-
sensitive pointer analysis that is applicable to real C pro-
grams. For efficiency, we summarize the effects of pro-
cedures using partial transfer functions. A partial transfer
function (PTF) describes the behavior of a procedure assum-
ing that certain alias relationships hold when it is called. We
can reuse a PTF in manycalling contexts as long as the aliases
among the inputs to the procedure are the same. Our empir-
ical results demonstrate that this technique is successful—a
single PTF per procedure is usually sufficientto obtain com-
pletely context-sensitive results. Because many C programs
use features such as type casts and pointer arithmetic to cir-
cumvent the high-level type system, our algorithm is based
on a low-level representation of memory locations thatsafely
handles all the features of C. We have implemented our algo-
rithm in the SUIF compiler system and we show that it runs
efficiently for a set of C benchmarks.
1 Introduction
Pointer analysis promises significant benefits for optimizing
and parallelizing compilers, yet despite much recent progress
it has notadvanced beyond the research stage. Several prob-
lems remain to be solved before it can become a practical
tool. First, the analysis must be efficient without sacrificing
the accuracy of the results. Second, pointer analysis algo-
rithms must handle real C programs. If an analysis only
provides correct results for well-behaved input programs, it
will not be widely used. We have developeda pointer analysis
algorithm that addresses these issues.
The goal of our analysis is to identifythe potential values
of the pointers at each statement in a program. We represent
thatinformation using points-to functions. Weconsider heap-
This research was supported in part by ARPA contract DABT63-94-C-
0054, an NSF YoungInvestigator award, and an Intel Foundation graduate
fellowship.
In Proceedingsof the ACM SIGPLAN’95 Conference on ProgrammingLan-
guage Design and Implementation, La Jolla, CA, June 18–21, 1995, pp.
1–12. Copyright c
1995 by ACM, Inc.
allocated data structures as well as global and stack variables,
but we do not attempt to analyze the relationships between
individualelements of recursive data structures.
Interprocedural analysis is crucial for accurately identify-
ing pointer values. Only very conservative estimates are pos-
sible by analyzing each procedure in isolation. One straight-
forward approach is to combine all the procedures into a
single control flow graph, adding edges for calls and returns.
An iterativedata-flow analysis using such a graph is relatively
simple but suffers from the problem of unrealizable paths.
That is, values can propagate from one call site, through the
callee procedure, and back to a different call site. Some
algorithms attempt to avoid unrealizable paths by tagging
the pointer information with abstractions of the calling con-
texts [2, 12]. However, these algorithms still inappropriately
combine some information from different contexts.
Emami et al. have proposed a context-sensitive algorithm
that completely reanalyzes a procedure for each of its calling
contexts [6]. This not only prevents values from propagating
along unrealizable paths, but also guarantees that the analysis
of a procedure in one calling context is completely indepen-
dent of all the other contexts. A procedure may behave quite
differently in each context due to aliases among its inputs,
and a context-sensitive analysis keeps those behaviors sepa-
rate. Reanalyzing for every calling context is only practical
for small programs. For larger programs, the exponential
cost quickly becomes prohibitive.
Interval analysis, which has been successfully used to an-
alyze side effects for scalar and array variables in Fortran
programs [7, 10], is an approach that combines context sen-
sitivityand efficiency. This technique summarizes the effects
of a procedure by a transfer function. For each call site where
the procedure is invoked, it computes the effects of the pro-
cedure by applying the transfer function to the specific input
parameters at the call site. This provides context sensitivity
without reanalyzing at every call site.
Interval analysis relies on being able to concisely sum-
marize the effects of the procedures. Unfortunately, pointer
analysis is not amenable to succinct summarization. The ef-
fects of a procedure may depend heavily on the aliases that
hold when it is called. Thus, the evaluation of a transfer func-
tion that summarizes the pointer assignments in a procedure
1