Program Analysis and Understanding - Slides | CMSC 631, Study notes of Computer Science

Material Type: Notes; Professor: Hicks; Class: PROG ANLYS&UNDERSTANDING; Subject: Computer Science; University: University of Maryland; Term: Fall 2007;

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-t9g-1
koofers-user-t9g-1 🇺🇸

10 documents

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
CMSC 631 – Program Analysis and
Understanding
Fall 2007
2
CMSC 631
Three main focus areas:
Formal systems and notations
-Vocabulary for talking about programs
Static analysis
-Automatic reasoning about source code
Programming language features
-Affects programs and how we reason about them
Analyzing & understanding software
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download Program Analysis and Understanding - Slides | CMSC 631 and more Study notes Computer Science in PDF only on Docsity!

CMSC 631 – Program Analysis and Understanding Fall 2007 CMSC 631 2

  • Three main focus areas: ■ Formal systems and notations - Vocabulary for talking about programs ■ Static analysis - Automatic reasoning about source code ■ Programming language features - Affects programs and how we reason about them

Analyzing & understanding software

CMSC 631 3

• Instructor: Michael Hicks

■ Office: 4131 AVW ■ E-mail: [email protected] ■ Office hours: TWF 10am-11am

  • Or by appointment

• Grader: Brent Gordon

■E-mail: [email protected]

Personnel

CMSC 631 4

• CMSC 430 or equivalent compiler class

■ Ideas we will use in this class:

  • Parse trees/abstract syntax trees
  • BNF notation for grammars
  • Type checking (usually little coverage in a compilers class)
  • Data flow analysis (coverage varies in a compilers class)
  • Tools like yacc and lex may be useful for your project ■ We won’t use most of the other material
  • So even without taking a compilers class, you may be OK
  • Talk to me if you’re not sure

Prerequisite

CMSC 631 7 Expectations: Homework

  • First half of class: two kinds of assignments ■ Programming assignments (20% of grade) - Every two weeks - Implement the ideas we see in lecture ■ Written assignments (10% of grade) - Every week - Short problem sets
  • This is how you will learn things ■ Much more effective than (just) listening to a lecture CMSC 631 8 Late Policy on Assignments
  • Programming Assignments: Due at midnight ■ We use Marmoset for submissions
  • http://submit.cs.umd.edu
  • Written assignments: Due at start of class ■ No late submissions
  • Contact me about extenuating circumstances ■ E.g., religious holidays ■ Inform me as soon as possible

CMSC 631 9

  • Will need to read some papers for class ■ More during the second half of the semester ■ Should come prepared to contribute to discussion
  • (Possible) student presentations later in the

semester

■ Read 1-2 papers on a topic ■ Present a lecture in class about the material

  • 10% of grade on class participation Expectations: Participation CMSC 631 10
  • Class goal: Teach you how to do research ■ So you have to do research as part of the class
  • Substantial research project (35% of grade) ■ Any topic vaguely related to the class
  • Will post some suggestions for projects later on
  • May also be able to share project with other class ■ Completed in groups of size 2 (possibly 1 or 3)
  • This will consume second-half of semester Expectations: Project

CMSC 631 13

  • Don’t do it Academic Dishonesty CMSC 631 14
  • http://www.cs.umd.edu/projects/softchat
  • Weekly meeting about PL and SE research
  • Mondays at 11am in 3258 AVW this fall ■ Starting Sep. 10
  • Topics include ■ Current research in the department ■ Practice talks ■ Interesting recent papers Software Chat

20 Ideas and Applications in Program Analysis in 40 Minutes CMSC 631 – Program Analysis and Understanding Fall 2007 CMSC 631 16

  • Rice’s Theorem: Any non-trivial property of

programs is undecidable

■ Uh-oh! We can’t do anything. So much for this course...

  • Need to make some kind of approximation ■ Abstract the behavior of the program ■ ...and then analyze the abstraction
  • Seminal papers: Cousot and Cousot, 1977, 1979 Abstract Interpretation

CMSC 631 19 Control-Flow Graph x = * x = 3 x = 3 x = 3 x = 3 x = 6 x =? x =? x =? CMSC 631 20

  • Dataflow facts form a lattice
  • Each statement has a transformation function ■ Out(S) = Gen(S) U (In(S) - Kill(S))
  • Terminates because ■ Finite height lattice ■ Monotone transformation functions Lattices and Termination x =? x = 3 x = 6 ... x = *

CMSC 631 21

  • Three syntactic forms ■ variable ■ function ■ function application
  • One reduction rule ■ → (replace by in )
  • Can represent any computable function! Lambda Calculus CMSC 631 22
  • Conditionals ■ true = false = ■ if a then b else c =
  • if true then b else c = → →
  • if false then b else c = → →
  • Can also represent numbers, pairs, data

structures, etc, etc.

  • Result: Lingua franca of PL Example

CMSC 631 25 Operational Semantics

  • Evaluation is depicted as operationally , as part of

some abstract machine

■ Program states are reduced according to some transition relation →. An example is our lambda calculus rule: ■ →

  • There are different styles of abstract machine ■ Small-step (as above), big-step (a.k.a. natural semantics ), SECD machine …
  • The meaning of a program is its fully reduced

form (a.k.a. a value )

CMSC 631 26 Denotational Semantics

  • The meaning of a program is defined as a

mathematical object, like a function or number

■ Rather than a sequence of machine states

  • The semantics is given in terms of an interpretation

function [|.|]

■ Takes program fragment as its argument and returns its meaning as the result, e.g., as a mathematical object

  • Things get interesting when trying to define

denotations for recursive constructs

CMSC 631 27 Denotational Semantics example

  • b ::= true | false | b ∨ b | b ∧ b
  • e ::= 0 | 1 | … | e + e | e * e
  • s ::= e | if b then s else s ■ [| true |] = true ■ [| b1 ∨ b2 |] = [| b1 |] or [| b2 |] ■ [| if b then s1 else s2 |] = ■ How would we handle a while loop? [|s1|] iff [|b|] holds [|s2|] iff [|b|] does not hold CMSC 631 28
  • With the aforementioned semantics, we define

the behavior of programs, and then reason about

programs in terms of this behavior

■ Are two programs equivalent? Does a program terminate? Does a program implement a particular specification?

  • Alternately, axiomatic semantics define the

meaning as what one can prove about it

■ Hoare, Dijkstra, Gries, others Axiomatic Semantics

CMSC 631 31

  • τ
  • τ τ → τ
  • τ in type environment , expression has type τ Simply-typed λ-calculus dom(A)

CMSC 631 32

  • Liskov: ■ If for each object of type there is an object of type such that for all programs defined in terms of , the behavior of is unchanged when is substituted for then is a subtype of.
  • Informal statement ■ If anyone expecting a can be given an instead, then is a subtype of. Subtyping

CMSC 631 33

  • Control-flow analysis
  • CFL reachablity and polymorphism
  • Constraint-based analysis
  • Alias and pointer analysis
  • Region-based memory management
  • Garbage collection
  • Theorem proving
  • More … Other Technologies and Topics CMSC 631 34
  • Polyspace ■ Looks for race conditions, out-of-bounds array accesses, null pointer dereferences, non-initialized data access, etc. ■ Also includes arithmetic equation solver
  • ASTREE ■ Used to detect all possible runtime failures (divide by zero, null pointer dereference, array out-of-bounds access) on embedded codes ■ Used regularly on Airbus Avionics software
  • Stacktool ■ Abstractly interprets machine code to check for possible stack overflow in embedded systems

Applications: Abstract Interpretation

CMSC 631 37

  • Type qualifiers ■ Format-string vulnerabilities, deadlocks, file I/O protocol errors, kernel security holes
  • Vault and Cyclone ■ Memory allocation and deallocation errors, library protocol errors, misuse of locks Applications: Type Systems CMSC 631 38
  • Twelf, Coq, Isabelle/HOL ■ Propositions can be expressed as types, and their proofs are expressed as terms having that type ■ Proposition: A → A, Proof: x:A.x ■ Type checking thus becomes proof checking ■ Can be used for more convincing formal proofs, or even for proof-carrying code Applications: Proof Assistants

CMSC 631 39

  • PL has a great mix of theory and practice ■ Very deep theory ■ But lots of practical applications
  • Recent exciting new developments ■ Focus on program correctness instead of speed ■ Forget about full correctness, though ■ Scalability to large programs essential Conclusion