Type Systems in Programming Languages - Prof. Jeffrey S. Foster, Study notes of Computer Science

The concept of type systems in programming languages, providing a definition, examples, and a detailed explanation of simply-typed lambda calculus. It also covers type judgments, type environments, type equivalence, and recursive types. The document further explores the idea of parametric polymorphism and system f.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-7a8
koofers-user-7a8 🇺🇸

10 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Type Systems
CMSC 631 – Program Analysis and
Understanding
Spring 2009
CMSC 631 2
Consider the (untyped) lambda calculus
!false = "x."y.x
!0 (Scott) = "x."y.x
Everything is encoded as a function
!So we can easily misuse combinators
-false 0 if 0 then ... etc...
!This is no better than assembly language!
The Need for a Type System
CMSC 631 3
A type system is some mechanism for distinguishing
good programs from bad
!Good programs = well typed
!Bad programs = ill typed or not typable
Examples:
!0 + 1 // well typed
!false 0 // ill-typed: can’t apply a boolean
!1 + (if true then 0 else false) // ill-typed: can’t add
boolean to integer
What is a Type System?
CMSC 631 4
“A type system is a tractable syntactic method for
proving the absence of certain program behaviors
by classifying phrases according to the kinds of
values they compute.
– Benjamin Pierce, Types and Programming Languages
A Definition of Type Systems
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Type Systems in Programming Languages - Prof. Jeffrey S. Foster and more Study notes Computer Science in PDF only on Docsity!

Type Systems CMSC 631 – Program Analysis and Understanding Spring 2009 CMSC 631 (^2)

  • Consider the (untyped) lambda calculus ! (^) false = "x."y.x ! (^) 0 (Scott) = "x."y.x
  • Everything is encoded as a function ! So we can easily misuse combinators - false 0 if 0 then ... etc... ! (^) This is no better than assembly language!

The Need for a Type System

  • A type system is some mechanism for distinguishing

good programs from bad

! Good programs = well typed ! Bad programs = ill typed or not typable

  • Examples: ! (^) 0 + 1 // well typed ! (^) false 0 // ill-typed: can’t apply a boolean ! 1 + (if true then 0 else false) // ill-typed: can’t add boolean to integer

What is a Type System?

“A type system is a tractable syntactic method for

proving the absence of certain program behaviors

by classifying phrases according to the kinds of

values they compute.”

  • Benjamin Pierce, Types and Programming Languages

A Definition of Type Systems

CMSC 631 (^5)

  • e ::= n | x | "x:t.e | e e ! (^) Functions include the type of their argument ! We don’t really need this, but it will come in handy
  • t ::= int | t # t ! (^) t1 # t2 is a the type of a function that, given an argument of type t1, returns a result of type t - t1 is the domain , and t2 is the range

Simply-Typed Lambda Calculus

CMSC 631 (^6)

  • Our type system will prove judgments of the form ! A! e : t ! “In type environment A, expression e has type t”

Type Judgments

  • A type environment is a map from variables to

types (a kind of symbol table)

! is the empty type environment

  • A closed term^ e^ is^ well-typed^ if^!^ e : t^ for some^ t
  • We’ll abbreviate this as! e : t ! A, x:t is just like A, except x now has type t
  • The type of x in A, x:t is t
  • The type of^ z$x^ in^ A, x:t^ in the type of^ z^ in^ A
  • When we see a variable in a program, we look in

the type environment to find its type

Type Environments

Type Rules

A! n : int x dom(A) A! x : A(x) A, x:t! e : t% A! "x:t.e : t#t% A! e1 : t#t% A! e2 : t A! e1 e2 : t!

CMSC 631

Progress

  • Suppose! e : t. Then either e is a value, or

there exists e’ such that e # e!

  • Proof by induction on e ! Base cases n, "x.e – these are values, so we’re done ! Base case x – can’t happen (empty type environment) ! (^) Inductive case e1 e2 – If e1 is not a value, then by induction we can evaluate it, so we’re done, and similarly for e2. Otherwise both e1 and e2 are values. Inspection of the type rules shows that e1 must have a function type, and therefore must be a lambda since it’s a value. Therefore we can make progress. 13 CMSC 631

Preservation

  • If! e : t and e # e! then! e! : t
  • Proof by induction on e # e! ! (^) Induction (easier than the base case!). Expression e must have the form e1 e2. ! Assume! e1 e2 : t and e1 e2 # e!. Then we have! e1 : t! # t and! e2 : t!. ! Then there are three cases. - If^ e1^ #^ e1!, then by induction^!^ e1 : t!^ #^ t, so^ e1!^ e2^ has type^ t - If reduction inside e2, similar 14 CMSC 631

Preservation, cont’d

  • Otherwise ("x.e) v # e[v\x]. Then we have ! Thus we have - x : t%! e : t -!^ v : t% ! Then by the substitution lemma (not shown) we have -!^ e[v\x] : t ! And so we have preservation 15 x: t%! e : t ! "x.e : t%#t CMSC 631

Substitution Lemma

  • If A! v : t and A, x:t! e : t%, then A! e[v\x] : t%
  • Proof: Induction on the structure of e
  • For lazy semantics, we’d prove ! (^) If A! e1 : t and A, x:t! e : t%, then A! e[e1\x] : t% 16

CMSC 631 (^17)

  • So we have ! Progress: Suppose! e : t. Then either e is a value, or there exists e! such that e # e! ! (^) Preservation: If! e : t and e # e! then! e! : t
  • Putting these together, we get soundness ! If! e : t then either there exists a value v such that e #* v, or e diverges (doesn’t terminate).
  • What does this mean? ! (^) Evaluation getting stuck is bad, so ! “Well-typed programs don’t go wrong”

Soundness

CMSC 631 (^18)

e ::= ... | (e, e) | fst e | snd e

  • Or, maybe, just add functions ! (^) pair : t # t% # t! t% ! (^) fst : t! t% # t ! (^) snd : t! t% # t%

Product Types (Tuples)

A! e : t! t! A! fst e : t A! e : t! t! A! snd e : t! A! e1 : t A! e2 : t ! A! (e1,e2) : t! t!

e ::= ... | inLt2 e | inRt1 e

| (case e of x1:t1 # e1| x2:t2 # e2)

Sum Types (Tagged Unions)

A! e : t A! inLt2 e : t1 + t A! e : t A! inRt1 e : t1 + t A! e : t1 + t A, x1:t1! e1 : t A, x2:t2! e2 : t A! (case e of x1:t1 # e1 | x2:t2 # e2) : t

  • Self application is not checkable in our system ! (^) It would require a type t such that t = t#t% - (We’ll see this next, but so far...)
  • The simply-typed lambda calculus is strongly

normalizing

! (^) Every program has a normal form ! (^) I.e., every program halts!

Self Application and Types

A, x:?! x : t#t! A, x:?! x : t A, x:?A, x:? !! x x : ...x x : ... AA !! ""x:?.x x : ...x:?.x x : ...

CMSC 631 (^25)

ML Datatypes Example

  • type list = Int of int | Cons of int * int list ! Equivalent to μ!.int+(int! !)
  • (Int 3) equivalent to ! fold (inLint!μ".int+(int!") 3)
  • (Cons (2,(Int 3)) equivalent to ! fold (inRint (2, fold (inLint!μ".int+(int!") 3)))
  • match e with Int x -> e1 | Cons x -> e2 same as ! (^) case (unfold e) - x:int^ #^ e - | x:^ int!(μ".int+(int!"))^ #^ e CMSC 631 (^26) - In the pure lambda calculus, every term is typable

with recursive types

! (Pure = variables, functions, applications only)

  • Most languages have some kind of “recursive” type ! E.g., for data structures like lists, tree, etc.
  • However, usually two recursive types that define

the same structure but use a different name are

considered different

! E.g., struct foo { int x; struct foo *next; } is different from struct bar { int x; struct bar *next; }

Discussion

  • We’ve discussed simple types so far ! Integers, functions, pairs, unions ! Extensions for recursive types and updatable refs
  • Type systems have nice properties ! Type checking is straightforward (needs annotations) ! (^) Well typed programs don’t go “wrong” - They don’t get stuck in the operational semantics
  • But...We can’t type check all good programs

Recap

  • How can we build more flexible type systems? ! More programs type check ! Type checking is still tractable
  • How can reduce the annotation burden? ! (^) Type inference

Up Next: Improving Types

CMSC 631 (^29)

• Observation: "x.x returns its argument exactly

and places no constraints on the type of x

! (^) The identity function works for any argument type

• We can express this with universal quantification:

! "x.x : '.'#' ! (^) For any type ', the identity function has type '#' ! This is also known as parametric polymorphism

Parametric Polymorphism

CMSC 631 (^30)

System F: annotated polymorphism

  • Let’s extend our system as follows: ! (^) t ::= ' | int | t # t | '.t ! (^) e ::= n | x | "x.e | e e | #'.e | e [t]
  • (^) That is, we add polymorphic types, and we add explicit type abstraction (generalization) … ! Annotated code locations at which a value of polymorphic type is created
  • (^) … and type application (instantiation) ! Explicitly annotated code locations at which a value of polymorphic type is used
  • (^) This system due to Girard, concurrently Reynolds

• Polymorphic functions map types to terms

!Normal functions map terms to terms

• Examples

!#'."x:'.x :^ '.'#' !#'.#"."x:'."y:".x :^ '.^ ".'#"#' !#'.#"."x:'."y:".y :^ '.^ ".'#"#"

Defining Polymorphic Functions

• When we use a parametric polymorphic type, we

apply (or instantiate) it with a particular type

! (^) In System F this is done by hand: ! (^) (#'."x:'.x)[t1] : t1 # t ! (#'."x:'.x)[t2] : t2 # t

• This is where the term^ parametric^ comes from

! (^) The type '.'#' is a “function” in the domain of types, and it is passed a parameter at instantiation time

Instantiation

CMSC 631 (^37)

  • Let’s consider the simply typed lambda calculus

with integers

! (^) e ::= n | x | "x:t.e | e e ! (No parametric polymorphism)

  • Type inference : Given a bare term (with no type

annotations), can we reconstruct a valid typing

for it, or show that it has no valid typing?

Type Inference

CMSC 631 (^38)

  • Problem: Consider the rule for functions
  • Without type annotations, where do we get t? ! We’ll use type variables to stand for as-yet-unknown types - t ::=^ '^ | int | t^ #^ t ! (^) We’ll generate equality constraints t = t among the types and type variables - And then we’ll solve the constraints to compute a typing

Type Language

A, x:t! e : t% A! "x:t.e : t#t%

Type Inference Rules

A! n : int x dom(A) A! x : A(x) A, x:'! e : t% ' fresh A! "x.e : '#t% A! e1 : t 1 A! e2 : t t1 = t2 #( ( fresh A! e1 e2 : (

“Generated” constraint

  • We collect all constraints appearing in the

derivation into some set C to be solved

  • Here, C contains just '#' = int #( ! (^) Solution: ' = int = (
  • Thus this program is typable, and we can derive

a typing by replacing ' and ( by int in the proof

Example

A, x:'! x:' A! ("x.x) : '#' AA !! 3 : int3 : int (^) '#''#' = int= int #(#( AAA !!! ((("""x.x) 3 :x.x) 3 :x.x) 3 : (((

CMSC 631 (^41)

  • We can solve the equality constraints using the

following rewrite rules, which reduce a larger set

of constraints to a smaller set

! (^) C {int=int} C ! (^) C {'=t} C[t'] ! (^) C {t='} C[t'] ! (^) C {t1#t2=t1%#t2%} C {t1=t1%} {t2=t2%} ! C {int=t1#t2} unsatisfiable ! C {t1#t2=int} unsatisfiable

Solving Equality Constraints

CMSC 631 (^42)

Termination

  • We can prove that the constraint solving

algorithm terminates.

  • For each rewriting rule, either ! (^) We reduce the size of the constraint set ! (^) We reduce the number of “arrow” constructors in the constraint set
  • As a result, the constraint always gets “smaller”

and eventually becomes empty

! A similar argument is made for strong normalization in the simply-typed lambda calculus

  • We don’t have recursive types, so we shouldn’t

infer them

  • So in the operation C[t'], require that ' FV(t)
  • In practice, it may better to allow ' FV(t) and do

the occurs check at the end

! But that can be awkward to implement

Occurs Check

  • Computing C[t'] by substitution is inefficient
  • Instead, use a union-find data structure to

represent equal types

! The terms are in a union-find forest ! When a variable and a term are equated, we union them so they have the same ECR (equivalence class representative)

  • Want the ECR to be the concrete type with which variables have been unified, if one exists. Can read off solution by reading the ECR of each set.

Unifying a Variable and a Type

CMSC 631 (^49)

Attempting Type Inference

  • Let’s extend simply-typed calculus as follows: ! (^) t ::= ' | int | t # t | '.t ! (^) e ::= n | x | "x.e | e e
  • (^) Type inference will automatically infer where to generalize a term, to introduce polymorphic types, and where to instantiate them CMSC 631 (^50) - This rule is exacty the same as System F, but we

just “magically” pick which t’ to instantiate with

! (^) You’re surely wondering about algorithmics. We’ll get to that …

Instantiation

A! e : '.t A! e : t[t!']

  • Question: When is it safe to generalize

(quantify) a type variable ' in the type of

expression e?

  • Answer: Whenever we can redo the typing

proof for e, choosing ' to be anything we want,

and still have a valid typing proof.

Generalization

  • The choice of the type of x is purely local to

type checking "x.x

! There is no interaction with the outside environment ! Thus we can generalize the type of x

Examples

A, x:'! e : ' A! "x.x : '#' A, x:int! x : int A! "x.x : int#int A, x:(i#i)! x : (i#i) A! "x.x : (i#i)#(i#i)

CMSC 631 (^53)

  • The function restricts the type of x, so we

cannot introduce a type variable

! Thus we cannot generalize the type of x ! We can only generalize when the function doesn’t “look at” its parameter

Examples (cont’d)

A, x:int! x : int A! "x.x+3 : int#int CMSC 631 (^54)

  • The choice of the type of x depends on the type

environment

! In the first derivation, x and y have the same type; if we generalize the type of x, they could have different types ! (^) Thus we cannot generalize the type of x

Examples (cont’d)

A, y:', x:'! if p then x else y : ' A, y:'! "x.if p then x else y : '#' A, y:', x:int! if p then x else y : int A, y:'! "x.if p then x else y : int#int

  • We can generalize any type variable that is

unconstrained by the environment

! (^) Warning: This won’t quite work with refs

Generalization Rule

A! e : t '"FV(A) A! e : #'.t

  • Suppose we have ! (^) A! e : t and ' FV(A)
  • Then let u be any type. By induction, can show ! (^) A[u']! e : t[u'] ! (^) But then since ' FV(A), that’s equivalent to ! (^) A! e : t[u']

Another Justification

CMSC 631 (^61)

  • A type inference algorithm that explicitly solves

the equality constraints on-line

  • Instead of implicit global substitution (like we used

before), threads the substitution through the

inference

  • In practice, use previous algorithm, plus generalize

at let and instantiate at variable uses.

! (^) Solve for the type of e1, generalize it, then instantiate its solution when doing inference on e

Algorithm W

CMSC 631 (^62)

  • Parametric polymorphic type inference let x = "x.x in // x : '.'#' x 3; // x : (#(, (=int x ("y.y) // x : )#), )=#
  • This would be untypable in a monomorphic type

system

Example

  • We’ve just seen parametric polymorphism ! System F and Hindley-Milner style polymorphism
  • Another popular form is subtype polymorphism ! As in OO programming ! These two can be combined (e.g., Java Generics)
  • Some languages also have ad-hoc polymorphism ! (^) E.g., + operator that works on ints and floats ! (^) E.g., overloading in Java

Kinds of Polymorphism

e ::= x | "x.e | e e

| ref e allocation

| !e dereference

| e := e assignment

| e; e sequencing

  • Notice that this is not C
    • Variables cannot be updated; only references can
    • I.e., there are no l-values or r-values
  • This is a language with updatable references

An Imperative Language

CMSC 631 (^65)

!(ref 0)

let x = ref 0 in

x := !x + 1

let x = ref 0 in

"y. x := !x + 1; !x

Examples

CMSC 631 (^66)

  • t ::= ... | ref t ! Note: in ML this type is written t ref

Type Checking Rules

A! e : t A! ref e : ref t A! e : ref t A! !e : t A! e1 : ref t A! e2 : t A! e1 := e2 : t

  • Sometimes in imperative programs we write

expressions that have some side effect but no

interesting result

  • To represent this directly, use unit: ! e ::= ... | () ! t ::= ... | unit

Unit and the Unit Type

A! e1 : ref t A! e2 : t A! () : unit A! e1 := e2 : unit

  • Now we need to keep track of memory ! State is a map from locations to values ! Our redexes will be tuples ‹State, expression› ! As a consequence, order of evaluation matters
  • As before, evaluation will yield a fully-evaluated

term, also called a value

! (^) v ::= x | "x.e ! (^) e ::= v | e e | ref e | !e | e := e

Operational Semantics

CMSC 631 (^73)

  • Only allow values to be generalized ! (^) v ::= x | n | "x.e ! e ::= v | e e | ref e | !e | e := e ! Intuition: Values cannot later be updated ! This solution due to Wright and Felleisen - Tofte found a much more complicated solution

Solution: The Value Restriction

A! v : t1 A,x:#'.t! e2 : t2 '=FV(t)-FV(A) A! let x = v in e2 : t CMSC 631 (^74)

  • Handles higher-order functions
  • Handles data structures smoothly
  • Works in infinite domains ! Set of types is unlimited
  • No forward/backward distinction
  • Polymorphism provides context-sensitivity

Benefits of Type Inference

CMSC 631 (^75)

  • Flow-insensitive ! (^) Types are the same at all program points ! (^) May produce coarse results ! Type inference failure can be hard to understand
  • Polymorphic type inference may not scale ! Exponential in worst case ! (^) Seems fine in practice (witness ML)

Drawbacks to Type Inference

CMSC 631 (^76)

  • Flow-insensitive ! Types are the same at all program points ! May produce coarse results ! Type inference failure can be hard to understand
  • Polymorphism may not scale ! (^) Exponential in worst case ! Seems fine in practice (witness ML)

Drawbacks to Type Inference