Lecture Notes on Polymorphism in Compiler Design by Frank Pfenning, Study Guides, Projects, Research of Programming Languages

Lecture notes on polymorphism in compiler design, discussing ad hoc and parametric polymorphism, their differences, and their importance in data abstraction. It also covers the identity function, parametric code, pairs, function pointers, interactions with other language features, and type inference.

Typology: Study Guides, Projects, Research

2021/2022

Uploaded on 09/27/2022

astur
astur 🇬🇧

4.3

(7)

227 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture Notes on
Polymorphism
15-411: Compiler Design
Frank Pfenning
Lecture 24
November 14, 2013
1 Introduction
Polymorphism in programming languages refers to the possibility that a function
or data structure can accommodate data of different types. There are two principal
forms of polymorphism: ad hoc polymorphism and parametric polymorphism. Ad hoc
polymorphism allows a function to compute differently, based on the type of the
argument. Parametric polymorphism means that a function behaves uniformly
across the various types [Rey74].
In C0, the equality == and disequality != operators are ad hoc polymorphic:
they can be applied to small types (int,bool,τ,τ[ ], and also char, which we don’t
have in L4), and they behave differently at different types (32 bit vs 64 bit compar-
isons). A common example from other languages are arithmetic operators so that
e1+e2could be addition of integers or floating point numbers or even concate-
nation of strings. Type checking should resolve the ambiguities and translate the
expression to the correct internal form.
The language extension of voidwe discussed in Assignment 4 is a (somewhat
borderline) example of parametric polymorphism, as long as we do not add a con-
struct hastype(τ, e)or eqtype(e1, e2)into the language and as long as the execution
does not raise a dynamic tag exception. It should therefore be considered some-
what borderline parametric, since implementations must treat it uniformly but a
dynamic tag error depends on the run-time type of a polymorphic value.
Generally, whether polymorphism is parametric depends on all the details of
the language definition. The importance of parametricity for data abstraction in
language implementations cannot be overstated. Failure of parametricity often
means failure of data abstraction: an implementation of a generic data structure
cannot necessarily be replaced by another one (even if it is correct!) without break-
ing a client.
LECTURE NOTE S NOVE MBE R 14, 2013
pf3
pf4
pf5

Partial preview of the text

Download Lecture Notes on Polymorphism in Compiler Design by Frank Pfenning and more Study Guides, Projects, Research Programming Languages in PDF only on Docsity!

Lecture Notes on

Polymorphism

15-411: Compiler Design

Frank Pfenning

Lecture 24

November 14, 2013

1 Introduction

Polymorphism in programming languages refers to the possibility that a function or data structure can accommodate data of different types. There are two principal forms of polymorphism: ad hoc polymorphism and parametric polymorphism. Ad hoc polymorphism allows a function to compute differently, based on the type of the argument. Parametric polymorphism means that a function behaves uniformly across the various types [Rey74]. In C0, the equality == and disequality != operators are ad hoc polymorphic: they can be applied to small types (int, bool, τ ∗, τ [ ], and also char, which we don’t have in L4), and they behave differently at different types (32 bit vs 64 bit compar- isons). A common example from other languages are arithmetic operators so that e 1 + e 2 could be addition of integers or floating point numbers or even concate- nation of strings. Type checking should resolve the ambiguities and translate the expression to the correct internal form. The language extension of void∗ we discussed in Assignment 4 is a (somewhat borderline) example of parametric polymorphism, as long as we do not add a con- struct hastype(τ, e) or eqtype(e 1 , e 2 ) into the language and as long as the execution does not raise a dynamic tag exception. It should therefore be considered some- what borderline parametric, since implementations must treat it uniformly but a dynamic tag error depends on the run-time type of a polymorphic value. Generally, whether polymorphism is parametric depends on all the details of the language definition. The importance of parametricity for data abstraction in language implementations cannot be overstated. Failure of parametricity often means failure of data abstraction: an implementation of a generic data structure cannot necessarily be replaced by another one (even if it is correct!) without break- ing a client.

2 Parametric Polymorphism

The prototypical example of a parametric function is the identity function, λx. x : α → α. In C0, we might write this as

a id(a x) { return x; }

which interprets the undefined type name a as a type variable whose scope is the current function. The projection function, which ignores its second argument, would be

a proj(a x, b y) { return x; }

with both a and b as type variables. From this we extract an abstract form of defi- nition id : ∀a. (a) → a proj : ∀a, b. (a, b) → a

When type-checking the body of a function, the free variables in the function defi- nition are treated like new basic types. In particular, they are not subject to instan- tiation, since in the end the function has to work for all types. To account for this we allow a new form of declaration a : type in our typecontext Γ. When type-checking the use of a polymorphic function, we can instantiated the type variables to other types. For example,

if (id(true)) return id(id(4));

should be well-typed. In order to formalize this we will need a substitution θ for the (quantified) type variables from the definition of a function, using concrete types and other type variables declared in the context. We write

Γ ` θ : (a 1 ,... , ak)

if θ substitutes types that are well-formed in Γ for the type variables a 1 ,... , ak. Furthermore, we write θ(τ ) for the result of applying the substitution θ to the type τ. Our typing rule then shapes up as follows: f : ∀a 1 ,... , ak.(τ 1 ,... , τn) → τ Γ θ : (a 1 ,... , ak) Γ e 1 : θ(τ 1 ) · · · Γ en : θ(τn) Γ f (e 1 ,... , en) : θ(τ )

4 Pairs

We can easily define a product type, which would usually be written as a ∗ b in a functional language.

struct prod<a,b> { a fst; b snd; };

typedef struct prod<a,b>* prod<a,b>;

a fst(prod<a,b> p) { return p->fst; }

b snd(prod<a,b> p) { return p->snd; }

prod<a,b> pair(a x, b y) { prod<a,b> p = alloc(struct prod<a,b>); p->fst = x; p->snd = y; return p; }

5 Function Pointers

Polymorphism in data structures is severely handicapped unless we can store func- tion pointers. For example, a hash table may be parameterized by a type key for keys and a type a for the elements stored in the table. We store in the header func- tions to hash a key value, to compare keys, and extracting a key from an element.

struct ht_header<key,a> { int size; /* size >= 0 / int capacity; / capacity > 0 / list<a>[] table; / \length(table) == capacity / int (hash)(key k); /* hash function / bool (key_equal)(key k1, key k2); /* key comparison / key (elem_key)(a elem); /* extracting key from element */ };

typedef struct ht_header<key,a> ht<key,a>;

a* ht_lookup(ht<key,a> H, key k) //@requires is_ht(H); { int i = (H->hash)(k); list<a>* p = H->table[i]; while (p != NULL) { //@assert p->data != NULL; if ((H->key_equal)((H->elem_key)(p->data), k)) return p->data; else p = p->next; } / not in list */ return NULL; }

6 Interactions With Other Language Features

The interactions between parametric and ad hoc polymorphism are often tricky. In C0 with parametric polymorphism, the main issue arises with equality. If we have e 1 == e 2 where e 1 and e 2 are of type a? If a stands for a small type, this might be feasible, but there is still a difference between 32-bit and 64-bit comparisons. Alternative, we could simply rule this out. This would suggest itself in particular in C0 with a type string, which is not subject to equality testing. A general approach to interactions between ad hoc and parametric polymor- phism are type classes as they are used in Haskell. In lecture, students proposed some extensions of the above so that polymorphism can be limited to type classes. Since I did not take any pictures of the blackboard at the time, these extensions are lost to posterity unless someone sends me some suggestions.

7 Type Inference

Often associated with parametric polymorphism is the idea of type inference. For the polymorphic part of the language, this actually presents rather few problems, since the scope of type variables is naturally delineated by function definitions. How- ever, in C0 there is a problem with field selection, e.f. Since fields are global and can freely be shared between different structs, it will be difficult to disambiguate uses of the field names f and therefore the type of e.