Memory Management and Garbage Collection in OCaml and Other Programming Languages - Prof. , Study notes of Programming Languages

An overview of memory management and garbage collection in ocaml, focusing on the implementation of stacks using modules and closures. It also discusses the relationship between objects and closures, encoding objects with functions, and memory classes. The document also touches upon memory management in other programming languages such as c, ruby, and java.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-icm-1
koofers-user-icm-1 🇺🇸

10 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
CMSC 330: Organization of
Programming Languages
Objects vs. Functional
Programming
CMSC 330 2
OOP vs. FP
Object-oriented programming (OOP)
Computation as interactions between objects
Objects encapsulate mutable data (state)
¾Accessed / modified via object’s public methods
Functional programming (FP)
Computation as evaluation of functions
¾Mutable data used to improve efficiency
Higher-order functions implemented as closures
¾Closure = function + environment
CMSC 330 3
An Integer “Stack” Abstraction in Java
class Stack {
class Node {
Integer val; Node next;
Node(Integer v, Node n) { val = v; next = n; }
};
private Node theStack;
void push(Integer v) {
theStack = new Node(v, theStack);
}
Integer pop() {
if (theStack == null)
throw new NoSuchElementException();
Integer temp = theStack.val;
theStack = theStack.next;
return temp;
}
}
CMSC 330 4
A “Stack” Abstraction in OCaml
module type STACK =
sig
type 'a stack
val new_stack : unit -> 'a stack
val push : 'a stack -> 'a -> unit
val pop : 'a stack -> 'a
end
module Stack : STACK =
struct
type 'a stack = 'a list ref
let new_stack () = ref []
let push s x = s := (x::!s)
let pop s = match !s with
[] -> failwith "Empty stack"
| (h::t) -> s := t; h
end
CMSC 330 5
Another “Stack” Abstraction in OCaml
let new_stack () =
let this = ref [] in
let push x = this := (x::!this)
and pop () = match !this with
[] -> failwith "Empty stack"
| (h::t) -> this := t; h
in
(push, pop)
# let s = new_stack ();;
val s : ('_a -> unit) * (unit -> '_a) = (<fun>, <fun>)
# Pervasives.fst s 3;; (* applies 1st part of s to 3 *)
- : unit = ()
# Pervasives.snd s ();; (* applies 2nd part of s to () *)
-: int= 3
CMSC 330 6
Two OCaml Stack Implementations
1st implementation (OOP style)
Based on modules
Specifies methods for
¾Creating stack
¾Pushing value onto stack parameter
¾Popping value from stack parameter
2nd implementation (FP style)
Based on closures
Creating stack returns tuple containing
¾Closure for pushing value onto created stack
¾Closure for popping value from created stack
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Memory Management and Garbage Collection in OCaml and Other Programming Languages - Prof. and more Study notes Programming Languages in PDF only on Docsity!

CMSC 330: Organization of

Programming Languages

Objects vs. Functional

Programming

CMSC 330 2

OOP vs. FP

Object-oriented programming (OOP)

  • Computation as interactions between objects
  • Objects encapsulate mutable data (state) ¾ Accessed / modified via object’s public methods

Functional programming (FP)

  • Computation as evaluation of functions ¾ Mutable data used to improve efficiency
  • Higher-order functions implemented as closures ¾ Closure = function + environment

CMSC 330 3

An Integer “Stack” Abstraction in Java

class Stack { class Node { Integer val; Node next; Node(Integer v, Node n) { val = v; next = n; } }; private Node theStack; void push(Integer v) { theStack = new Node(v, theStack); } Integer pop() { if (theStack == null) throw new NoSuchElementException(); Integer temp = theStack.val; theStack = theStack.next; return temp; } } CMSC 330 4

A “Stack” Abstraction in OCaml

module type STACK = sig type 'a stack val new_stack : unit -> 'a stack val push : 'a stack -> 'a -> unit val pop : 'a stack -> 'a end module Stack : STACK = struct type 'a stack = 'a list ref let new_stack () = ref [] let push s x = s := (x::!s) let pop s = match !s with [] -> failwith "Empty stack" | (h::t) -> s := t; h end

CMSC 330 5

Another “Stack” Abstraction in OCaml

let new_stack () = let this = ref [] in let push x = this := (x::!this) and pop () = match !this with [] -> failwith "Empty stack" | (h::t) -> this := t; h in (push, pop)

**# let s = new_stack ();; val s : ('_a -> unit) * (unit -> '_a) = (, )

Pervasives.fst s 3;; (* applies 1st**^ *part of s to 3 )

**- : unit = ()

Pervasives.snd s ();; (* applies 2nd**^ **part of s to () *)

  • : int = 3**

CMSC 330 6

Two OCaml Stack Implementations

1 st^ implementation (OOP style)

  • Based on modules
  • Specifies methods for ¾ Creating stack ¾ Pushing value onto stack parameter ¾ Popping value from stack parameter

2 nd^ implementation (FP style)

  • Based on closures
  • Creating stack returns tuple containing ¾ Closure for pushing value onto created stack ¾ Closure for popping value from created stack

CMSC 330 7

Relating Objects and Closures

An object...

  • Is a collection of fields (data)
  • ...and methods (code)
  • When a method is invoked ¾ Method has implicit this parameter that can be used to access fields of object

A closure...

  • Is a pointer to an environment (data)
  • ...and a function body (code)
  • When a closure is invoked ¾ Function has implicit environment that can be used to access variables

CMSC 330 8

Relating Objects and Closures (cont.)

class C { int x = 0; void set_x(int y) { x = y; } int get_x() { return x; } }

let make () = let x = ref 0 in ( (fun y -> x := y), (fun () -> !x) )

x = 0

C c = new C(); c.set_x(3); int y = c.get_x();

x = ref 0

fun y -> x := y fun () -> !x let (set, get) = make ();; set 3;; let y = get ();;

CMSC 330 9

Encoding Objects with Functions

We can apply this transformation in general

  • becomes
  • make ( ) is like the constructor
  • The closure environment contains the fields

class C { f1 ... fn; m1 ... mn; }

let make () = let f1 = ... ... and fn = ... in ( fun ... , ( body of m1 ) ... fun ..., ( body of mn ) )

Tuple containing closures

CMSC 330 10

Recall a Useful Higher-Order Function

Map applies an arbitrary function f

  • To each element of a list
  • And returns the resulting modified list

Can we encode this in Java?

  • Using object oriented programming

let rec map f = function [] -> [] | (h::t) -> (f h)::(map f t)

CMSC 330 11

A Map Method for Stack

Problem – Write a map method in Java

  • Must pass a function into another function

Solution

  • Can be done using an object with a known method
  • Use interface to specify what method must be present

public interface Function { Integer eval(Integer arg); }

CMSC 330 12

A Map Method for Stack (cont.)

Examples

  • Two classes which both implement Function interface class AddOne implements Function { Integer eval(Integer arg) { return new Integer(arg + 1); } }

class MultTwo implements Function { Integer eval(Integer arg) { return new Integer(arg * 2); } }

CMSC 330: Organization of

Programming Languages

Garbage Collection

CMSC 330 20

Memory Attributes

Memory to store data in programming

languages has several attributes

  • Persistence (or lifetime) ¾ How long the memory exists
  • Allocation ¾ When the memory is available for use
  • Recovery ¾ When the system recovers the memory for reuse

CMSC 330 21

Memory Attributes (cont.)

Most programming languages are concerned

with some subset of the following 4 memory

classes

  1. Fixed (or static) memory
  2. Automatic memory
  3. Programmer allocated memory
  4. Persistent memory

CMSC 330 22

Memory Classes

Static memory – Usually a fixed address in

memory

  • Persistence – Lifetime of execution of program
  • Allocation – By compiler for entire execution
  • Recovery – By system when program terminates

Automatic memory – Usually on a stack

  • Persistence – Lifetime of method using that data
  • Allocation – When method is invoked
  • Recovery – When method terminates

CMSC 330 23

Memory Classes (cont.)

Allocated memory – Usually memory on a heap

  • Persistence – As long as memory is needed
  • Allocation – Explicitly by programmer
  • Recovery – Either by programmer or automatically (when possible and depends upon language)

CMSC 330 24

Memory Classes (cont.)

Persistent memory – Usually the file system

  • Persistence – Multiple execution of a program (e.g., files or databases)
  • Allocation – By program or user, often outside of program execution
  • Recovery – When data no longer needed
  • Note ¾ Dealing with persistent memory → databases (CMSC 424)

CMSC 330 25

Memory Management in C

Local variables live on the stack

  • Allocated at function invocation time
  • Deallocated when function returns
  • Storage space reused after function returns

Space on the heap allocated with malloc()

  • Must be explicitly freed with free()
  • Called explicit or manual memory management ¾ Deletions must be done by the user

CMSC 330 26

Memory Management Mistakes

May forget to free memory (memory leak)

**{ int x = (int ) malloc(sizeof(int)); }

May retain ptr to freed memory (dangling pointer)

**{ int x = ...malloc(); free(x); x = 5; / oops! / }

May try to free something twice

*{ int x = ...malloc(); free(x); free(x); } ¾ This may corrupt the memory management data structures

  • E.g., the memory allocator maintains a free list of space on the heap that’s available

CMSC 330 27

Ways to Avoid Mistakes

Don’t allocate memory on the heap

  • Often impractical
  • Leads to confusing code (e.g., alloca() )

Never free memory

  • OS will reclaim process’s memory anyway at exit
  • Memory is cheap; who cares about a little leak?
  • LISP model – System halts program and reclaims unused memory when there is no more available

Use a garbage collector

  • E.g., conservative Boehm-Weiser collector for C

CMSC 330 28

Memory Management in Ruby

Local variables live on the stack

  • Storage reclaimed when method returns

Objects live on the heap

  • Created with calls to Class.new

Objects never explicitly freed

  • Ruby uses automatic memory management ¾ Uses a garbage collector to reclaim memory

CMSC 330 29

Memory Management in OCaml

Local variables live on the stack

Tuples, closures, and constructed types live on

the heap

  • let x = (3, 4) (* heap-allocated *)
  • let f x y = x + y in f 3 (* result heap-allocated *)
  • type ‘a t = None | Some of ‘a
  • None (* not on the heap–just a primitive *)
  • Some 37 (* heap-allocated *)

Garbage collection reclaims memory

CMSC 330 30

Memory Management in Java

Local variables live on the stack

  • Allocated at method invocation time
  • Deallocated when method returns

Other data lives on the heap

  • Memory is allocated with new
  • But never explicitly deallocated ¾ Java uses automatic memory management

CMSC 330 37

Reference Counting

Old technique (1960)

Each object has count of number of pointers to

it from other objects and from the stack

  • When count reaches 0, object can be deallocated

Counts tracked by either compiler or manually

To find pointers, need to know layout of objects

  • In particular, need to distinguish pointers from ints

Method works mostly for reclaiming memory

  • Doesn’t handle fragmentation problem

CMSC 330 38

Reference Counting Example

stack 1

CMSC 330 39

Reference Counting Example (cont.)

stack 1

CMSC 330 40

Reference Counting Example (cont.)

stack 1

CMSC 330 41

Reference Counting Example (cont.)

stack 1

CMSC 330 42

Reference Counting Example (cont.)

stack 1 2

CMSC 330 43

Reference Counting Example (cont.)

stack 1 2

CMSC 330 44

Reference Counting Example (cont.)

stack

CMSC 330 45

Reference Counting Tradeoffs

Advantage

  • Incremental technique ¾ Generally small, constant amount of work per memory write ¾ With more effort, can even bound running time

Disadvantages

  • Cascading decrements can be expensive
  • Requires extra storage for reference counts
  • Can’t collect cycles, since counts never go to 0

CMSC 330 46

Mark and Sweep GC

Idea

  • Only objects reachable from stack can be live

Every so often, stop the world and do GC

  • Mark all objects on stack as live
  • Mark object reachable from live object as live ¾ Repeat until no more reachable objects
  • Deallocate any non-reachable objects

This is a tracing garbage collector

  • Does not handle fragmentation problem

CMSC 330 47

Mark and Sweep Example

stack

CMSC 330 48

Mark and Sweep Example (cont.)

stack

CMSC 330 55

Mark and Sweep Tradeoffs (cont.)

Disadvantages

  • Fragmentation ¾ Available space broken up into many small pieces - Thus many mark-and-sweep systems may also have a compaction phase (like defragmenting your disk)
  • Cost proportional to heap size ¾ Sweep phase needs to traverse whole heap – it touches dead memory to put it back on to the free list
  • Not appropriate for real-time applications ¾ Bad if your car’s braking system performs GC while you are trying to stop at a busy intersection

CMSC 330 56

Stop and Copy GC

Like mark and sweep, but only touches live

objects

  • Divide heap into two equal parts (semispaces)
  • Only one semispace active at a time
  • At GC time, flip semispaces
    1. Trace the live data starting from the stack
    2. Copy live data into other semispace
    3. Declare everything in current semispace dead
    4. Switch to other semispace

CMSC 330 57

Stop and Copy Example

stack

CMSC 330 58

Stop and Copy Example (cont.)

stack

CMSC 330 59

Stop and Copy Example (cont.)

stack

CMSC 330 60

Stop and Copy Example (cont.)

stack

CMSC 330 61

Stop and Copy Tradeoffs

Advantages

  • Only touches live data
  • No fragmentation (automatically compacts) ¾ Will probably increase locality

Disadvantages

  • Requires twice the memory space
  • Like mark and sweep, need to “stop the world” ¾ Program must stop running to let garbage collector move around data in the heap

CMSC 330 62

The Generational Principle

Object lifetime increases

More objects live

“Young objects die quickly; old objects keep living”

CMSC 330 63

Generational Collection

Long lived objects get copied over and over

  • Idea: Have more than one semispace, divide into generations ¾ Older generations collected less often ¾ Objects that survive many collections get pushed into older generations ¾ Need to track pointers from old to young generations to use as roots for young generation collection

One popular setup

  • Generational stop and copy

CMSC 330 64

Java HotSpot SDK 1.4.2 Collector

Multi-generational, hybrid collector

  • Young generation ¾ Stop and copy collector
  • Tenured generation ¾ Mark and sweep collector
  • Permanent generation ¾ No collection

Questions

  • Why does using a copy collector for the youngest generation make sense?
  • What apps will be penalized by this setup?

CMSC 330 65

More Issues in GC (cont.)

Stopping is world is a big hit

  • Unpredictable performance ¾ Bad for real-time systems
  • Need to stop all threads ¾ Without a much more sophisticated GC

One-size fits all solution

  • Sometimes, GC just gets in the way
  • But correctness comes first

CMSC 330 66

What Does GC Mean to You?

Ideally, nothing

  • GC should make programming easier
  • GC should not affect performance (much)

Usually bad idea to manage memory yourself

  • Using object pools, free lists, object recycling, etc…
  • GC implementations have been heavily tuned ¾ May be more efficient than explicit deallocation

If GC becomes a problem, hard to solve

  • You can set parameters of the GC
  • You can modify your program