

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Two topics from the cs61b course at uc berkeley: generational garbage collection and augmenting data structures. Generational garbage collection separates objects into young and old generations, with the young generation being divided into eden and survivor spaces. References from old objects to young objects are kept in a special table to enable the minor collection to find them. Augmenting data structures involves adding 'extra' abilities to existing data structures, such as combining a hash table with a 2-3-4 tree to determine next smaller and larger keys in o(1) time, or recording size and height information in splay trees to maintain this information during rotations.
Typology: Study notes
1 / 2
This page cannot be seen from the preview
Don't miss anything!


CS61B:^ Lecture 41Friday, December 3, 2010Generational Garbage Collection-------------------------------Studies of memory allocation have shown that most objects allocated by mostprograms have short lifetimes, while a few go on to survive through manygarbage collections. This observation has inspired generational garbagecollectors, which separate old from new objects.A generational collector has two or more generations, which are like theseparate spaces used by copying collectors, except that the generations can beof different sizes, and can change size during a program’s lifetime.Sun’s 1.3 JVM divides objects into an old generation and a young generation.Because old objects tend to last longer, the old generation doesn’t need to begarbage collected nearly as often.^ Hence, the old generation uses a compactingmark-and-sweep collector, because speed is not critical, but memory efficiencymight be. Because old objects are long-lived, and because mark and sweep onlyuses one memory space, the old generation tends to remain compact.The young generation is itself divided into three areas.^ The largest area iscalled "Eden", and it is the space where all objects are born, and most die.Eden is large enough that most objects in it will become garbage long before itgets full. When Eden fills up, it is garbage collected and the survivingobjects are copied into one of two survivor_spaces.^ The survivor spaces arejust the two spaces of a copying garbage collector.If an unexpectedly large number of objects survive Eden, the survivor spacescan expand if necessary to make room for additional objects.Objects move back and forth between the two survivor spaces until they ageenough to be tenured - moved to the old generation.^ Young objects benefitfrom the speed of the copying collector while they’re still wild and prone todie young.Thus, the Sun JVM takes advantage of the best features of both themark-and-sweep and copying garbage collection methods.There are two types of garbage collection:^ minor collections, which happenfrequently but only affect the young generation - thereby saving lots of time -and major collections, which happen much less often but cover all the objectsin memory.This introduces a problem. Suppose a young object is live only because an oldobject references it. How does the minor collection find this out, if itdoesn’t search the old generation?References from old objects to young objects tend to be rare, because oldobjects are set in their ways and don’t change much.^ Since references from oldobjects to young are so rare, the JVM keeps a special table of them, which itupdates whenever such a reference is created.^ The table of references is addedto the roots of the young generation’s copying collector.
| old generation^
| young generation^
|^ | survivor space^
|^ | survivor space^
|^ |^ Eden
AUGMENTING^ DATA^ STRUCTURES==========================Once^ you^ know^ how^ to^ design one of the data structures taught in this class,it’s^ sometimes^ easy^ to^ augment it to have "extra" abilities.You’ve^ already^ augmented data structures in Project 3.
For example, the set E of^ edges^ is^ stored^ as^ both a hash table and an adjacency list.
The hash table allows^ you^ to^ test^ set^ membership in O(1) time, unlike the adjacency list.
The adjacency^ list^ tells^ you the edges adjoining a vertex in O(degree) time, unlikethe^ hash^ table.2-3-4^ Trees^ with^ Fast^ Neighbors-------------------------------Suppose^ you^ have^ a^ 2-3-4 tree with no duplicate keys.
Given a key k, you want to^ be^ able^ to^ determine^ whether k is in the tree, and what the next smaller andlarger^ keys^ are,^ in^ O(1) time.^ The insert() and delete() operations must stilltake^ O(log^ n)^ time.^ Can you do it?It’s^ easy^ if^ you^ combine the 2-3-4 tree with a hash table.
The hash table maps each^ key^ to^ an^ record^ that stores the next smaller and next larger keys in thetree.---------------- ---------------| | | ----- ----- || Hash table | | | 4 | | 9 | | 5 ----+///////+----->| ----- ----- |---------------- | prev^ next |--------------- The^ trick^ is^ that^ when^ you insert a key into the tree, you can determine bytree^ search^ in^ O(log^ n)^ time what the next smaller and larger keys are.
Then, you^ update^ all^ three^ keys’ records in the hash table in O(1) time.Similarly,^ when^ you^ delete an key from the tree, you must delete it from thehash^ table^ too,^ and^ update the records for the two neighboring keys.
This too takes^ O(1)^ time.
Splay Trees with Node Information---------------------------------Sometimes it’s useful for a^ binary^ search
tree^ to^ record^ extra^ information^ in each node, like the size and^ height^ of^ each^ subtree^ at^ each^ node. In splay trees, this is easy^ to^ maintain.
Splaying^ is^ just^ a^ sequence^ of^ tree rotations.^ Each rotation changes^ the
sizes^ of^ only^ two^ subtrees,^ and^ we^
can easily compute their new sizes^ after the^ rotation.^ Let^ size(Y)^ be^ the^ number of nodes in the subtree rooted^ at^ node
Y.^ After^ a^ right^ rotation^ (for instance) you can recompute^ the^ information
as^ follows: size(Y) = 1 + size(B) + size(C)^
size(X) = 1 + size(A) + size(Y)^
height(Y) = 1 + max{height(B),^ height(C)}
height(X) = 1 + max{height(A),^ height(Y)}
(Note:^ to make this work,^ we^ must^ say
/A/B^ rotate^ right^ /B/C
that the height of an empty^ tree^ is^
Be forwarned that a rotation^ does^ not
just^ change^ the^ heights^ of^ X^ and^ Y--it also can change the heights^ of^ all^ their
ancestors.^ But^ X^ gets^ splayed^ all^ the way to the root, so all the^ ancestors’
heights^ are^ fixed^ on^ the^ way^ up. Likewise, inserting or removing^ an^ item
changes^ the^ subtree^ sizes^ of^ all^ the ancestors of the affected item,^ and^ possibly^ their^ heights^ as^ well.^ But a newly inserted item gets splayed^ to^ the^ top;^ and^ a^ deleted^ node’s^ parent is splayed to the top.^ So again,^ all^ the
sizes^ and^ heights^ will^ get^ fixed^ during the rotations.^ Let’s watch^ the^ size fields^ as^ we^ insert^ a^ new^ node^ X^ into
a splay tree.^ (The following^ numbers^ are^ sizes,^ not^ keys.) Note that the very first rotation^ is at^ the^ grandparent^ of^ node^ X^ (zig-zig). 10 10 10 10
4 1 =zig=>^5 1 =zig=>^ [5]^
1 =zig-zag=>^1 4 1 =zig=>^3
How can we use this information?^ We can^ answer^ the^ query^ "How^3 find(4) many keys are there between^ x^ and^ y?"
in^ O(log^ n)^ amortized^ /^
time if the splay tree has^ no^ duplicate
keys^ and^ we^ label^ every^2 subtree with its size.^ Our^ strategy is^ to^ set^ c^ =^ n,^ then^ /^
deduct from c the number of^ keys^ outside
the^ range^ [x,^ y].^1 8 /^
find(x);^ // After the splaying,^ the
keys^ in^ the^ root’s^ left^6
// subtree are all less than^ x,^ so^ subtract
their^ number^ from^ c. c = c - size(root’s left^ subtree);if (root key < x)^ // Only^ possible^
if^ x^ is^ not^ in^ the^ tree--^6 find(7)c--; // otherwise x was^ splayed^ to^ the^ root.^ /^ ^3 find(y);^ // After the splaying,^ the
keys^ in^ the^ root’s^ /^ ^ // right subtree all exceed^ y.^2 5 c = c - size(root’s right^ subtree);^
if (root key > y) c--;^
Now, c is the number of keys^ in^ [x,^ y].