Algorithms: Analysis of Splay Trees, Convex Hull, and Binary Heaps - Prof. John Weng, Exams of Computer Science

An analysis of various algorithms including splay trees, convex hull using quickhull, and binary heaps. Topics covered include time complexity, average and worst-case scenarios, and space complexity. Splay trees discuss search and insert operations, while convex hull covers the quickhull algorithm and its recurrence relation. Binary heaps are analyzed in terms of insertion and deletion operations.

Typology: Exams

2011/2012

Uploaded on 05/04/2012

connorsname
connorsname 🇺🇸

4 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Algorithms
Online can accept input one by one.
Offline takes all the info at once.
Splay Trees
Average Worst case
Space O(n) O(n)
Search O(log n) amortized O(n)
Insert O(log n) amortized O(log
n)
Delete O(log n) amortized O(log
n)
After each insert, delete, or lookup,
you splay:
The node you just looked up, or
The node you just inserted, or
The parent of the node you just
deleted
while N is not the root:
if N is a child of the root:
// ZIG:
Rotate about the root to
bring N to the root
else:
P := N.parent
G := P.parent // the
grandparent of N
if N and P are both left or
both right children:
// ZIG-ZIG:
Rotate about G then about
P to bring N up two levels
else:
// ZIG-ZAG:
Rotate about P then about G to
bring N up two levels
Convex Hull
quickHull(S, ps, pe)
if S == {} then
return [] // the empty list
else
pf = point of S farthest from the dividing line
PR = set of points to the right of the line between ps
and pf
PL = set of points to the right of the line between pf
and pe
return quickHull(PR, ps, pf) + [pf] + quickHull(PL, pf,
pe)
end if
If the algorithm divides the points into two halves, the
recurrence relation is
N)
In the worst case, all the points wind up on one side of
each dividing line and the 2 algorithm is then O(N )
Time complexity
f(x) = O(g(x)) (big-oh) means that the growth rate of
f(x) is asymptotically less than or equal to to the
growth rate of g(x).
f(x) = Ω(g(x)) (big-omega) means that the growth
rate of f(x) is asymptotically greater than or equal to
the growth rate of g(x)
f(x) = o(g(x)) (small-oh) means that the growth rate
of f(x) is asymptotically less than to the growth rate of
g(x).
f(x) = ω(g(x)) (small-omega) means that the growth
rate of f(x) is asymptotically greater than the growth
rate of g(x)
f(x) = Θ(g(x)) (theta) means that the growth rate of
f(x) is asymptotically equal to the growth rate of g(x)
T(n) = 3n^2 + 4n^2.5 + 6n^2 16 then T(n) =
omega(n^2.5)
T(n) = 111.5nlog(n) + 8.8n^2 3n + 13 then T(n) =
O(n^2)
T(n) = 210(3^n) + 4n^2log(n) 100 then T(n) =
omega(1)
T(n) = 4n^2 12.4nlog(n) 93 then T(n) = O(n^2log(n))
T(n) = 2nlog(n) + 12.3nlog^2(n) + 23 then T(n) !=
O(nlog(n))
T(n) = 22.3nlog(n) 4.3n^2 + 4.5n then T(n) =
O(nlog(n))
T(n) = 1000.3 then T(n) = O(1)
T(n) = 31(2^n) + 11(3^n) + 4n^3 + 3n^2 + 21 then T(n)
!= Θ(2^n)
T(n) = 4n^3 + 6n + 5n^4 + 1244 then T(n) = Θ(n^4)
T(n) = O(4n^3 + 3n^3 + 5n) and T(n) = omega(.07n^3
3n^2 + 54) then T(n) = Θ(n^3)
B-Tree
A B-Tree of order M is an M-ary tree with the
following properties
1. The data items are stored at leaves
2. The nonleaf nodes store up to M-1 keys
to guide the searching; key I represents
the smallest key in subtree i+1
3. The root is either a leaf or has between
two and M children
4. All nonleaf nodes (except the root) have
between roof(M/2) and M children
5. All leaves are at the same depth and have
between root(L/2) and L data items, for
some L
Max Number of Nodes m = (2d)h+1 − 1.
Min Number of Nodes m = 2dh − 1
Binary Heaps
Average Worst case Space O(n) O(n) Search O(n) O(n)
Insert O(log n) O(log n) Delete O(log n) O(log n)
Insert
To add an element to a heap we must perform an up-
heap operation (also known as bubble-up, percolate-
up, sift-up, trickle up, heapify-up, or cascade-up), by
following this algorithm:
1. Add the element to the bottom level of the
heap.
2. Compare the added element with its
parent; if they are in the correct order,
stop.
3. If not, swap the element with its parent
and return to the previous step.
The worst case is when the new element needs to
become the root, requiring one iteration for each level
in the tree, which is O(log n). However, since
approximately 50% of the elements are leaves and 75%
are in the bottom two levels, it is likely that the new
element to be inserted will only move a few levels
upwards to maintain the heap. Thus, binary heaps
support insertion in average constant time, O(1)
BuildHeap
A heap could be built by successive insertions. This
approach requires time because
each insertion takes time and there
are e lements. However this is not the optimal
method. The optimal method starts by arbitrarily
putting the elements on a binary tree, respecting the
shape property (the tree could be represented by an
array, see below). Then starting from the lowest level
and moving upwards, shift the root of each subtree
downward as in the deletion algorithm until the heap
property is restored. More specifically if all the
subtrees starting at some height (measured from the
bottom) have already been "heapified", the trees at
height can be heapified by sending their
root down along the path of maximum valued children
when building a max-heap, or minimum valued
children when building a min-heap. This process takes
operations (swaps) per node. In this method
most of the heapification takes place in the lower
levels. The number of nodes at height is .
Therefore, the cost of heapifying all subtrees is: O(n)
Heapsort
Worst Case O(nlogn)Best O(nlogn) Avg O(nlogn)
Take max node (root) swap it with end of array and
then heapify the remaining nodes.
Birthday Problem
1 ( n!*(365 C n) / 365^n) = Prob that 2 people share a
birthday
(364/365)^n = prob no birthdays fall on Jan 1st
((n C x) * (365 C n-1)) / (365^n) = Prob that exactly x
people of n share a birthday
(1/b) ^x * (b-1/b)^(n-x) = Prob that out of b buckets
and n keys bucket 1 has exactly x keys
Hash Table
Average Worst case Space O(n)[1] O(n) Search O(1 +
n/k) O(n) Insert O(1) O(1) Delete O(1 + n/k) O(n)
Graph algorithms
Reachability. Can you get to B from A?
(depth first or Breadth first)
Shortest path (min-cost path). Find the
path from B to A with the minimum cost
(determined as some simple function of
the edges traversed in the path).
(Dijkstra)
Minimum spanning tree. Find the
"smallest" subset of the edges in which all
the nodes are connected. (greedy) same
as dijkstra except you don’t focus on
one point
Traveling salesman. Find the smallest cost
path through all the nodes.
Visit all nodes. Traversal.
Topological
L ← Empty list that will contain the
sorted elements
S ← Set of all nodes with no incoming
edges
while S is non-empty do
remove a node n from S
insert n into L
for each node m with an edge e
from n to m do
remove edge e from the graph
if m has no other incoming
edges then
insert m into S
if graph has edges then
return error (graph has at least
one cycle)
else
return L (a topologically sorted
order)
quickSort
Quicksort is a divide and conquer algorithm. Quicksort
first divides a large list into two smaller sub-lists: the
low elements and the high elements. Quicksort can
then recursively sort the sub-lists.
The steps are:
1. Pick an element, called a pivot, from the
list.
2. Reorder the list so that all elements with
values less than the pivot come before the
pivot, while all elements with values
greater than the pivot come after it (equal
values can go either way). After this
partitioning, the pivot is in its final
position. This is called the partition
operation.
3. Recursively sort the sub-list of lesser
elements and the sub-list of greater
elements.
The base case of the recursion are lists of size zero or
one, which never need to be sorted.
Mergesort O(nlogn)
Conceptually, a merge sort works as follows
1. Divide the unsorted list into n sublists,
each containing 1 element (a list of 1
element is considered sorted).
2. Repeatedly Merge sublists to produce new
sublists until there is only 1 sublist
remaining. (This will be the sorted list.)
BubbleSort O(n^2) can detect inorder input O(n)
Let us take the array of numbers "5 1 4 2 8", and sort
the array from lowest number to greatest number using
bubble sort algorithm. In each step, elements written in
bold are being compared. Three passes will be
required.
First Pass:
( 5 1 4 2 8 ) ( 1 5 4 2 8 ), Here, algorithm
pf2

Partial preview of the text

Download Algorithms: Analysis of Splay Trees, Convex Hull, and Binary Heaps - Prof. John Weng and more Exams Computer Science in PDF only on Docsity!

Algorithms Online can accept input one by one. Offline takes all the info at once.

Splay Trees Average Worst case Space O(n) O(n) Search O(log n) amortized O(n) Insert O(log n) amortized O(log n) Delete O(log n) amortized O(log n) After each insert, delete, or lookup, you splay: The node you just looked up, or The node you just inserted, or The parent of the node you just deleted

while N is not the root: if N is a child of the root: // ZIG: Rotate about the root to bring N to the root else: P := N.parent G := P.parent // the grandparent of N if N and P are both left or both right children: // ZIG-ZIG: Rotate about G then about P to bring N up two levels else: // ZIG-ZAG: Rotate about P then about G to bring N up two levels

Convex Hull

quickHull(S, ps, pe) if S == {} then return [] // the empty list else pf = point of S farthest from the dividing line PR = set of points to the right of the line between ps and pf PL = set of points to the right of the line between pf and pe return quickHull(PR, ps, pf) + [pf] + quickHull(PL, pf, pe) end if

If the algorithm divides the points into two halves, the recurrence relation is

N)

In the worst case, all the points wind up on one side of each dividing line and the 2 algorithm is then O(N )

Time complexity f(x) = O(g(x)) (big-oh) means that the growth rate of f(x) is asymptotically less than or equal to to the growth rate of g(x). f(x) = Ω(g(x)) (big-omega) means that the growth rate of f(x) is asymptotically greater than or equal to the growth rate of g(x) f(x) = o(g(x)) (small-oh) means that the growth rate of f(x) is asymptotically less than to the growth rate of g(x). f(x) = ω(g(x)) (small-omega) means that the growth rate of f(x) is asymptotically greater than the growth rate of g(x) f(x) = Θ(g(x)) (theta) means that the growth rate of f(x) is asymptotically equal to the growth rate of g(x)

T(n) = 3n^2 + 4n^2.5 + 6n^2 – 16 then T(n) = omega(n^2.5) T(n) = 111.5nlog(n) + 8.8n^2 – 3n + 13 then T(n) = O(n^2) T(n) = 210(3^n) + 4n^2log(n) – 100 then T(n) = omega(1) T(n) = 4n^2 – 12.4nlog(n) – 93 then T(n) = O(n^2log(n)) T(n) = 2nlog(n) + 12.3nlog^2(n) + 23 then T(n) != O(nlog(n)) T(n) = 22.3nlog(n) – 4.3n^2 + 4.5n then T(n) = O(nlog(n))

T(n) = 1000.3 then T(n) = O(1) T(n) = 31(2^n) + 11(3^n) + 4n^3 + 3n^2 + 21 then T(n) != Θ(2^n) T(n) = 4n^3 + 6n + 5n^4 + 1244 then T(n) = Θ(n^4) T(n) = O(4n^3 + 3n^3 + 5n) and T(n) = omega(.07n^3 – 3n^2 + 54) then T(n) = Θ(n^3)

B-Tree A B-Tree of order M is an M-ary tree with the following properties

  1. The data items are stored at leaves
  2. The nonleaf nodes store up to M-1 keys to guide the searching; key I represents the smallest key in subtree i+
  3. The root is either a leaf or has between two and M children
  4. All nonleaf nodes (except the root) have between roof(M/2) and M children
  5. All leaves are at the same depth and have between root(L/2) and L data items, for some L Max Number of Nodes m = (2d)h+1 − 1. Min Number of Nodes m = 2dh − 1

Binary Heaps Average Worst case Space O(n) O(n) Search O(n) O(n) Insert O(log n) O(log n) Delete O(log n) O(log n) Insert To add an element to a heap we must perform an up- heap operation (also known as bubble-up , percolate- up , sift-up , trickle up , heapify-up , or cascade-up ), by following this algorithm:

  1. Add the element to the bottom level of the heap.
  2. Compare the added element with its parent; if they are in the correct order, stop.
  3. If not, swap the element with its parent and return to the previous step. The worst case is when the new element needs to become the root, requiring one iteration for each level in the tree, which is O(log n ). However, since approximately 50% of the elements are leaves and 75% are in the bottom two levels, it is likely that the new element to be inserted will only move a few levels upwards to maintain the heap. Thus, binary heaps support insertion in average constant time, O(1)

BuildHeap A heap could be built by successive insertions. This

approach requires time because

each insertion takes time and there are elements. However this is not the optimal method. The optimal method starts by arbitrarily putting the elements on a binary tree, respecting the shape property (the tree could be represented by an array, see below). Then starting from the lowest level and moving upwards, shift the root of each subtree downward as in the deletion algorithm until the heap property is restored. More specifically if all the subtrees starting at some height (measured from the bottom) have already been "heapified", the trees at

height can be heapified by sending their root down along the path of maximum valued children when building a max-heap, or minimum valued children when building a min-heap. This process takes

operations (swaps) per node. In this method most of the heapification takes place in the lower

levels. The number of nodes at height is. Therefore, the cost of heapifying all subtrees is: O(n)

Heapsort Worst Case O(nlogn)Best O(nlogn) Avg O(nlogn) Take max node (root) swap it with end of array and then heapify the remaining nodes.

Birthday Problem 1 – ( n!*(365 C n) / 365^n) = Prob that 2 people share a birthday (364/365)^n = prob no birthdays fall on Jan 1st

((n C x) * (365 C n-1)) / (365^n) = Prob that exactly x people of n share a birthday (1/b) ^x * (b-1/b)^(n-x) = Prob that out of b buckets and n keys bucket 1 has exactly x keys

Hash Table Average Worst case Space O(n) [1] O(n) Search O(1 + n/k) O(n) Insert O(1) O(1) Delete O(1 + n/k) O(n)

Graph algorithms

 Reachability. Can you get to B from A?

( depth first or Breadth first)

 Shortest path (min-cost path). Find the

path from B to A with the minimum cost (determined as some simple function of the edges traversed in the path). ( Dijkstra )

 Minimum spanning tree. Find the

"smallest" subset of the edges in which all the nodes are connected. ( greedy) same as dijkstra except you don’t focus on one point

 Traveling salesman. Find the smallest cost

path through all the nodes.

 Visit all nodes. Traversal.

Topological L ← Empty list that will contain the sorted elements S ← Set of all nodes with no incoming edges while S is non-empty do remove a node n from S insert n into L for each node m with an edge e from n to m do remove edge e from the graph if m has no other incoming edges then insert m into S if graph has edges then return error (graph has at least one cycle) else return L (a topologically sorted order)

quickSort Quicksort is a divide and conquer algorithm. Quicksort first divides a large list into two smaller sub-lists: the low elements and the high elements. Quicksort can then recursively sort the sub-lists. The steps are:

  1. Pick an element, called a pivot , from the list.
  2. Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
  3. Recursively sort the sub-list of lesser elements and the sub-list of greater elements. The base case of the recursion are lists of size zero or one, which never need to be sorted.

Mergesort O(nlogn) Conceptually, a merge sort works as follows

  1. Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted).
  2. Repeatedly Merge sublists to produce new sublists until there is only 1 sublist remaining. (This will be the sorted list.) BubbleSort O(n^2) can detect inorder input O(n) Let us take the array of numbers "5 1 4 2 8", and sort the array from lowest number to greatest number using bubble sort algorithm. In each step, elements written in bold are being compared. Three passes will be required. First Pass: ( 5 1 4 2 8 ) ( 1 5 4 2 8 ), Here, algorithm

compares the first two elements, and swaps them. ( 1 5 4 2 8 ) ( 1 4 5 2 8 ), Swap since 5 > 4 ( 1 4 5 2 8 ) ( 1 4 2 5 8 ), Swap since 5 > 2 ( 1 4 2 5 8 ) ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them. Repeat

Insert O(n^2) best O(n) Iterate through the array and place the number in the correct position related to already sorted numbers

Binary Trees

h+1 < Nodes < 2^(h+1) -

Primms Initialize all nodes with their distance from a chosen node. Select the shortest path and update the length of the nodes using that chosen node. Kruskals Start with all edges in a forest. Select the shortest edge. If that edge makes a loop don’t add it. Repeat

Comparison-based Sorting Alg. Proof (adapted for both the Sorting Algorithm and Searching a binary tree): Claim: Any comparison-based sorting alg. takes Ω(nlogn) time (Ω(logn) for search). Proof: Given any comparison-based sorting (search) algorithm, we can represent its behavior on an input of size n by a decision tree. Note: need only consider the comparisons in the algorithm (other operations only make the algo take longer).

  • each internal node in the decision tree corresponds to one of the comparisons in the algorithm
  • start at the root and do first comparison: if ≤ take left branch; if > take right branch, etc.
  • each leaf represents one possible ordering of the input (or result of a search), longest path - worst case The number of nodes is T(N) = 2T(N-1) + 1. Inline image 1

In the case of Sort equation: T has height h, binary, most leaves=2h

  • at least n! leaves n! ≤ # leaves ≤ 2h 2h ≥ n! ; log(2h) ≥ log(n!) ; h ≥ log(n!)

In the case of Search Equation: height determined by N ≤ 2h+1 - 1 ; N + 1 ≤ 2^(h+1) ; log(N+1) ≤ h+1 ; log(N+1) -1 ≤ h Hence, complexity to traverse the height of the decision tree is Ω(logN).

PROOF (MAX # OF NODES BINARY TREE):

Show that the maximum number of nodes in a binary tree of height h is 2^(h+1) - 1. Base case: Let h = 0. This gives us a binary “tree” with a single node. For this case, 2^(h+1) – 1 = 2^(1) – 1 = 2

  • 1 = 1. Thus, the maximum number of nodes in a binary tree of height h is 2^(h+1) – 1 when h = 0. Inductive hypothesis: Assume that for some k ≥ 0, the maximum number of nodes in a binary tree of height k is known to be 2^(k+1) – 1. Inductive step: Consider a binary tree of height k + 1. The root node can have at most two subtrees. Each subtree can have a height of at most k. By our inductive hypothesis, these subtrees can contain a maximum of 2^(k+1) – 1 nodes. Considering these two subtrees and the root, the maximum number of nodes in a binary tree of height k + 1 is therefore 2 * (2^(k+1)
        1. Distributing terms, 2 * (2^(k+1) – 1) + 1 = 2 * 2^(k+1) – 2 + 1 = 2^[(k+1) + 1] – 1. By induction, the maximum number of nodes in a binary tree of height h is 2^(h+1) – 1 for any h ≥ 0.

AMORTIZED ANALYSIS (this was specifically for that problem on exam 2): T’(n) is the time up to n without considering all the copies, but add an installment each time to catch up the underestimation. T’(n): smooth and simpler. Relate T(n) to T’(n): T(n) = T’(n) + b(n) Evaluate for n = 4: T(4) = T’(4) + b(4) = 7 + 0 = 7 Table size at 4: 8 Relate T’(n) to T’(n-1): T’(n) = T’(n-1) + 1 + 1 T’(1) = 1

Derive T’(n) for all n = 0, 1, 2, ... T’(n) = T’(n-1)+2 = T’(n-2) + 2 + 2 = 1 + 2 + 2 + ... + 2 = 2(n-1) + 1 Give the bound b(n) = T(n) - T’(n) in big-O: O(n) Derive T(n) in big-O, but tight: T(n) = T’(n) + b(n) = 2(n-1) + 1 + b(n) = O(n) + O(n) = O(n)

PROOF: WARSHALL’S CORRECTNESS

Proof: by induction on k Basis: k=0, trivially true Inductive Hypothesis: Assume that the claim is true for k-1. Inductive Step: We will prove the claim is true for k Consider the shortest path from i to j using nodes 1 to k. two cases:

  1. the path from i to j goes through k.
  2. It does not go through k In either case, ca prove the shortest special path will be D[i,j].

PROOF: PARENT - CHILD from EXAM 2 Suppose the memory is M[i], i = 1, 2, ... , n for storing a binary heap. a) Suppose that a child is at M[c], where c is the address. What is the address p of its parent? Parent is at p = floor(c/2), c ≥ 2. b) Prove your above result using induction on c (not p), c = 2, 3, 4, ... Claim: parent of a child at c is at floor(c/2) Basis: c = 2: parent at floor(2/2) = 1 true. c = 3: parent at floor(3/2) = 1 true. Hypothesis: The claim is true for c, c ≥ 3. Induction Step: to prove the claim for c+1. Two cases (the A's and B's below next to the R's and L's are subscripts):

  1. c+1 is even (LB)
  • Its left cell RA is at c => by hypothesis LA’s parent s at floor(c/2) = (c-1)/

  • LB’s parent is at (c-1)/2 + 1 = (c-1+2)/2 = (c+1/2) = floor((c+1)/2) -> conclusion is true for c+

  1. c+1 is odd. (RB)
  • Its left cell LB is at c => by hypothesis LB’s parent is at floor(c/2) = c/2.

  • RB’s parent is at c/2 = floor((c+1)/2) => the claim is true again. =>Q.E.D.