




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
These lecture notes cover the basics of algorithms, including design, analysis, and implementation. Topics covered include complexity, sorting algorithms, graph algorithms, network flow algorithms, computational geometry, string algorithms, greedy/dynamic algorithms, and optimization algorithms. The notes also cover recurrence relations and how to solve them. examples and illustrations from xkcd.com. These notes are suitable for university students studying computer science and software engineering.
Typology: Lecture notes
1 / 106
This page cannot be seen from the preview
Don't miss anything!





























































































Computer Science and Software Engineering, 2011
Notes by CSSE, Comics by xkcd.com
1
Overview
(a) What are Algorithms?
(b) Design of Algorithms.
(c) Types of Algorithms.
(a) Growth rates.
(b) Asymptotic analysis, O and Θ.
(c) Average case analysis.
(d) Recurrence relations.
(a) Insertion Sort.
(b) Merge Sort.
(c) QuickSort.
2
What you should already know?
This unit will require the following basic
knowledge:
structures, recursion, testing, etc
trees, etc.
notation, amortized analysis etc.
by induction, some understanding of
continuous functions
What will we be studying?
We will study a collection of algorithms,
examining their design, analysis and sometimes
even implementation. The topics we will cover
will be taken from the following list:
The importance of design
By far the most important thing in a program
is the design of the algorithm. It is far more
significant than the language the program is
written in, or the clock speed of the computer.
To demonstrate this, we consider the problem
of computing the Fibonacci numbers.
The Fibonacci sequence is the sequence of
integers starting
which is formally defined by
1
2 = 1 and F n
n− 1
n− 2
Let us devise an algorithm to compute F n
9
The naive solution
The naive solution is to simply write a recursive
method that directly models the problem.
static int fib(int n) {
return (n<3? 1 : fib(n-1) + fib(n-2));
Is this a good algorithm/program in terms of
resource usage?
Timing it on a (2005) iMac gives the following
results (the time is in seconds and is for a loop
calculating F n 10000 times).
Value Time
20
21
22
23
Value Time
24
25
26
27
How long will it take to compute F 30
40
or
50
10
Experimental results
Make a plot of the times taken.
Theoretical results
Each method call to fib() does roughly the
same amount of work (just two comparisons
and one addition), so we will have a very rough
estimate of the time taken if we count how
many method calls are made.
Exercise: Show the number of method calls
made to fib() is 2 F n
Re-design the algorithm
We can easily re-design the algorithm as an
iterative algorithm.
static int fib(int n) {
int f_2; /* F(i+2) */
int f_1 = 1; /* F(i+1) */
int f_0 = 1; /* F(i) */
for (int i = 1; i < n; i++) {
/* F(i+2) = F(i+1) + F(i) */
f_2 = f_1 + f_0;
/* F(i) = F(i+1); F(i+1) = F(i+2) */
f_0 = f_1;
f_1 = f_2;
return f_0;
13
An Iterative Algorithm
An iterative algorithm gives the following
times:
Value Time
20
21
22
23
Value Time
10
3
10
4
10
5
10
6
14
Another solution?
The Fibonacci sequence is specified by the
homogeneous recurrence relation:
F (n) =
{
1 if n = 1, 2;
F (n − 1) + F (n − 2) otherwise.
In general we can define a closed form for
these recurrence equations:
F (n) = Aα
n
n
where α, β are the roots of
x
2 − x − 1 = 0.
relation that describes an algorithms
complexity.
linear recurrence relations specify
exponential functions.
See CLRS, Chapter 4.
Recurrence Relations
Recurrence relations can be a useful way to
specify the complexity of recursive functions.
For example the linear homogeneous
recurrence relation:
F (n) =
{
1 if n = 1, 2;
F (n − 1) + F (n − 2) otherwise
specifies the sequence 1, 1 , 2 , 3 , 5 , 8 , 13 , .....
In general a linear homogeneous recurrence
relation is given as:
F (1) = c 1
F (2) = c 2
F (k) = c k
F (n) = a 1 F (n − 1) + ... + a k
F (n − k)
For example
F (n) =
{
1 if n = 1, 2;
2 F (n − 1) + F (n − 2) otherwise
specifies the sequence 1, 1 , 3 , 7 , 17 , 41 , ...
A computational problem: Travelling Salesman
Instance: A set of “cities” X together with a
“distance” d(x, y) between any pair x, y ∈ X.
Question: What is the shortest circular route
that starts and ends at a given city and visits
all the cities?
An instance of Travelling Salesman is a list of
cities, together with the distances between the
cities, such as
d =
21
An algorithm for Sorting
One simple algorithm for Sorting is called
Insertion Sort. The basic principle is that it
takes a series of steps such that after the i-th
step, the first i objects in the array are sorted.
Then the (i + 1)-th step inserts the (i + 1)-th
element into the correct position, so that now
the first i + 1 elements are sorted.
procedure INSERTION-SORT(A)
for j ← 2 to length[A]
do key ← A[j]
! Insert A[j] into the sorted sequence
! A[1... j − 1]
i = j − 1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i = i − 1
A[i + 1] ← key
22
Pseudo-code
Pseudo-code provides a way of expressing
algorithms in a way that is independent of any
programming language. It abstracts away other
program details such as the type system and
declaring variables and arrays. Some points to
note are:
as in Java.
then...else and while have similar
interpretations to Java.
comment line.
Pseudo-code (contd)
should be evaluated and the resulting value
assigned to variable v. Or, in the case of
v 1 ← v 2 ← e, to variables v 1 and v 2
their procedures.
arrays are assumed to be indexed from 1 to
N (rather than 0 to N − 1, the approach
followed by Java).
See CLRS (page 19-20) for more details.
But to return to the insertion sort: What do
we actually mean by a good algorithm?
Evaluating Algorithms
There are many considerations involved in this
question.
25
Correctness of insertion sort
Insertion sort can be shown to be correct by a
proof by induction.
procedure INSERTION-SORT(A)
for j ← 2 to length[A]
do key ← A[j]
! Insert A[j] into the sorted sequence
! A[1... j − 1]
i = j − 1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i = i − 1
A[i + 1] ← key
We do the induction over the loop variable j.
The base case of the induction is:
“the first element is sorted”,
and the inductive step is:
“given the first j elements are sorted after the
j
th iteration, the first j + 1 elements will be
sorted after the j + 1
th iteration.
26
Proof by Induction
To show insertion sort is correct, let p(n) be
the statement “after the n
th iteration, the first
n + 1 elements of the array are sorted”
To show p(0) we simply note that a single
element is always sorted.
Given p(i) is true for all i < n, we must show
that p(n) is true:
After the (n − 1)
th iteration the first n
elements of the array are sorted.
The n
th iteration takes the (n + 1)
th element
and inserts it after the last element that a)
comes before it, and b) is less than it.
Therefore after the n
th iteration, the first n + 1
elements of the array are sorted.
Aside: Proof by Contradiction
Another proof technique you may need is proof
by contradiction.
Here, if you want to show some property p is
true, you assume p is not true, and show this
assumption leads to a contradiction
(something we know is not true, like i < i).
For example, two sorted arrays of integers, L,
containing exactly the same elements, must be
identical.
Proof by contradiction: Suppose M &= N are
two distinct, sorted arrays containing the same
elements. Let i be the least number such that
M [i] &= N [i]. Suppose a = M [i] < N [i]. Since M
and N contain the same elements, and
M [j] = N [j] for all j < i, we must have
a = N [k] for some k > i. But then N [k] < N [i]
so N is not sorted: contradiction.
Numerical Stability
You can be fairly certain of exact results from
a computer program provided all arithmetic is
done with the integers
carefully about any overflow.
However the situation is entirely different when
the problem involves real number, because
there is necessarily some round-off error when
real numbers are stored in a computer. A
floating point representation of a number in
base β with precision p is a representation of
the form.
d.ddddd × β
e
where d.ddddd has exactly p digits.
33
Accumulation of errors
Performing repeated calculations will take the
small truncation errors and cause them to
accumulate. The resulting error is known as
roundoff error. If we are careful or lucky, the
roundoff error will tend to behave randomly,
both positive and negative, and the growth of
error will be slow.
Certain calculations however, vastly increase
roundoff error and can cause errors to grow
catastrophically to the point where they
completely swamp the real result.
Two particular operations that can cause
numerical instability are
zero
It is important to be aware of the possibility for
roundoff error and to alter your algorithm
appropriately.
34
Efficiency
An algorithm is efficient if it uses as few
resources as possible. Typically the resources
which we are interested in are
Other resources are important in practical
terms, but are outside the scope of the design
and analysis of algorithms.
In many situations there is a trade-off between
time and space, in that an algorithm can be
made faster if it uses more space or smaller if
it takes longer.
Although a thorough analysis of an algorithm
should consider both time and space, time is
considered more important, and this course will
focus on time complexity.
Measuring time
How should we measure the time taken by an
algorithm?
We can do it experimentally by measuring the
number of seconds it takes for a program to
run — this is often called benchmarking and is
often seen in popular magazines. This can be
useful, but depends on many factors:
being run, both in terms of size and
which particular instance it is.
So it is not an independent measure of the
algorithm, but rather a measure of the
implementation, the machine and the instance.
Complexity
The complexity of an algorithm is a
“device-independent” measure of how much
time it consumes. Rather than expressing the
time consumed in seconds, we attempt to
count how many “elementary operations” the
algorithm performs when presented with
instances of different sizes.
The result is expressed as a function, giving
the number of operations in terms of the size
of the instance. This measure is not as precise
as a benchmark, but much more useful for
answering the kind of questions that commonly
arise:
big. How long will that take me?
as fast? What size of problem can we
solve in the same time?
The answers to questions like this depend on
the complexity of the algorithm.
37
Example
Suppose you run a small business and have a
program to keep track of your 1024 customers.
The list of customers is changing frequently
and you often need to sort it. Your two
programmers Alice and Bob both come up
with algorithms.
Alice presents an algorithm that will sort n
names using 256n lg n comparisons and Bob
presents an algorithm that uses n
2
comparisons. (Note: lg n ≡ log 2
n)
Your current computer system takes 10
− 3
seconds to make one comparison, and so when
your boss benchmarks the algorithms he
concludes that clearly Bob’s algorithm is
better.
Size Alice Bob
But is he right?
38
Expansion
Alice however points out that the business is
expanding and that using Bob’s algorithm
could be a mistake. As the business expands,
her algorithm becomes more competitive, and
soon overtakes Bob’s.
Size Alice Bob
So Alice’s algorithm is much better placed for
expansion.
A benchmark only tells you about the situation
today, whereas a software developer should be
thinking about the situation both today and
tomorrow!
Hardware improvement
A time-critical application requires you to sort
as many items as possible in an hour. How
many can you sort?
An hour has 3600 seconds, so we can make
3600000 comparisons. Thus if Alice’s
algorithm can sort n A
items, and Bob’s n B
items, then
3600000 = 256n A
lg n A
= n
2
B
which has the solution
n A
= 1352 n B
But suppose that we replace the machines with
ones that are four times as fast. Now each
comparison takes
1
4
− 3 seconds so we can
make 14400000 comparisons in the same time.
Solving
14400000 = 256n A
lg n A
= n
2
B
yields
n A
= 4620 n B
Notice that Alice’s algorithm gains much more
from the faster machines than Bob’s.
Why is big-O notation useful?
In one sense, big-O notation hides or loses a
lot of useful information. For example, the
functions
f (n) = n
2 / 1000
g(n) = 100 n
2
h(n) = 10
10 n
2
are all O(n
2 ) despite being quite different.
However in another sense, the notation
contains the essential information, in that it
completely describes the asymptotic rate of
growth of the function. In particular it contains
enough information to give answers to the
questions:
faster as the input size increases?
what size problems can I solve in the
same time?
45
An asymptotically better sorting algorithm
procedure MERGE-SORT(A, p, r)
if p < r
then q ← ,(p + r)/ 2 -
MERGE-SORT(A, p, q)
MERGE-SORT(A, q + 1, r)
MERGE(A, p, q, r)
procedure MERGE(A, p, q, r)
n 1
← q − p + 1; n 2
← r − q
allocate arrays L[1... n 1
for i ← 1 to n 1
do L[i] ← A[p + i − 1]
for j ← 1 to n 2
do R[j] ← A[q + j]
L[n 1
i ← 1; j ← 1
for k ← p to r
do if L[i] ≤ R[j]
then A[k] ← L[i]
i ← i + 1
else A[k] ← R[j]
j ← j + 1
46
Merge-sort complexity
The complexity of Merge Sort can be shown to
be Θ(nlgn).
The Master Theorem
Merge Sort’s complexity can be described by
the recurrence relation:
F (n) = 2F (n/2) + n, where F (1) = 1.
As this variety of recurrence relation appears
frequently in divide and conquer algorithms it
is useful to have an method to find the
asymptotic complexity of these functions.
The Master Theorem: Let f (n) be a
function described by the recurrence::
f (n) = af (n/b) + cn
d .
where a, b ≥ 1, d ≥ 0 and c > 0 are constants.
Then
f (n) is
O(n
d ) if a < b
d
O(n
d lgn) if a = b
d
O(n
log b
a ) if a > b
d
See CLRS, 4.3.
Average case analysis
The major problem with average case analysis
is that we must make an assumption about the
probability distribution of the inputs. For a
problem like Sorting there is at least a
theoretically reasonable choice—assume that
every permutation of length n has an equal
chance of occurring (already we are assuming
that the list has no duplicates).
For example, we can consider each of the 24
permutations when sorting four inputs with
insertion sort:
Comparisons Inputs
3 1234, 2134
4 1243, 1324, 2143, 2314, 3124, 3214
5 1342, 1423, 2341, 2413, 3142, 3241,
4123, 4213
6 1432, 2431, 3412, 3421, 4132, 4231,
4312, 4321
So the weighted average of comparisons is
(recall that the best case for four inputs is 3,
whereas the worst case is 6).
49
Inversions
Definition An inversion in a permutation σ is
an ordered pair (i, j) such that
i < j and σ i
> σ j
For example, the permutation σ = 1342 has
two inversions, while σ = 2431 has four.
It is straightforward to see that the number of
comparisons that a permutation requires to be
sorted is equal to the number of inversions in it
(check this!) plus a constant, c.
(For sorting four inputs, c = 3)
So the average number of comparisons
required is equal to the average number of
inversions in all the permutations of length n.
Theorem The average number of inversions
among all the permutations of length n is
n(n − 1)/4.
Thus Insertion Sort takes O(n
2 ) time on
average.
50
An asymptotically worse algorithm
Quicksort is Θ(n
2 ), but it’s average complexity
is better than Merge-sort! (CLRS Chapter 7)
procedure QUICKSORT(A, p, r)
if p < r
then q ← PARTITION(A, p, r)
QUICKSORT(A, p, q − 1)
QUICKSORT(A, q + 1, r)
procedure PARTITION(A, p, r)
x ← A[r]
i ← p − 1
for j ← p to r − 1
do if A[j] ≤ x
then i ← i + 1
exchange A[i] ↔ A[j]
exchange A[i + 1] ↔ A[r]
return i + 1
Input size
The complexity of an algorithm is a measure of
how long it takes as a function of the size of
the input. For Sorting we took the number of
items n, as a measure of the size of the input.
This is only true provided that the actual size
of the items does not grow as their number
increases. As long as they are all some
constant size K, then the input size is Kn. The
actual value of the constant does not matter,
as we are only expressing the complexity in
big-O notation, which suppresses all constants.
But what is an appropriate input parameter for
Travelling Salesman? If the instance has n
cities, then the input itself has size Kn
2 —this
is because we need to specify the distance
between each pair of cities.
Therefore you must be careful about what
parameter most accurately reflects the size of
the input.
Summary (cont.)
runs in time O(n
2 ).
in time O(nlgn).
in time O(n
2 ) but is faster than Merge sort
in the average case.
O(n), O(nlgn), O(n
k )) are regarded as
feasible.
n ), O(n))
are regarded as infeasible.
Computer Science and Software Engineering, 2011
Notes by CSSE, Comics by xkcd.com
1
Overview
2
What is a graph?
Definition A graph G consists of a set V (G)
called vertices together with a collection E(G)
of pairs of vertices. Each pair {x, y} ∈ E(G) is
called an edge of G.
Example If
and
then G is a graph with 4 vertices and 5 edges.
!
!
!
!
!
!
!
!
Isomorphisms
Consider the following two graphs:
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
Apart from the “names” of the vertices and
the geometric positions it is clear that these
two graphs are basically the same — in this
situation we say that they are isomorphic.
Definition Two graphs G 1 and G 2 are
isomorphic if there is a one-one mapping
φ : V (G 1
2
) such that
{φ(x), φ(y)} ∈ E(G 2 ) if and only if
{x, y} ∈ E(G 1
In this case the isomorphism is given by the
mapping
φ(A) = 2 φ(B) = 3 φ(C) = 4 φ(D) = 1
Basic properties of graphs
Let us consider some of the basic terminology
of graphs:
Adjacency If {x, y} ∈ E(G), we say that x and
y are adjacent to each other, and sometimes
write x ∼ y. The number of vertices adjacent
to v is called the degree or valency of v. The
sum of the degrees of the vertices of a graph is
even.
Paths A path of length n in a graph is a
sequence of vertices v 1
∼ v 2
∼ · · · ∼ v n+
such
that (v i
, v i+
) ∈ E(G) and vertices
{v 1
, v 2
,... , v n+
} are distinct.
Cycles A cycle of length n is a sequence of
vertices v 1
∼ v 2
∼ · · · v n ∼ v n+
such that
v 1
= v n+
, (v i
, v i+
) ∈ E(G) and therefore only
vertices {v 1
, v 2
,... , v n } are distinct.
Distance The distance between two vertices x
and y in a graph is the length of the shortest
path between them.
9
Subgraphs
If G is a graph, then a subgraph H is a graph
such that
and
A spanning subgraph H has the property that
V (H) = V (G) — in other words H has been
obtained from G only by removing edges.
An induced subgraph H must contain every
edge of G whose endpoints lie in V (H) — in
other words H has been obtained from G by
removing vertices and their adjoining edges.
10
Counting Exercises
In the graph G 4
there?
there?
!
!
!
!
!
!
!
! "
"
"
"
"
"
"
"
Connectivity, forests and trees
Connected A graph G is connected if there is
a path between any two vertices. If the graph
is not connected then its connected
components are the maximal induced
subgraphs that are connected.
Forests A forest is a graph that has no cycles.
Trees A tree is a forest with only one
connected component. It is easy to see that a
tree with n vertices must have exactly n − 1
edges.
The vertices of degree 1 in a tree are called
the leaves of the tree.
Directed and weighted graphs
There are two important extensions to the
basic definition of a graph.
Directed graphs In a directed graph, an
edge is an ordered pair of vertices, and hence
has a direction. In directed graphs, edges are
often called arcs.
Directed Tree Each vertex has at most one
directed edge leading into it, and there is one
vertex (the root) which has a path to every
other vertex.
Weighted graphs In a weighted graph, each
of the edges is assigned a weight (usually a
non-negative integer). More formally we say
that a weighted graph is a graph G together
w(e) represents the weight of the edge e).
13
Distance in weighted graphs
When talking about weighted graphs, we need
to extend the concept of distance.
Definition In a weighted graph X a path
x = x 0
∼ x 1
∼ · · · ∼ x n = y
has weight
i=n− 1 ∑
i=
w(x i
, x i+
The shortest path between two vertices x and
y is the path of minimum weight.
14
Representation of graphs
There are two main ways to represent a graph
— adjacency lists or an adjacency matrix.
Adjacency lists The graph G is represented
by an array of |V (G)| linked lists, with each list
containing the neighbours of a vertex.
Therefore we would represent G 4 as follows:
! 5
! 6
! 3
! 5
! 7
! 2
! 3
! 4
! 6
! 7
! 5
! 2
! 5
! 6
! 1
! 3
! 5
! 2
This representation requires two list elements
for each edge and therefore the space required
is Θ(|V (G)| + |E(G)|).
Note: In general to avoid writing |V (G)| and
|E(G)| we shall simply put V = |V (G)| and
For comparison...
...the graph G 4
"
"
"
"
"
"
"
" #