Analysis of Algorithms: Understanding the Running Time of Sorting Algorithms, Slides of Data Representation and Algorithm Design

An analysis of algorithms, focusing on the sorting problem and its solutions, including insertion sort. Observations on running time, experimental hypotheses, and predictions. It also discusses measuring running time and estimating the running time of algorithms.

Typology: Slides

2011/2012

Uploaded on 07/15/2012

saandeep
saandeep 🇮🇳

4.5

(6)

99 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Analysis of Algorithms
2
Running Time
Charles Babbage (1864)
As soon as an Analytic Engine exists, it will necessarily
guide the future course of the science. Whenever any
result is sought by its aid, the question will arise - By what
course of calculation can these results be arrived at by the
machine in the shortest time? - Charles Babbage
Analytic Engine (schematic)
3
Overview
Analysis of algorithms. Framework for comparing algorithms and
predicting performance.
Scientific method.
!Observe some feature of the universe.
!Hypothesize a model that is consistent with observation.
!Predict events using the hypothesis.
!Verify the predictions by making further observations.
!Validate the theory by repeating the previous steps until the
hypothesis agrees with the observations.
Universe = computer itself.
4
Case Study: Sorting
Sorting problem:
!Given N items, rearrange them in ascending order.
!Applications: statistics, databases, data compression, computational
biology, computer graphics, scientific computing, ...
Hanley
Haskell
Hauser
Hayes
Hong
Hornet
Hsu
Hauser
Hong
Hsu
Hayes
Haskell
Hanley
Hornet
docsity.com
pf3
pf4
pf5
pf8

Partial preview of the text

Download Analysis of Algorithms: Understanding the Running Time of Sorting Algorithms and more Slides Data Representation and Algorithm Design in PDF only on Docsity!

Analysis of Algorithms

2

Running Time

Charles Babbage ( 1864 )

As soon as an Analytic Engine exists, it will necessarily

guide the future course of the science. Whenever any

result is sought by its aid, the question will arise - By what

course of calculation can these results be arrived at by the

machine in the shortest time? - Charles Babbage

Analytic Engine (schematic) 3

Overview

Analysis of algorithms. Framework for comparing algorithms and predicting performance. Scientific method. ! Observe some feature of the universe. ! Hypothesize a model that is consistent with observation. ! Predict events using the hypothesis. ! Verify the predictions by making further observations. ! Validate the theory by repeating the previous steps until the hypothesis agrees with the observations. Universe = computer itself. 4

Case Study: Sorting

Sorting problem: ! Given N items, rearrange them in ascending order. ! Applications: statistics, databases, data compression, computational biology, computer graphics, scientific computing, ... Hanley Haskell Hauser Hayes Hong Hornet Hsu Hauser Hong Hsu Hayes Haskell Hanley Hornet

5 Insertion sort. ! Brute-force sorting solution. ! Move left-to-right through array. ! Exchange next element with larger elements to its left, one-by-one. Insertion Sort

public static void insertionSort(double[] a) {

int N = a.length;

for (int i = 0 ; i < N; i++) {

for (int j = i; j > 0 ; j--) {

if (less(a[j], a[j- 1 ]))

exch(a, j, j- 1 );

else break;

6 Insertion Sort: Observation Observe and tabulate running time for various values of N. ! Data source: N random numbers between 0 and 1. 40 , 000 400 million 20 , 000 99 million 10 , 000 25 million 5 , 000 6. 2 million N Comparisons 80 , 000 16 million 7 Data analysis. Plot # comparisons vs. input size on log-log scale. Regression. Fit line through data points! a Nb. Hypothesis. # comparisons grows quadratically with input size! N^2 /4. Insertion Sort: Experimental Hypothesis slope 8 Insertion Sort: Prediction and Verification Experimental hypothesis. # comparisons! N^2 /4. Prediction. 4 00 million comparisons for N = 40,000. Observations. Prediction. 1 0 billion comparisons for N = 200,000. Observation. 200 , 000 9. 997 billion N Comparisons 40 , 000 399. 7 million 40 , 000 401. 6 million 40 , 000 400. 0 million N Comparisons 40 , 000 401. 3 million

Agrees.

Agrees.

13 Data analysis. Plot time vs. input size on log-log scale. Regression. Fit line through data points! a Nb. Hypothesis. Running time grows quadratically with input size. Insertion Sort: Experimental Hypothesis 14 Timing in Java Wall clock. Measure time between beginning and end of computation. ! Manual: Skagen wristwatch. ! Automatic: Stopwatch.java library.

Stopwatch.tic();

double elapsed = StopWatch.toc();

public class Stopwatch { private static long start; public static void tic() { start = System.currentTimeMillis(); } public static double toc() { long stop = System.currentTimeMillis(); return (stop - start) / 1000. 0 ; } } 15 Measuring Running Time Factors that affect running time. ! Machine. ! Compiler. ! Algorithm. ! Input data. More factors. ! Caching. ! Garbage collection. ! Just-in-time compilation. ! CPU used by other processes. Bottom line. Often hard to get precise measurements. 16 Summary Analysis of algorithms. Framework for comparing algorithms and predicting performance. Scientific method. ! Observe some feature of the universe. ! Hypothesize a model that is consistent with observation. ! Predict events using the hypothesis. ! Verify the predictions by making further observations. ! Validate the theory by repeating the previous steps until the hypothesis agrees with the observations. Remaining question. How to formulate a hypothesis?

Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos 226

How To Formulate a Hypothesis

18

Types of Hypotheses

Worst case running time. Obtain bound on running time of algorithm on any input of a given size N. ! Generally captures efficiency in practice. ! Draconian view, but hard to find effective alternative. Average case running time. Obtain bound on running time of algorithm on random input as a function of input size N. ! Hard to accurately model real instances by random distributions. ! May perform poorly on other distributions. Amortized running time. Worst-case bound on running time of any sequence of N operations. 19

Estimating the Running Time

Total running time: sum of cost " frequency for all of the basic ops. ! Cost depends on machine, compiler. ! Frequency depends on algorithm, input. Cost for sorting. ! A = # exchanges. ! B = # comparisons. ! Cost on a typical machine = 1 1A + 4B. Frequency of sorting ops. ! N = # elements to sort. ! Selection sort: A = N-1, B = N(N-1)/2. Donald Knuth 1974 Turing Award 20 An easier alternative. (i) Analyze asymptotic growth as a function of input size N. (ii) For medium N, run and measure time. (iii) For large N, use (i) and (ii) to predict time. Asymptotic growth rates. ! Estimate as a function of input size N.

  • N, N log N, N^2 , N^3 , 2 N, N! ! Ignore lower order terms and leading coefficients.
  • Ex. 6N^3 + 17N^2 + 56 is asymptotically proportional to N^3

Asymptotic Running Time

25 Logarithmic Time Logarithmic time. Running time is O(log N). Searching in a sorted list. Given a sorted array of items, find index of query item. O(log N) solution. Binary search.

public static int binarySearch(String[] a, String key) {

int left = 0 ;

int right = a.length - 1 ;

while (left <= right) {

int mid = left + (right - left) / 2 ;

int cmp = key.compareTo(a[mid]);

if (cmp < 0 ) right = mid - 1 ;

else if (cmp > 0 ) left = mid + 1 ;

else return mid;

return - 1 ;

26 Linear Time Linear time. Running time is O(N). Find the maximum. Find the maximum value of N items in an array.

double max = Double.NEGATIVE_INFINITY;

for (int i = 0 ; i < N; i++) {

if (a[i] > max)

max = a[i];

27 Linearithmic Time Linearithmic time. Running time is O(N log N). Sorting. Given an array of N elements, rearrange in ascending order. O(N log N) solution. Mergesort. [stay tuned] Remark. $(N log N) comparisons required. [stay tuned] 28 Quadratic Time Quadratic time. Running time is O(N^2 ). Closest pair of points. Given N points in the plane, find closest pair. O(N^2 ) solution. Enumerate all pairs of points. Remark. $(N^2 ) seems inevitable, but this is just an illusion.

double min = Double.POSITIVE_INFINITY;

for (int i = 0 ; i < N; i++){

for (int j = i+ 1 ; j < N; j++) {

double dx = (x[i] - x[j]);

double dy = (y[i] - y[j]);

if (dxdx + dydy < min)

min = dxdx + dydy;

29 Exponential Time Exponential time. Running time is O(aN) for some constant a > 1. Finbonacci sequence: 1 1 2 3 5 8 13 21 34 55 … O(%N) solution. Spectacularly inefficient! Efficient solution.

public static int F(int N) {

if (n == 0 || n == 1 ) return n;

else return F(n- 1 ) + F(n- 2 );

!

! F ( N ) =^ "^ N 5

$^ % & '^ (^. nearest integer function 30 Summary of Common Hypotheses When N doubles, Complexity Description running time 2 N Exponential algorithm is not usually practical. squares! N^2 Q reulaatdirvaetliyc samlgaollr iptrhombl^ eprmasc.tical^ for^ use^ only^ on quadruples 1 Constant algorithm is independent of input size. does not change increases by a constant Logarithmic algorithm gets slightly slower as N log N grows. N^ L Nin ienapru^ taslg.orithm^ is^ optimal^ if^ you^ need^ to^ process doubles slightly more than N log N Linearithmic algorithm scales to huge problems. doubles