Optimizing Program Performance: Strategies and Techniques, Study notes of Computer Science

An overview of program performance optimization, including the importance of understanding when and how to optimize, the best strategies for optimization, and specific topics such as bottlenecks, timing and profiling, algorithm analysis, and tuning the code. It also includes examples and exercises.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-eod
koofers-user-eod 🇺🇸

10 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Performance*
Objective: To learn when and how to
optimize the performance of a program.
“The first principle of optimization is
don’t.”
Knowing how a program will be used and the
environment it runs in, is there any benefit to
making it faster?
*The examples in these slides come from
Brian W. Kernighan and Rob Pike,
“The Practice of Programming”, Addison-Wesley, 1999.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Optimizing Program Performance: Strategies and Techniques and more Study notes Computer Science in PDF only on Docsity!

Performance* • Objective: To learn when and how tooptimize the performance of a program. • “The first principle of optimization isdon’t.” – Knowing how a program will be used and theenvironment it runs in, is there any benefit tomaking it faster? *The examples in these slides come from Brian W. Kernighan and Rob Pike, “The Practice of Programming”, Addison-Wesley, 1999.

Approach • The best strategy it to use the simplest, cleanestalgorithms and data appropriate for the task. • Then measure performance to see if changes areneeded. • Enable compiler options to generate the fastestpossible code. • Assess what changes to the program will have themost effect (profile the code). • Make changes one at a time and re-assess (alwaysretest to verify correctness). – Consider alternative algorithms – Tune the code – Consider a lower level language (just for time sensitivecomponents)

A Bottleneck • isspam example from the text – Heavily used – Existing implementation not fast enough incurrent environment • Benchmark • Profile – Tune code – Change algorithm

Timing

  • In Unix environment^ – time command • Example (time quicksort from chapter 2)^ – head –10000 < /usr/share/dict/words | shuffle >in^ – gcc –o sort1 sort1.c quicksort.c^ – time sort1 < in > /dev/null

Solving Recurrences ByRepeated Substitution • T(n) = 2T(n/2) + cn = 2[2T(n/4) + c(n/2)] + cn = 4T(n/4) + cn + cn = 2

2 2 T(n/2) + 2cn (^2 3) = 2[2T(n/2) + c(n 2 2 ) ] + 2cn (^3 3) = 2T(n/2^ ) + 3cn …jj = 2T(n/2^ ) + jcnk^ Suppose n = 2^ and T(1) = c

k^ ⇒ T(n) = 2 T(1) + kcn = n + cnlg(n)

Solving Recurrences ByRepeated Substitution • T(n) = T(n-1) + cn = T(n-2) + c(n-1) + cn = T(n-3) + c(n-2) + c(n-1) + cn … = T(n-j) + c[(n-j+1) + … + n)] Suppose T(1) = c^ ⇒ T(n) = T(n-(n-1)) + c[(n-(n-1)+1 + … + n] = T(1) + c[2+3+…+n] = c[1+2+…+n] n^2 /)^1 (^ +=^ ncnjc = ∑^1 = j

What does Asymptotic Analysismean for Actual Runtimes^2 • If T(n) =^ Θ(n^ )^ – Doubling the input size increases the time by a factor of^42 – T(2n)/T(n) = (c4n

2 2 2)^ + o(n))/(cn+ o(n), which in the (^2) limit is equal to 4. o(n) means lower order terms.

-^ If T(n) =^ Θ(nlogn)^ – Doubling the input size roughly doubles the time [sameas linear]^ – T(2n)/T(n) = (c2nlog(2n) +o(nlogn))/(nlog(n)+o(nlogn)) =^ = (c2nlogn + o(nlogn))/(cnlogn + o(nlogn)), which inthe limit is equal to 2

Empirical Confirmation • Run and time quicksort (without randompivot) on sorted inputs of size 10,000 and20,000, and 40,000 • Compute the ratio of times to see if it is afactor of 4. • What if random inputs are used?

Determining Growth RateEmpirically • Time quicksort with a range of input sizes – e.g. 1000, 2000, 3000, …, 10000 • Write a program that times sort for a range ofinputs. Use the clock function to time code insidea program. – T(1000), T(2000), T(3000),…,T(10000) – plot times for range of input to visualize – Compute ratios to compare to known functions^2 – T(1000)/1000, T(2000)/

2 2 ,…, T(10000)/

  • Does the ratio approach a constant, go to 0, go to

∞?

  • I.e. is is growing at the same rate, faster, or slower thanthe comparison function?
  • Worst Case for Quicksort • Modify the code to remove the random selectionof the pivot – This makes it possible to deterministically construct aworst case input (this is why randomization was used) – The worst case will occur for sorted or reverse sortedinput – For sorted input, the number of comparisons Q(n) as afunction of input size satisfies – Q(n) = Q(n-1) + n-1, Q(1) = 0 – Q(n) = n(n-1)/
  • Obtaining Range of Times • sortr 1000 10 1000 – sorts and times sorted arrays of size – 1000, 2000, 3000,…,

Profiling with gcov • Uses source code analysis provided by thecompiler to analyze the number of times eachstatement in the source code is executed. – $gcc -fprofile-arcs -ftest-coverage sorti.c quicksorti.c -osorti – $sorti 10 – $gcov sorti.c – $gcov quicksorti.c