MATLAB Profiling and Optimization Techniques - Prof. Xin Li, Study Guides, Projects, Research of Electrical and Electronics Engineering

An overview of matlab profiling and optimization techniques. It covers the use of the matlab profiler to identify time-consuming lines in a script, as well as various optimization strategies such as vectorization, preallocation, and loop acceleration. The document also discusses the importance of understanding the data types and sizes of arrays, and provides examples of common optimization techniques.

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 07/30/2009

koofers-user-8u2-1
koofers-user-8u2-1 🇺🇸

10 documents

1 / 33

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Pascal Getreuer, February 2009
Contents
Page
1 Introduction 1
2 The Profiler 3
3 Array Preallocation 5
4 JIT Acceleration 7
5 Vectorization 9
6 Inlining Simple Functions 14
7 Referencing Operations 16
8 Solving Ax =b19
9 Numerical Integration 22
10 Signal Processing 26
11 Miscellaneous 29
12 Further Reading 31
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21

Partial preview of the text

Download MATLAB Profiling and Optimization Techniques - Prof. Xin Li and more Study Guides, Projects, Research Electrical and Electronics Engineering in PDF only on Docsity!

Contents

  • Pascal Getreuer, February
  • 1 Introduction Page
  • 2 The Profiler
  • 3 Array Preallocation
  • 4 JIT Acceleration
  • 5 Vectorization
  • 6 Inlining Simple Functions
  • 7 Referencing Operations
  • 8 Solving Ax = b
  • 9 Numerical Integration
  • 10 Signal Processing
  • 11 Miscellaneous
  • 12 Further Reading

1 Introduction

Matlab is a prototyping environment, meaning it focuses on the ease of development with language

flexibility, interactive debugging, and other conveniences lacking in performance-oriented languages like

C and Fortran. While Matlab may not be as fast as C, there are ways to bring it closer.

Disclaimer This is not a beginner’s tutorial to Matlab, but a tutorial on

performance. The methods discussed here are generally fast, but no claim is

made on what is fastest. Use at your own risk.

Before beginning our main topic, let’s step back for a moment and ask ourselves what we really want. Do

we want the fastest possible code?—if so, we should switch to a performance language like C. Probably

more accurately, we want to spend less time total from developing, debugging, running, and until finally

obtaining results. This section gives some tips on working smart, not hard.

M-Lint Messages

In more recent versions of Matlab, the integrated editor automatically gives feedback on possible code

errors and suggested optimizations. These messages are helpful in improving a program and in learning

more about Matlab.

for k = 1:NumTrials

r = rand;

x(k) = runsim(r);

end

hist(x); % Plot histogram of simulation outputs

p x p might be growing inside a loop. Consider preallocating for speed.

Hold the mouse cursor over ::::::::::::::: underlined code to read a message. Or, see all M-Lint messages at once

with Tools → M-Lint → Show M-Lint Report.

Organization

ˆ Use a separate folder for each project.

A separate folder for each project keeps related things together and simplifies making copies and

backups of project files.

ˆ Write header comments, especially H1.

The first line of the header comment is called the H1 comment. Type help(cd) to get a listing

of all project files and their H1 comments.

2 The Profiler

Matlab 5.0 (R10) and newer versions include a tool called the profiler that helps identify bottlenecks

in a program. Use it with the profile command:

profile on Turn the profiler on

profile off Turn it back off

profile clear Clear profile statistics

profile report View the results from the profiler

For example, consider profiling the following function:

function result = example1(Count)

for k = 1:Count

result(k) = sin(k/50);

if result(k) < −0.

result(k) = gammaln(k);

end

end

To analyze the efficiency this function, first enable and clear the profiler, run the function, and then

view the profile report:

profile on, profile clear

example1(5000);

profile report

There is a slight parsing overhead when running code for the first time; run the test code twice and time

the second run. The profiler report command shows a report. Depending on the system, profiler

results may differ from this example.

MATLAB Profile Report: Summary

Report generated 30-Jul-2004 16:57:

Total recorded time: 3.09 s

Number of M-functions: 4

Clock precision: 0.016 s

Function List

Name Time Time Time/call Self time Location

example1 3.09 100.0% 1 3.094000 2.36 76.3% ~/example1.m

gammaln 0.73 23.7% 3562 0.000206 0.73 23.7% ../toolbox/matlab/specfun/gammaln.m

profile 0.00 0.0% 1 0.000000 0.00 0.0% ../toolbox/matlab/general/profile.m

profreport 0.00 0.0% 1 0.000000 0.00 0.0% ../toolbox/matlab/general/profreport.m

Clicking the “example1” link gives more details:

Lines where the most time was spent

Line Number Code Calls Total Time % Time

4 result(k) = sin(k/50); 5000 2.11 s 68%

7 result(k) = gammaln(k); 721 0.84 s 27%

6 if result(k) < -0.9 5000 0.14 s 5%

Totals 3.09 s 100%

The most time-consuming lines are displayed, along with time, time percentage, and line number. The

most costly lines are the computations on lines 4 and 7.

Another helpful section of the profile report is “M-Lint Results,” which gives feedback from the M-Lint

code analyzer. Possible errors and suggestions are listed here.

M-Lint results

Line number Message

4 ‘result’ might be growing inside a loop. Consider preallocating for speed.

7 ‘result’ might be growing inside a loop. Consider preallocating for speed.

(Preallocation is discussed in the next section.)

The profiler has limited time resolution, so to profile a piece of code that runs too quickly, run the test

code multiple times with a loop. Adjust the number of loop iterations so that the time it takes to run

is noticeable. More iterations yields better time resolution.

The profiler is an essential tool for identifying bottlenecks and per-statement analysis, however, for

more accurate timing of a piece of code, use the tic/toc stopwatch timer.

tic; example1(5000); toc;

Elapsed time is 3.082055 seconds

For serious benchmarking, also close your web browser, anti-virus, and other background processes that

may be taking CPU cycles.

What if the final array size can vary? Here is an example:

a = zeros(1,10000); % Preallocate

count = 0;

for k = 1:

v = exp(rand*rand);

if v > 0.5 % Conditionally add to array

count = count + 1;

a(count) = v;

end

end

a = a(1:count); % Trim extra zeros from the results

The average run time of this program is 0.42 seconds without preallocation and 0.18 seconds with it.

Preallocation is also beneficial for cell arrays, using the cell command to create a cell array of the

desired size.

4 JIT Acceleration

Matlab 6.5 (R13) and later feature the Just-In-Time (JIT) Accelerator for improving the speed of

M-functions, particularly with loops. By knowing a few things about the accelerator, you can improve

its performance.

The JIT Accelerator is enabled by default. To disable it, type “feature accel off” in the console,

and “feature accel on” to enable it again.

As of Matlab R2008b, only a subset of the Matlab language is supported for acceleration. Upon

encountering an unsupported feature, acceleration processing falls back to non-accelerated evaluation.

Acceleration is most effective when significant contiguous portions of code are supported.

ˆ Data types: Code must use supported data types for acceleration: double (both real and

complex), logical, char, int8– 32 , uint8– 32. Some struct, cell, classdef, and function

handle usage is supported. Sparse arrays are not accelerated.

ˆ Array shapes: Array shapes of any size with 3 or fewer dimensions are supported. Changing the

shape or data type of an array interrupts acceleration. A few limited situations with 4D arrays

are accelerated.

ˆ Function calls: Calls to built-in functions and M-functions are accelerated. Calling MEX func-

tions and Java interrupts acceleration. (See also page 14 on inlining simple functions.)

ˆ Conditionals and loops: The conditional statements if, elseif, and simple switch statements

are supported if the conditional expression evaluates to a scalar. Loops of the form for k=a:b,

for k=a:b:c, and while loops are accelerated if all code within the loop is supported.

In-place computation

Introduced in Matlab 7.3 (R2006b), the element-wise operators (+, .*, etc.) and some other functions

can be computed in-place. That is, a computation like

x = 5*sqrt(x.ˆ2 + 1);

is handled internally without needing temporary storage for accumulating the result. An M-function

can also be computed in-place if its output argument matches one of the input arguments.

x = myfun(x);

function x = myfun(x)

x = 5*sqrt(x.ˆ2 + 1);

return;

To enable in-place computation, the in-place operation must be within an M-function (and for an in-

place function, the function itself must be called within an M-function). Currently, there is no support

for in-place computation with MEX-functions.

5 Vectorization

A computation is vectorized by taking advantage of vector operations. A variety of programming

situations can be vectorized, and often improving speed to 10 times faster or even better. Vectorization

is one of the most general and effective techniques for writing fast M-code.

5.1 Vectorized Computations

Many standard Matlab functions are “vectorized”: they can operate on an array as if the function

had been applied individually to every element.

sqrt([1,4;9,16])

ans =

1 2

3 4

abs([0,1,2,−5,−6,−7])

ans =

0 1 2 5 6 7

Consider the following function:

function d = minDistance(x,y,z)

% Find the min distance between a set of points and the origin

nPoints = length(x);

d = zeros(nPoints,1); % Preallocate

for k = 1:nPoints % Compute distance for every point

d(k) = sqrt(x(k)ˆ2 + y(k)ˆ2 + z(k)ˆ2);

end

d = min(d); % Get the minimum distance

For every point, its distance from the origin is computed and stored in d. For speed, array d is

preallocated (see Section 3). The minimum distance is then found with min. To vectorize the distance

computation, replace the for loop with vector operations:

function d = minDistance(x,y,z)

% Find the min distance between a set of points and the origin

d = sqrt(x.ˆ2 + y.ˆ2 + z.ˆ2); % Compute distance for every point

d = min(d); % Get the minimum distance

The modified code performs the distance computation with vector operations. The x, y and z arrays

are first squared using the per-element power operator, .^ (the per-element operators for multiplication

and division are .* and ./). The squared components are added with vector addition. Finally, the

square root of the vector sum is computed per element, yielding an array of distances. (A further

improvement: it is equivalent to compute d = sqrt(min(x.^2 + y.^2 + z.^2))).

The first version of the minDistance program takes 0.73 seconds on 50000 points. The vectorized

version takes less than 0.04 seconds, more than 18 times faster.

Some useful functions for vectorizing computations:

min, max, repmat, meshgrid, sum, cumsum, diff, prod, cumprod, accumarray

5.2 Vectorized Logic

The previous section shows how to vectorize pure computation. But bottleneck code often involves

conditional logic. Like computations, Matlab’s logic operators are vectorized:

[1,5,3] < [2,2,4]

ans =

1 0 1

Two arrays are compared per-element. Logic operations return “logical” arrays with binary values.

How is this useful? Matlab has a few powerful functions for operating on logical arrays:

find([1,5,3] < [2,2,4]) % Find indices of nonzero elements

ans =

1 3

any([1,5,3] < [2,2,4]) % True if any element of a vector is nonzero

% (or per−column for a matrix)

ans =

1

all([1,5,3] < [2,2,4]) % True if all elements of a vector are nonzero

% (or per−column for a matrix)

ans =

0

Vectorized logic also works on arrays.

find(eye(3) == 1) % Find which elements == 1 in the 3x3 identity matrix

ans =

1 % (element locations given as indices)

5

9

By default, find returns element locations as indices (see page 16 for indices vs. subscripts).

Example 4: Drawing images with meshgrid

The meshgrid function takes two input vectors and converts them to matrices by replicating the first

over the rows and the second over the columns.

[x,y] = meshgrid(1:5,1:3)

x =

1 2 3 4 5

1 2 3 4 5

1 2 3 4 5

y =

1 1 1 1 1

2 2 2 2 2

3 3 3 3 3

The matrices above work like a map for a width 5, height 3 image. For each pixel, the x-location can

be read from x and the y-location from y. This may seem like a gratuitous use of memory as x and y

simply record the column and row positions, but this is useful. For example, to draw an ellipse,

% Create x and y for a width 150, height 100 image

[x,y] = meshgrid(1:150,1:100);

% Ellipse with origin (60,50) of size 15 x 40

Img = sqrt(((x−60).ˆ2 / 15ˆ2) + ((y−50).ˆ2 / 40ˆ2)) > 1;

% Plot the image

imagesc(Img); colormap(copper);

axis image, axis off

Drawing lines is just a change in the formula.

[x,y] = meshgrid(1:150,1:100);

% The line y = x*0.8 + 20

Img = (abs((x*0.8 + 20) − y) > 1);

imagesc(Img); colormap(copper);

axis image, axis off

Polar functions can be drawn by first converting x and y variables with the cart2pol function.

[x,y] = meshgrid(1:150,1:100);

[th,r] = cart2pol(x − 75,y − 50); % Convert to polar

% Spiral centered at (75,50)

Img = sin(r/3 + th);

imagesc(Img); colormap(hot);

axis image, axis off

Example 5: Polynomial interpolation

Given n points x 1 ,... xn and corresponding function values y 1 ,... yn, the coefficients c 0 ,... , cn− 1 of the

interpolating polynomial

P (x) = cn− 1 x

n− 1

  • · · · + c 1 x + c 0

can be found by solving

x 1 n− 1 x 1 n− 2 · · · x 1 2 x 1 1

x 2

n− 1 x 2

n− 2 · · · x 2

2 x 2 1

. . .

xn n− 1 xn n− 2 · · · xn 2 xn 1

cn− 1

cn− 2

. . .

c 0

y 1

y 2

. . .

yn

function c = polyint(x,y)

% Given a set of points and function values x and y,

% computes the interpolating polynomial.

x = x(:); % Make sure x and y are both column vectors

y = y(:);

n = length(x); % n = Number of points

% Construct the matrix on the left−hand−side

xMatrix = repmat(x, 1, n); % Make an n by n matrix with x on every column

powMatrix = repmat(n−1:−1:0, n, 1); % Make another n by n matrix of exponents

A = xMatrix .ˆ powMatrix; % Compute the powers

c = A\y; % Solve matrix equation for coefficients

The strategy to construct the left-hand side matrix is to first make two n×n matrices of bases and

exponents and then put them together using the per element power operator, .^. The repmat function

(“replicate matrix”) is used to make the base matrix xMatrix and the exponent matrix powMatrix.

xMatrix =

x( 1 ) x( 1 ) · · · x( 1 )

x( 2 ) x( 2 ) · · · x( 2 )

. . .

x(n) x(n) · · · x(n)

powMatrix =

n − 1 n − 2 · · · 0

n − 1 n − 2 · · · 0

. . .

n − 1 n − 2 · · · 0

The xMatrix is made by repeating the column vector x over the columns n times. The powMatrix is a

row vector with elements n − 1 , n − 2 , n − 3 ,... , 0 repeated down the rows n times. The two matrices

could also have been created with [powMatrix, xMatrix] = meshgrid(n-1:-1:0, x).

The code above is only an example—use the standard polyfit function for serious use.

For example, in Matlab R11 and earlier, ifft was implemented by simply calling fft and conj. If x

is a one-dimensional array, y = ifft(x) can be inlined with y = conj(fft(conj(x)))/length(x).

Another example: b = unique(a) can be inlined with

b = sort(a(:)); b( b((1:end−1)') == b((2:end)') ) = [];

While repmat has the generality to operate on matrices, it is often only necessary to tile a vector or

just a scalar. To repeat a column vector y over the columns n times,

A = y(:,ones(1,n)); % Equivalent to A = repmat(y,1,n);

To repeat a row vector x over the rows m times,

A = x(ones(1,m),:); % Equivalent to A = repmat(x,m,1);

To repeat a scalar s into an m×n matrix,

A = s(ones(m,n)); % Equivalent to A = repmat(s,m,n);

This method avoids the overhead of calling an M-file function. It is never slower than repmat (critics

should note that repmat.m itself uses this method to construct mind and nind). For constructing

matrices with constant value, there are other efficient methods, for example, s+zeros(m,n).

Don’t go overboard. Inlining functions is only beneficial when the function is simple and when it is

called frequently. Doing it unnecessarily obfuscates the code.

7 Referencing Operations

Referencing in Matlab is varied and powerful enough to deserve a section of discussion. Good under-

standing of referencing enables vectorizing a broader range of programming situations.

7.1 Subscripts vs. Indices

Subscripts are the most common method used to refer to matrix elements, for example, A(3,9) refers

to row 3, column 9. Indices are an alternative referencing method. Consider a 10 × 10 matrix A.

Internally, Matlab stores the matrix data linearly as a one-dimensional, 100-element array of data in

column-major order.

An index refers to an element’s position in this one-dimensional array. Using an index, A(83) also refers

to the element on row 3, column 9.

It is faster to scan down columns than over rows. Column-major order means that elements

along a column are sequential in memory while elements along a row are further apart. Scanning down

columns promotes cache efficiency.

Conversion between subscripts and indices can be done with the sub2ind and ind2sub functions.

However, because these are M-file functions rather than fast built-in operations, it is more efficient to

compute conversions directly (also see Section 6). For a two-dimensional matrix A of size M×N, the

conversions between subscript (i,j) and index (index) are

A(i,j) ↔ A(i + (j-1)*M)

A(index) ↔ A(rem(index-1,M)+1, floor((index-1)/M)+1)

Indexing extends to N-D matrices as well, with indices increasing first through the columns, then

through the rows, through the third dimension, and so on. Subscript notation extends intuitively,

A(... , dim4, dim3, row, col).

7.2 Vectorized Subscripts

It is useful to work with submatrices rather than individual elements. This is done with a vector of

indices or subscripts. If A is a two-dimensional matrix, a vector subscript reference has the syntax

A(rowv, colv)

7.4 Reference Wildcards

Using the wildcard, :, in a subscript refers to an entire row or column. For example, A(:,1) refers

to every row in column one–the entire first column. This can be combined with vector subscripts,

A([2,4],:) refers to the second and fourth rows.

When the wildcard is used in a vector index, the entire matrix is referenced. On the right-hand side,

this always returns a column vector.

A(:) = column vector

This is frequently useful: for example, if a function input must be a row vector, the user’s input can be

quickly reshaped into row vector with A(:).’ (make a column vector and transpose to a row vector).

A(:) on the left-hand side assigns all the elements of A, but does not change its size. For example,

A(:) = 8 changes all elements of matrix A to 8.

7.5 Logical Indexing

Given a logical array mask with the same size as A,

A(mask)

refers to all elements of A where the corresponding element of mask is 1 (true). It is equivalent to

A(find(mask)). A common usage of logical indexing is to refer to elements based on a per-element

condition, for example,

A(abs(A) < 1e−3) = 0

sets all elements with magnitude less than 10

− 3 to zero. Logical indexing is also useful to select non-

rectangular sets of elements, for example,

A(logical(eye(size(A))))

references the diagonal elements of A. Note that for a right-hand side reference, it is faster to use

diag(A) to get the diagonal of A.

7.6 Deleting Submatrices with [ ]

Elements in a matrix can be deleted by assigning the empty matrix. For example, A([3,5]) = [ ]

deletes the third and fifth element from A. If this is done with index references, the matrix is reshaped

into a row vector.

It is also possible to delete with subscripts if all but one subscript are the wildcard. A(2,:) = [ ]

deletes the second row. Deletions like A(2,1) = [ ] or even A(2,1:end) = [ ] are illegal.

8 Solving^ Ax^ =^ b

Many problems involve solving a linear system, that is, solving the Ax = b problem

a 1 , 1 a 1 , 2 · · · a 1 ,n

a 2 , 1 a 2 , 2 · · · a 2 ,n

. . .

an, 1 an, 2 · · · an,n

x 1

x 2

. . .

xn

b 1

b 2

. . .

bn

The condition number cond(A) describes how sensitive the problem is to small perturbations. Matlab

can efficiently approximate cond(A) with condest. Smaller condition number is better, and in this case

A is said to be well-conditioned. An infinite condition number implies that A is singular—the problem

cannot be solved, though it may be solved approximately with the pseudoinverse (x = pinv(A)*b).

Small- to moderately-sized problems are solved efficiently by the backslash operator:

x = A\b; % Solves A*x = b

We shall focus on large problems.

8.1 Iterative Methods

For large problems, Matlab is well-equipped with iterative solvers: bicg, bicgstab, cgs, gmres, lsqr,

minres, pcg, symmlq, qmr. The general calling syntax (except for gmres) is

[x,flag,relres,iter,resvec] = method (A,b,tol,maxit,M1,M2,x0)

where the required arguments are the matrix A, the right-hand side b, and the solution x. The minres,

pcg, and symmlq methods require A to be symmetric (A = A’) and pcg additionally requires positive

definiteness (all eigenvalues are positive). The optional input variables are

tol accuracy tolerance (default 10 − 6 )

maxit maximum number of iterations (default 20)

M1, M2 the method uses preconditioner M = M1*M2; it solves M

− 1 Ax = M

− 1 b

x0 initial guess

The optional output variables are

flag convergence flag (0 = success, nonzero values are error codes)

relres relative residual ‖b − Ax‖/‖b‖

iter number of iterations

resvec a vector listing the residual norm ‖b − Ax‖ at each iteration

The gmres method includes one additional input,

[x,flag,relres,iter,resvec] = gmres(A,b,restart,tol,maxit,...)

The method restarts every restart iterations, or if restart=[], the method does not restart.