

























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of matlab profiling and optimization techniques. It covers the use of the matlab profiler to identify time-consuming lines in a script, as well as various optimization strategies such as vectorization, preallocation, and loop acceleration. The document also discusses the importance of understanding the data types and sizes of arrays, and provides examples of common optimization techniques.
Typology: Study Guides, Projects, Research
1 / 33
This page cannot be seen from the preview
Don't miss anything!


























Matlab is a prototyping environment, meaning it focuses on the ease of development with language
flexibility, interactive debugging, and other conveniences lacking in performance-oriented languages like
C and Fortran. While Matlab may not be as fast as C, there are ways to bring it closer.
performance. The methods discussed here are generally fast, but no claim is
made on what is fastest. Use at your own risk.
Before beginning our main topic, let’s step back for a moment and ask ourselves what we really want. Do
we want the fastest possible code?—if so, we should switch to a performance language like C. Probably
more accurately, we want to spend less time total from developing, debugging, running, and until finally
obtaining results. This section gives some tips on working smart, not hard.
In more recent versions of Matlab, the integrated editor automatically gives feedback on possible code
errors and suggested optimizations. These messages are helpful in improving a program and in learning
more about Matlab.
for k = 1:NumTrials
r = rand;
x(k) = runsim(r);
end
hist(x); % Plot histogram of simulation outputs
p x p might be growing inside a loop. Consider preallocating for speed.
Hold the mouse cursor over ::::::::::::::: underlined code to read a message. Or, see all M-Lint messages at once
with Tools → M-Lint → Show M-Lint Report.
Use a separate folder for each project.
A separate folder for each project keeps related things together and simplifies making copies and
backups of project files.
Write header comments, especially H1.
The first line of the header comment is called the H1 comment. Type help(cd) to get a listing
of all project files and their H1 comments.
Matlab 5.0 (R10) and newer versions include a tool called the profiler that helps identify bottlenecks
in a program. Use it with the profile command:
profile on Turn the profiler on
profile off Turn it back off
profile clear Clear profile statistics
profile report View the results from the profiler
For example, consider profiling the following function:
function result = example1(Count)
for k = 1:Count
result(k) = sin(k/50);
if result(k) < −0.
result(k) = gammaln(k);
end
end
To analyze the efficiency this function, first enable and clear the profiler, run the function, and then
view the profile report:
profile on, profile clear
example1(5000);
profile report
There is a slight parsing overhead when running code for the first time; run the test code twice and time
the second run. The profiler report command shows a report. Depending on the system, profiler
results may differ from this example.
Report generated 30-Jul-2004 16:57:
Total recorded time: 3.09 s
Number of M-functions: 4
Clock precision: 0.016 s
Name Time Time Time/call Self time Location
example1 3.09 100.0% 1 3.094000 2.36 76.3% ~/example1.m
gammaln 0.73 23.7% 3562 0.000206 0.73 23.7% ../toolbox/matlab/specfun/gammaln.m
profile 0.00 0.0% 1 0.000000 0.00 0.0% ../toolbox/matlab/general/profile.m
profreport 0.00 0.0% 1 0.000000 0.00 0.0% ../toolbox/matlab/general/profreport.m
Clicking the “example1” link gives more details:
Lines where the most time was spent
Line Number Code Calls Total Time % Time
4 result(k) = sin(k/50); 5000 2.11 s 68%
7 result(k) = gammaln(k); 721 0.84 s 27%
6 if result(k) < -0.9 5000 0.14 s 5%
Totals 3.09 s 100%
The most time-consuming lines are displayed, along with time, time percentage, and line number. The
most costly lines are the computations on lines 4 and 7.
Another helpful section of the profile report is “M-Lint Results,” which gives feedback from the M-Lint
code analyzer. Possible errors and suggestions are listed here.
M-Lint results
Line number Message
4 ‘result’ might be growing inside a loop. Consider preallocating for speed.
7 ‘result’ might be growing inside a loop. Consider preallocating for speed.
(Preallocation is discussed in the next section.)
The profiler has limited time resolution, so to profile a piece of code that runs too quickly, run the test
code multiple times with a loop. Adjust the number of loop iterations so that the time it takes to run
is noticeable. More iterations yields better time resolution.
The profiler is an essential tool for identifying bottlenecks and per-statement analysis, however, for
more accurate timing of a piece of code, use the tic/toc stopwatch timer.
tic; example1(5000); toc;
Elapsed time is 3.082055 seconds
For serious benchmarking, also close your web browser, anti-virus, and other background processes that
may be taking CPU cycles.
What if the final array size can vary? Here is an example:
a = zeros(1,10000); % Preallocate
count = 0;
for k = 1:
v = exp(rand*rand);
if v > 0.5 % Conditionally add to array
count = count + 1;
a(count) = v;
end
end
a = a(1:count); % Trim extra zeros from the results
The average run time of this program is 0.42 seconds without preallocation and 0.18 seconds with it.
Preallocation is also beneficial for cell arrays, using the cell command to create a cell array of the
desired size.
Matlab 6.5 (R13) and later feature the Just-In-Time (JIT) Accelerator for improving the speed of
M-functions, particularly with loops. By knowing a few things about the accelerator, you can improve
its performance.
The JIT Accelerator is enabled by default. To disable it, type “feature accel off” in the console,
and “feature accel on” to enable it again.
As of Matlab R2008b, only a subset of the Matlab language is supported for acceleration. Upon
encountering an unsupported feature, acceleration processing falls back to non-accelerated evaluation.
Acceleration is most effective when significant contiguous portions of code are supported.
Data types: Code must use supported data types for acceleration: double (both real and
complex), logical, char, int8– 32 , uint8– 32. Some struct, cell, classdef, and function
handle usage is supported. Sparse arrays are not accelerated.
Array shapes: Array shapes of any size with 3 or fewer dimensions are supported. Changing the
shape or data type of an array interrupts acceleration. A few limited situations with 4D arrays
are accelerated.
Function calls: Calls to built-in functions and M-functions are accelerated. Calling MEX func-
tions and Java interrupts acceleration. (See also page 14 on inlining simple functions.)
Conditionals and loops: The conditional statements if, elseif, and simple switch statements
are supported if the conditional expression evaluates to a scalar. Loops of the form for k=a:b,
for k=a:b:c, and while loops are accelerated if all code within the loop is supported.
Introduced in Matlab 7.3 (R2006b), the element-wise operators (+, .*, etc.) and some other functions
can be computed in-place. That is, a computation like
x = 5*sqrt(x.ˆ2 + 1);
is handled internally without needing temporary storage for accumulating the result. An M-function
can also be computed in-place if its output argument matches one of the input arguments.
x = myfun(x);
function x = myfun(x)
x = 5*sqrt(x.ˆ2 + 1);
return;
To enable in-place computation, the in-place operation must be within an M-function (and for an in-
place function, the function itself must be called within an M-function). Currently, there is no support
for in-place computation with MEX-functions.
A computation is vectorized by taking advantage of vector operations. A variety of programming
situations can be vectorized, and often improving speed to 10 times faster or even better. Vectorization
is one of the most general and effective techniques for writing fast M-code.
Many standard Matlab functions are “vectorized”: they can operate on an array as if the function
had been applied individually to every element.
sqrt([1,4;9,16])
ans =
1 2
3 4
abs([0,1,2,−5,−6,−7])
ans =
0 1 2 5 6 7
Consider the following function:
function d = minDistance(x,y,z)
% Find the min distance between a set of points and the origin
nPoints = length(x);
d = zeros(nPoints,1); % Preallocate
for k = 1:nPoints % Compute distance for every point
d(k) = sqrt(x(k)ˆ2 + y(k)ˆ2 + z(k)ˆ2);
end
d = min(d); % Get the minimum distance
For every point, its distance from the origin is computed and stored in d. For speed, array d is
preallocated (see Section 3). The minimum distance is then found with min. To vectorize the distance
computation, replace the for loop with vector operations:
function d = minDistance(x,y,z)
% Find the min distance between a set of points and the origin
d = sqrt(x.ˆ2 + y.ˆ2 + z.ˆ2); % Compute distance for every point
d = min(d); % Get the minimum distance
The modified code performs the distance computation with vector operations. The x, y and z arrays
are first squared using the per-element power operator, .^ (the per-element operators for multiplication
and division are .* and ./). The squared components are added with vector addition. Finally, the
square root of the vector sum is computed per element, yielding an array of distances. (A further
improvement: it is equivalent to compute d = sqrt(min(x.^2 + y.^2 + z.^2))).
The first version of the minDistance program takes 0.73 seconds on 50000 points. The vectorized
version takes less than 0.04 seconds, more than 18 times faster.
Some useful functions for vectorizing computations:
min, max, repmat, meshgrid, sum, cumsum, diff, prod, cumprod, accumarray
The previous section shows how to vectorize pure computation. But bottleneck code often involves
conditional logic. Like computations, Matlab’s logic operators are vectorized:
[1,5,3] < [2,2,4]
ans =
1 0 1
Two arrays are compared per-element. Logic operations return “logical” arrays with binary values.
How is this useful? Matlab has a few powerful functions for operating on logical arrays:
find([1,5,3] < [2,2,4]) % Find indices of nonzero elements
ans =
1 3
any([1,5,3] < [2,2,4]) % True if any element of a vector is nonzero
% (or per−column for a matrix)
ans =
1
all([1,5,3] < [2,2,4]) % True if all elements of a vector are nonzero
% (or per−column for a matrix)
ans =
0
Vectorized logic also works on arrays.
find(eye(3) == 1) % Find which elements == 1 in the 3x3 identity matrix
ans =
1 % (element locations given as indices)
5
9
By default, find returns element locations as indices (see page 16 for indices vs. subscripts).
The meshgrid function takes two input vectors and converts them to matrices by replicating the first
over the rows and the second over the columns.
[x,y] = meshgrid(1:5,1:3)
x =
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
y =
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
The matrices above work like a map for a width 5, height 3 image. For each pixel, the x-location can
be read from x and the y-location from y. This may seem like a gratuitous use of memory as x and y
simply record the column and row positions, but this is useful. For example, to draw an ellipse,
% Create x and y for a width 150, height 100 image
[x,y] = meshgrid(1:150,1:100);
% Ellipse with origin (60,50) of size 15 x 40
Img = sqrt(((x−60).ˆ2 / 15ˆ2) + ((y−50).ˆ2 / 40ˆ2)) > 1;
% Plot the image
imagesc(Img); colormap(copper);
axis image, axis off
Drawing lines is just a change in the formula.
[x,y] = meshgrid(1:150,1:100);
% The line y = x*0.8 + 20
Img = (abs((x*0.8 + 20) − y) > 1);
imagesc(Img); colormap(copper);
axis image, axis off
Polar functions can be drawn by first converting x and y variables with the cart2pol function.
[x,y] = meshgrid(1:150,1:100);
[th,r] = cart2pol(x − 75,y − 50); % Convert to polar
% Spiral centered at (75,50)
Img = sin(r/3 + th);
imagesc(Img); colormap(hot);
axis image, axis off
Given n points x 1 ,... xn and corresponding function values y 1 ,... yn, the coefficients c 0 ,... , cn− 1 of the
interpolating polynomial
P (x) = cn− 1 x
n− 1
can be found by solving
x 1 n− 1 x 1 n− 2 · · · x 1 2 x 1 1
x 2
n− 1 x 2
n− 2 · · · x 2
2 x 2 1
. . .
xn n− 1 xn n− 2 · · · xn 2 xn 1
cn− 1
cn− 2
. . .
c 0
y 1
y 2
. . .
yn
function c = polyint(x,y)
% Given a set of points and function values x and y,
% computes the interpolating polynomial.
x = x(:); % Make sure x and y are both column vectors
y = y(:);
n = length(x); % n = Number of points
% Construct the matrix on the left−hand−side
xMatrix = repmat(x, 1, n); % Make an n by n matrix with x on every column
powMatrix = repmat(n−1:−1:0, n, 1); % Make another n by n matrix of exponents
A = xMatrix .ˆ powMatrix; % Compute the powers
c = A\y; % Solve matrix equation for coefficients
The strategy to construct the left-hand side matrix is to first make two n×n matrices of bases and
exponents and then put them together using the per element power operator, .^. The repmat function
(“replicate matrix”) is used to make the base matrix xMatrix and the exponent matrix powMatrix.
xMatrix =
x( 1 ) x( 1 ) · · · x( 1 )
x( 2 ) x( 2 ) · · · x( 2 )
. . .
x(n) x(n) · · · x(n)
powMatrix =
n − 1 n − 2 · · · 0
n − 1 n − 2 · · · 0
. . .
n − 1 n − 2 · · · 0
The xMatrix is made by repeating the column vector x over the columns n times. The powMatrix is a
row vector with elements n − 1 , n − 2 , n − 3 ,... , 0 repeated down the rows n times. The two matrices
could also have been created with [powMatrix, xMatrix] = meshgrid(n-1:-1:0, x).
The code above is only an example—use the standard polyfit function for serious use.
For example, in Matlab R11 and earlier, ifft was implemented by simply calling fft and conj. If x
is a one-dimensional array, y = ifft(x) can be inlined with y = conj(fft(conj(x)))/length(x).
Another example: b = unique(a) can be inlined with
b = sort(a(:)); b( b((1:end−1)') == b((2:end)') ) = [];
While repmat has the generality to operate on matrices, it is often only necessary to tile a vector or
just a scalar. To repeat a column vector y over the columns n times,
A = y(:,ones(1,n)); % Equivalent to A = repmat(y,1,n);
To repeat a row vector x over the rows m times,
A = x(ones(1,m),:); % Equivalent to A = repmat(x,m,1);
To repeat a scalar s into an m×n matrix,
A = s(ones(m,n)); % Equivalent to A = repmat(s,m,n);
This method avoids the overhead of calling an M-file function. It is never slower than repmat (critics
should note that repmat.m itself uses this method to construct mind and nind). For constructing
matrices with constant value, there are other efficient methods, for example, s+zeros(m,n).
Don’t go overboard. Inlining functions is only beneficial when the function is simple and when it is
called frequently. Doing it unnecessarily obfuscates the code.
Referencing in Matlab is varied and powerful enough to deserve a section of discussion. Good under-
standing of referencing enables vectorizing a broader range of programming situations.
Subscripts are the most common method used to refer to matrix elements, for example, A(3,9) refers
to row 3, column 9. Indices are an alternative referencing method. Consider a 10 × 10 matrix A.
Internally, Matlab stores the matrix data linearly as a one-dimensional, 100-element array of data in
column-major order.
An index refers to an element’s position in this one-dimensional array. Using an index, A(83) also refers
to the element on row 3, column 9.
It is faster to scan down columns than over rows. Column-major order means that elements
along a column are sequential in memory while elements along a row are further apart. Scanning down
columns promotes cache efficiency.
Conversion between subscripts and indices can be done with the sub2ind and ind2sub functions.
However, because these are M-file functions rather than fast built-in operations, it is more efficient to
compute conversions directly (also see Section 6). For a two-dimensional matrix A of size M×N, the
conversions between subscript (i,j) and index (index) are
A(i,j) ↔ A(i + (j-1)*M)
A(index) ↔ A(rem(index-1,M)+1, floor((index-1)/M)+1)
Indexing extends to N-D matrices as well, with indices increasing first through the columns, then
through the rows, through the third dimension, and so on. Subscript notation extends intuitively,
A(... , dim4, dim3, row, col).
It is useful to work with submatrices rather than individual elements. This is done with a vector of
indices or subscripts. If A is a two-dimensional matrix, a vector subscript reference has the syntax
A(rowv, colv)
Using the wildcard, :, in a subscript refers to an entire row or column. For example, A(:,1) refers
to every row in column one–the entire first column. This can be combined with vector subscripts,
A([2,4],:) refers to the second and fourth rows.
When the wildcard is used in a vector index, the entire matrix is referenced. On the right-hand side,
this always returns a column vector.
A(:) = column vector
This is frequently useful: for example, if a function input must be a row vector, the user’s input can be
quickly reshaped into row vector with A(:).’ (make a column vector and transpose to a row vector).
A(:) on the left-hand side assigns all the elements of A, but does not change its size. For example,
A(:) = 8 changes all elements of matrix A to 8.
Given a logical array mask with the same size as A,
A(mask)
refers to all elements of A where the corresponding element of mask is 1 (true). It is equivalent to
A(find(mask)). A common usage of logical indexing is to refer to elements based on a per-element
condition, for example,
A(abs(A) < 1e−3) = 0
sets all elements with magnitude less than 10
− 3 to zero. Logical indexing is also useful to select non-
rectangular sets of elements, for example,
A(logical(eye(size(A))))
references the diagonal elements of A. Note that for a right-hand side reference, it is faster to use
diag(A) to get the diagonal of A.
Elements in a matrix can be deleted by assigning the empty matrix. For example, A([3,5]) = [ ]
deletes the third and fifth element from A. If this is done with index references, the matrix is reshaped
into a row vector.
It is also possible to delete with subscripts if all but one subscript are the wildcard. A(2,:) = [ ]
deletes the second row. Deletions like A(2,1) = [ ] or even A(2,1:end) = [ ] are illegal.
Many problems involve solving a linear system, that is, solving the Ax = b problem
a 1 , 1 a 1 , 2 · · · a 1 ,n
a 2 , 1 a 2 , 2 · · · a 2 ,n
. . .
an, 1 an, 2 · · · an,n
x 1
x 2
. . .
xn
b 1
b 2
. . .
bn
The condition number cond(A) describes how sensitive the problem is to small perturbations. Matlab
can efficiently approximate cond(A) with condest. Smaller condition number is better, and in this case
A is said to be well-conditioned. An infinite condition number implies that A is singular—the problem
cannot be solved, though it may be solved approximately with the pseudoinverse (x = pinv(A)*b).
Small- to moderately-sized problems are solved efficiently by the backslash operator:
x = A\b; % Solves A*x = b
We shall focus on large problems.
For large problems, Matlab is well-equipped with iterative solvers: bicg, bicgstab, cgs, gmres, lsqr,
minres, pcg, symmlq, qmr. The general calling syntax (except for gmres) is
[x,flag,relres,iter,resvec] = method (A,b,tol,maxit,M1,M2,x0)
where the required arguments are the matrix A, the right-hand side b, and the solution x. The minres,
pcg, and symmlq methods require A to be symmetric (A = A’) and pcg additionally requires positive
definiteness (all eigenvalues are positive). The optional input variables are
tol accuracy tolerance (default 10 − 6 )
maxit maximum number of iterations (default 20)
M1, M2 the method uses preconditioner M = M1*M2; it solves M
− 1 Ax = M
− 1 b
x0 initial guess
The optional output variables are
flag convergence flag (0 = success, nonzero values are error codes)
relres relative residual ‖b − Ax‖/‖b‖
iter number of iterations
resvec a vector listing the residual norm ‖b − Ax‖ at each iteration
The gmres method includes one additional input,
[x,flag,relres,iter,resvec] = gmres(A,b,restart,tol,maxit,...)
The method restarts every restart iterations, or if restart=[], the method does not restart.