



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concept of round-off errors in floating point computations, focusing on their impact on the final result. Examples of calculations with round-off errors and their relative and absolute values. It also discusses methods to estimate and minimize round-off errors.
Typology: Lecture notes
1 / 5
This page cannot be seen from the preview
Don't miss anything!




1.6.1 Round-off errors.
When people or computers do computations with floating point numbers, they usually round the result of each arithmetic operation to a certain fixed number of digits of precision. This introduces additional errors into the final result called round-off errors. Usually round-off errors are insignificant compared to errors in measurement or truncation errors, but sometimes they will actually be larger. This is the case when the result of an addition or subtraction is significantly smaller in magnitude than the numbers which one is adding or subtracting. In some cases the round-off errors can be serious enough to cause the final result to be meaningless.
Example 1. An object moves along a straight line so that its position x at time t is given by x = t^3. Let t o = 10 and t (^) 1 = 10 + h be two times and x o = t (^) o^3 = 10 3 = 1000 and x (^) 1 = t (^) 13 = (10+ h ) 3 be the corresponding positions. The displacement F 0 4 4x is the change in position, i.e. F 0 4 4x = x (^) 1 – x (^) o = (10+ h )^3 - 1000. Suppose h = 0.014.
a. Compute F 0 4 4x exactly.
b. Compute F 0 4 4x doing the calculations using four digit decimal floating point arithmetic. What is the error in the result?
c. An alternative formula for F 0 4 4x is F 0 4 4x = 3 t (^) o^2 h + 3 t (^) o h^2 + h^3 = 300 h + 30 h^2 + h^3. Compute F 0 4 4x using this alternative formula again doing the calculations using four digit decimal floating point arithmetic. What is the error in the result? How does this compare with part b?
Solution. a. Compute F 0 4 4x exactly.
10 + h = 10 + 0.014 = 10. (10 + h )^2 = (10.014) 2 = 100. (10 + h )^3 = (100.280196)(10.014) = 1004.
F 0 4 4x = 1004.205882744 – 1000 = 4.
b. Compute F 0 4 4x rounding results to four digits after each operation. In the following F 0 A Eindicates rounding and a subscript a indicates an approximate value.
10 + h = 10.014 F 0 A E(10 + h ) (^) a = 10.
[(10 + h ) a ]^2 = (10.01) 2 = 100.2001 F 0 A E[(10 + h ) 2 ] a = 100.
[(10 + h )^2 ] a (10 + h ) (^) a = (100.2)(10.01) = 1003.002 F 0 A E[(10 + h ) 3 ] (^) a = 1003
[(10 + h )^3 ] a - 1000 = 1003 – 1000 = 3 F 0 A E[ F 0 4 4x ] a = 3 Absolute error = 4.205882744 - 3 = 1.
Relative error = 1.205882744/3 F 0 B B0.4 = 40%
c. Compute F 0 4 4x using the alternative formula.
10 + h = 10.014 F 0 A E(10 + h ) (^) a = 10. 300 h = (300)(0.014) = 4. h^2 = (0.014) 2 = 0.
30 h^2 = (30)(0.000196) = 0. h^3 = 0.
300 h + 30 h^2 + h^3 = 4.205882744 F 0 A E[ F 0 4 4x ] (^) a = 4.
Absolute error = 4.205882744 - 4.206 = 0. Relative error = 0.000117256/4.206 F 0 B B0.00003 = 0.003%
This is much better than b.
This example illustrates that sometimes one formula for computing a quantity is better than another equivalent formula from the standpoint of round-off error. It also raises several questions. Why did the result in part b have such a large relative error and the result in part c didn't? Would the same be true if we did the calculations with more digits of precision? Is there a way to describe/predict how large the round-off error might be ahead of time before we do the computation?
Estimating the round-off error in a certain computation is often difficult. One way is to repeat the same computation doing the second calculation with more digits of precision. By comparing the two values one can estimate the round-off error in the value obtained with fewer digits of precision. Another way is to try to estimate the round-off error at each step of the computation.
If we look carefully we can see that in the computation in part b we lost about three digits of precision when we did the final subtraction 1003 – 1000 = 3. Until then the intermediate results had almost four digits of precision. The two numbers we subtracted, 1003 and 1000, were close in the relative sense so the result was a number, 3, that was much smaller than either. The computation in part c did not involve the subtraction of two nearly equal numbers, so the only round-off errors were small in the relative sense.
Example 2. Redo part b of Example 1 with a general h which is small with respect to 10 if the computations are done on a computer with machine F 0 6 5 equal to F 0 6 5. (In parts b and c of Example 1 one has F 0 6 5 = 5 F 0 B 4 10 -4^ = 0.0005.) For simplicity you may make approximations in the calculation of the error.
Solution. Recall from section 1.5 that the relative error between a number x and its rounded value is no more
than F 0 6 5. The first thing we do in the calculation of (10+ h ) 3 - 1000 is to round h. The rounded value of h may
have a relative error as large as F 0 6 5 and an absolute error as large as h F 0 6 5. The next thing to do is to compute 10
error as much as h F 0 6 5 /(10 + h ) which is about h F 0 6 5 /10. Rounding introduces another relative error of
approximately F 0 6 5 which is added to h F 0 6 5 /10 giving F 0 6 5 + h F 0 6 5 /10 F 0 B B F 0 6 5. Next we multiply 10+ h by itself to get (
adds an additional relative error of F 0 6 5. It follows that the computed value of (10+ h ) 3 may have a relative error
of about 5 F 0 6 5 and an absolute error of about 5 F 0 6 5 (10+ h )^3 which is about 5000 F 0 6 5. Finally, we compute (
absolute error of about 5000 F 0 6 5 and a relative error of about 5000 F 0 6 5 /[(10+ h )^3 - 1000]. Since (10+ h )^3 - 1000 F 0 B B
300 h , the relative error of (10+ h )^3 - 1000 is about 5000 F 0 6 5 /(300 h ) F 0 B B 17 F 0 6 5 / h. In part b one had h = 0.014, so the
worst case relative error is about 1200 F 0 6 5. If F 0 6 5 = 5 F 0 B 4 10 -4^ , then the relative error might be as large as 0.6. In fact it was only about 0.4, which is about 2/3 the worst case.
Example 2. The equation y = describes the top half of the circle of radius 2 centered at the origin. If one starts at x on the x axis and goes left to x = 0, then the change in the y values is F 0 4 4y = 2 -. Suppose x = 0.0016.
a. Compute F 0 4 4y exactly.
b. Compute F 0 4 4y doing the calculations using four digit decimal floating point arithmetic. What is the error in the result?
c. An alternative formula for F 0 4 4y is F 0 4 4y =. Compute F 0 4 4y using this alternative formula again doing the calculations using four digit decimal floating point arithmetic. What is the error in the result? How does this compare with part b?
Solution. a. Compute F 0 4 4y exactly.
4 - x = 4 - 0.0016 = 3. = = 1.999599960…
Example 4. Consider the calculation of y = 1 + x + x^2 /2! + x^3 /3! + F 0 B C+ xn^ / n! discussed in section 1.1. Let's estimate the round-off error in the computation when x = -5.5 and n = 25, and the calculations are done with decimal floating point numbers with six digits of precision. In this case F 0 6 5 = 5 F 0 B 4 10 -^.
Solution. The answer depends somewhat on the algorithm used to compute the sum. Note that y = y (^) 25 where
yj = 1 + x + x^2 /2! + x^3 /3! + F 0 B C+ x j^ / j!
= 1 + t (^) 1 + t (^) 2 + t (^) 3 + F 0 B C+ tj = yj -1 + tj
where
tj = xj^ / j! = q (^) j / f (^) j qj = xj^ = xqj -
fj = j! = j fj -
Suppose one uses the following algorithm.
x = - 5.5; n = 25; y o = 1; q o = 1;
f o = 1;
for j = 1 to n do begin qj = xq (^) j -1;
fj = j f (^) j -1 ; tj = q (^) j / f (^) j ;
yj = yj -1 + tj end
One way to estimate the round-off error is to first do the computation using six digits of precision and then with more digits of precision. This is done in Example 9.5.2a in Section 1.9.5 below. With six digits of precision one obtains 0.00405471, with 10 digits one obtains 0.00408674 and with 14 digits one obtains 0.00408673. It
appears that the true value is about 0.0040867, so the six digit calculation is off by about 3 F 0 B 4 10 -5^ which is about a 1% error.
Another way to estimate the round-off error is to estimate the error at each stage of the computation. This can be somewhat complicated as we saw in Example 2. First consider the error in q (^) j = x j^. The values of q (^) o and q (^) 1
can be represented exactly. In this particular example the values of q (^) 2 and q (^) 3 can also be represented exactly, but if x had some other value this might not be true. So we shall give an estimate that holds for any value of x. To get q (^) j we multiply q (^) j -1 by x and round. Before rounding the relative error in xqj -1 is no more than the relative
error in q (^) j -1. Rounding introduces an additional relative error of no more than F 0 6 5. So the relative error in q (^) j is
Similarly the relative error in f (^) j is no more than approximately ( j -1) F 0 6 5.
Now consider the error in t (^) j = q (^) j / fj. Before rounding the relative error in t (^) j is no more than approximately the
sum of the relative errors in q (^) j and fj which is 2( j -1) F 0 6 5. Rounding introduces an additional relative error of no
more than approximately F 0 6 5 , so the relative error in t (^) j is no more than about (2 j -1) F 0 6 5. The absolute error in t (^) j is
no more than about (2 j -1) F 0 6 5 | tj |.
Finally consider the error in the yj = yj -1 + t (^) j. Before rounding the absolute error in y (^) j is no more than the sum of
the absolute errors in y (^) j -1 and t (^) j. Rounding introduces and an additional absolute error of nor more than | y (^) j | F 0 6 5.
So the absolute error in yj is no more than about b (^) j F 0 6 5 where b (^) j = +.
This value is computed for j = 25 in Example 2 in section 1.6.2. One obtains b (^) 25 F 0 B B2559, so the error in y 25 is
bounded by about b (^) 25 F 0 6 5 F 0 B B0.013. The terms | tj | start at 5.5 for j = 1 and go 15..., 27..., 38…, 41 for j = 2, 3, 4, 5 and then start to decrease. The terms (2 j -1) | t (^) j | start at 45.. for j = 2 and go 138..., 266..., 377…, 422 for j = 3,
4, 5, 6 and then start to decrease. The terms | y (^) j | start at 4.5 for j = 1 and go 10..., 17..., 21…,, for j = 2, 3, 4 and then start to decrease. It turns out that the main contributions to b (^) 25 are the terms (2 k -1) | t (^) k | for k between 3 and 12. This estimate of the error is quite a bit larger than the one obtained above by repeating the computations using more digits. This is because it assumes the worst possible case at each step.