



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Solutions to problems related to floating point number representation, error analysis, and algorithm implementation. Topics include binary mantissa format, base-ten conversion, machine epsilon, bisection algorithm, false position method, and finite difference approximation of derivatives.
Typology: Exams
1 / 6
This page cannot be seen from the preview
Don't miss anything!




ME2016A, Spring 2004, Dr. Ferri Name: _____ Solution _______ Test 1, February 9 This is a closed-book, closed-notes test. There are 6 problems for a total of 40 points.
Honor Pledge: On my honor, I pledge that I have neither given nor received any inappropriate aid in the preparation of this test.
Signature
Problem 1 (10) ____ 10 ___
Problem 2 (15) ____ 15 ___
Problem 3 (8) ______ 8 ___
Problem 4-6 (7) ____ 7 ____
Total (40) ________ 40 ____
Problem 1. (10 points)
In the text, we studied a particular format for floating point numbers which is x = m 2 , where the e binary mantissa must have the form m = 0.1…… In other words, the first digit of m after the decimal point must be a “1”. To avoid wasting bits this way, most computers use an alternate format:
x =( 1 + f ) 2^ e
where f can be any binary number such that 0 ≤ f ≤ 0. 1111 " 1
Consider a 9-bit computer where the first bit is a sign bit for x (0 = positive, 1 = negative), the next bit is a sign bit for the exponent e , the next 3 bits are for the exponent itself, and the next 4 are the digits after the decimal-point in f. (a) What is the base-ten number given by the following register values:
(b) What is the smallest positive number that can be represented on this computer? (c) What is the largest positive number that can be represented on this computer? (d) What is the smallest number ε such that 1+ε > 1?
(a) f = (0.0101) 2 = (1/4 + 1/16) 10 = (5/16) 10 ; e = + (011) 2 = (+3) 10
x = (1+f) 2e = (1 + 5/16)23 = (10.5) 10
(b) Smallest: f = 0, e = -7 Æ x = (1+0)2-7 = 2-
(c) Largest: f = (0.1111) 2 = (1/2 + 1/4 + 1/8 + 1/16) 10 = (15/16) 10 ; e = +
x = (1+f) 2+7 = (31/16) 10 2+7 = (248) 10
(d) Machine Epsilon: ε = 2(1-t), where t is the number of bits in m. In this case, t = (4 (for f) plus 1) = 5.
ε = 2-4 = (1/16) 10
check: 1 = (1.0000) 2^0 and 1/16 = (1.0000) 2-4. To add, shift the decimal point:
1.0001 2^0 Anything smaller than 1/16 will not register when added to 1
1 3 5 7 10
true root, x=0.
0.1 0.3 0.5 0.7 1.
Problem 3. (8 points) The formula for the first forward finite difference approximation of a derivative is
h
h
f x f x f xi i^1 i^1
where R1 is the Taylor-series remainder. It is assumed that the spacing between the xi coordinates is
constant; i.e., x (^) i + 1 − xi = h for all i. The 4 lines of Matlab code below generate a sequence of points
defining the curve f ( x )= sin( 10 x ).
(a) Add some lines of Matlab code that will take the vectors x , and f (already in the workspace), and the variables h and n to approximate the first derivative of f at each xi using the expression above. Your code MUST contain a loop. (b) The approximation generated by your answer to (a) will have some error relative to the exact result. If h is reduced by a factor of 10, should the error go up or down? By what factor will it change? Explain your reasoning.
h = 0.1; x = 0:h:50; f = sin(10*x); n = length(x);
for k = 1:(n-1)
fprime(k) = (f(k+1) – f(k) )/h;
end
(b) We know that ( ) ( ) 2
(^1) Oh h
=. If h is reduced by a
factor of 10, the truncation error will also be reduced by a factor of 10.