

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Assignment; Class: NUMERICAL ANALYSIS; Subject: MATHEMATICS; University: Texas A&M University; Term: Unknown 1989;
Typology: Assignments
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Lecturer: Prof. Wolfgang Bangerth Blocker Bldg., Room 507D (979) 845 6393 [email protected] Teaching Assistant: Seungil Kim Blocker Bldg., Room 507A (979) 862 3259 [email protected]
All problems are discussed in the lab on Wednesday. You do not need to hand anything in.
Problem 1 (Continuous vs. discrete). Functions f (x) are usually defined over an entire domain x ∈ I = (a, b) ⊂ R and – if interesting – take values in an image f (I) ⊂ R. Both domain and image are sets with infinitely many elements. On the other hand, computers can only represent numbers using a finite number of bits, most often as 32-bit (float, or REAL4) or 64-bit (double, or REAL8) IEEE floating point numbers, which store numbers in the form ±m 2 e, where 0 ≤ m < 1 is the mantissa
m = b 12 −^1 + b 22 −^2 + b 32 −^3 + · · · + bM 2 −M^ (1)
and e is the exponent and has the form
e = ±(u 121 + u 222 + u 323 + · · · + uE 2 E^ ). (2)
The coefficients bi, ei are single-bit numbers, i.e. either 0 or 1. In the bi- nary system, floating point numbers can therefore be written as ± 0 .b 1 b 2 b 3... × 2 ±u^1 u^2 u^3 ...uE^. The total number of bits needed for the representation are M bits for the mantissa, E + 1 bits for the exponent, and 2 bits for the two signs. Obviously, not all elements of I and f (I) can be represented. Write a short program to find
a) an approximation to the smallest and largest positive numbers that can be represented in float and double precision;
b) the smallest float and double floating point number you can add to 1 such that the result is different from 1.
c) In exact arithmetic, the system of linear equations
x 1 + x 2 = 2, x 1 + 10^20 x 2 = 1 + 10^20
has the solution x 1 = x 2 = 1. Are there corresponding floating point num- bers for x 1 , x 2 that when plugged into the left hand side of the equations yields the exact values on the right hand side? If so, which? If not, is this a problem?
Problem 2 (Floating point vs real numbers). Let ε be the smallest floating point number in double precision such that in computer arithmetic 1 + ε 6 = 1 (you determined ε in Problem 1b). What are the floating point values of (1 + ε 2 ) − 1, 1 + ( ε 2 − 1), and (1 − 1) + ε 2? In what important way do exact and floating point arithmetic therefore differ?
Problem 3 (Taylor series). Derive the first four terms and integral remain- der term of the Taylor series of
a) f (x) = sin x when expanded around x 0 = 0;
b) f (x) = x sin x when expanded around x 0 = π/2;
c) f (x) = 4(x − 3)^2 (x + 2) when expanded around x 0 = 1. What happened to the remainder term and what does this mean for the accuracy of the Taylor expansion with only four terms?
d) f (x) = xx^ when expanded around x 0 = 1. (Note: You will first have to figure out how to differentiate f (x). Use the identity ab^ = eb^ ln^ a.)
You may use a computer algebra system like Maple to compute derivatives of f (x), but not to generate the entire Taylor series.