2 Solved Problems on the Numerical Methods - Assignment 1 | MATH 609, Assignments of Mathematical Methods for Numerical Analysis and Optimization

Material Type: Assignment; Class: NUMERICAL ANALYSIS; Subject: MATHEMATICS; University: Texas A&M University; Term: Unknown 1989;

Typology: Assignments

Pre 2010

Uploaded on 02/10/2009

koofers-user-gn9
koofers-user-gn9 🇺🇸

5

(1)

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
MATH 609-602: Numerical Methods
Lecturer: Prof. Wolfgang Bangerth
Blocker Bldg., Room 507D
(979) 845 6393
Teaching Assistant: Seungil Kim
Blocker Bldg., Room 507A
(979) 862 3259
Homework assignment 1
All problems are discussed in the lab on Wednesday. You do not need to hand
anything in.
Problem 1 (Continuous vs. discrete). Functions f(x) are usually defined
over an entire domain xI= (a,b)Rand if interesting take values in an
image f(I)R. Both domain and image are sets with infinitely many elements.
On the other hand, computers can only represent numbers using a finite number
of bits, most often as 32-bit (float, or REAL*4) or 64-bit (double, or REAL*8)
IEEE floating point numbers, which store numbers in the form ±m2e, where
0m < 1 is the mantissa
m=b121+b222+b323+· · · +bM2M(1)
and eis the exponent and has the form
e=±(u121+u222+u323+· · · +uE2E).(2)
The coefficients bi, eiare single-bit numbers, i.e. either 0 or 1. In the bi-
nary system, floating point numbers can therefore be written as ±0.b1b2b3. . . ×
2±u1u2u3...uE. The total number of bits needed for the representation are M
bits for the mantissa, E+ 1 bits for the exponent, and 2 bits for the two signs.
Obviously, not all elements of Iand f(I) can be represented. Write a short
program to find
a) an approximation to the smallest and largest positive numbers that can
be represented in float and double precision;
b) the smallest float and double floating point number you can add to 1 such
that the result is different from 1.
c) In exact arithmetic, the system of linear equations
x1+x2= 2,
x1+ 1020x2= 1 + 1020
1
pf2

Partial preview of the text

Download 2 Solved Problems on the Numerical Methods - Assignment 1 | MATH 609 and more Assignments Mathematical Methods for Numerical Analysis and Optimization in PDF only on Docsity!

MATH 609-602: Numerical Methods

Lecturer: Prof. Wolfgang Bangerth Blocker Bldg., Room 507D (979) 845 6393 [email protected] Teaching Assistant: Seungil Kim Blocker Bldg., Room 507A (979) 862 3259 [email protected]

Homework assignment 1

All problems are discussed in the lab on Wednesday. You do not need to hand anything in.

Problem 1 (Continuous vs. discrete). Functions f (x) are usually defined over an entire domain x ∈ I = (a, b) ⊂ R and – if interesting – take values in an image f (I) ⊂ R. Both domain and image are sets with infinitely many elements. On the other hand, computers can only represent numbers using a finite number of bits, most often as 32-bit (float, or REAL4) or 64-bit (double, or REAL8) IEEE floating point numbers, which store numbers in the form ±m 2 e, where 0 ≤ m < 1 is the mantissa

m = b 12 −^1 + b 22 −^2 + b 32 −^3 + · · · + bM 2 −M^ (1)

and e is the exponent and has the form

e = ±(u 121 + u 222 + u 323 + · · · + uE 2 E^ ). (2)

The coefficients bi, ei are single-bit numbers, i.e. either 0 or 1. In the bi- nary system, floating point numbers can therefore be written as ± 0 .b 1 b 2 b 3... × 2 ±u^1 u^2 u^3 ...uE^. The total number of bits needed for the representation are M bits for the mantissa, E + 1 bits for the exponent, and 2 bits for the two signs. Obviously, not all elements of I and f (I) can be represented. Write a short program to find

a) an approximation to the smallest and largest positive numbers that can be represented in float and double precision;

b) the smallest float and double floating point number you can add to 1 such that the result is different from 1.

c) In exact arithmetic, the system of linear equations

x 1 + x 2 = 2, x 1 + 10^20 x 2 = 1 + 10^20

has the solution x 1 = x 2 = 1. Are there corresponding floating point num- bers for x 1 , x 2 that when plugged into the left hand side of the equations yields the exact values on the right hand side? If so, which? If not, is this a problem?

Problem 2 (Floating point vs real numbers). Let ε be the smallest floating point number in double precision such that in computer arithmetic 1 + ε 6 = 1 (you determined ε in Problem 1b). What are the floating point values of (1 + ε 2 ) − 1, 1 + ( ε 2 − 1), and (1 − 1) + ε 2? In what important way do exact and floating point arithmetic therefore differ?

Problem 3 (Taylor series). Derive the first four terms and integral remain- der term of the Taylor series of

a) f (x) = sin x when expanded around x 0 = 0;

b) f (x) = x sin x when expanded around x 0 = π/2;

c) f (x) = 4(x − 3)^2 (x + 2) when expanded around x 0 = 1. What happened to the remainder term and what does this mean for the accuracy of the Taylor expansion with only four terms?

d) f (x) = xx^ when expanded around x 0 = 1. (Note: You will first have to figure out how to differentiate f (x). Use the identity ab^ = eb^ ln^ a.)

You may use a computer algebra system like Maple to compute derivatives of f (x), but not to generate the entire Taylor series.