Solving Nonlinear Equations: Existence, Uniqueness, and Methods, Study notes of Mathematical Methods for Numerical Analysis and Optimization

The challenges of solving nonlinear equations of the form f(x) = 0, where f(x) is any known function. Existence and uniqueness conditions, the bisection method, and fixed-point iteration. It also mentions the intermediate value theorem and the inverse function theorem.

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-9pg
koofers-user-9pg 🇺🇸

5

(1)

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Jim Lambers
Math 105A
Summer Session I 2003-04
Lecture 4 Notes
These notes correspond to Sections 2.1 and 2.2 in the text.
Nonlinear Equations
To this point, we have only considered the solution of linear equations. We now explore the much
more difficult problem of solving nonlinear equations of the form
f(x) = 0,
where f(x) : RnRmcan be any known function. A solution xof such a nonlinear equation is
called a root of the equation, as well as a zero of the function f.
Existence and Uniqueness
For simplicity, we assume that the function f:RnRmis continuous on the domain under
consideration. Then, each equation fi(x) = 0, i= 1, . . . , m, defines a hypersurface in Rm. The
solution of f(x) = 0 is the intersection of these hypersurfaces, if the intersection is not empty. It is
not hard to see that there can be a unique solution, infinitely many solutions, or no solution at all.
For a general equation f(x) = 0, it is not possible to characterize the conditions under which
a solution exists or is unique. However, in some situations, it is possible to determine existence
analytically. For example, in one dimension, the Intermediate Value Theorem implies that if a
continuous function f(x) satisfies f(a)0 and f(b)0 where a<b, then f(x) = 0 for some
x(a, b).
Similarly, it can be concluded that f(x) = 0 for some x(a, b) if the function (xz)f(x)0
for x=aand x=b, where z(a, b). This condition can be generalized to higher dimensions. If
SRnis an open, bounded set, and (xz)Tf(x)0 for all xon the boundary of Sand for some
zS, then f(x) = 0for some xS. Unfortunately, checking this condition can be difficult in
practice.
One useful result from calculus that can be used to establish existence and, in some sense,
uniqueness of a solution is the Inverse Function Theorem, which states that if the Jacobian of f
is nonsingular at a point x0, then fis invertible near x0and the equation f(x) = yhas a unique
solution for all ynear f(x0).
If the Jacobian of fat a point x0is singular, then fis said to be degenerate at x0. Suppose
that x0is a solution of f(x) = 0. Then, in one dimension, degeneracy means f0(x0) = 0, and we
say that x0is a double root of f(x). Similarly, if f(j)(x0) = 0 for j= 0, . . . , m 1, then x0is a root
1
pf3
pf4

Partial preview of the text

Download Solving Nonlinear Equations: Existence, Uniqueness, and Methods and more Study notes Mathematical Methods for Numerical Analysis and Optimization in PDF only on Docsity!

Jim Lambers Math 105A Summer Session I 2003- Lecture 4 Notes

These notes correspond to Sections 2.1 and 2.2 in the text.

Nonlinear Equations

To this point, we have only considered the solution of linear equations. We now explore the much more difficult problem of solving nonlinear equations of the form

f (x) = 0 ,

where f (x) : Rn^ → Rm^ can be any known function. A solution x of such a nonlinear equation is called a root of the equation, as well as a zero of the function f.

Existence and Uniqueness

For simplicity, we assume that the function f : Rn^ → Rm^ is continuous on the domain under consideration. Then, each equation fi(x) = 0, i = 1,... , m, defines a hypersurface in Rm. The solution of f (x) = 0 is the intersection of these hypersurfaces, if the intersection is not empty. It is not hard to see that there can be a unique solution, infinitely many solutions, or no solution at all. For a general equation f (x) = 0 , it is not possible to characterize the conditions under which a solution exists or is unique. However, in some situations, it is possible to determine existence analytically. For example, in one dimension, the Intermediate Value Theorem implies that if a continuous function f (x) satisfies f (a) ≤ 0 and f (b) ≥ 0 where a < b, then f (x) = 0 for some x ∈ (a, b). Similarly, it can be concluded that f (x) = 0 for some x ∈ (a, b) if the function (x − z)f (x) ≥ 0 for x = a and x = b, where z ∈ (a, b). This condition can be generalized to higher dimensions. If S ⊂ Rn^ is an open, bounded set, and (x − z)T^ f (x) ≥ 0 for all x on the boundary of S and for some z ∈ S, then f (x) = 0 for some x ∈ S. Unfortunately, checking this condition can be difficult in practice. One useful result from calculus that can be used to establish existence and, in some sense, uniqueness of a solution is the Inverse Function Theorem, which states that if the Jacobian of f is nonsingular at a point x 0 , then f is invertible near x 0 and the equation f (x) = y has a unique solution for all y near f (x 0 ). If the Jacobian of f at a point x 0 is singular, then f is said to be degenerate at x 0. Suppose that x 0 is a solution of f (x) = 0. Then, in one dimension, degeneracy means f ′(x 0 ) = 0, and we say that x 0 is a double root of f (x). Similarly, if f (j)(x 0 ) = 0 for j = 0,... , m − 1, then x 0 is a root

of multiplicity m. We will see that degeneracy can cause difficulties when trying to solve nonlinear equations.

Sensitivity

Recall that the absolute condition number of a function f (x) is approximated by |f ′(x)|. In solving a nonlinear equation in one dimension, we are trying to solve an inverse problem, where the forward problem is the evaluation of f at x = 0. It follows that the condition number for solving f (x) = 0 is approximately 1/|f ′(x 0 )|, where x 0 is the solution. This discussion can be generalized to higher dimensions, where the condition number is measured using the norm of the Jacobian.

The Bisection Method

Suppose that f (x) is a continuous function that changes sign on the interval [a, b]. Then, by the Intermediate Value Theorem, f (x) = 0 for some x ∈ [a, b]. How can we find the solution, knowing that it lies in this interval? The method of bisection attempts to reduce the size of the interval in which a solution is known to exist. Suppose that we evaluate f (m), where m = (a + b)/2. If f (m) = 0, then we are done. Otherwise, f must change sign on the interval [a, m] or [m, b], since f (a) and f (b) have different signs. Therefore, we can cut the size of our search space in half, and continue this process until the interval of interest is sufficiently small, in which case we must be close to a solution. The following algorithm implements this approach.

Algorithm (Bisection) Let f be a continuous function on the interval [a, b] that changes sign on (a, b). The following algorithm computes an approximation p∗^ to a number p in (a, b) such that f (p) = 0.

for j = 1, 2 ,... do pj = (a + b)/ 2 if f (pj ) = 0 or b − a is sufficiently small then p∗^ = pj return p∗ end if f (a)f (pj ) < 0 then b = pj else a = pj end end

Given a continuous function g that is known to have a fixed point in an interval [a, b], we can try to find this fixed point by repeatedly evaluating g at points in [a, b] until we find a point x for which g(x) = x. This is the essence of the method of fixed-point iteration, the implementation of which we now describe.

Algorithm (Fixed-Point Iteration) Let g be a continuous function defined on the interval [a, b]. The following algorithm computes a number x∗^ ∈ (a, b) that is a solution to the equation g(x) = x.

Choose an initial guess x 0 in [a, b]. for k = 0, 1 , 2 ,... do xk+1 = g(xk) if |xk+1 − xk| is sufficiently small then x∗^ = xk+ return x∗ end end

Under what circumstances will fixed-point iteration converge to the exact solution x∗? If we denote the error in xk by ek = xk − x∗, we can see from Taylor’s Theorem and the fact that g(x∗) = x∗^ that ek+1 ≈ g′(x∗)ek. Therefore, if |g′(x∗)| ≤ k, where k < 1, then fixed-point iteration is locally convergent; that is, it converges if x 0 is chosen sufficiently close to x∗. This leads to the following result.

Theorem (Fixed-Point Theorem) Let g be a continuous function on the interval [a, b]. If g(x) ∈ [a, b] for each x ∈ [a, b], and if there exists a constant k < 1 such that

|g′(x)| ≤ k, x ∈ (a, b),

then the sequence of iterates {xk}∞ k=0 converges to the unique fixed point x∗^ of g in [a, b], for any initial guess x 0 ∈ [a, b].

It can be seen from the preceding discussion why g′(x) must be bounded away from 1 on (a, b), as opposed to the weaker condition |g′(x)| < 1 on (a, b). If g′(x) is allowed to approach 1 as x approaches a point c ∈ (a, b), then it is possible that the error ek might not approach zero as k increases, in which case fixed-point iteration would not converge. In general, when fixed-point iteration converges, it does so at a rate that varies inversely with the constant k that bounds |g′(x)|. In the extreme case where derivatives of g are equal to zero at the solution x∗, the method can converge much more rapidly. We will discuss convergence behavior of various methods for solving nonlinear equations in a later lecture. Often, there are many ways to convert an equation of the form f (x) = 0 to one of the form g(x) = x, the simplest being g(x) = x − φ(x)f (x) for any function φ. However, it is important to ensure that the conversion yields a function g for which fixed-point iteration will converge.