


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The challenges of solving nonlinear equations of the form f(x) = 0, where f(x) is any known function. Existence and uniqueness conditions, the bisection method, and fixed-point iteration. It also mentions the intermediate value theorem and the inverse function theorem.
Typology: Study notes
1 / 4
This page cannot be seen from the preview
Don't miss anything!



Jim Lambers Math 105A Summer Session I 2003- Lecture 4 Notes
These notes correspond to Sections 2.1 and 2.2 in the text.
To this point, we have only considered the solution of linear equations. We now explore the much more difficult problem of solving nonlinear equations of the form
f (x) = 0 ,
where f (x) : Rn^ → Rm^ can be any known function. A solution x of such a nonlinear equation is called a root of the equation, as well as a zero of the function f.
For simplicity, we assume that the function f : Rn^ → Rm^ is continuous on the domain under consideration. Then, each equation fi(x) = 0, i = 1,... , m, defines a hypersurface in Rm. The solution of f (x) = 0 is the intersection of these hypersurfaces, if the intersection is not empty. It is not hard to see that there can be a unique solution, infinitely many solutions, or no solution at all. For a general equation f (x) = 0 , it is not possible to characterize the conditions under which a solution exists or is unique. However, in some situations, it is possible to determine existence analytically. For example, in one dimension, the Intermediate Value Theorem implies that if a continuous function f (x) satisfies f (a) ≤ 0 and f (b) ≥ 0 where a < b, then f (x) = 0 for some x ∈ (a, b). Similarly, it can be concluded that f (x) = 0 for some x ∈ (a, b) if the function (x − z)f (x) ≥ 0 for x = a and x = b, where z ∈ (a, b). This condition can be generalized to higher dimensions. If S ⊂ Rn^ is an open, bounded set, and (x − z)T^ f (x) ≥ 0 for all x on the boundary of S and for some z ∈ S, then f (x) = 0 for some x ∈ S. Unfortunately, checking this condition can be difficult in practice. One useful result from calculus that can be used to establish existence and, in some sense, uniqueness of a solution is the Inverse Function Theorem, which states that if the Jacobian of f is nonsingular at a point x 0 , then f is invertible near x 0 and the equation f (x) = y has a unique solution for all y near f (x 0 ). If the Jacobian of f at a point x 0 is singular, then f is said to be degenerate at x 0. Suppose that x 0 is a solution of f (x) = 0. Then, in one dimension, degeneracy means f ′(x 0 ) = 0, and we say that x 0 is a double root of f (x). Similarly, if f (j)(x 0 ) = 0 for j = 0,... , m − 1, then x 0 is a root
of multiplicity m. We will see that degeneracy can cause difficulties when trying to solve nonlinear equations.
Recall that the absolute condition number of a function f (x) is approximated by |f ′(x)|. In solving a nonlinear equation in one dimension, we are trying to solve an inverse problem, where the forward problem is the evaluation of f at x = 0. It follows that the condition number for solving f (x) = 0 is approximately 1/|f ′(x 0 )|, where x 0 is the solution. This discussion can be generalized to higher dimensions, where the condition number is measured using the norm of the Jacobian.
Suppose that f (x) is a continuous function that changes sign on the interval [a, b]. Then, by the Intermediate Value Theorem, f (x) = 0 for some x ∈ [a, b]. How can we find the solution, knowing that it lies in this interval? The method of bisection attempts to reduce the size of the interval in which a solution is known to exist. Suppose that we evaluate f (m), where m = (a + b)/2. If f (m) = 0, then we are done. Otherwise, f must change sign on the interval [a, m] or [m, b], since f (a) and f (b) have different signs. Therefore, we can cut the size of our search space in half, and continue this process until the interval of interest is sufficiently small, in which case we must be close to a solution. The following algorithm implements this approach.
Algorithm (Bisection) Let f be a continuous function on the interval [a, b] that changes sign on (a, b). The following algorithm computes an approximation p∗^ to a number p in (a, b) such that f (p) = 0.
for j = 1, 2 ,... do pj = (a + b)/ 2 if f (pj ) = 0 or b − a is sufficiently small then p∗^ = pj return p∗ end if f (a)f (pj ) < 0 then b = pj else a = pj end end
Given a continuous function g that is known to have a fixed point in an interval [a, b], we can try to find this fixed point by repeatedly evaluating g at points in [a, b] until we find a point x for which g(x) = x. This is the essence of the method of fixed-point iteration, the implementation of which we now describe.
Algorithm (Fixed-Point Iteration) Let g be a continuous function defined on the interval [a, b]. The following algorithm computes a number x∗^ ∈ (a, b) that is a solution to the equation g(x) = x.
Choose an initial guess x 0 in [a, b]. for k = 0, 1 , 2 ,... do xk+1 = g(xk) if |xk+1 − xk| is sufficiently small then x∗^ = xk+ return x∗ end end
Under what circumstances will fixed-point iteration converge to the exact solution x∗? If we denote the error in xk by ek = xk − x∗, we can see from Taylor’s Theorem and the fact that g(x∗) = x∗^ that ek+1 ≈ g′(x∗)ek. Therefore, if |g′(x∗)| ≤ k, where k < 1, then fixed-point iteration is locally convergent; that is, it converges if x 0 is chosen sufficiently close to x∗. This leads to the following result.
Theorem (Fixed-Point Theorem) Let g be a continuous function on the interval [a, b]. If g(x) ∈ [a, b] for each x ∈ [a, b], and if there exists a constant k < 1 such that
|g′(x)| ≤ k, x ∈ (a, b),
then the sequence of iterates {xk}∞ k=0 converges to the unique fixed point x∗^ of g in [a, b], for any initial guess x 0 ∈ [a, b].
It can be seen from the preceding discussion why g′(x) must be bounded away from 1 on (a, b), as opposed to the weaker condition |g′(x)| < 1 on (a, b). If g′(x) is allowed to approach 1 as x approaches a point c ∈ (a, b), then it is possible that the error ek might not approach zero as k increases, in which case fixed-point iteration would not converge. In general, when fixed-point iteration converges, it does so at a rate that varies inversely with the constant k that bounds |g′(x)|. In the extreme case where derivatives of g are equal to zero at the solution x∗, the method can converge much more rapidly. We will discuss convergence behavior of various methods for solving nonlinear equations in a later lecture. Often, there are many ways to convert an equation of the form f (x) = 0 to one of the form g(x) = x, the simplest being g(x) = x − φ(x)f (x) for any function φ. However, it is important to ensure that the conversion yields a function g for which fixed-point iteration will converge.