






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A lecture note from the book 'scientific computing with case studies' by dianne p. O’leary. It focuses on the topic of nonlinear equations and the methods used to solve them, including newton's method, newton-like methods, and globally-convergent continuation methods. The document also includes case studies on determining orientations of a truss and parameters in an ecological model of a colony of flour beetles.
Typology: Study notes
1 / 12
This page cannot be seen from the preview
Don't miss anything!







Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/SCCSwebpage Lecture Notes for Unit VI Nonlinear Equations and Continuation Methods Dianne P. O’Leary ©^ c 2008
The problem
Given a function F : Rn^ → Rn, find a point x ∈ Rn^ such that
F(x) = 0.
Note: The one-dimensional case (n = 1) is covered in CMSC/AMSC 460. The best software for this problem is some variant on Richard Brent’s zeroin, available in Netlib. In Matlab, it is called fzero.
We’ll assume from now on that n > 1.
Goals
To develop algorithms for solving nonlinear systems of equations.
Note: Solving nonlinear equations is a close kin to solving optimization problems.
Important note: If F is a polynomial in the variables x, then use special purpose software that enables you to find all of the solutions reliably.
Example of a polynomial system:
x^2 y^3 + xy = 2 2 xy^2 + x^2 y + xy = 0
Pointers: Watson’s homotopy method; Traub’s software.
What we know
We already know a lot about solving nonlinear equations. The main tools are Newton’s method and Newton-like methods.
Differences from the methods we have studied:
∂Fi ∂xk
This matrix is generally not symmetric.
‖F(x)‖
as a merit function, but there are difficulties with this approach.
The Plan
Note: The nonlinear least squares problem, discussed in Section 24.2 and Chapter 13, is one approach to solving nonlinear equations for which
Newton-like methods
Now we can fit a linear function to F:
F(x(k)^ + p) ≈ F(x(k)) + J(x(k))p = 0 → J(x(k))p = −F(x(k)).
But if we still find the Newton direction from solving J(x(k))p(k)^ = −F(x(k)), then we are still fitting a linear function to F.
As for the line search, we will just eliminate it and set αk = 1, or we will use ‖F(x)‖ as a merit function.
The resulting algorithm is Newton’s method for nonlinear equations:
Until x(k)^ is a good enough solution,
Find a direction p(k)^ by solving J(x(k))p(k)^ = −F(x(k)). Set x(k+1)^ = x(k)^ + αkp(k)^ where αk = 1 or αk is determined by a line search.
Convergence result: Under some “standard assumptions” (2nd derivatives exist, simple zero, etc.), the algorithm is guaranteed to converge to a solution if started close enough to it, and the convergence rate is quadratic.
Newton-like methods
B(k+1)^ = B(k)^ +
(y − B(k)s)sT sT^ s
where y = F(x(k+1)) − F(x(k)) and s = x(k+1)^ − x(k). Then
B(k+1)s = y Secant condition
and if sT^ v = 0 , then B(k+1)v = B(k)v.
Convergence result: As in the minimization case, we get superlinear convergence if the direction is (asymptotically) close enough to the Newton direction.
Globally-convergent continuation methods
Globally-convergent continuation methods
The basic idea:
ρa(λ, x) = λF(x) + (1 − λ)Fa(x)
where λ is a real number in the interval [0, 1].
We have just seen one example of how to construct a homotopy function ρa(λ, x) so that solving ρa(λ, x) = 0 is easy when λ = 0 and solves the desired problem when λ = 1.
There are many other useful ways to define a homotopy, and we will use some alternatives in the case studies.
The basic algorithm
Initialize λ = 0 and x = solution to Fa(x) = 0.
While λ < 1 ,
Parameterized Sard’s Theorem: Let U = Rn^ × [0, 1) × Rn^ and define a function ρ : U → Rn^ in C^2 which is transversal to zero on U. (We’ll call its variables (a, λ, z).) Choose a point a ∈ Rn^ and define
ρa(λ, z) = ρ(a, λ, z).
Then the map ρa is transversal to zero on [0, 1) × Rn^ for almost all a.
What “for almost all” means:
Why we need this:
Our solution curve is almost always “nice”
Theorem: In addition to the hypotheses above, assume that the equation ρa(0, z) = 0 has a unique solution z 0. Then for almost all a ∈ Rm, there is a curve γ of solutions to ρa(λ, z) = 0 , emanating from (0, z 0 ), with the properties:
In summary, the curve is very well-behaved! In fact, we can hope to walk along it.
Where does the solution curve go?
Theorem: Under the hypotheses above,
In other words, the curve is going just where we want to go, and it gets there finitely, as long as the Jacobians are bounded away from rank-deficient matrices.
Picture. The above theorem means that we have a solution curve that starts at λ = 0 and goes to our desired solution at λ = 1. But there may be a countable number of other solution curves, some closed and some with two endpoints at λ = 1.
Note that we need to be careful not to step onto one of these other curves.
An example: convex optimization
Suppose we want to find x∗^ to solve the problem
min x f (x)
where f : Rn^ → R^1 is a C^2 convex function that is bounded below. (This last assumption assures that a bounded solution exists.)
We define the mapping
ρa(λ, x) = λ∇f (x) + (1 − λ)(x − a)
Will this homotopy generate a solution curve that terminates in (1, x∗)?
It is sufficient to verify the hypotheses of our three theorems:
Jx = λH(x) + (1 − λ)I ,
where H is the Hessian matrix of f. Now H is positive semi-definite since f is convex, so Jx has rank n for λ ∈ [0, 1), so the complete Jacobian has full-rank, as required.
Note: This is a nice example, and the result means that the continuation method is one approach to solving convex optimization problems, but such problems are generally too easy to require such heavy machinery. Use one of the Newton-type algorithms to solve convex optimization problems.
What we have accomplished
To solve the nonlinear system of equations F(x) = 0 , under the above hypotheses, all we need to do is to follow the solution curve
ρa(λ, x) = λF(x) + (1 − λ)Fa(x) = 0
until λ = 1.
Following the solution curve
We want to follow the solution curve from λ = 0 to λ = 1.
Note first that we don’t need to be too careful when following the curve – we don’t care about the values at any points in between λ = 0 and λ = 1.
Note second that we need to be careful enough that we don’t start walking along any other solution curve, or we will lose our way!
Method 1
Given a function ρa, we set λ = 0 and ˆx = a. Then
ρa(λ, ˆx) = 0.
Until λ = 1,
Increase λ a little bit. Solve ρa(λ, x) = 0 using your favorite algorithm (Newton-like, etc.) with ˆx as a starting guess. Call the solution ˆx.
Upon termination, we have computed ˆx so that ρa(1, ˆx) = 0 , so ˆx solves the problem F(x) = 0.
Issues:
Method 2
Issues of choosing a stepsize and tolerance are troublesome, and we also face them in Unit 5, in solving ordinary differential equations (ode’s). Can we use our ode machinery to solve the homotopy problem?
Let’s differentiate our curve. We could differentiate with respect to λ, but then the derivative would fail to exist at any turning point and we might get in trouble.
Instead, we’ll introduce a new independent variable and differentiate with respect to it. The most convenient one is s = arc length.
Now ρa(λ(s), x(s)) = 0 ,
so d ds ρa(λ(s), x(s)) = 0
and we have the intial conditions
λ(0) = 0 , x(0) = a.
When the solution curve reaches λ(s) = 1, we are finished; the resulting x(s) solves our nonlinear system.
For uniqueness, we normalize so that ∥ ∥ ∥ ∥(^
dλ ds
dx ds
What does this system of ode’s look like when Fa(x) = x − a?
dλ ds
F(x) + λJ(x)
dx ds
dλ ds
(x − a) + (1 − λ)
dx ds
dλ ds
∑^ n
i=
dxi ds
The system is implicit.
In principle, we could just plug the system into our favorite ode solver, and this is good for prototyping algorithms, but in practice the ode solver should be tailored to this problem: