


















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The stability analysis of boundary value problems (BVPs) of ordinary differential equations (ODEs) and compares it to initial value problems (IVPs). It covers the sensitivity of solutions to changes in the ODE function and the use of shooting methods to find solutions. The document also touches upon finite difference methods for solving BVPs.
Typology: Exams
1 / 26
This page cannot be seen from the preview
Don't miss anything!



















Related reading: Ascher and Petzold, Chapter 6 (a good discussion of stability ) and Chapter 7 (which includes details on multiple shooting and setting up Newton’s method for these problems).
An ODE boundary value problem consists of an ODE in some interval [a, b] and a set of ‘boundary conditions’ involving the data at both endpoints.
After converting to a first order system, any BVP can be written as a system of m-equations for a solution y(x) : R → Rm^ satisfying
dy dx
= F (x, y), x ∈ [a, b]
with boundary conditions B(y(a), y(b)) = ~ 0.
As with ODE IVPs, this ‘standard’ form is typically used for numerical computation (e.g. Matlab’s bvp4c requires the ODE written this way), although there are often simplifications that can be made given some structure.
For example, the two-point boundary value problem (i.e. a BVP with a second order ODE and two boundary conditions)
y′′^ = 4y − y′, x ∈ [a, b], y(a) = c, y′(b) = d
can be written in this form, where w = (y 1 , y 2 ) with w 1 = y and w 2 = y′^ and
w′ 1 = w 2 w′ 2 = 4w 1 − w 2
B(y(a), y(b)) =
w 1 (a) − c w 2 (b) − d
General boundary conditions can be difficult to deal with. The most common (and simplest) type are linear boundary conditions, where
Bay(a) + Bby(b) = c, Ba, Bb = m × m matrices.
The boundary conditions are separated if they can be written as
Bay(a) = ca, Bby(b) = cb
e.g. for the example above (where Ba = [ 1 00 0 ] and Bb = [ 0 00 1 ]).
For an IVP, the effect of the initial condition is carried forward by the ODE from the initial point. We can follow this data and take steps one at a time, which is natural and straightforward.
In contrast, a BVP is much harder to solve because the ODE carries information from the boundaries throughout the interval. We cannot start at one point and step forward, because that would ignore the other boundary data. Thus, more work is required.
The stability of BVPs is a substantial topic; a quick study will help to illustrate the issue. First, suppose we have a linear IVP
y′^ = Ay + p(t), y(0) = y 0
where A is an m × m matrix. If the ODE function is perturbed slightly,
y˜′^ = Ay˜ + ˜p(t), y(0) = 0
then the difference u = ˜y − y evolves according to
u′^ = Au + δ(t), u(0) = 0, δ := ˜p(t) − p(t).
From ODE theory, we know that the eigenvalues of A determine stability in the usual way (stable if Re(λ) < 0). A similar result holds for changes in the initial condition, and a linear stability analysis (as we did before) extends this to the non-linear case.
Now contrast with the BVP case. Consider a linear, separated BVP
y′^ = Ay + p(t), Bay(a) = ca, Bby(b) = cb
and a perturbed problem (assuming the BCs do not change for simplicity)
y˜′^ = Ay˜ + ˜p(t), Ba y˜(a) = ca, Bb ˜y(b) = cb.
Then the difference u = ˜y − y solves
u′^ = Au + δ(x), Bau(a) = Bbu(b) = 0.
is clearly not stable since one component grows exponentially (and this is also true in the other direction). The ‘instability’ of the IVP does not apply to the BVP because the bound- ary stops any unstable growth.
The point: The key practical result here is that the stability of the boundary value problem and the associated initial value problem (in whatever direction) are not necessarily the same. Thus, using ‘IVP’ techniques to solve the boundary value problems can introduce new (artificial) instability [we will see this shortly].
The second point is that the ‘stiffness’ condition for a BVP is different. The linear, constant coefficient BVP y′^ = λy, x ∈ [0, b] ...BCs...
may be considered stiff if b|Re(λ)| 1.
The condition here generalizes as in the IVP case. This means that a solution y = eλx may grow rapidly relative to the interval size (plug in x = b). Because there is no longer a ‘forward’ direction as in an IVP, stiffness does not require Re(λ) < 0.
Stiff BVPs can be challenging to solve because they often involve boundary layers, thin regions of rapid variation near the boundaries (or elsewhere).
One other subtlety is that existence/uniqueness is more complicated for BVPs.
Recall that for an IVP y = F (t, y), y(0) = y 0
there is a unique (local) solution under mild assumptions - namely, that F is continuous and Lipschitz in y. In contrast, a BVP may have: i) A unique solution ii) multiple solutions iii no solution at all
Obviously, a problem with no solution cannot be solved numerically. Case (iii) also has consequences for solvable problems - it means that even if a problem has a solution, nearby problems (which we may encounter in the approximation process) might not.
From a numerical perspective, (ii) is not a serious problem, since numerical methods will naturally select one solution in the computation. We just need to be careful to find the right solution - for instance, by using an initial guess close to the desired solution.
One important example is an eigenvalue problem such as
y′′^ = −λy, y(0) = y(π) = 0
which has solutions y(x) = sin(kx) for λ = k an integer and no solutions otherwise. Note that the eigenvalue problem can be regarded as a system of ODEs; set w 1 = y and w 2 = y′ and add a third trivial variable λ:
w′ 1 = w 2 , w′ 2 = −λw 1 , λ′^ = 0.
This is a first order system of three ODEs for (w 1 , w 2 , λ). The trick of replacing a param- eter with a trivial ODE is a ‘reformulation trick’, which we can use to solve for unknown constants.^2
2 Boundary value problems (shooting, part I)
To start, we consider a typical two-point boundary value problem
y′′^ = f (x, y, y′), x ∈ [a, b], y(a) = c, y(b) = d
for a function y(x). Unlike an initial value problem, there are conditions involving y at both endpoints of the interval, so we cannot just start at x = a and integrate up to x = b.
One approach (to be considered later) is to approximate y(x) by some finite difference or other approximation scheme, then arrive at a system for the discrete values y(xi) that can be solved all at once. The downside is that one needs to typically solve large linear systems that require some effort.
However, there is a cheaper way. Note that y(a) is specified. We can use an IVP solver to compute a solution y(x; s) given initial conditions
y(a; s) = c, y′(a; s) = s
where s is a ‘guess’ at the correct value of y′(a). This solution will hit x = b at some value that is probably not equal to the correct value, y(b) = d. However, we can then adjust the value of s, refining the guess until is is (nearly equal to) the right value.
This method is called shooting - the analogy here is to shooting a ball [insert relevant sport here] at a goal, determining the unknown correct velocity by throwing it too fast/too slow until it hits the goal exactly.
(^2) In particular, it means that if you have a routine to solve a BVP in standard form, you can also solve
BVPs with unknown parameters with the same code by adding trivial equations.
which will hopefully converge to the correct value s∗^ if the initial value is good enough. Explicitly, the iteration reads
sn+1 = sn −
y(b; sn) − d ∂y ∂s (b;^ sn)^
To summarize, when using Newton’s method, the steps are:
Note that the variational ODE for z(x; s) is solved on the same grid because it must know the IVP solution y(x; s) at the grid points. You could also just put the two systems together and solve them at once.
Also note that shooting really only requires
and any method of each type can be used. The appropriate choice will vary by problem.
The two-point boundary value problem
y′′^ = 2y^3 − 6 y − 2 x^3 , y(1) = 2, y(2) = 5/ 2 (2.1)
has a solution y(x) that happens to be given exactly by
y(x) = x +
x
, x ∈ [1, 2].
Let y(x; s) be the solution to the shooting IVP
y′′^ = 2y^3 − 6 y − 2 x^3 , y(1) = 2, y′(1) = s. (2.2)
The goal function is g(s) = y(2; s) − 5 / 2 (2.3)
so that the solution to the IVP with s such that g(s) = 0 is also a solution to the BVP (2.1).
For computation, set w 1 = y, w 2 = y′^ and z 1 = ∂y/∂s and z 2 = ∂y′/∂s. Then
w′ 1 = w 2 w′ 2 = 2w 13 − 6 w 1 − 2 x^3 z′ 1 = z 2 z′ 2 = (6w^21 − 6)z 1
with initial condition (w 1 , w 2 , z 1 , z 2 ) = (0, s, 0 , 1).
The goal function and its derivative are then
g(s) = w 1 (2; s) − 5 / 2 , g′(s) = w 3 (2; s).
Note that the first two equations can be solved first, and the variational ODE (the last two equations) can be solved after if the same grid is used.
The ODE has the feature that if y′(1) is too large (e.g. y′(1) > 1 /2) then the solution to the IVP (2.2) has a singularity in [1, 2]. Thus, it is necessary to choose a starting value s < 1 /2, and must be close enough to 0 that it never enters this singular region. Or, some other, safer method could be used (e.g. bisection).
It is important to note that the stability of shooting depends on the stability of the IVP, even though (as discussed) the actual stability of the BVP may be different.
In particular, a stable BVP may have a shooting IVP that is very stiff or otherwise ill- behaved. That is, by using shooting we may introduce numerical issues hat are not inherent to the problem, which is a significant disadvantage.
Moreover, note that the IVPs must be solved for guesses s that are not correct, so we may need to solve nearby problems that are worse than the true solution - e.g. if the IVP is singular for certain values of s.
There are two cheap ways to ‘fix’ shooting if the IVP is not well behaved.
First, if the IVP blows up only in one direction, we can try to shoot from the other side. [The application is left as an exercise].
A stronger technique is multiple shooting. The idea is that if the IVP wants to blow up after some distance < b for a BVP in the interval [0, b], we can ‘restart’ the solve by guessing values inside the interval and then connect the pieces. For the two-point problem
y′′^ = f (x, y, y′), y(a) = c, y(b) = d
3 Non-scalar problems
Consider a boundary value problem
y′^ = F (x, y), x ∈ [a, b], B(y(a), y(b)) = 0
with a first order system of m equations. The shooting IVP is
y′^ = F (x, y), x ∈ [a, b], y(a) = s
which has a solution y(x; s). Note that the guess is, at worst, a vector in Rm. If some components at x = a are known, we can reduce the dimension (e.g. if y 1 (a) = 0).
The goal function is then
G(s) = B(y(a; s), y(b; s)), G : Rm^ → Rm
again noting that its dimension may be less if it can be simplified.
We could use a derivative-free method to solve G(s) = 0 (and be done); the details of such methods will not be discussed here. To use Newton’s method, we need the Jacobian of G. Define the matrix-valued function (the variation of the solution with respect to s)
Z(x; s) =
∂y(x; s) ∂s
That is, Z has columns [z 1 , · · · , zm] where
zj =
∂y(x; s) ∂sj
= variation of y with respect to sj.
By differentiating the IVP, we see that Z solves the variational ODE
Z′^ = J(x, y) Z, Z(a; s) = I (3.1)
where J(x, y) = ∂F/∂y is the Jacobian of F with respect to y. Note that the ODE (3.1) is really a collection of m equations for
zj (x; s) =
∂y(a; s) ∂sj
, j = 1, · · · , m
which is how they would be solve numerically. Because the ODE is linear and the Jacobian is shared by all m equations, all the zj ’s can be computed efficiently at once.
With Z known, we can then calculate the Jacobian of the goal function needed for shooting.
The Van der Pol equation is (in system form)
x′ 1 = x 2 , x′ 2 = μ(1 − x^21 )x 2 − x 1
which describes a non-linear oscillator with negative damping when its amplitude is small.
For certain values of μ (when μ > 0) this system has a limit cycle - a unique periodic solution. This can be formulated as a boundary value problem. Let the period be T , and suppose that x 1 (0) = 0 (it is not hard to show that the limit cycle must cross the x 2 -axis, so we can pick that crossing point as the t = 0 point as the system is autonomous).
The BVP to solve is
x′ 1 = x 2 x′ 2 = μ(1 − x^21 )x 2 − x 1
x 1 (0) = 0 x 1 (0) = x 1 (T ) x 2 (0) = x 2 (T )
where T is also an unknown. We can convert to a nicer form using two tricks. First, scale to get a fixed domain; set τ = t/T as a new time variable. Second, regard T as an independent variable with T ′^ = 0 to get a first order system for (x 1 , x 2 , T ). The result is
dx 1 dτ
= T x 2 dx 2 dτ
= T (μ(1 − x^21 )x 2 − x 1 ) dT dτ
x 1 (0) = 0 x 1 (0) = x 1 (1) x 2 (0) = x 2 (1)
which is a first-order system in a standard form (with three BCs as expected). The fact that T ′^ = 0 means that the process can be simplified a bit.
To apply shooting to solve this BVP, we solve an IVP of the form
dx dτ
= T F (x), x(0) = (0, s) with a guess (s, T )
noting that the T ′^ = 0 equation doesn’t actually need to be included in the ODE as there is nothing to solve. The goal function is
G(s, T ) =
x 1 (1; s) x 2 (1; s) − s
If Newton’as method is to be used, we need the appropriate variational ODEs. Let
z =
∂x 1 ∂s
, w =
∂x 2 ∂T
4 Finite differences
Suppose, for example, we wish to solve the boundary value problem
y′′^ = f (x, y, y′), x ∈ [0, b] (BCs at x = 0 and x = b)
This problem can be solved by the following direct process:
Discretize the domain: Choose grid points {xn} for the approximation {un}
Discretize the ODE: Approximate derivatives with finite differences in the interior of the domain (where BCs do not apply).
Boundary conditions: Discretize the boundary conditions, and modify (2) as needed.
Solve the system of equations from (2) and (3) by the usual methods.
Unlike shooting, the entire solution is solved for at once, which avoids the potential insta- bilities in the auxiliary IVPs. The details depend, of course, on the choice of grid points (uniform or non-uniform) and the discretization (which formulas to use).
We solve the example problem (4.1) with ‘Dirichlet’ BCs
y(0) = c, y(b) = d (4.2)
using a uniform grid xn = hn, n = 0, 1 , · · · , N, h := b/N.
Since the values of y are known at n = 0 and n = N , our unknowns are (y 1 , · · · , yN − 1 ).
Using the standard centered difference formula for y′′^ and y′^ and replacing y(xn) with un (the approximation to be found) gives
un+1 − 2 un + un− 1 h^2
= f (xn, un,
un+1 − un− 1 2 h
), n = 1, · · · , N − 1 , (4.3)
and applying the boundary conditions gives (trivially)
u 0 = c, uN = d.
This is a system of N + 1 equations for the N + 1 values of {un} on the grid, or alternatively a system of N − 1 equations for the ‘interior’ values (u 1 , · · · , uN − 1 ). The latter is preferable for computation. The result is a system of equations for u = (u 1 , · · · , uN − 1 ),
G(u) = 0, G : RN^ −^1 → RN^ −^1.
The equations are obtained by taking (4.5) and ‘plugging in’ for the boundary values u 0 , uN :
G 1 (u) = u 2 − 2 u 1 + c − h^2 f (x 1 , u 1 ,
u 2 − c 2 h
Gn(u) = un+1 − 2 un + un− 1 − h^2 f (xn, un,
un+1 − un− 1 2 h
), n = 2, · · · , N − 2 ,
GN − 1 (u) = d − 2 uN − 1 + uN − 2 − h^2 f (xN − 1 , uN − 1 ,
d − uN − 2 2 h
This system can then be solved using Newton’s method. Note that the Jacobian J = ∂G/∂u is tridiagonal, i.e. has non-zero entries only on the center diagonal and one above/below; the linear system for Newton steps can then be solved efficiently (each linear solve requires O(N ) operations).
Note that if the BVP is linear then the system has the form Au = ~b, so it can be solved as a linear system. For example, if f = x^2 (so y′′^ = x^2 ) then
, ~b =
−c 0 .. . 0 −d
x^21 x^22 .. . x^2 N − 2 x^2 N − 1
Note that the boundary conditions add to the right hand side - every term that does not depend on u gets moved to the RHS (b) and every term involving u is on the LHS (Au).
The Airy function Ai(x) is a multiple of^5 the unique solution to
y′′^ = xy, y(0) = 1, lim x→∞
y(x) = 0
Because the ODE has another solution Bi(x) that grows rapidly (see homework 6), using shooting can be difficult (it is sensitive to the choice of y′(0), and solutions do not want to go to zero).
We solve the BVP instead, replacing infinity with a large value L. Letting
xn = nh, h = L/N,
we discretize as discussed above to get
un+1 − 2 un + un− 1 = h^2 xnun, n = 1, · · · , N − 1 (^5) to be precise, the Airy function has Ai(0) = 1/(3 2 / (^3) Γ(2/3)).
Similarly, uN +1 = uN − 1 + 2hc. (4.7)
Now we can apply the discretized formula (4.5) for the ODE at x 0 , since x−1 is now an available point, leading to the single formula
un+1 − 2 un + un− 1 h^2
= f (xn, un, u′ n), n = 0, · · · , N
where u− 1 and uN +1 are given by (4.6), (4.7). We now have a system of N + 1 equations for the N + 1 unknowns that can be solved.
Essentially, the boundary conditions were used to pad the domain with ‘ghost points’ that are not part of the solution, but are needed because the discretization extends past the end- points. Depending on the problem, the trick here can be more delicate, but we will omit a detailed discussion of the nuances here.
Now consider the boundary value problem (in first-order form)
y′^ = F (x, y), x ∈ [0, b]
with m equations (so y : R → Rm) with linear boundary conditions
B 0 y(0) + Bby(b) = c.
Consider a grid {xn} with hn = xn+1 − xn and x 0 = 0, xN = b and let xn+1/ 2 denote the midpoint of [xn, xn+1]. We can use the midpoint formula to discretize the ODE - we use a centered difference at each midpoint xn+1/ 2 , which uses the values at xn and xn+1:
un+1 − un hn
= F (xn+1/ 2 ,
un+1 + un 2
), n = 0, · · · , N − 1
noting that y at n + 1/2 has been approximated with an average.^6
The boundary conditions provide one more set of equations, so there are mN equations from the ODE and m from the BCs. This matches the m(N + 1) unknowns, so we have a complete system that can be solved via the usual methods (e.g. Newton).
Note also that the resulting linear systems for Newton’s method will have a nice struc- ture: except for the boundary conditions, the matrix will be banded - only elements a certain distance from the main diagonal are non-zero.
If we let
Jn+1/ 2 =
∂y
(xn+1/ 2 , un+1/ 2 )
(^6) Note that there are other ways the discretization could be done; the version here is used as an example.
and
Vn = −I −
hnJn+1/ 2 , Wn = I −
hnJn+1/ 2 ,
then the Jacobian for G(u) = 0 with
Gn(u) = un+1 − un − F (xn+1/ 2 , un+1/ 2 ), n = 0, · · · , N − 1
and GN (u) = B 0 u 0 + BbuN − c
is given in block form by
B 0 ‘ · · · 0 0 Bb
If the boundary conditions are separated, i.e. B˜ 0 y(0) = B˜by(b) = 0, we can split the boundary equations apart and put the x = 0 ones at the top rows and the x = b ones at the bottom, preserving the band structure:
0 · · · 0 0 B˜b
Because it is a ‘standard’ first order system, the method can be used on a variety of problems (including the second-order BVP of the previous section). The primary advantage is that the first order system allows us to discretize only first-order derivatives, which only take two points (xn and xn+1). This small stencil makes using a non-uniform grid simple, because the discretized ODE at each point xn+1/ 2 depends only only one of the grid spacings (xn+1 − xn).
The ‘centered’ difference is not always the natural way to discretize. As an illustrative example, consider
y′^ =
−λ 0 0 λ
y, y 1 (0) = 1, y 2 (3) = 1
Observe that:
5 The linear algebra
We will not cover the numerical linear algebra here in detail^7. The focus here is on high- lighting what one needs to know to use the algorithms.
A banded matrix A is a matrix such that
aij = 0 for − m 1 ≤ j − i ≤ m 2
where 0 < m 1 , m 2 and m = m 1 + m 2 + 1 is the ‘bandwidth’, which should be much smaller than the matrix dimensions. That is, A has non-zero diagonals only close to the center diagonal. A common special case that we have seen is a tridiagonal matrix, for which m 1 = m − 2 = 1 and the bandwidth is three:
β 1 γ 1 0 · · · 0 α 2 β 2 γ 2 · · · 0 0
0 · · · αN − 1 βN − 1 γN − 1 0 · · · 0 αN βN
An LU decomposition of a square matrix is a factorization
A = LU, L = lower triangular, U = upper triangular.
This factorization can be used to solve linear systems Ax = b in two steps:
(Factor) Compute L and U such that A = LU
(Solve) Solve Ly = b,then solve U x = y (with forward/back substitution)
Once (1) is done, step (2) can be done for many RHS’s b without re-factoring the matrix, so step (1) is often an initial ‘processing’ of the matrix to prepare it for solving linear systems.
In practice: When solving systems with LU decomposition, you should make use of a specialized solver. This means, typically, calling code for (1) (e,g, L, U = lu(A)) and then calling code for (2) to solve the system (e.g. x = solve(L,U,b)).
The LU factorization algorithm is straightforward (not described here), and the second step is trivial. If A is N × N , then in general the factorization takes O(N 3 ) steps and the second part takes O(N 2 ) steps. This means that in general, solving linear systems is expensive and scales poorly as the size of the matrix increases.
(^7) See, for instance, Numerical Recipes for a good practical source or Golub and Van Loan’s Matrix com-
putations for details and theory
However, if A is banded then the solution process is much more efficient. The result:
(Banded LU) Let A be a banded matrix with m 1 lower and m 2 upper bands and bandwidth m = m 1 + m 2 + 1. Then it can be factored as A = LU where
Computing A = LU takes O(m^2 N ) operations, as does solving Ax = b.
Because A only contains at most mN elements, we only need to store it in a ‘compressed’ form as an N × m array of diagonals. For example,
β 1 γ 1 0 · · · 0 α 2 β 2 γ 2 · · · 0 0
0 · · · αN − 1 βN − 1 γN − 1 0 · · · 0 αN βN
0 β 1 γ 1 α 2 β 2 γ 2 .. .
αN βN 0
Important note (padding): Because off-diagonals have less entries than the main diago- nal, there will be some unused entries in the array. A choice must be made on how to arrange the data (‘padding’ the array with zeros). The padded zeros are shown in red above, using the convention that the j-th row of A is the j-th row of the compressed array^8. As another example,
, m^1 = 1, m^2 = 2,^ =⇒^ A^ (array) =
Using the algorithms: A banded LU solver has two routines: a routine that computes A = LU given A in compress form, and a routine that solves Ax = b given L, U (also compressed) and b (see example code for typical usage). To solve Ax = b:
A brief note about other methods: If the matrix is sparse (has mostly non-zero en- tries, e.g. an N × N matrix with O(N ) entries) but not banded, then other techniques must
(^8) Matlab uses the opposite convention for square matrices where columns line up; see the documentation
of spdiags for details.