









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The problem of finding functions in problems like interpolation and regression. It considers similar problems involving filling in function values and solving more complex design problems. The theories of ordinary differential equations (ODEs) and partial differential equations (PDEs) study the case where we wish to find a function based on information about or relationships between its derivatives. This chapter focuses on the case of an ordinary differential equation and in particular initial value problems.
Typology: Lecture notes
1 / 15
This page cannot be seen from the preview
Don't miss anything!










We motivated the problem of interpolation in Chapter 11 by transitioning from analzying to finding functions. That is, in problems like interpolation and regression, the unknown is a function f , and the job of the algorithm is to fill in missing data.
We continue this discussion by considering similar problems involving filling in function val- ues. Here, our unknown continues to be a function f , but rather than simply guessing missing values we would like to solve more complex design problems. For example, consider the follow- ing problems:
In each of these cases, our unknown is a function f , but our criteria for success is more involved than “matches a given set of data points.”
The theories of ordinary differential equations (ODEs) and partial differential equations (PDEs) study the case where we wish to find a function f (~x) based on information about or relationships between its derivatives. Notice we already have solved a simple version of this problem in our discussion of quadrature: Given f ′(t), methods for quadrature provide ways of approximating f (t) using integration. In this chapter, we will consider the case of an ordinary differential equations and in particular initial value problems. Here, the unknown is a function f (t) : R → R n, and we given an equation satisfied by f and its derivatives as well as f ( 0 ); our goal is to predict f (t) for t > 0. We will provide several examples of ODEs appearing in the computer science literature and then will proceed to describe common solution techniques.
As a note, we will use the notation f ′^ to denote the derivative d f/dt of f : [0, ∞) → R n. Our goal will be to find f (t) given relationships between t, f (t), f ′(t), f ′′(t), and so on.
ODEs appear in nearly any part of scientific example, and it is not difficult to encounter practical situations requiring their solution. For instance, the basic laws of physical motion are given by an ODE:
Example 13.1 (Newton’s Second Law). Continuing from §5.1.2, recall that Newton’s Second Law of Motion states that F = ma, that is, the total force on an object is equal to its mass times its acceleration. If we simulate n particles simultaneously, then we can think of combining all their positions into a vector ~x ∈ R^3 n. Similarly, we can write a function ~F(t, ~x, ~x′) ∈ R^3 n^ taking the time, positions of the particles, and their velocities and returning the total force on each particle. This function can take into account interrelationships between particles (e.g. gravitational forces or springs), external effects like wind resistance (which depends on ~x′), external forces varying with time t, and so on. Then, to find the positions of all the particles as functions of time, we wish to solve the equation ~x′′^ = ~F(t, ~x, ~x′)/m. We usually are given the positions and velocities of all the particles at time t = 0 as a starting condition.
Example 13.2 (Protein folding). On a smaller scale, the equations governing motions of molecules also are ordinary differential equations. One particularly challenging case is that of protein folding, in which the geometry structure of a protein is predicted by simulating intermolecular forces over time. These forces take many often nonlinear forms that continue to challenge researchers in computational biology.
Example 13.3 (Gradient descent). Suppose we are wishing to minimize an energy function E(~x) over all ~x. We learned in Chapter 8 that −∇E(~x) points in the direction E decreases the most at a given ~x, so we did line search along that direction from ~x to minimize E locally. An alternative option popular in certain theories is to solve an ODE of the form ~x′^ = −∇E(~x); in other words, think of ~x as a function of time ~x(t) that is attempting to decrease E by walking downhill. For example, suppose we wish to solve A~x = ~b for symmetric positive definite A. We know from §10.1. that this is equivalent to minimizing E(~x) ≡ 12 ~x>^ A~x −~b>~x + c. Thus, we could attempt to solve the ODE ~x′^ = −∇ f (~x) = ~b − A~x. As t → ∞, we expect ~x(t) to better and better satisfy the linear system.
Example 13.4 (Crowd simulation). Suppose we are writing video game software requiring realistic sim- ulation of virtual crowds of humans, animals, spaceships, and the like. One strategy for generating plausible motion, illustrated in Figure NUMBER, is to use differential equations. Here, the velocity of a member of the crowd is determined as a function of its environment; for example, in human crowds the proximity of other humans, distance to obstacles, and so on can affect the direction a given agent is moving. These rules can be simple, but in the aggregate their interaction is complex. Stable integrators for differential equations underlie this machinery, since we do not wish to have noticeably unrealistic or unphysical behavior.
A full treatment of the theory of ordinary differential equations is outside the scope of our discus- sion, and we refer the reader to CITE for more details. This aside, we mention some highlights here that will be relevant to our development in future sections.
Here, we denote gi(t) : R → R n^ to contain n components of g. Then, g 1 (t) satisfies the original ODE. To see so, we just check that our equation above implies g 2 (t) = g′ 1 (t), g 3 (t) = g′ 2 (t) = g 1 ′′ (t), and so on. Thus, making these substitutions shows that the final row encodes the original ODE. The trick above will simplify our notation, but some care should be taken to understand that this approach does not trivialize computations. In particular, in may cases our function f (t) will only have a single output, but the ODE will be in several derivatives. We replace this case with one derivative and several outputs.
Example 13.5 (ODE expansion). Suppose we wish to solve y′′′^ = 3 y′′^ − 2 y′^ + y where y(t) : R → R. This equation is equivalent to:
d dt
y z w
y z w
Just as our trick above allows us to consider only first-order ODEs, we can restrict our notation even more to autonomous ODEs. These equations are of the form f ′(t) = F[ f (t)], that is, F no longer depends on t. To do so, we could define
g(t) ≡
f (t) g ¯(t)
Then, we can solve the following ODE for g instead:
g′(t) =
f ′(t) g ¯′(t)
F[ f (t), ¯g(t)] 1
In particular, ¯g(t) = t assuming we take ¯g( 0 ) = 0. It is possible to visualize the behavior of ODEs in many ways, illustrated in Figure NUMBER. For instance, if the unknown f (t) is a function of a single variable, then we can think of F[ f (t)] as providing the slope of f (t), as shown in Figure NUMBER. Alternatively, if f (t) has output in R^2 , we no longer can visualize the dependence on time t, but we can draw phase space, which shows the tangent of f (t) at each (x, y) ∈ R^2.
Before we can proceed to discretizations of the initial value problem, we should briefly acknowl- edge that not all differential equation problems are solvable. Furthermore, some differential equa- tions admit multiple solutions.
Example 13.6 (Unsolvable ODE). Consider the equation y′^ = 2 y/t, with y( 0 ) 6 = 0 given; notice we are not dividing by zero because y( 0 ) is prescribed. Rewriting as
1 y
dy dt
t
and integrating with respect to t on both sides shows:
ln |y| = 2 ln t + c
Or equivalently, y = Ct^2 for some C ∈ R. Notice that y( 0 ) = 0 in this expression, contradicting our initial conditions. Thus, this ODE has no solution with the given initial conditions.
Example 13.7 (Nonunique solutions). Now, consider the same ODE with y( 0 ) = 0. Consider y(t) given by y(t) = Ct^2 for any C ∈ R. Then, y′(t) = 2 Ct. Thus,
2 y t
2 Ct^2 t
= 2 Ct = y′(t),
showing that the ODE is solved by this function regardless of C. Thus, solutions of this problem are nonunique.
Thankfully, there is a rich theory characterizing behavior and stability of solutions to differen- tial equations. Our development in the next chapter will have a stronger set of conditions needed for existence of a solution, but in fact under weak conditions on f it is possible to show that an ODE f ′(t) = F[ f (t)] has a solution. For instance, one such theorem guarantees local existence of a solution:
Theorem 13.1 (Local existence and uniqueness). Suppose F is continuous and Lipschitz, that is, ‖F[~y] − F[~x]‖ 2 ≤ L‖~y − ~x‖ 2 for some L. Then, the ODE f ′(t) = F[ f (t)] admits exactly one solution for all t ≥ 0 regardless of initial conditions.
In our subsequent development, we will assume that the ODE we are attempting to solve satisfies the conditions of such a theorem; this assumption is fairly realistic in that at least locally there would have to be fairly degenerate behavior to break such weak assumptions.
One way to gain intuition for the behavior of ODEs is to examine behavior of solutions to some simple model equations that can be solved in closed form. These equations represent linearizations of more practical equations, and thus locally they model the type of behavior we can expect. We start with ODEs in a single variable. Given our simplifications in §13.2.1, the simplest equation we might expect to work with would be y′^ = F[y], where y(t) : R → R. Taking a linear approximation would yield equations of type y′^ = ay + b. Substituting ¯y ≡ y + b/a shows: y¯′^ = y′^ = ay + b = a( y¯ − b/a) + b = a y¯. Thus, in our model equations the constant b simply induces a shift, and for our phenomenological study in this section we can assume b = 0. By the argument above, we locally can understand behavior of y′^ = F[y] by studying the linear equation y′^ = ay. In fact, applying standard arguments from calculus shows that
y(t) = Ceat.
Obviously, there are three cases, illustrated in Figure NUMBER:
Solving this relationship for ~yk+ 1 shows
~yk+ 1 = ~yk + hF[~yk] + O(h^2 ) ≈ ~yk + hF[~yk].
Thus, the forward Euler scheme applies the formula on the right to estimate ~yk+ 1. It is one of the most efficient strategies for time-stepping, since it simply evaluates F and adds a multiple of the result to ~yk. For this reason, we call it an explicit method, that is, there is an explicit formula for ~yk+ 1 in terms of ~yk and F. Analyzing the accuracy of this method is fairly straightforward. Notice that our approximation of ~yk+ 1 is O(h^2 ), so each step induces quadratic error. We call this error localized truncation error because it is the error induced by a single step; the word “truncation” refers to the fact that we truncated a Taylor series to obtain this formula. Of course, our iterate ~yk already may be inaccurate thanks to accumulated truncation errors from previous iterations. If we integrate from t 0 to t with O(^1 /h) steps, then our total error looks like O(h); this estimate represents global truncation error, and thus we usually write that the forward Euler scheme is “first-order accurate.” The stability of this method requires somewhat more consideration. In our discussion, we will work out the stability of methods in the one-variable case y′^ = ay, with the intuition that similar statements carry over to multidimensional equations by replacing a with the spectral radius. In this case, we know yk+ 1 = yk + ahyk = ( 1 + ah)yk. In other words, yk = ( 1 + ah)ky 0. Thus, the integrator is stable when | 1 + ah| ≤ 1, since otherwise |yk| → ∞ exponentially. Assuming a < 0 (otherwise the problem is ill-posed), we can simplify:
| 1 + ah| ≤ 1 ⇐⇒ − 1 ≤ 1 + ah ≤ 1 ⇐⇒ − 2 ≤ ah ≤ 0
⇐⇒ 0 ≤ h ≤
|a| , since a < 0
Thus, forward Euler admits a time step restriction for stability given by our final condition on h. In other words, the output of forward Euler can explode even when y′^ = ay is stable if h is not small enough. Figure NUMBER illustrates what happens when this condition is obeyed or violated. In multiple dimensions, we can replace this restriction with an analogous one using the spectral radius of A. For nonlinear ODEs this formula gives a guide for stability at least locally in time; globally h may have to be adjusted if the Jacobian of F becomes worse conditioned.
Similarly, we could have applied the backward differencing scheme at ~yk+ 1 to design an ODE inte- grator: F[~yk+ 1 ] = ~y′(t) = ~yk+ 1 − ~yk h
Thus, we solve the following potentially nonlinear system of equations for ~yk+ 1 :
~yk = ~yk+ 1 − hF[~yk+ 1 ].
Because we have to solve this equation for ~yk+ 1 , backward Euler is an implicit integrator.
This method is first-order accurate like forward Euler by an identical proof. The stability of this method, however, contrasts considerably with forward Euler. Once again considering the model equation y′^ = ay, we write:
yk = yk+ 1 − hayk+ 1 =⇒ yk+ 1 = yk 1 − ha
Paralleling our previous argument, backward Euler is stable under the following condition: 1 | 1 − ha|
≤ 1 ⇐⇒ | 1 − ha| ≥ 1
⇐⇒ 1 − ha ≤ −1 or 1 − ha ≥ 1
⇐⇒ h ≤
a or h ≥ 0, for a < 0
Obviously we always take h ≥ 0, so backward Euler is unconditionally stable. Of course, even if backward Euler is stable it is not necessarily accurate. If h is too large, ~yk will approach zero far too quickly. When simulating cloth and other physical materials that require lots of high-frequency detail to be realistic, backward Euler may not be an effective choice. Furthermore, we have to invert F[·] to solve for ~yk+ 1. Example 13.8 (Backward Euler). Suppose we wish to solve ~y′^ = A~y for A ∈ R n×n. Then, to find ~yk+ 1 we solve the following system:
~yk = ~yk+ 1 − hA~yk+ 1 =⇒ ~yk+ 1 = (In×n − hA)−^1 ~yk.
Suppose ~yk is known at time tk and ~yk+ 1 represents the value at time tk+ 1 = tk + h. Suppose we also know ~yk+ (^1) / 2 halfway in between these two steps. Then, by our derivation of the centered differencing we know: ~yk+ 1 = ~yk + hF[~yk+ (^1) / 2 ] + O(h^3 ) From our derivation of the trapezoidal rule: F[~yk+ 1 ] + F[~yk] 2
= F[~yk+ (^1) / 2 ] + O(h^2 )
Substituting this relationship yields our first second-order integration scheme, the trapezoid method for integrating ODEs: ~yk+ 1 = ~yk + h F[~yk+ 1 ] + F[~yk] 2 Like backward Euler, this method is implicit since we must solve this equation for ~yk+ 1. Once again carrying out stability analysis on y′^ = ay, we find in this case time steps of the trapezoidal method solve yk+ 1 = yk +
ha(yk+ 1 + yk)
In other words,
yk =
1 + 12 ha 1 − 12 ha
)k y 0.
If we study stability behavior of Heun’s method for y′^ = ay for a < 0, we know:
yk+ 1 = yk +
h 2 (ayk + a(yk + hayk))
=
h 2 a( 2 + ha)
yk
1 + ha +
h^2 a^2
yk
Thus, the method is stable when
− 1 ≤ 1 + ha +
h^2 a^2 ≤ 1
⇐⇒ − 4 ≤ 2 ha + h^2 a^2 ≤ 0
The inequality on the right shows h ≤ (^) |^2 a| , and the one on the left is always true for h > 0 and
a < 0, so the stability condition is h ≤ (^) |^2 a|. Heun’s method is an example of a Runge-Kutta method derived by applying quadrature meth- ods to the integral above and substituting Euler steps into F[·]. Forward Euler is a first-order ac- curate Runge-Kutta method, and Heun’s method is second-order. A popular fourth-order Runge- Kutta method (abbreviated “RK4”) is given by:
~yk+ 1 = ~yk + h 6 (~k 1 + 2 ~k 2 + 2 ~k 3 +~k 4 )
where~k 1 = F (^) [~yk]
~k 2 = F
~yk +
h~k 1
~k 3 = F
~yk +
h~k 2
~k 4 = F
~yk + h~k 3
This method can be derived by applying Simpson’s rule for quadrature. Runge-Kutta methods are popular because they are explicit and thus easy to evaluate while providing high degrees of accuracy. The cost of this accuracy, however, is that F[·] must be eval- uated more times. Furthermore, Runge-Kutta strategies can be extended to implicit methods that can solve stiff equations.
One class of integrators that achieves strong accuracy when F[·] is approximately linear is to use our solution to the model equation explicitly. In particular, if we were solving the ODE ~y′^ = A~y, using eigenvectors of A (or any other method) we could find an explicit solution ~y(t) as explained in §13.2.3. We usually write ~yk+ 1 = eAh~yk, where eAh^ encodes our exponentiation of the eigenvalues (in fact we can find a matrix eAh^ from this expression that solves the ODE to time h). Now, if we write ~y′^ = A~y + G[~y],
where G is a nonlinear but small function, we can achieve fairly high accuracy by integrating the A part explicitly and then approximating the nonlinear G part separately. For example, the first-order exponential integrator applies forward Euler to the nonlinear G term:
~yk+ 1 = eAh~yk − A−^1 ( 1 − eAh)G[~yk]
Analysis revealing the advantages of this method is more complex than what we have written, but intuitively it is clear that these methods will behave particularly well when G is small.
The transformations in §13.2.1 enabled us to simplify notation in the previous section considerably by reducing all explicit ODEs to the form ~y′^ = F[~y]. In fact, while all explicit ODEs can be written this way, it is not clear that they always should. In particular, when we reduced k-th order ODEs to first-order ODEs, we introduced a number of variables representing the first through k − 1-st derivatives of the desired output. In fact, in our final solution we only care about the zeroth derivative, that is, the function itself, so orders of accuracy on the temporary variables are less important. From this perspective, consider the Taylor series
~y(tk + h) = ~y(tk) + h~y′(tk) +
h^2 2 ~y′′(tk) + O(h^3 )
If we only know ~y′^ up to O(h^2 ), this does not affect our approximation, since ~y′^ gets multiplied by h. Similarly, if we only know ~y′′^ up to O(h), this approximation will not affect the Taylor series terms above because it will get multiplied by h^2 / 2. Thus, we now consider “multivalue” meth- ods, designed to integrate ~y(k)(t) = F[t,~y′(t),~y′′(t),... ,~y(k−^1 )(t)] with different-order accuracy for different derivatives of the function ~y. Given the importance of Newton’s second law F = ma, we will restrict to the case ~y′′^ = F[t,~y,~y′]; many extensions exist for the less common k-th order case. We introduce a “velocity” vector ~v(t) = ~y′(t) and an “acceleration” vector~a. By our previous reduction, we wish to solve the following first-order system:
~y′(t) = ~v(t) ~v′(t) = ~a(t) ~a(t) = F[t,~y(t),~v(t)]
Our goal is to derive an integrator specifically tailored to this system.
We begin by deriving the famous class of Newmark integrators.^1 Denote ~yk, ~vk, and ~ak as the position, velocity, and acceleration vectors at time tk; our goal is to advance to time tk+ 1 ≡ tk + h.
(^1) We follow the development in http://www.stanford.edu/group/frg/course_work/AA242B/CA-AA242B-Ch7. pdf.
Thus, we can use our earlier relationship to show:
~yk+ 1 = ~yk + h~vk +
∫ (^) tk+ 1
tk
(tk+ 1 − t)~a(t) dt from before
= ~yk + h~vk +
− β
h^2 ~ak + β h^2 ~ak+ 1 + O(h^2 )
Here, we use β instead of γ (and absorb a factor of two in the process) because the γ we choose for approximating ~yk+ 1 does not have to be the same as the one we choose for approximating ~vk+ 1. After all this integration, we have derived the class of Newmark schemes, with two input parameters γ and β , which has first-order accuracy by the proof above:
~yk+ 1 = ~yk + h~vk +
− β
h^2 ~ak + β h^2 ~ak+ 1
~vk+ 1 = ~vk + ( 1 − γ )h~ak + γ h~ak+ 1 ~ak = F[tk,~yk,~vk]
Different choices of β and γ lead to different schemes. For instance, consider the following exam- ples:
~yk+ 1 = ~yk + h~vk +
h^2 ~ak ~vk+ 1 = ~vk + h~ak
This integrator is explicit and holds exactly when the acceleration is a constant function.
~yk+ 1 = ~yk + h~vk +
h^2 ~ak+ 1 ~vk+ 1 = ~vk + h~ak+ 1
Here, the velocity is stepped implicitly using backward Euler, giving first-order accuracy. The update of ~y, however, can be written
~yk+ 1 = ~yk +
h(~vk + ~vk+ 1 ),
showing that it locally is updated by the midpoint rule; this is our first example of a scheme where the velocity and position updates have different orders of accuracy. Even so, it is possible to show that this technique, however, is globally first-order accurate in ~y.
~xk+ 1 = ~xk +
h(~vk + ~vk+ 1 )
~vk+ 1 = ~vk +
h(~ak +~ak+ 1 )
~xk+ 1 = ~xk + h~vk +
h^2 ~ak
~vk+ 1 = ~vk +
h(~ak +~ak+ 1 )
The method earns its name because simplifying the equations above leads to the alternative form:
~vk+ 1 = ~yk+ 2 − ~yk 2 h ~ak+ 1 = ~yk+ 1 − 2 ~yk+ 1 + ~yk h^2
It is possible to show that Newmark’s methods are unconditionally stable when 4 β > 2 γ > 1 and that second-order accuracy occurs exactly when γ = 1 / 2 (CHECK).
A different way to achieve second-order accuracy in ~y is to use centered differences about the time tk+ (^1) / 2 ≡ tk + h/ 2 : ~yk+ 1 = ~yk + h~vk+ (^1) / 2
Rather than use Taylor arguments to try to move ~vk+ (^1) / 2 , we can simply store velocities ~v at half points on the grid of time steps. Then, we can use a similar update to step forward the velocities:
~vk+ (^3) / 2 = ~vk+ (^1) / 2 + h~ak+ 1.
Notice that this update actually is second-order accurate for ~x as well, since if we substitute our expressions for ~vk+ (^1) / 2 and ~vk+ (^3) / 2 we can write:
~ak+ 1 =
h^2
(~yk+ 2 − 2 ~yk+ 1 + ~yk)
Finally, a simple approximation suffices for the acceleration term since it is a higher-order term:
~ak+ 1 = F
tk+ 1 , ~xk+ 1 ,
(~vk+ (^1) / 2 + ~vk+ (^3) / 2 )
This expression can be substituted into the expression for ~vk+ (^3) / 2. When F[·] has no dependence on ~v, the method is fully explicit:
~yk+ 1 = ~yk + h~vk+ (^1) / 2 ~ak+ 1 = F[tk+ 1 ,~yk+ 1 ] ~vk+ (^3) / 2 = ~vk+ (^1) / 2 + h~ak+ 1
This is known as the leapfrog method of integration, thanks to the staggered grid of times and the fact that each midpoint is used to update the next velocity or position.