











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Theorems 1 and 3 in optimization, focusing on the differentiability and equidifferentiability of a function V(t) under certain conditions. The theorems provide relationships between the function, its derivatives, and the optimality of selections x∗t and y∗t in a constrained maximization problem.
Typology: Slides
1 / 19
This page cannot be seen from the preview
Don't miss anything!












Econometrica, Vol. 70, No. 2 (March, 2002), 583–
By Paul Milgrom and Ilya Segal^1
The standard envelope theorems apply to choice sets with convex and topological struc- ture, providing sufficient conditions for the value function to be differentiable in a param- eter and characterizing its derivative. This paper studies optimization with arbitrary choice sets and shows that the traditional envelope formula holds at any differentiability point of the value function. We also provide conditions for the value function to be, variously, absolutely continuous, left- and right-differentiable, or fully differentiable. These results are applied to mechanism design, convex programming, continuous optimization prob- lems, saddle-point problems, problems with parameterized constraints, and optimal stop- ping problems. Keywords: Envelope theorem, differentiable value function, sensitivity analysis, math programming, mechanism design.
1 introduction
Traditional “envelope theorems” do two things: describe sufficient con- ditions for the value of a parameterized optimization problem to be differentiable in the parameter and provide a formula for the derivative. Economists initially used envelope theorems for concave optimization problems in demand theory. The theorems were used to analyze the effects of changing prices, incomes, and technology on the welfare and profits of consumers and firms. With households and firms choosing quantities of consumer goods and inputs, the choice sets had both the convex and topological structure required by the early envelope theorems. In recent years, results that may be regarded as extensions of envelope theo- rems have frequently been used to study incentive constraints in contract theory and game theory,^2 to examine nonconvex production problems, 3 and to develop the theory of “monotone” or “robust” comparative statics. 4 The choice sets and objective functions in these applications generally lack the topological and con- vexity properties required by the traditional envelope theorems. At the same time, the analysis of these applications does not always require full differentia- bility of the value function everywhere. For example, contract theory considers
(^1) The second author is grateful to Michael Whinston, collaboration with whom inspired some of the ideas developed in this paper. We also thank the National Science Foundation for financial support, Federico Echenique and Luis Rayo for excellent research assistance, and Vincent Crawford, Ales Filipi, Peter Hammond, John Roberts, Chris Shannon, Steve Tadelis, Lixin Ye, and the referees for their comments and suggestions. (^2) There are many such examples, beginning with Mirrlees (1971). (^3) For example, see Milgrom and Roberts (1988). (^4) See Milgrom and Shannon (1994) and Athey, Milgrom, and Roberts (2000).
583
584 p. milgrom and i. segal
incentive mechanisms with arbitrary message spaces and arbitrary outcome func- tions. While an agent’s value function in such a mechanism need not be a dif- ferentiable function of his type, it can nevertheless be represented as an integral of the partial derivative of the agent’s payoff function with respect to his type. This representation constitutes an important step in the analysis of optimal con- tracts. While some progress has been made in extending traditional envelope the- orems to be useful in such modern applications, none has been general enough to encompass them all. 5 The core contributions of this paper are envelope theorems for maximization problems with arbitrary choice sets, in which such properties of the objective function as differentiability, concavity, or continuity in the choice variable cannot be utilized. First we show that the traditional envelope formula holds at any dif- ferentiability point of the value function. Then we provide a sufficient condition for the value function to be absolutely continuous. This condition ensures that the value function is differentiable almost everywhere and can be represented as an integral of its derivative. We also provide a sufficient condition for the value function to have right- and left-hand directional derivatives everywhere and char- acterize those derivatives. When the two directional derivatives are equal, the function is differentiable. Associated with the new envelope theorems is a new intuition, distinct from the one offered in leading graduate economics textbooks. 6 In our approach, the choice set has no structure and is used merely as a set of indices to iden- tify elements of a family of functions on the set 0 1 of possible parameter values. Figure 1 illustrates this approach for the case of a finite choice set X = x 1 x 2 x 3 . The value function V t = maxx∈X f x t is the “upper envelope” of the func- tions f x t. The figure illustrates several of its general properties when the choice set is finite and the objective function f is continuously differentiable in the parameter t. First, the value function is differentiable almost everywhere and has directional derivatives everywhere. Its right-hand derivative at parameter value t is everywhere equal to the largest of the partial derivatives ft x t on the set of optimal choices at t, while the left-hand derivative is everywhere equal to the smallest of the partial derivatives. Consequently, V is differentiable at t if and only if the derivative is constant on the set of optimal choices. This occurs wherever the maximum is unique but, as the Figure shows, it can also happen at other points. Our general envelope theorems, stated and proved in Section 2, expand upon this example. In Section 3, we explore several applications, utilizing the additional structure available in these applications. The first application is to problems of mechanism design. The second is to maximization problems that are concave in
(^5) The mathematical literature on “sensitivity analysis” has formulated several generalized Envelope Theorems—see Bonnans and Shapiro (2000, Section 4.3) for a recent survey. These results by and large rely on topological assumptions on the choice set and continuity of the objective function in the choice variable. We compare these results to ours in Section 3. (^6) See, for example, Mas-Colell, Whinston, and Green (1995) and Simon and Blume (1994).
586 p. milgrom and i. segal
V is right-hand differentiable at t, then V ′^ t+ ≥ ft x∗ t. If t ∈ 0 1 and V is differentiable at t, then V ′^ t = f (^) t x∗ t.
Proof: Using (1) and (2), we see that for any t′^ ∈ 0 1 ,
f x∗ t′^ − f x∗ t ≤ V t′^ − V t
Taking t′^ ∈ t 1 , dividing both sides by t′^ − t > 0, and taking their limits as t′^ → t+ yields f (^) t x∗ t ≤ V ′^ t+ if the latter derivative exists. Taking instead t′^ ∈ 0 t , dividing both sides by t − t′^ > 0, and taking their limits as t′^ → t− yields ft x∗ t ≥ V ′^ t− if the latter derivative exists. When V is differentiable at t ∈ 0 1 , we must have V ′^ t = V ′^ t− = V ′^ t+ = f (^) t x∗ t. Q.E.D.
Theorem 1 is only useful when the value function V is sufficiently well- behaved—for example, differentiable, directionally differentiable, or absolutely continuous. In the remainder of this section, we identify sufficient conditions for the value function to have these properties. These conditions do not exploit any structure of the choice set X, but treat it as merely a set of indices identify- ing elements of the family of functions f x · (^) x∈X on the set [0, 1] of possible parameter values. The conditions for the value function to be well behaved will involve certain properties that the functions f x · (^) x∈X must satisfy uniformly. 9 In particular, the following result offers a sufficient condition for the value function to be absolutely continuous. In this case, the value function is differen- tiable almost everywhere and can be represented as an integral of its derivative:
Theorem 2: Suppose that f x · is absolutely continuous for all x ∈ X. Sup- pose also that there exists an integrable function b 0 1 → + such that f (^) t x t ≤ b t for all x ∈ X and almost all t ∈ 0 1 . Then V is absolutely continuous. Sup- pose, in addition, that f x · is differentiable for all x ∈ X, and that X∗^ t = almost everywhere on [0, 1]. Then for any selection x∗^ t ∈ X∗^ t ,
V t = V 0 +
∫ (^) t
0
(3) f (^) t x∗^ s s ds
Proof: Using (1), observe that for any t′ t′′^ ∈ 0 1 with t′^ < t′′,
V t′′^ − V t′^ ≤ sup x∈X
f x t′′^ − f x t′
= sup x∈X
∫ (^) t′′
t′
ft x t dt
∫ (^) t′′
t′
sup x∈X
f (^) t x t dt ≤
∫ (^) t′′
t′
b t dt
This implies that V is absolutely continuous. Therefore, V is differentiable almost everywhere, and V t = V 0 +
∫ (^) t 0 V^
′ (^) s ds. If f x t is differentiable in t, then
V ′^ s is given by Theorem 1 wherever it exists, and we obtain (3). Q.E.D.
(^9) Mathematical concepts and results used in this paper can be found in Aliprantis and Border
(1994), Royden (1988), Rockafellar (1970), and Apostol (1969).
envelope theorems 587
The integral representation (3) plays a key role in mechanism design (see Section 3). The role of the integrable bound in Theorem 2 is illustrated with the following example:
Example 1: Let X = 0 + and f x t = g t/x , where g z is a dif- ferentiable function that achieves a unique maximum at z = 1, and ≡ supz∈ 0 + zg′^ z < +. (For example, g z = ze−z^ satisfies these conditions.) Observe that supx∈X f (^) t x t = supx∈X^1 t xt g′^ t/x = /t, which is not integrable on 0 1 . By inspection, for all t > 0 X∗^ t = t, and V t = g 1 > V 0 = g 0. Note that for any t ∈ 0 1 f (^) t x∗^ t t = g′^1 /t = 0 = V ′^ t , illustrating Theorem 1. However, the conclusion of Theorem 2 does not hold, for V is dis- continuous at t = 0. It follows that the integrable bound assumed in Theorem 2 is not dispensable.
The assumptions of Theorem 2 do not ensure that the value function is differ- entiable everywhere, as the example depicted in Figure 1 makes clear. However, in the example the value function is right- and left-differentiable everywhere. This observation can be extended from finite to arbitrary choice sets, provided that the family of objective functions satisfies the following property:
Definition: The family of functions f x · (^) x∈X is equidifferentiable at t ∈ 0 1 if f x t′^ − f x t / t′^ − t converges uniformly as t′^ → t.
When the set X is infinite, uniform convergence on X is stronger than pointwise convergence, hence equidifferentiability is stronger than differentiabil- ity. Asimple sufficient condition for the equidifferentiability of f x · x∈X is provided by the equicontinuity of ft x · (^) x∈X everywhere. Indeed, in this case the Mean Value Theorem allows us to write f x t′^ − f x t / t′^ − t = ft x s for some s between t and t′, and the equicontinuity condition implies that this expression converges uniformly to f (^) t x t as t′^ → t.
Theorem 3: Suppose that the family of functions f (^) t x · (^) x∈X is equidifferen- tiable at t 0 ∈ 0 1 , that supx∈X f (^) t x t 0 < +, and that X∗^ t = for all t. Then V is left- and right-hand differentiable at t 0. For any selection x∗^ t ∈ X∗^ t , the directional derivatives are
V ′^ t 0 + = lim t→t 0 + f (^) t x∗^ t t 0 for t 0 < 1
V ′^ t 0 − = lim t→t 0 − f (^) t x∗^ t t 0 for t 0 > 0
V is differentiable at t 0 ∈ 0 1 if and only if ft x∗^ t t 0 is continuous in t at t = t 0.
envelope theorems 589
It is easy to see that V t = t sin log t. Observe that f x t is differentiable in t for all x, with ft x t ≤ 2 for all x t. (In particular, the assumptions of Theorem 2 are satisfied.) However, f x · (^) x∈X is not equidifferentiable at t 0 = 0:
sup x∈X
f x t − f x 0 t − 0
− f (^) t x 0
∣ =^ sin log^ t^ +^1 ^0 as^ t^ →^0
Observe that V does not have a right-hand derivative at t 0 = 0, since
lim t→ 0 +
V t /t = 1 = lim t→ 0 +
V t /t = − 1
Therefore, we cannot dispense with the assumption of equidifferentiability in Theorem 3.
In conclusion of this section, observe that Theorems 2 and 3 can be applied when⋃ their assumptions hold only on the reduced choice set X∗^ 0 1 =
s∈ 0 1 X ∗ (^) s. Indeed, replacement of X with X∗ (^) 0 1 will not affect the value
function V or the optimal choice correspondence X∗.
3 applications In this section we demonstrate how the general results outlined above can be applied to several important economic settings. The additional structure available in these settings can be utilized to verify the assumptions of Theorems 1–3, as well as to strengthen their conclusions.
3 1 Mechanism Design Consider an agent whose utility function f x t over outcomes x ∈ Y depends on his type t ∈ 0 1 . The agent is offered a mechanism, described by a message set M and an outcome function h M → Y. The mechanism induces the menu X = h m m ∈ M ⊂ Y , i.e., the set of outcomes that are accessible to the agent. The agent’s equilibrium utility V t in the mechanism is then given by (1), and the set X∗^ t of the mechanism’s equilibrium outcomes is given by (2). Any selection x∗^ t ∈ X∗^ t is a choice rule implemented by the mechanism. For this setting, Theorem 2 immediately implies the following corollary.
Corollary 1: Suppose that the agent’s utility function f x t is differentiable and absolutely continuous in t for all x ∈ Y , and that supx∈Y f (^) t x t is integrable on [0, 1].^10 Then the agent’s equilibrium utility V in any mechanism implementing a given choice rule x∗^ must satisfy the integral condition (3).
(^10) The last assumption can be relaxed in some commonly studied mechanism design settings. For example, suppose that an outcome can be described as x = z w , where w ∈ is the monetary transfer to the agent and z ∈ Z ⊂ is a nonmonetary decision. Suppose furthermore that the agent’s utility function takes the quasilinear form f z w t = g z t + w, and that g has strictly increas- ing differences in z t (equivalently, f has the Spence-Mirrlees single-crossing property). Then
590 p. milgrom and i. segal
Deducing condition (3) is a key step in the analysis of mechanism design prob- lems with continuous type spaces. Mirrlees (1971), Laffont and Maskin (1980), Fudenberg and Tirole (1991), and Williams (1999) derived and exploited this con- dition by restricting attention to (piecewise) continuously differentiable choice rules. This is not fully satisfactory, because a mechanism designer may find it optimal to implement a choice rule that is not piecewise continuously differen- tiable. For example, in the trade setting with linear utility (see, e.g., Myerson (1991, Section 6.5)), both the profit-maximizing and total surplus-maximizing choice rules are usually discontinuous. 11 At the same time, the integral condition (3) still holds in this setting and implies such important results as the Revenue Equivalence Theorem for auctions and the Myerson-Satterthwaite inefficiency theorem. It should be noted that Corollary 1 can be applied to multidimensional type spaces as well. For example, suppose that the agent’s type space is " ⊂ k^ and his utility function is g X × " → . Suppose that " is smoothly connected, that is, any two points a b ∈ " are connected by a path described by a continuously differentiable function % 0 1 → " such that % 0 = a and % 1 = b. If g is dif- ferentiable in & ∈ " and the gradient g (^) & x & is bounded on X × ", then the function f x t = g x % t satisfies the assumptions of Corollary 1. The Corol- lary then implies that if V " → is the agent’s value function in a mechanism implementing the choice rule x∗^ " → X, then V b −V a equals the path inte- gral of the gradient g& x∗^ & & along the path connecting a and b. Since this result holds for any smooth path in " V is a potential function for the vector field g& x∗^ & & , and is therefore determined by this field up to a constant (see, e.g., Apostol (1969)). 12 In addition to the integral representation (3), it is sometimes of interest to know that the agent’s equilibrium utility V is differentiable. For example, suppose that, as in Segal and Whinston (2002), the agent chooses his type t, interpreted as investment, before participating in the mechanism. 13 Suppose the agent maxi-
the Monotone Selection Theorem (Milgrom and Shannon (1994)) implies that for any selection x∗^ t = z∗^ t w∗^ t ∈ X∗^ t z∗^ t is nondecreasing in t. Furthermore, under strictly increasing differences, g (^) t z t is nondecreasing in z, and therefore f (^) t x∗^ s t = gt z∗^ s t ∈ gt z∗^0 t , g (^) t z∗^1 t for all s. Therefore, f (^) t x t is uniformly bounded on x t ∈ X∗^ 0 1 × 0 1 . This allows us to apply Theorem 2 on the reduced choice set X∗^ 0 1 and obtain the integral represen- tation (3). (^11) Myerson (1981) proves condition (3) utilizing the special structure of the linear setting. However, his proof does not readily generalize to other settings. While monotonicity of implementable decision rules is typically used to show that the value function is differentiable almost everywhere, this by itself does nto imply that it equals the integral of the derivative. For example, it does not rule out the possibility that the value function is discontinuous. Even establishing continuity of the value function would not suffice: a counterexample is provided by the Cantor ternary function (see, e.g., Royden (1988)). Thus, establishing absolute continuity of the value function is an indispensable step for deriving (3). (^12) Krishna and Maenner (2001) derive this result independently, but under unnecessary restrictions on the agent’s payoffs or the mechanism itself (their Hypotheses I and II). (^13) Any cost of this investment is included in f.
592 p. milgrom and i. segal
is some x∗^ ∈ X∗^ t 0 such that f (^) t x∗ t 0 exists. Then V is differentiable at t 0 and V ′^ t 0 = f (^) t x∗ t 0.
Proof: Take t′ t′′ ' ∈ 0 1 . By the convexity of X and the concavity of f , for any x′ x′′^ ∈ X we can write
f 'x′^ + 1 − ' x′′ 't′^ + 1 − ' t′′^ ≥ 'f x′ t′^ + 1 − ' f x′′ t′′^
Taking the supremum of both sides over x′ x′′^ ∈ X, and using the convexity of X, we obtain V 't′^ + 1 − ' t′′^ ≥ 'V t′^ + 1 − ' V t′′^ , and therefore V is concave. This implies that V is directionally differentiable at each t ∈ 0 1 and V ′^ t− ≥ V ′^ t+ (see, e.g., Rockafellar (1970)). On the other hand, by Theorem 1, V ′^ t 0 − ≤ f (^) t x∗ t 0 ≤ V ′^ t 0 +. Q.E.D.
The Benveniste and Scheinkman theorem established the differentiability of the value function in a class of infinite-horizon consumption problems with a parameterized initial endowment. In their setting, X is the set of technologi- cally feasible consumption paths, and the objective function is the consumer’s intertemporal utility, e.g., f x t = u x 0 + t +
s= 1 )^ s (^) u x s.^ 16
3 3 Continuous Objective Functions on Compact Choice Sets If X is a nonempty compact space and f x t is upper semicontinuous in x, then X∗^ t is nonempty. If, in addition, f (^) t x t is continuous in x t , then all the assumptions of Theorems 2 and 3 are satisfied. Furthermore, in this case we can simplify the expressions for the directional derivatives of V and the charac- terization of the differentiability points of V. These results can be summarized as follows:
Corollary 4: Suppose that X is a nonempty compact space, f x t is upper semicontinuous in x, and f (^) t x t is continuous in x t. Then (i) V is absolutely continuous and the integral representation (3) holds. (ii) V ′^ t+ = maxx∈X∗ (^) t f (^) t x t for any t ∈ 0 1 and V ′^ t− = minx∈X∗ (^) t f (^) t x t for any t ∈ 0 1 . (iii) V is differentiable at a given t ∈ 0 1 if and only if f (^) t x t x ∈ X∗^ t is a singleton, and in that case V ′^ t = f (^) t x t for all x ∈ X∗^ t.
Proof: The continuous function ft x t is bounded on X × 0 1 , so the “integrable bound” condition of Theorem 2 is satisfied. Furthermore, since f x t is upper semicontinuous in x X∗^ t is a nonempty compact set for all t. Also, the absolute continuity of f x t in t is implied by its continuous dif- ferentiability in t. Therefore, all assumptions of Theorem 2 are satisfied, which establishes part (i).
(^16) If, in addition to the technological constraints embodied in X, there is a constraint on feasible consumption x 0 + t in the first period (e.g., x 0 + t ≥ 0), then the present analysis applies on neighbor- hoods in the parameter set where the consumption constraint is nonbinding.
envelope theorems 593
Next, the continuity of f (^) t and the compactness of X imply that the family of functions ft x · (^) x∈X is equicontinuous. As noted in Section 2, this implies that f x · (^) x∈X is equidifferentiable at any t. Since f (^) t is also bounded on X × 0 1 , all assumptions of Theorem 3 are satisfied. Therefore, V has direc- tional derivatives, which are given by (4). Take t 0 ∈ 0 1. Berge’s Maximum Theorem (see, e.g., Aliprantis and Bor- der (1994)) and the continuity of f (^) t imply that for any selection x∗^ t ∈ X∗^ t , limt→t 0 +f (^) t x∗^ t t 0 ≤ maxx∈X∗ (^) t 0 f (^) t x t 0. Combining with (4), we see that V ′^ t 0 + ≤ maxx∈X∗ (^) t 0 f (^) t x t 0. Since Theorem 1 implies the reverse inequality, this establishes the first part of (ii). The second part is established similarly. Part (iii) follows immediately. Q.E.D.
Aversion of this result was first obtained by Danskin (1967). In the economic literature, the result was rediscovered by Kim (1993) and Sah and Zhao (1998). Corollary 4 makes it clear that, contrary to the conventional wisdom in the economic literature, good behavior of the value function does not rely on good behavior of maximizers. For example, consider a bounded linear programming problem in a Euclidean space. At a parameter value at which there are multiple maximizers, any selection of maximizers is typically discontinuous in the param- eter. Nevertheless, Corollary 4(i) establishes that the value function is absolutely continuous. As another example, suppose that X is a convex compact set in a Euclidean space described by a collection of inequality constraints, and that the objective function is strictly concave in x. Then the optimal choice is unique, and there- fore by Corollary 4(iii) the value function is differentiable everywhere, even at parameter values where the maximizer is not differentiable (e.g., where the set of binding constraints changes). While the traditional envelope theorem derived from first-order conditions (see, e.g., Simon and Blume (1994)) cannot be used at such points, Corollary 4(iii) establishes that the envelope formula must still hold. To understand the role of compactness in parts (ii) and (iii) of Corollary 4, consider the following example:
Example 3: Let X = 0 ∪ 12 1 , and
f x t =
− t − x 2 for x ∈ 12 1 1 2 −^ t^ for^ x^ =^0
With the Euclidean topology on X, the example satisfies all the assumptions of Corollary 4 except for compactness of X. 17 Note that X∗^ t is a singleton for all t: in particular, for t ≤ 12 X∗^ t = 0 and V t = 12 − t, while for t > 12 X∗^ t = t and V t = 0. Nevertheless, V is not differentiable at t = 12 , and its right-hand derivative at this point does not satisfy the formula in Corollary 4(ii).
(^17) By changing the topology on X, the same example can be construed as one in which X is
compact but the continuity assumptions of Theorem 3 are violated.
envelope theorems 595
Proof: The absolute continuity of V t = supx∈X infy∈Y f x y t obtains by double application of the absolute continuity result of Theorem 2. Therefore, V is differentiable almost everywhere and V t = V 0 +
∫ (^) t 0 V^
′ (^) s ds. Now, consider the graph of the saddle-point selection: G ≡ t x∗^ t y∗^ t t ∈ 0 1 ⊂ 0 1 × X × Y. Since the product topological space 0 1 × X × Y satisfies the second axiom of countability by our assumptions, the set of isolated points of G is at most countable. Therefore, the set S of points t ∈ 0 1 such that t x∗^ t y∗^ t is not isolated in G and V ′^ t exists has full measure on 0 1 . Take any point t 0 ∈ S and let x 0 y 0 = x∗^ t 0 y∗^ t 0. Since t 0 x 0 y 0 is not isolated in G, there exists a sequence t (^) k xk y (^) k k= 1 ⊂ G such that t (^) k x (^) k y (^) k → t 0 x 0 y 0 as k → and t (^) k = t 0 for all k. Furthermore, the sequence can be chosen so that t (^) k − t 0 has a constant sign, and for definiteness let it be positive. By the definition of a saddle point, we can write
f x 0 y (^) k tk − f x 0 y (^) k t 0 t (^) k − t 0
V t (^) k − V t 0 t (^) k − t 0
≤
f x (^) k y 0 tk − f x (^) k y 0 t 0 t (^) k − t 0
Using equidifferentiability of f x y · (^) xy ∈X×Y , this implies
ft x 0 y (^) k t 0 +
o t (^) k − t 0 t (^) k − t 0
V t (^) k − V t 0 t (^) k − t 0
≤ f (^) t x (^) k y 0 t 0 +
o t (^) k − t 0 t (^) k − t 0
As k → , by the continuity of ft x y t in x and in y, both bounds converge to f (^) t x 0 y 0 t 0. Therefore, we must have V ′^ t 0 = ft x∗^ t 0 y∗^ t 0 t 0. Since this formula holds for each t 0 in the set S, which has full measure in 0 1 , we obtain the result. Q.E.D.
Note that in contrast to Theorem 2 for maximization programs, Theorem 4 utilizes topologies on the choice sets X Y and the continuity of f (^) t x y t in these topologies. The following example demonstrates that these extra assump- tions are indispensable:
Example 4: Let X = Y = 0 1 , and
f x y t =
t − x if x ≥ y y − t otherwise
It can be verified that for each t, the function has a unique saddle point x∗^ t y∗^ t = t t , and V t = 0. Note that V ′^ t = 0, while f (^) t x∗^ t y∗^ t t = 1, for all t. Thus, the integral representation (5) does not hold. Note that all the assumptions of Theorem 4 but for those involving topologies on X and Y
596 p. milgrom and i. segal
are satisfied. Observe that f (^) t x y t is not continuous in x or y in the standard topology on . The function is trivially continuous in the discrete topology on X and Y (in which all points are isolated), but this topology does not satisfy the second countability axiom.
Under appropriate continuity assumptions, a saddle-point extension of Corol- lary 4 can also be obtained.^19
Theorem 5: Let X and Y be compact spaces and suppose that f X × Y × 0 1 → and f (^) t X × Y × 0 1 → are continuous functions. Suppose also that X∗^ t × Y ∗^ t = for all t ∈ 0 1 . Then V is directionally differentiable, and the directional derivatives are
V ′^ t+ = max x∈X∗^ t min y∈Y ∗^ t ft x y t = min y∈Y ∗^ t max x∈X∗^ t f (^) t x y t for t < 1
V ′^ t− = min x∈X∗^ t max y∈Y ∗^ t ft x y t = max y∈Y ∗^ t min x∈X∗^ t f (^) t x y t for t > 0
Proof: Take t 0 ∈ 0 1 , and a selection x∗^ t y∗^ t ∈ X∗^ t × Y ∗^ t. For any t > t 0 we can write
f x∗^ t 0 y∗^ t t − f x∗^ t 0 y∗^ t t 0 t − t 0
≤
V t − V t 0 t − t 0
f x∗^ t y∗^ t 0 t − f x∗^ t y∗^ t 0 t 0 t − t 0
Therefore, by the Mean Value Theorem,
ft x∗^ t 0 y∗^ t s′^ t ≤
V t − V t 0 t − t 0
≤ f (^) t x∗^ t y∗^ t 0 s′′^ t
for some s′^ t s′′^ t ∈ t 0 t. This implies that
max x∈X∗^ t 0
ft x y∗^ t s′^ t ≤
V t − V t 0 t − t 0
≤ min y∈Y ∗^ t 0
(6) ft x∗^ t y s′′^ t
Berge’s Maximum Theorem implies that maxx∈X∗ (^) t 0 f (^) t x y t is continuous in y t and miny∈Y ∗ (^) t 0 f (^) t x y t is continuous in x t. The theorem also implies that the saddle set correspondence, being the Nash equilibrium correspon- dence of a zero-sum game, is upper hemicontinuous (see, e.g., Fudenberg and Tirole (1991)). These two observations imply that
lim t→t 0 +
max x∈X∗^ t 0
ft x y∗^ t s′^ t ≥ min y∈Y ∗^ t 0
max x∈X∗^ t 0
f (^) t x y t 0
lim t→t 0 + min y∈Y ∗^ t 0 ft x∗^ t y s′′^ t ≤ max x∈X∗^ t 0 min y∈Y ∗^ t 0 f (^) t x y t 0
(^19) For the particular case where X and Y are unit simplexes representing the two players’ mixed strategies in a finite zero-sum game, and hence the payoff f x y t is bilinear in x y , the result has been obtained by Mills (1956).
598 p. milgrom and i. segal
in x t , and there exists xˆ ∈ X such that g x tˆ 0 for all t ∈ 0 1 . Then: (i) V is absolutely continuous, and for any selection x∗^ t y∗^ t ∈ X∗^ t × Y ∗^ t ,
V t = V 0 +
∫ (^) t
0
L (^) t x∗^ s y∗^ s s ds
(ii) V is directionally differentiable, and its directional derivatives equal:
V ′^ t+ = max x∈X∗^ t
min y∈Y ∗^ t
L (^) t x y t = min y∈Y ∗^ t
max x∈X∗^ t
L (^) t x y t for t < 1
V ′^ t− = min x∈X∗^ t max y∈Y ∗^ t
L (^) t x y t = max y∈Y ∗^ t
min x∈X∗^ t L (^) t x y t for t > 0
Proof: For all t ∈ 0 1 , all y∗^ ∈ Y ∗^ t , and each i = 1 2 k we can write
V t ≥ L x yˆ ∗ t ≥ f x tˆ + y∗ i g (^) i x t ˆ
where the first inequality is by the definition of the saddle value, and the second by nonnegativity of Lagrange multipliers. This implies that
y i∗ ≤ ¯y (^) i ≡ sup t∈ 0 1
V t − f x tˆ g (^) i x tˆ
Observe that y¯ (^) i < +, since the numerator of the above fraction is bounded, and the denominator is bounded away from zero by the definition of xˆ and continuity of g xˆ ·. Therefore, the set Y =
∏k i= 1 ^0 ^ y¯^ i ^ ⊂^ k
Aversion of result (ii) was first obtained by Gol’shtein (1972). Also, note that in the particular case where k = 1 g x t = h x + t, and f x t = f x , it yields V ′^ t+ = min Y ∗^ t and V ′^ t− = max Y ∗^ t. This special case of Corollary 5(ii), which allows the interpretation of a Lagrange multiplier as the “price” of the constraint, is stated in Rockafellar (1970).
3 6 Smooth Pasting in Optimal Stopping Problems Optimal stopping theory has become a standard tool in economics to model decisions involving “real” or financial options, such as when and whether to exer- cise an option to buy securities, convert a bond, harvest a crop, adopt a new technology, or terminate a research project (see, e.g., Dixit and Pindyck (1994)). In the usual formulation, the decision maker chooses a stopping time of a con- tinuous time Markov process z % with state space / and paths that are right- continuous. The decision maker’s flow payoff at any time % in state z is z. If at any time the process is stopped in state z, the decision maker receives a termination payoff of 0 z.
envelope theorems 599
Suppose the decision maker adopts the Markovian policy of terminating when- ever the state lies in the closed set S. Define TS = inf% z % ∈ S to be the first time that the process enters the set S. The decision maker’s payoff is
0
e−3s^ z s ds + e−3T^ S^ 0 z T (^) S
with expected payoff beginning in state z 0 of f S z 0 = E2 S z 0 = z 0 . The optimal value function is V z 0 ≡ supS f S z 0 and a policy S∗^ is Markov optimal if for all z 0 V z 0 ≡ f S∗ z 0.
Corollary 6: Suppose that 0 and V are differentiable and that a Markov opti- mal strategy S∗^ exists. Then, for all z 0 ∈ S∗ V z 0 = 0 z 0 and V (^) z z 0 = (^0) z z 0.^21
Proof: For any z 0 ∈ S∗ S = / (“always stop immediately”) is an optimal policy beginning in z 0 and its value is f / z 0 ≡ 0 z 0. Since 0 and V are differentiable, the conclusion follows from Theorem 1. Q.E.D.
This conclusion is known as “smooth pasting,” because it asserts that V melds smoothly into 0. Economic models exploiting smooth pasting frequently assume that z % is a Markov diffusion process satisfying the assumptions of Corol- lary 6. The conditions that imply differentiability of the function f S z 0 in z 0 are subtle (see Fleming and Soner (1993)) and frequently depend on properties of both the stochastic process and the payoff functions, but not on the optimality of the stopping set S. Given this technical structure, the advantage of the present treatment of smooth pasting is that it separates the issue of the differentiability of the value of Markov policies from the issue of the equality of two derivatives, which under such differentiability follows simply from the optimality of the stop- ping rule S∗.
4 conclusion It is common for economic optimization models to include a variety of mathematical assumptions to ease the analysis. These include the assumptions of convexity, differentiability of certain functions, and sign restrictions on the derivatives that are used for comparative statics analysis. It has long been understood that one class of conclusions—those about the existence of prices supporting the optimum—depend only on the assumptions that are invariant to linear transformations of the choice variables, such as con- vexity. Similarly, as emphasized by Milgrom and Shannon (1994), directional comparative statics conclusions depend only on assumptions that are invariant to order-preserving transformations of the choice variables and the parameter,
(^21) If the process is nonstationary, the corollary still applies with time as a component of the state
variable.
envelope theorems 601
Mills, H. D. (1956): “Marginal Values of Matrix Games and Linear Programs,” in Linear Inequalities and Related Systems, Annals of Mathematical Studies, 38, ed. by H. W. Kuhn and A. W. Tucker. Princeton: Princeton University Press, 183–193. Mirrlees, J. (1971): “An Exploration in the Theory of Optimum Income Taxation,” The Review of Economic Studies, 38, 175–208. Myerson, R. B. (1981): “Optimal Auction Design,” Mathematics of Operations Research, 6, 58–73. (1991): Game Theory. Cambridge: Harvard University Press. Rockafellar, R. T. (1970): Convex Analysis. Princeton: Princeton University Press. Royden, H. L. (1988): Real Analysis, Third Edition. Englewood Cliffs: Prentice-Hall. Sah, R., and J. Zhao (1998): “Some Envelope Theorems for Integer and Discrete Choice Vari- ables,” International Economic Review, 39, 623–634. Segal, I., and M. Whinston (2002): “The Mirrlees Approach to Mechanism Design with Rene- gotiation: Theory and Application to Hold-Up and Risk Sharing,” Econometrica, 70, 1–45. Simon, C. P., and L. Blume (1994): Mathematics for Economists. New York: W. W. Norton & Co. Williams, S. R. (1999): “ACharacterization of Efficient, Bayesian Incentive-Compatible Mecha- nisms,” Economic Theory, 14, 155–180.