




























































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
NOTES FOR QUANTITATIVE METHODS: SOME ADVANCED MATHEMATICS FROM AN ELEMENTARY POINT OF VIEW. MAXWELL B. STINCHCOMBE. Contents. 0. Organizational Stuff.
Typology: Exercises
1 / 68
This page cannot be seen from the preview
Don't miss anything!





























































SOME ADVANCED MATHEMATICS FROM AN ELEMENTARY POINT OF VIEW
MAXWELL B. STINCHCOMBE
Date: November 25, 2003. Fall Semester, 2003. Unique #30238.
4 MAXWELL B. STINCHCOMBE
, summability and valuation of streams of utilities, convex analysis and duality; further properties of R^ and related spaces, (including compactness, continuity and measurability of functions on R`, summability of sequences, existence of optima, fixed point theorems, cdf’s, other metrics, other metric spaces, the Theorem of the Maximum); Probabilities and expectations (including domains, modes of convergence, convergence theorems, orders of stochastic dominance, conditional expecta- tions and probabilities); Dynamic programming (including properties of sequence spaces and probabilities on them, Bellman and Euler equations, the role of the Theorem of the Max- imum, growth models); Statistics (including specific distributions [uniform, gamma, beta, Gaussian, t, F , χ^2 , Poisson, negative exponential, Weibull, logistic], estimators and their properties [consistency, Glivenko-Cantelli, different kinds of “best” estimators, Bayesian es- timators, MLE estimators, information inequalities, sufficiency, Blackwell-Rao], properties of hypothesis tests [types of errors and their associated distributions, the Neyman-Pearson Lemma]).QUANTITATIVE METHODS 5
1.1. Lengths and measurements. N and Q from elementary school. As models of mea- surements of quantities, we’re done.
1.2. Why we want more. Irrationality of easily described lengths, clt and integration. Sequences in Q, convergence implies settling down, but not the reverse. Subsequences. Cauchy sequences and R as the completion of Q. Implications of completeness: decreasing and increasing bounded sequences have limits, equivalently, every bounded set has a sup and an inf. The idea of completion also shows up in the major limit theorem in statistics (i.e. the CLT).
1.3. Valuing sequences of rewards. This section is based on classic analyses as well as the more recent Marinacci’s “An Axiomatic Approach to Complete Patience and Time In- variance,” Journal of Economic Theory 83 , 105-144 (1998). Patience about finite sequences, (r 1 , r 2 ,... , rt), of rewards seems to be about being indifferent between all time permutations of the sequence. In the dynamic programming models used in game theory and macro, one often achieves infinite sequences of rewards. These may not be entirely believable, but they do a pretty good job of capturing the idea of an indefinite future.
1.3.1. Classic analyses. lim inft rt ≤ lim supt rt, equality for limits. XY^ notation, e.g.’s 2^3 , R`, RN. Summability for x ∈ RN. u : R → R, u(x) := (u(xt)t∈N) ∈ RN, u bounded. From here on, we’ll just use x for the elements in RN^ and try to value them, thinking that they are bounded streams of utilities. Vlim inf (x) := lim inft xt (that is, lim inft u(xt)) which always exists (by completeness). The infinite extension of a simple of idea of patience is here — any permutation of the integers fails to change Vlim inf (x). Thinking about a finite sequence of rewards, a useful definition of patience is that any permutation of the reward sequence is indifferent, having the good stuff early is just as desirable as having it late. If π : N → N is 1-1 and onto, then xπ denotes the sequence (xπ(t))t∈N. π is a permutation of N.
Theorem 1.3.1. For any permutation π, Vlim inf (x) = Vlim inf (xπ).
QUANTITATIVE METHODS 7
min{p · x : x ∈ Cu(u)}, again, p 0 a price vector. One of the things we will see is that Dpπ = y∗(p) where y∗(p) is the solution to max{p · y : y ∈ Y }, and Dpe(p, u) = h(p, u) where h(p, u) is the solution to min{p · x : x ∈ Cu(u)}. y∗(p) is the supply/demand function for the firm, h(p, u) is the Hicksian demand function.
1.4.1. Convexity of sets. Dfn convexity of sets, e.g.’s hyperplanes H p≤ (r) := {x : p · x ≤ r}, triangles, squares, the set Y above when f ′′^ < 0. A function is concave when its subgraph is a convex set. In the production function example, you should recognize decreasing returns to scale (DRTS). When the epigraph is a convex set, the function is convex. There is no such creature as a concave set.
1.4.2. Three basic results.
Theorem 1.4.1. If {Kα : α ∈ A} is a collection of convex sets, then K := ∩αKα is convex.
Proof.
Corollary 1.4.1.1. If u : R+ → R is concave, then for each u, Cu(u) := {x ∈ R + : u(x) ≥ u} is a convex set.
Proof. In intermediate micro, one starts with a utility function, u, that represents preferences, that is, x % y iff u(x) ≥ u(y), and then derives demand behavior, x(p, m) from the solutions to
max u(x) subject to p · x ≤ m, x ∈ R` +.
The demand function, x(p, m), is unaffected by monotonic transformations of u, that is, if f : R → R satisfies [[r > s] ⇒ [f (r) > f (s)]] and v(x) := f (u(x)), then x(p, m) also solves the problem
max v(x) subject to p · x ≤ m, x ∈ R` +.
Utility functions do not measure anything. They are no more than a convenient device to represent preferences. The property that Cu(u) is a convex set is preserved under monotonic transformations of u, that is, for every v, Cv(v) := {x ∈ R+ : v(x) ≥ v} is a convex set. This leads to a definition, with K a convex subset of R, v : K → R is quasi-concave if for all v, {x ∈ K : v(x) ≥ v} is a convex set. Two sets, E, E′^ are disjoint if E ∩ E′^ = ∅. Stronger than disjoint is being at a positive distance. For A, B ⊂ R, A + B := {a + b : a ∈ A, b ∈ B}, draw some pictures, show that A, B convex ⇒ A + B is convex. For x ∈ R^ and > 0, B(x, ) := {y : ‖x − y‖ < }. Two sets, E, E′^ ⊂ R`^ are -separated if E + B(0, ) and E′^ + B(0, ) are disjoint.
8 MAXWELL B. STINCHCOMBE
Theorem 1.4.2 (Separating Hyperplane). If K and K′^ are disjoint convex subsets of R, then ∃p ∈ R, p 6 = 0, such that ∀x ∈ K, x′^ ∈ K′, p · x ≤ p · x′. If K and K′^ are also -separated, then ∃p ∈ R`, p 6 = 0, and ∃δ > 0 such that ∀x ∈ K, x′^ ∈ K′, p · x + δ ≤ p · x′.
Pictures of what this means, proof will come later. An interesting application uses the idea of a closed set, intuitively, one containing its boundary. In order to get to interesting economics, I am providing an interim definition of closed sets, one that applies only to convex sets. We will return to the idea of closed sets later.
Definition 1.4.3. The convex-closure of K ⊂ R`^ is K :=
{H p≤ (r) : K ⊂ H p≤ (r)}. A set K is convex-closed if K = K.
The class of convex-closed sets is closed under intersection, that is, if Kα, α ∈ A, is a collection of convex-closed sets, then ∩αKα is convex-closed. Pictures. The SHThm gives us
Lemma 1.4.4. If K is convex, then for all > 0 , K ⊂ K + B(0, ).
1.4.3. A worked example. Y = {(y 1 , y 2 ) : y 1 ≤ 0 , y 2 ≤
|y 1 |}, ΠY (p) := sup{p · y : y ∈ Y }, find the input demand function, the supply function, the profit function, show that the profit function is convex, “application” to stability of prices, refer to homework on derivative tests. Given a convex profit function Π(·), YΠ := {y : ∀p > 0 , p · y ≤ Π(p)}, relate to convex- closed sets, do the work in the example to show the basic duality result for profit functions,
Y = YΠY.
This means that I can give you a profit function and I have implicitly specified the technology, or I can give you a technology, and I have implicitly specified the profit function, and these two representations are (loosely) duals of each other. Applications of this idea to expenditure functions (recovering upper contour sets from expenditure functions and vice versa), costs functions (same idea).
1.4.4. Support functions. The inf-support function of a set K is μinf K (p) := inf{p·x : x ∈ K}. [Beware: this is the support function that most people use, not the next one.] The sup- support function of a set K is μsup K (p) := sup{p · x : x ∈ K}. Note that μsup K (p) = −μinf K (−p), so these are essentially the same function. Conventions with ±∞ and 0 < α < 1 in the definition of concave and convex functions.
Theorem 1.4.5. An inf-support function is concave, a sup-support function is convex.
Corollary 1.4.5.1. An expenditure function is concave, and a profit function is convex.
Behavioral implications. The general duality theorem relating closed convex sets to their support functions is
10 MAXWELL B. STINCHCOMBE
Problem 1.7 (Cauchy-Schwarz inequality and dot products). Consider vectors x = (xk)nk=1, y = (yk)nk=1 and z = (zk)nk=1 in Rn. The (dot) product is defined by x · y :=
k xk^ yk, some- times written as xy. Following the logic of Pythagoras’s theorem, the length of a vector is ‖x‖ :=
k x^2 k =^
x · x.
k xkyk)^2 ≤^ (
k x^2 k)(
k y k^2 ). Under what conditions is the inequality satisfied as an equality?
k |xk^ −^ yk|^ so that^ ρ^ :^ Rn^ ×^ Rn^ →^ R+. Show that^ (Rn, ρ)^ is a metric space.
QUANTITATIVE METHODS 11
1.6. Self-guided tour to differentiability and concavity. This section develops the negative semi-definiteness of the matrix of second derivatives as being equivalent to the concavity of a twice continuously differentiable function. It also develops the determinant test for negative semi-definiteness. Before reading this, you should know the equivalent of the math camp review of matrix multiplication and determinants. You are responsible for handing in the problems scattered throughout this section by the middle of the semester. I would recommend that you do it before that.
1.6.1. The two results. Before giving the results, we need some terminology. A function f : C → R is strictly concave if ∀x, y ∈ C, x 6 = y, and all α ∈ (0, 1), f (αx + (1 − α)y) > αf (x) + (1 − α)f (y). A symmetric matrix n × n matrix A = (aij )i,j=1,... ,n is negative semi-definite if for all vectors z ∈ Rn, zT^ Az ≤ 0, it is negative definite if for all z 6 = 0, zT^ Az < 0.
Theorem 1.6.1. A twice continuously differentiable f : Rn^ → R defined on an open, convex set C is concave (respectively strictly concave) iff for all x◦^ ∈ C D x^2 f (x◦) is negative semi- definite (respectively negative definite).
The principal sub-matrices of a symmetric n × n matrix A = (aij )i,j=1,... ,n are the m × m matrices (aij )i,j=1,... ,m, m ≤ n. Thus, the 3 principal sub-matrices of the 3 × 3 matrix
are
[ 3
, and
Theorem 1.6.2. A matrix A is negative semi-definite (respectively negative definite) iff the sign of m’th principal sub-matrix is either 0 or − 1 m^ (respectively, the sign of the m’th principal sub-matrix is − 1 m). It is positive semi-definite (respectively positive definite) if you replace “− 1 m” with “+1m” throughout.
In the following two problems, use Theorem 1.6.1 and 1.6.2.
Problem 1.8. The function f : R^2 + → R defined by f (x, y) = xαyβ^ , α, β > 0 , is strictly concave on R^2 ++ if α + β < 1 , and is concave on R^2 ++ if α + β = 1.
Problem 1.9. The function f : R^2 + → R defined by f (x, y) = (xp^ + yp)^1 /p^ is convex on R^2 ++ if p ≥ 1 and is concave if p ≤ 1.
QUANTITATIVE METHODS 13 Throughout, all matrices have only real number entries. |A| denotes the determinant of the square A. Recall that A is invertible, as a linear mapping, iff |A| 6 = 0. (If these statements do not make sense to you, you missed the math camp and need to do some review.)
Problem 1.12. Remember, or look up, how to find determinants for 2 × 2 and 3 × 3 matrices.
A vector x 6 = 0 is an eigenvector^1 and the number λ 6 = 0 is an eigenvalue for A if Ax = λx. Note that Ax = λx iff A(rx) = λ(rx) for all r 6 = 0. Therefore, we can, and do, normalize eigenvectors by ‖x‖ = 1, which corresponds to setting r = 1/‖x‖. There is still some ambiguity, since we could just as well set r = − 1 /‖x‖. In general, one might need to consider λ’s and x’s that are imaginary numbers, that is λ = a + bi with i =
−1. This means that x will need to be imaginary too. To see why, read on.
Lemma 1.6.3. A = λx, x 6 = 0, iff (A − λI)x = 0 iff |A − λI| = 0.
Proof: You should know why this is true. If not, you need some more review. Define g(λ) = |A − λI| so that g is an n’th degree polynomial in λ. The fundamental theorem of algebra tells us that any n’th degree polynomial has n roots, counting multiplic- ities, in the complex plane. To be a bit more concrete, this means that there are complex numbers λi, i = 1,... , n such that
g(y) = (λ 1 − y)(λ 2 − y) · · · (λn − y).
The “counting multiplicities” phrase means that the λi need not be distinct.
Problem 1.13. Using the quadratic formula, show that if A is a symmetric 2 × 2 matrix, then both of the eigenvalues of A are real numbers. Give a 2 × 2 non-symmetric matrix with real entries having two imaginary eigenvalues. [This can be done with a matrix having only 0 ’s and 1 ’s as entries.]
The conclusion about real eigenvalues in the previous problem is true for general n × n matrices, and we turn to this result. From your trigonometry class (or from someplace else), (a + bi)(c + di) = (ac − bd) + (ad + bd)i defines multiplication of complex numbers, and (a + bi)∗^ := a − bi defines the complex conjugate of the number (a + bi). Note that rs = sr and r = r∗^ iff r is a real number for complex r, s. By direct calculation, (rs)∗^ = r∗s∗^ for any pair of complex numbers r, s. Complex vectors are vectors with complex numbers as their entries. Their dot product is defined in the usual way, x · y :=
i xiyi. Notationally,^ x^ ·^ y^ may be written^ xT^ y. The next proof uses
(^1) “Eigen” is a german word meaning “own.”
14 MAXWELL B. STINCHCOMBE
Problem 1.14. If r is a complex number, then rr∗^ = 0 iff r = 0. If x is a complex vector, then xT^ x∗^ = 0 iff x = 0.
Lemma 1.6.4. Every eigenvalue of a symmetric A is real, and distinct eigenvectors are real, and orthogonal to each other.
Proof: The eigenvalue part: Suppose that λ is an eigenvalue and x an associated eigenvector so that
(1) Ax = λx.
Taking the complex conjugate of both sides,
(2) Ax∗^ = λ∗x∗
because A has only real entries. [Ax = λx] ⇒ [(x∗)T^ Ax = (x∗)T^ λx = λxT^ x∗], [Ax∗^ = λ∗x∗] ⇒ [xT^ Ax∗^ = xT^ λ∗x∗^ = λ∗xT^ x∗].
Subtracting,
(x∗)T^ Ax − xT^ Ax∗^ = (λ − λ∗)xT^ x∗.
Since the matrix A is symmetric,
(x∗)T^ Ax − xT^ Ax∗^ = 0.
Since x 6 = 0, xT^ x∗^6 = 0. Therefore, [(λ − λ∗)xT^ x∗^ = 0] ⇒ [(λ − λ∗) = 0],
which can only happen if λ is a real number. The eigenvector part: From the previous part, all eigenvalues are real. Since A is real, this implies that all eigenvectors are also real. Let λi 6 = λj be distinct eigenvalues and xi, xj their associated eigenvectors so that Axi = λixi, Axj = λj xj.
Pre-multiplying by the appropriate vectors,
xTj Axi = λixTj xi, xTi Axj = λj xTi xj.
We know that xTi xj = xTj xi (by properties of dot products). Because A is symmetric,
xTj Axi = xTi Axj. Combining,
(λi − λj )xTj xi = 0.
Since (λi − λj ) 6 = 0, we conclude that xi · xj = 0, the orthogonality we were looking for. The following uses basic linear algebra definitions.
Problem 1.15. If the n × n A has n distinct eigenvalues, then its eigenvectors form an orthonormal basis for Rn.
16 MAXWELL B. STINCHCOMBE
Recall that the principal sub-matrices of a symmetric n × n matrix A = (aij )i,j=1,... ,n are the m × m matrices (aij )i,j=1,... ,m, m ≤ n. The following is pretty obvious, but it’s useful anyway.
Problem 1.18. A is negative definite iff for all m ≤ n and all non-zero x having only the first m components not equal to 0 , xT^ Ax < 0.
Looking at m = 1, we must check if
(x 1 , 0 , 0 ,... , 0)
a 11 a 12 · · · a 1 n a 21 a 22 · · · a 2 n ... ... ... an 1 an 2 · · · ann
x 1 0 0 ... 0
= a 11 x^21 < 0.
This is true iff the first principal sub-matrix of A has the same sign as − 1 m^ = − 11 = −1. Looking at m = 2, we must check if
(x 1 , x 2 , 0 ,... , 0)
a 11 a 12 · · · a 1 n a 21 a 22 · · · a 2 n ... ... ... an 1 an 2 · · · ann
x 1 x 2 0 ... 0
This is true iff the matrix [ a 11 a 12 a 21 a 22
is negative definite, which is true iff all of its eigenvalues are negative. There are two eigenvalues, the product of two negative numbers is positive, so the m = 2 case is handled by having the sign of the determinant of the 2 × 2 principal submatrix being − 12. Looking at m = 3, we must check if
a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33
is negative definite, which is true iff all of its eigenvalues are negative. There are three eigenvalues, the product of three negative numbers is negative, so the m = 3 case is handled by having the sign of the determinant of the 3 × 3 principal submatrix being − 13. Continue in this fashion, and you have a proof of Theorem 1.6.2. Your job is to fill in the details for the negative semi-definite, the positive definite, and the positive semi-definite cases as well.
QUANTITATIVE METHODS 17
Problem 1.19. Prove Theorem 1.6.2.
QUANTITATIVE METHODS 19 A metric on cdf’s is ρ(F, G) = inf{ > 0 : ∀x ∈ R, G(x) ≤ F (x + ) + and F (x) ≤ G(x + ) + }. Levy ribbons and the triangle inequality. Weak∗^ convergence aka convergence in distribu- tion turns out to be equivalent to ρ-convergence. Let F∞ be the cdf of δ 0 , look at B(F, ), let Φ be the standard normal cdf, look at B(Φ, ). Example: Xt iid ± 1 12 each, FT the cdf of ST := T −^1
t≤T Xt,^ T^ an even number. By Tchebyshev we can show that for all > 0, P (|ST − 0 | > ) → 0, let F∞ be the cdf of δ 0 , and note that FT (0) 6 → F∞(0). Rather, FT (0) ≡ 12. Dfn continuity point of F using sequences. Dfn Fn →weak F iff Fn(x) → F (x) for all continuity points of F. Compare CLT.
Theorem 2.2.1. Fn →weak F iff ρ(Fn, F ) → 0.
2.3. Continuity. Metrics, τX is the collection of open sets, closedness, closed subsets of complete metric spaces are complete. Dfn: f : X → Y is cts if f −^1 (τY ) ⊂ τX. Lemma: cts iff f −^1 of the closed sets is a subset of the closed sets iff -δ iff sequence definition of ctuity. Equality of topologies with different metrics, completeness does not survive change of metrics, ρ(x, y) = |F (x) − F (y)|, F (r) = er/(1 + er^ ).
2.4. Compactness and the existence of optima.
2.5. The Theorem of the Maximum. For E a subset of X, define E^ = ∪x∈E B(x, ), this is the -ball around the set E. For compact A, B ⊂ X, define m(A, B) = inf{ > 0 : A ⊂ B}. The Hausdorff distance between compact sets is
d(A, B) = max{m(A, B), m(B, A)}.
Draw some pictures.
20 MAXWELL B. STINCHCOMBE
Definition 2.5.1. A compact-valued, non-empty-valued correspondence Γ : X −→ Y is
Theorem 2.5.2 (Theorem of the Maximum). If u : X × Y → R is cts and Γ : X −→ Y is cts, compact and non-empty valued, then v(x) := max{u(x, y) : y ∈ Γ(x)} is a cts function and x 7 → {y : u(x, y) ≥ u(x, Γ(x))} is an uhc correspondence.
Applications: consumer choice theory, producer theory, general equilibrium, game theory. We will see this theorem in dynamic programming too.
2.6. The Separating Hyperplane Theorem. Hyperplanes, separation, the theorem, the proof. Applications: the 2’nd Welfare Theorem (existence of prices), existence of Lagrange multipliers [pass through saddle points and the simplest form of the Kuhn-Tucker theorem]. We did the basic duality theorem with applications to the recovery of preferences and technology from demand and supply behavior.
2.7. Problems.
Problem 2.1. (^1) A(x) is the indicator function of a set A, taking the value 1 when x ∈ A and taking the value 0 otherwise. Show that (X, ρ) is a metric space when X is non-empty and ρ(x, y) = 1{x 6 =y}(x, y).
Problem 2.2. The closure of a subset E of a metric space (X, d) is denoted E, and is defined as the smallest closed set containing E. Show that the following are equivalent definitions of E:
{F : E ⊂ F, F closed}.