Foundations of Optimization - Lecture Notes, Lecture notes of Mathematics

Columbia Business School - First Year of the Doctoral Program in Decisions, Risk and Operations • Condensed Notes roughly following two courses I took - "Foundations of Optimization" (thought by Prof Ciamac Moallemi) and "Convex Optimization" (thought by Prof Garud Iyengar). These notes are also heavily based on Boyd and Vandenberghe's book "Convex Optimization" (available online) and Luenberger's "Optimization by Vector Space Methods". The chapter numbers in these notes refer to Boyd and Vandenberghe's text. Rough list of topics covered: convexity of sets and functions, formulation of convex programs (from linear programs to semi-definite programs), duality, applications, Hilbert and Banach spaces, minimum-norm problems in Banach spaces, the Hahn-Banach Theorem.

Typology: Lecture notes

Pre 2010

Available from 05/06/2023

tandhi-wahyono
tandhi-wahyono 🇮🇩

5

(15)

774 documents

1 / 50

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Foundations of Optimization Notes Page 1
Daniel Guetta
FOUNDATIONS OF OPTIMIZATION
Basics
Optimization problems
o An optimization problem is
minimise ( ) subject to fÎxx
f is the objective (real) is the constraint set/feasible set/search space.
o *
x is an optimal solution (global minimizer) if and only if
*()()
f
f
£"Îxxx
o Maximizing f(x) is equivalent to minimizing –f(x).
o We consider problems in the following form
minimize ( )
subject to (
() 0
)10
i
n
iim
gmi
f
r
h="££
£"££
Î
x
x
x
x
o We consider the following subsets of the problem
In linear programming, all functions are linear.
In convex programming, the f and g are convex, and the h are linear.
o If is the feasible set of a problem, a point Îx is a local minimum if there
exists a neighborhood ()
r
Nx such that () (())
r
ff N£"ÎÇyyxx. It is an
unconstrained local minimum if () ( )()
r
fNf£"Îyy xx. (Strict equivalents
exist).
Topology
o An open ball around a point n
Îx with radius r > 0 is the set
{
}
:() n
rrN Î-=<xy xy, where 2
i
xx.
o A point n
ÎÌx is an interior point if there exists an open ball such that
()
r
NÌx. A set n
Ì is open if int=.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32

Partial preview of the text

Download Foundations of Optimization - Lecture Notes and more Lecture notes Mathematics in PDF only on Docsity!

FOUNDATIONS OF O PTIMIZATION

Basics

 Optimization problems

o An optimization problem is minimise f ( ) subject to x x Î f is the objective (real)  is the constraint set/feasible set/search space. o x * is an optimal solution ( global minimizer ) if and only if f ( x *^ )£ f ( ) x " x Î o Maximizing f ( x ) is equivalent to minimizing – f ( x ). o We consider problems in the following form minimize ( ) subject to ( ( ) 0

i n

i i^ m g m i

f

r

h = " £ £ £ " £ £ Î

x

x

x

x

o We consider the following subsets of the problem  In linear programming , all functions are linear.  In convex programming, the f and g are convex, and the h are linear. o If  is the feasible set of a problem, a point x Î is a local minimum if there exists a neighborhood Nr ( ) x such that f ( x ) £ f ( ) y " y Î  Ç Nr ( x ). It is an unconstrained local minimum if f ( ) x £ f ( ) y " y Î Nr ( ) x. (Strict equivalents exist).

 Topology

o An open ball around a point x Î n with radius r > 0 is the set

Nr ( ) x = { y Î  n : x - y < r }, where x = å x i^2.

o A point x Î  Ì n is an interior point if there exists an open ball such that Nr ( ) x Ì. A set  Ì  n is open if  =int.

o A point x Î  Ì n is a closure point if, for every open ball N (^) r ( ) x , there exists y Î with y Î Nr ( ) x. A set  Ì  n is closed if  =cl. o The set of reals is both closed and open. o Theorems :  The union of open sets is open. The intersection of a finite number of open sets is open.  The intersection of closed sets is closed. The union of a finite number of closed sets is closed.

 Analysis

o A sequence of vectors (^) { x (^) n } Ì n converges to a limit x Î n if (^) k lim ¥ x - x (^) k = 0 , and we say that x (^) kx. o A set  Ì  n is (sequentially) compact if, given a sequence { x (^) k }Ì , there is a subsequence { x (^) k (^) i }converging to an element x Î.  Theorem (Heine-Borel): A set  Ì  n is compact if and only if it is closed and bounded.  Theorem : A closed subset of a compact set is compact.  Theorem : Suppose {  (^) n }are a sequence of non-empty, compact sets that are nested (ie:  n (^) + 1 Ì n ) – then their intersection is non-empty. o A real-valued function f defined on a domain  Ì  n is continuous at the point x Î  if, for every sequence { x (^) k }Ì with x k (^)  x , lim k ¥ f ( x (^) k ) = f ( ) x. f is continuous if it is continuous at all points in . o A function f is coercive over a set  Ì  n if, for every sequence { x (^) k (^) }Ì with x k (^)  ¥, we have lim k ¥ f ( x (^) k )= ¥. o The inverse image of the set  Ì  is defined by f -^1 ( ) = (^) { x Î  : f ( x )Î}.  Theorem : If f is continuous and  is open closed^ and  is open closed^ , then f -^1 (  ) is also open closed^. This is the standard way to prove that a set is open/closed.

1 0

( ) ( )^ , , ( ) n (^ )^ li m (^ i )^ ( ) i h

T n

f f^ f^ f^ h^ f x

f x xh  = éê^ ¶^ ¶ ùú Î êêë (^) ¶ ¶ úúû

¶ = +^ -

x x^  x^  x x^ e x

f is differentiable over an open set  Î if it is differentiable at every point in the set. If, in addition, the components of the gradient are continuous over  , then f is continuously differentiable over . o If, for a point x Îint , each component of the gradient is differentiable, we say f is twice differentiable at x , and we define the Hessian Matrix ^2 f ( x ) Î n^ ´ n by 2 2 ( ) ( ) i j (^) ij

f f x x

éê (^) ¶ ùú  = ê (^) ¶ ¶ ú êë úû

x^ x

If f is twice continuously differentiable in a neighborhood of x , then the Hessian is symmetric. o Suppose at f is twice continuously differentiable over a neighborhood N^ r ( ) x^ , then for all d Î Nr ( ) 0 1 2 2 f ( x + d ) = f ( ) x +  f ( ) x T^ d + 2 d Tf ( ) x d + o æççè^ d ö÷÷÷ø (Formally, this means that for every C > 0, there exists a neighborhood around 0 such that the estimate of f ( x + d ) differs from the real value by no more than C d^2. o Consider a vector-valued function F :    m^ , Ì n and a point x^ Îint^. We define the gradient to be the matrix  F x ( ) Î n^ ´ m with

1 ( ) ( ), , ( ) ( ) j ( ) m ij i

F F F F

x  = é^  ù  =^ ¶ êë úû (^) ¶ F x xx x^ x o The chain rule states that for interior points, if h x ( ) = g f x ( ( )) , then  h x ( ) =  f x ( )  g f x ( ( ))

 Linear algebra – Kernels and Images

o Consider a matrix A Î  m^ ´ n. Then  ker A = (^) { x Î n : A x = (^0) }  im A = (^) { y Î  m^ : y = A x x , Î n } o Given a set  Î  n , ^ = (^) { x Î  n : xy = 0 " y Î}

o Lemma : im A = éêë^ ker( A^ ^ )ùúû^. In other words, given z Î m , z = A x for some x Î  n^  zy = 0 " y with Ay = 0

 Sets, etc…

o Affine setsDefinition : A set  Ì  n is affine if, for all points x 1 (^) , x (^) 2 Î and a scalar l Î  , l x 1 (^) + (1 - l ) x (^) 2 Î.  Example : The empty space, a line and any subspace are affine. Similarly,  = (^) { x Î  n^ : A x = b }is affine.  Definition : Given a set of points  Ì  n , the affine hull aff  is the set of points l 1 x 1 (^) +  + lk x k , where k > 1, { x i (^) }Ì and (^) l + = 1. The affine hull is affine and is the smallest affine set containing . o Convex setsDefinition : The set  is convex if, for all points x 1 (^) , x 2 Î and a scalar l Î (0,1), l x (^) 1 + (1 - l ) x (^) 2 Î (^). Clearly, affine sets are also convex.  Definition : Given a set of points  Ì  n , the convex hull conv  is the set of points l 1 (^) x 1 (^) +  + lk x k , where k > 1, { x (^) i }Ì , li ³ 0 and l + = 1.  Theorem (scalar multiplication) : if  Ì  n is convex and a Î  , then a  = (^) { a x : x Î (^) } is convex.  Theorem (vector sum) : If  , Ì  n are convex sets, then the set  +  = (^) { x + y : x Î , y Î (^) } is also convex.  Theorem (affine transformations) : If  Ì  n is a convex set, A Î ^ m^ ´ n is a matrix and b Î m is a vector, then the set { A^ x^ +^ b^ : x^ Î^ }is a convex subset of^  m^.  Theorem : If ^ is an arbitrary collection of convex sets, then the intersection Ç Î  is also convex. o Miscellaneous definitions

 The L2-norm (Euclidean norm): x (^) 2 = (^) å^ ni = 1 xi^2 = xxx (^) G = x ^ G x (when G  0 and symmetric)  The p -norm: (^) ( )

1/ 1 |^ |

n (^) p^ p x = (^) å i = xi for p > 1  x (^) ¥ =max |{ x 1 | , , | xn |} Given a norm, the (closed) ball with centre x 0 and radius r > 0 is { x^ Î^  n^^ : x^ -^ x 0 £ r }, and it is convex.

o Theorem (projection) : Let  Ì  n be a closed and non-empty convex set, and

consider the Euclidean norm. Fix the vector x^^ Î n^. Consider the problem min s.t. Î n

Ì

z x z   For every x Î n , the problem has a unique global minimum x * called the projection of x onto . A vector x ¢ Î is equal to x * if and only if ( x - x ¢) (^ ⋅ z - x ¢) £ 0 " z Î Geometrically, the angle between x ¢  x and x ¢  z must be larger than 90o for all points in the set:

Proof : Existence follows from the fact z - x is coercive and  is closed. Uniqueness follows because minimizing z - x is equivalent to minimizing z - x^2 = zz - 2 zx + xx , which is strictly convex.

Now, consider that  f ( x *^ ) = 2( x *- x ). By necessary and sufficient conditions for convex optimization problems (derived later), the condition in the theorem must hold.

x^ ¢

x z

Application : Suppose we want to approximate f ( x ) over a set of points {^ x 1^ ,^ ,^ x m } using^ g ( )^ x^^ =^ å^ k  = 1 r^  f ( ) x , where the^^ fi^ are basis functions and^ r^ is a vector of weights. One way to do this is to solve the problem

{ }

2 min 1 ( ) ( ) s.t. ( ) is a linear combination of ( )

m i f^ i g i g f

=éêë^ - ùúû ⋅ ⋅

å x^ x  Consider the matrix F i (^) ,  = f  ( x i ) and the vector y , yi = f ( x (^) i ). This problem is equivalent to

{ }

min s.t. : k

Î F r Î

z r

y z  This is a projection problem, and so a unique optimizer exists.

Existence of solutions

Theorem – Sufficient Conditions (Weierstrass): Consider the problem min f ( ) s.t. x x Î  Ì n. Then if o  is non-empty o f is lower semicontinuous over  and one of the following conditions hold:

  1.  is compact
  2.  is closed, and f is coercive
  3. There exists a scalar g^ such that the level set (^ g^ )^ =^ { x^ Î^ :^ f ( ) x^ £ g } is nonempty and compact. then the set of optimal minimizing solutions of f is non-empty and compact. Proof : o 1 ^3 : define
  • (^) inf ( ) { } f = (^) x Î  f x Î È - ¥ (this always exists). Then, given g > f *, the level set { x Î  : f ( ) x £ g } must be non-empty. By the continuity of f, it is also closed. Thus, since ^ is compact, so is this set. o 2 ^3 : Define (^ g^ )^ =^ { x^ Î^ :^ f ( ) x^ £ g }. Since f is coercive, ( )^ g is non-empty and bounded for any g^ >^^ f *. Furthermore, since the domain of f (ie: ^ ) is

0

0

lim (^ )^ (^ ) 0

li (^ )^ (^ )

m ( )

f

f

f f

f f

a

a

a a a a a

x d x d x

d

d x

x

x (^) d

If x * is a global optimum, the LHS must be positive for small enough a.

Thus,  f ( x *) ⋅ d ³ 0. Since d is arbitrary, we must have  f ( x *) = 0.

 Second order : fix d Î n. For sufficiently small a :

12 2 2 *^2 1 2 2 *^2

2

T T

f f f o o

f f

a a a a a a

x + d x x d d x d d x d

If^ x^ * is a global optimum, the LHS must be positive for small enough a^ ,

and so 12 2 2 *^2 1 2 *^2 2 2

( ) (^ ) 0

T T

f o f o

a a a a

d x d d x d Taking limits as a  0 : d T^ ^2 f ( x *) d ³ 0 Since d is arbitrary, this leads to our result. o Theorem – sufficient conditions : Consider a point x *^ Îint. If f is twice

continuously differentiable in a neighborhood of x * , and  f ( x *^ ) = 0 ^2 f ( x *) 0 Then^ x^ * is a strict unconstrained local minimum. The geometric interpretation is as above – the only difference is that we now require a positive definite instead of a positive semi definite matrix. Proof : Let l > 0 be the smallest eigenvalue of ^2 f ( x *), and let d Î Nr ( ) \ { } 0 0 ( ) ( ) ( ) ( )

  • 12 * 2 12 *^2 1 2 2 2 (^22)

2

2 || || |^ ||

T T

f f o f

f o o

f

o

l l

æç ö÷÷ = ççç^ + ÷÷÷ ççè ÷÷

ø

    • x d d x d d d x d d d d d (^) d d

x d x

Now, for any g Î (0, l ), there exists e Î (0, r ]such that

2 2 with ||^ |

l (^) + o d ³ g (^) " d d < e d And this means that ( *^ ) ( *^ ) || ||^2 ( *) f f (^) 2 f x + d ³ x + g d > x

 Using the necessary conditions

o Verify there is a global minimum (using the existence theorem). o Find the set of possible unconstrained local minima using  f ( ) x = 0. o Compare these points with all points on the boundary  \ int. o Example : Consider min (^) x Î  n^12 x^ ^ G x - b x ^ and G  0. By an earlier theorem, global minima must exist. Furthermore,  \ int is empty, and so the global minimum must be an unconstrained local minimum. The first order necessary conditions immediately allow us to characterize that point as G x *- b = 0.

 Sensitivity analysis

o Consider the problem min f ( , x a ) s.t. x Î n. We let x * be a local optimum, and f *^ ( ) a = f ( x *( ), a a ). The first-order conditions are  x (^) f ( x *( ), a a )= 0 Taking the derivative with respect to a , we obtain  x *^ ( ) a  (^2) xx (^) f ( x *^ ( ), a a ) + ^2 xa f ( x *( ), a a ) = 0 From this expression, we can obtain expressions for the sensitivity of the optimum, and of the optimal value:  x *^ ( ) a = -^2 xa^ f ( x *^ ( ), a a ) (^) { ^2 xx^ f ( x *( ), a a )}- 1  f^ *^ ( ) a =  a (^) f ( x *^ ( ), a a ) =  x *^ ( ) a  (^) x (^) f ( x *^ ( ), a a ) +  a (^) f ( x *^ ( ), a a ) =  af ( x *( ), a a ) o The implicit function theorem tells us when this exists.

Constrained local optimality

 Consider the problem min f ( ) s.t. x x Î  Ì n. We are interested in characterizing local minima that are not in int . We will assume, though, that f is continuously differentiable in a neighborhood of the point considered.

( ) ( )

  • k^^ *^ k * k k k x - x =^ x^^ -^ x^ d + d = x^ - x d d z d And so we can re-write the above as

f ( x (^) k (^) ) = f ( x ) + x^^ k - d x^  f ( xk )⋅ d k

Now, if d Î( x *) as well, then  f ( x *) ⋅ d < 0. The strict inequality implies that this is also true in a neighborhood of x *, and so for k large enough, we get f ( x k ) < f ( x *). This contradicts the local minimality of x *.  Unfortunately,  is hard to characterize algebraically, unless we focus on the particular example where  is the intersection of equality constraints.

Equality constrained optimization

 Consider the problem min f ( ) s.t. x h x ( ) = 0 x , Î n where h :  n^  m. We assume the f and h (^) i are continuously differentiable in a neighborhood of the local minimum.  In this particular case, we will show we can characterize  in a simple way. The intuition behind our result is that for any feasible x , d Î n and a > 0 h x ( + a d ) » h ( ) x + ah x ( )  d = ah ( ) xd So intuitively, one might expected that any direction for which  h x ( )  d^ = 0 to maintain feasibility. We now formalize this statement…  Definition : the cone of first-order feasible variations at x *^ Î n is the set ( x *^ ) = (^) { d Î  n : h x ( *^ )  d^ = (^0) } = éêëker^  h x ( *)ùúû Note that d Î  ( x *^ )  - d Î( x *). As such,  ( x *)is actually a subspace of  n.  Definition : A point x *^ Î n is a regular point if it is feasible and the constraint gradients  h (^) i ( x *) are linearly independent. In other words,  h x ( *) ¹ 0. If m > n , no regular points exist, and if m = 1, this reduces to  h 1 ( x *)¹ 0.  Lemma (regularity) : Let x * be a regular point. Then  ( x *^ ) =( x *) Proof : This theorem is hard. The intuition behind the proof is o Consider the curve we would trace if we were sitting at a point x * and we started walking forward or backwards while staying on the constraint (ie: while keeping the constraint satisfied). We’ll start by showing that for any direction

d Î( x *), there is such a path that starts by walking forward or backward along the direction d. o Once we’ve established this, the result is relatively easy, because the path constitutes a “walk” fully contained in our set  which eventually ends up being in the direction d. It’s therefore in . And now the painful details! First, let’s find the curve in question: o Begin by choosing d Î ( x *). Given a scalar t , consider the curve x ( ) t = x *+ t d. This satisfies our requirement that we be moving either side of x *, and that we start by going in direction d. However, there’s no guarantee we stay on the constraints. o Instead, consider the path x ( t ) = x *^ + t d +  h x ( *) ( ) u t for some unknown vector u ( ) t Î m. This seems sensible – we are correcting our path to reflect how h might change. For x ( t ) to be “valid”, we require it to satisfy the m equations h x ( *^ + t d +  h x ( *) ( ) u t )= 0 For t = 0, u (0) = 0 is clearly a solution.

Now, take the gradient of the boxed equation with respect to u and evaluate it at ( t , u ) = 0. We get  h x ( *^ )  h x ( *) Since the columns of  h x ( *)are linearly independent, this matrix is invertible. The two results above allow us to use the implicit function theorem to deduce that a solution u ( t ) to the boxed equation exists for all t Î -( t t , ), for some t.

Thus, we have managed to find a curve x ( )^ t that keeps us on the constraints and that is defined over t Î -( t t , ) with x (0) = x * (this implies that the curve represents moving forward and backward from x *). o All we now need to prove is that the initial direction in which we move is d. To do that, differentiate the boxed equation above with respect to t and evaluate at t = 0. We get

Or in other words, we require  f ( x *)to be in  ( x *)^:  f ( *^ ) Î ( *^ )^ = éê^ ker  ( *^ ) ùú^= im  ( *) x x (^) ë h x (^) û h x   Or in other words, there exists l Î m such that  f ( x *^ ) =  h x ( *) l. Proof : Since x * is a local minimum, ( x *^ ) Ç  ( x *)= Æ, and since x * is regular, ( x *^ ) Ç  ( x *)= Æ. Now, assume d Î( x *) – by what we have such said, d Ï( x *), and so  f ( x *) ⋅ d ³ 0. However, since we also have - d Î( x *), we must have  f ( x *) ⋅ d = 0. For the last part of the theorem, note that im A = (ker A ^ )^, as proved in the introductory section of these notes.  The last part of the previous theorem is important, because it provides a “simple” way to characterize the tangent cone, and a “recipe” to find optimal points. This can be formalized further using…

 …Lagrange Multipliers

o Theorem – necessary conditions : If x * is a local minimum that is a regular point, then there exists a unique vector l *^ Î m called a Lagrange multiplier such that  f ( x *^ ) + l *^  h x ( *^ ) =  f ( x *^ ) + (^) å^ mi = 1 li *^  h (^) i ( x *)= 0 In addition, if f and h are twice continuously differentiable d ^ ( ^2 f ( x *^ ) + (^) å^ mi = 1 li *^ ^2 hi ( x *^ )) d ³ 0 " d Î( x *)

There is an interesting geometrical interpretation of the first-order condition. It effectively states that  f ( x *) [the direction in which we might increase our objective] must be a linear combination of the  h (^) i ( x *) [the perpendicular to the constraints hi ( x * ) = 0 ]. Since we cannot move along any of those perpendiculars without leaving the constraints, we clearly cannot move along  f ( x *). Here is an example, in which  f^ ( ) x^ is constant:

Proof : The existence of l * is simply a restatement of the previous theorem. The uniqueness of l * follows from the fact that the columns of  h x ( *) are linearly independent. For the second-order condition, consider a d Î( x *), and use the first part of the regularity lemma to define a path x ( ) t either side of x *, which always stays on the constraints and such that x  (0) = d. Now, define g t ( ) = f ( ( )) x t and take a double derivative g t ( ) = x  ( ) t ^ ^2 f ( ( )) ( ) x t xt + x ( ) t   f ( ( )) x t Since all points x ( t ) satisfy the constraints of the problem, and x * is a local minimum, t = 0 must be an unconstrained local minimum of g ( t ). Thus g (0) = d ^2 f ( x *^ ) d + x (0)   f ( x *) ³ 0 Finally, consider ( ) t = l * h^ ( ( )) x t = 0 and differentiate it twice, to get (0) (^) = d ^ ( å^ mi = 1 li ^2 h^ i ( x *^ )) d + x (0) h ( x *^ ) l *= 0 Finally, add the last two equations, and apply the first order condition. o We define the Lagrangian as  ( x , l ) = f ( ) x + lh x ( ) The first and second order conditions then reduce to

2 * * *

( , ) 0 d ( )

 ³ " Î

x xx

x d x d x

ll

And the feasibility condition is given by  (^) l ( x *^ , l *) = 0

^2

{ x^ :^ h x ( )^^ =^0 }  f ( ) x (Darker shading implies larger value of f )

h x ( *)

x^ *

multipliers under weaker assumptions called constraint qualifications. If the constraints are linear, for example, Lagrange multipliers are guaranteed to exist. The weakest form of constraint qualification is quasiregularity , which requires that ( x *^ ) = ( x *). o Theorem – Sufficient Conditions : Assume that f and h are both twice

continuously differentiable, and that x *^ Î n and l *^ Î m satisfy

2 * * *

( , ) 0 ( ) \ { }

L

L

L

 > " Î

x

xx

x 0 x 0 d x d d x 0

l

l l  (^) l  Then x * is a strict local minimum. Proof : The second condition above implies that x * is clearly feasible. Suppose it is not a strict local minimum; then there exists a sequence { x (^) k } Ì n such that x (^) k ¹ x * and x k (^)  x * which lies entirely in the feasible region of the problem [ie: h x ( (^) k (^) )= 0 ] and f ( x kf ( x *). We define, for some d

k kk *^ dk k^0

d x^ x d x x x x Now, by the mean value theorem, there exists x  Î [ x *, x k ]with h x ( (^) k (^) ) - h ( x *^ ) =  h x ( (^) k (^) ) (^ x k - x *) =  h ( xk ) ( d (^) k d k ) But since x * and x k are feasible, h x ( (^) k (^) ) = h x ( *)= 0 , so.  h x (  k (^) ) d k = 0 Taking the limit as k  ¥ , we get  h x ( *) d^ = 0 , and so d Î ( x *). Now, we know that h x ( (^) k (^) ) = 0 f x ( (^) k (^) ) - f ( x *) £ 0 Using a second order Taylor expansion (with remainder) with some set of x ˆ i Î [ x k , x *^ ], we can re-write these as hi ( x k (^) ) = hi ( x *) + dkh (^) i ( x *^ ) ⋅ d k (^) + 12 dk^2 d^ k  ^2 hi ( x ˆ i ) d k = 0 f ( x (^) k ) - f ( x^ *^ ) = dkf ( x *^ ) ⋅ d k (^) + 12 dk^2^ d k  ^2 f ( x ˆ^0 ) d (^) k £ 0 We can modify the first set of equations slightly by remembering that h x ( *) = 0 , and multiplying both sides of the equation by l i *. This gives

hi ( x (^) k ) = dk li *  h (^) i ( x *^ ) ⋅ d k (^) + 12 dk^2 d^ kli *^ ^2 hi ( x ˆ i ) d (^) k = 0 Adding these m + 1 equations, we get ( ) ( ) ( )

  • 1 * * 12 2 2 0 1 * 2
    • 12 2 2 0 1 * 2

( ) ( ) ( ˆ^ ) ( ˆ) 0

m (^) i i k k k m (^) i i i k i i m (^) i k k k

k k (^) i i i k

f h f h L f h

d d

l d l d l

= =

å å x å

x x d d x x d x l d d x x d

 

Noting that, by the first order conditions,  x L ( x *^ , l * ) and then dividing by 12 dk^2 and taking the limit as k  ¥ , this becomes d  ( ^2 f ( x *^ ) + (^) å^ mi = 1 li *^ ^2 hi ( x *)) d £ 0 But since d Î ( x *) \ { } 0 , this violates our assumed second order condition. o We now consider an application of these conditions. Consider the program min (^) x Î  n s^2 = x^ G^ x s.t. 1 x ^ = 1, mx = m which might represent minimizing the variance in a portfolio while keeping total sales equal to 1 unit, and keeping the expected return equal to a certain value m. The first-order conditions give 2 G x *^ + l 1 * 1 + l 2 *^ m = 0 1 x ^ *^ = 1, mx *= m From the first equation, we obtain

  • 12 1 1 * 12 1 2 *
  • 12 1 1 * 12 1 2 *
  • 12 1 1 * 12 1 2 *

l l l l l l m

= - G - G

= - G - G =

= - G - G =

x 1 1 x 1 1 1 x 1

m m m m m m

      The last two equations are a system of equations for ( l^1 *^ ,^ l 2^ *): 1 1 1 * 1 1 2 *

l l m

æç (^) G G öæ÷÷ (^) ç ö÷÷ (^) æç ö÷÷

  • ççç^ G G ÷÷÷ ççç^ ÷÷÷ =çççç ÷÷÷ çè ÷øè ç ÷ø è ø

m m m m

    this system is nonsingular provided that G  0 and 1 and m are linearly independent. We then get 1 * 1 1 2 *^2

l h z m l h^ z m

æç ö÷÷ (^) æç (^) + ö÷÷ ççç (^) ÷÷÷ =ççç + ÷÷÷ çè ÷ø çè ø Where the constants depend on G and m. Now, using the first equation in the FOCs, we obtain, for some vectors v and w x^ *^ = m v + w