Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Foundations of Optimization - Lecture Notes, Lecture notes of Mathematics

Mathematics

Columbia Business School - First Year of the Doctoral Program in Decisions, Risk and Operations • Condensed Notes roughly following two courses I took - "Foundations of Optimization" (thought by Prof Ciamac Moallemi) and "Convex Optimization" (thought by Prof Garud Iyengar). These notes are also heavily based on Boyd and Vandenberghe's book "Convex Optimization" (available online) and Luenberger's "Optimization by Vector Space Methods". The chapter numbers in these notes refer to Boyd and Vandenberghe's text. Rough list of topics covered: convexity of sets and functions, formulation of convex programs (from linear programs to semi-definite programs), duality, applications, Hilbert and Banach spaces, minimum-norm problems in Banach spaces, the Hahn-Banach Theorem.

Typology: Lecture notes

Pre 2010

Available from 05/06/2023

tandhi-wahyono 🇮🇩

(15)

774 documents

1 / 50

This page cannot be seen from the preview

Don't miss anything!

Foundations of Optimization Notes Page 1

Daniel Guetta

FOUNDATIONS OF OPTIMIZATION

Basics

 Optimization problems

o An optimization problem is

minimise ( ) subject to fÎxx

f is the objective (real)  is the constraint set/feasible set/search space.

o *

x is an optimal solution (global minimizer) if and only if

*()()

£"Îxxx

o Maximizing f(x) is equivalent to minimizing –f(x).

o We consider problems in the following form

minimize ( )

subject to (

() 0

)10

iim

gmi

h="££

£"££



o We consider the following subsets of the problem

 In linear programming, all functions are linear.

 In convex programming, the f and g are convex, and the h are linear.

o If  is the feasible set of a problem, a point Îx is a local minimum if there

exists a neighborhood ()

Nx such that () (())

ff N£"ÎÇyyxx. It is an

unconstrained local minimum if () ( )()

fNf£"Îyy xx. (Strict equivalents

exist).

 Topology

o An open ball around a point n

Îx with radius r > 0 is the set

{

}

:() n

rrN Î-=<xy xy, where 2

x=åx.

o A point n

ÎÌx is an interior point if there exists an open ball such that

()

NÌx. A set n

Ì is open if int=.

Partial preview of the text

Download Foundations of Optimization - Lecture Notes and more Lecture notes Mathematics in PDF only on Docsity!

FOUNDATIONS OF O PTIMIZATION

Basics

 Optimization problems

o An optimization problem is minimise f ( ) subject to x x Î f is the objective (real)  is the constraint set/feasible set/search space. o x * is an optimal solution ( global minimizer ) if and only if f ( x *^ )£ f ( ) x " x Î o Maximizing f ( x ) is equivalent to minimizing – f ( x ). o We consider problems in the following form minimize ( ) subject to ( ( ) 0

i n

i i^ m g m i

h = " £ £ £ " £ £ Î

o We consider the following subsets of the problem  In linear programming , all functions are linear.  In convex programming, the f and g are convex, and the h are linear. o If  is the feasible set of a problem, a point x Î is a local minimum if there exists a neighborhood Nr ( ) x such that f ( x ) £ f ( ) y " y Î  Ç Nr ( x ). It is an unconstrained local minimum if f ( ) x £ f ( ) y " y Î Nr ( ) x. (Strict equivalents exist).

 Topology

o An open ball around a point x Î n with radius r > 0 is the set

Nr ( ) x = { y Î  n : x - y < r }, where x = å x i^2.

o A point x Î  Ì n is an interior point if there exists an open ball such that Nr ( ) x Ì. A set  Ì  n is open if  =int.

o A point x Î  Ì n is a closure point if, for every open ball N (^) r ( ) x , there exists y Î with y Î Nr ( ) x. A set  Ì  n is closed if  =cl. o The set of reals is both closed and open. o Theorems :  The union of open sets is open. The intersection of a finite number of open sets is open.  The intersection of closed sets is closed. The union of a finite number of closed sets is closed.

 Analysis

o A sequence of vectors (^) { x (^) n } Ì n converges to a limit x Î n if (^) k lim ¥ x - x (^) k = 0 , and we say that x (^) k  x. o A set  Ì  n is (sequentially) compact if, given a sequence { x (^) k }Ì , there is a subsequence { x (^) k (^) i }converging to an element x Î.  Theorem (Heine-Borel): A set  Ì  n is compact if and only if it is closed and bounded.  Theorem : A closed subset of a compact set is compact.  Theorem : Suppose {  (^) n }are a sequence of non-empty, compact sets that are nested (ie:  n (^) + 1 Ì n ) – then their intersection is non-empty. o A real-valued function f defined on a domain  Ì  n is continuous at the point x Î  if, for every sequence { x (^) k }Ì with x k (^)  x , lim k ¥ f ( x (^) k ) = f ( ) x. f is continuous if it is continuous at all points in . o A function f is coercive over a set  Ì  n if, for every sequence { x (^) k (^) }Ì with x k (^)  ¥, we have lim k ¥ f ( x (^) k )= ¥. o The inverse image of the set  Ì  is defined by f -^1 ( ) = (^) { x Î  : f ( x )Î}.  Theorem : If f is continuous and  is open closed^ and  is open closed^ , then f -^1 (  ) is also open closed^. This is the standard way to prove that a set is open/closed.

1 0

( ) ( )^ , , ( ) n (^ )^ li m (^ i )^ ( ) i h

T n

f f^ f^ f^ h^ f x

f x x  h  = éê^ ¶^ ¶ ùú Î êêë (^) ¶ ¶ úúû

¶ = +^ -

x x^  x^  x x^ e x

f is differentiable over an open set  Î if it is differentiable at every point in the set. If, in addition, the components of the gradient are continuous over  , then f is continuously differentiable over . o If, for a point x Îint , each component of the gradient is differentiable, we say f is twice differentiable at x , and we define the Hessian Matrix ^2 f ( x ) Î n^ ´ n by 2 2 ( ) ( ) i j (^) ij

f f x x

éê (^) ¶ ùú  = ê (^) ¶ ¶ ú êë úû

x^ x

If f is twice continuously differentiable in a neighborhood of x , then the Hessian is symmetric. o Suppose at f is twice continuously differentiable over a neighborhood N^ r ( ) x^ , then for all d Î Nr ( ) 0 1 2 2 f ( x + d ) = f ( ) x +  f ( ) x T^ d + 2 d T  f ( ) x d + o æççè^ d ö÷÷÷ø (Formally, this means that for every C > 0, there exists a neighborhood around 0 such that the estimate of f ( x + d ) differs from the real value by no more than C d^2. o Consider a vector-valued function F :    m^ , Ì n and a point x^ Îint^. We define the gradient to be the matrix  F x ( ) Î n^ ´ m with

1 ( ) ( ), , ( ) ( ) j ( ) m ij i

F F F F

x  = é^  ù  =^ ¶ êë úû (^) ¶ F x x  x x^ x o The chain rule states that for interior points, if h x ( ) = g f x ( ( )) , then  h x ( ) =  f x ( )  g f x ( ( ))

 Linear algebra – Kernels and Images

o Consider a matrix A Î  m^ ´ n. Then  ker A = (^) { x Î n : A x = (^0) }  im A = (^) { y Î  m^ : y = A x x , Î n } o Given a set  Î  n , ^ = (^) { x Î  n : x ⋅ y = 0 " y Î}

o Lemma : im A = éêë^ ker( A^ ^ )ùúû^. In other words, given z Î m , z = A x for some x Î  n^  z ⋅ y = 0 " y with A  y = 0

 Sets, etc…

o Affine sets  Definition : A set  Ì  n is affine if, for all points x 1 (^) , x (^) 2 Î and a scalar l Î  , l x 1 (^) + (1 - l ) x (^) 2 Î.  Example : The empty space, a line and any subspace are affine. Similarly,  = (^) { x Î  n^ : A x = b }is affine.  Definition : Given a set of points  Ì  n , the affine hull aff  is the set of points l 1 x 1 (^) +  + lk x k , where k > 1, { x i (^) }Ì and (^) l + = 1. The affine hull is affine and is the smallest affine set containing . o Convex sets  Definition : The set  is convex if, for all points x 1 (^) , x 2 Î and a scalar l Î (0,1), l x (^) 1 + (1 - l ) x (^) 2 Î (^). Clearly, affine sets are also convex.  Definition : Given a set of points  Ì  n , the convex hull conv  is the set of points l 1 (^) x 1 (^) +  + lk x k , where k > 1, { x (^) i }Ì , li ³ 0 and l + = 1.  Theorem (scalar multiplication) : if  Ì  n is convex and a Î  , then a  = (^) { a x : x Î (^) } is convex.  Theorem (vector sum) : If  , Ì  n are convex sets, then the set  +  = (^) { x + y : x Î , y Î (^) } is also convex.  Theorem (affine transformations) : If  Ì  n is a convex set, A Î ^ m^ ´ n is a matrix and b Î m is a vector, then the set { A^ x^ +^ b^ : x^ Î^ }is a convex subset of^  m^.  Theorem : If ^ is an arbitrary collection of convex sets, then the intersection Ç Î  is also convex. o Miscellaneous definitions

 The L2-norm (Euclidean norm): x (^) 2 = (^) å^ ni = 1 xi^2 = x ⋅ x  x (^) G = x ^ G x (when G  0 and symmetric)  The p -norm: (^) ( )

1/ 1 |^ |

n (^) p^ p x = (^) å i = xi for p > 1  x (^) ¥ =max |{ x 1 | , , | xn |} Given a norm, the (closed) ball with centre x 0 and radius r > 0 is { x^ Î^  n^^ : x^ -^ x 0 £ r }, and it is convex.

o Theorem (projection) : Let  Ì  n be a closed and non-empty convex set, and

consider the Euclidean norm. Fix the vector x^^ Î n^. Consider the problem min s.t. Î n

Ì

z x z   For every x Î n , the problem has a unique global minimum x * called the projection of x onto . A vector x ¢ Î is equal to x * if and only if ( x - x ¢) (^ ⋅ z - x ¢) £ 0 " z Î Geometrically, the angle between x ¢  x and x ¢  z must be larger than 90o for all points in the set:

Proof : Existence follows from the fact z - x is coercive and  is closed. Uniqueness follows because minimizing z - x is equivalent to minimizing z - x^2 = z ⋅ z - 2 z ⋅ x + x ⋅ x , which is strictly convex.

Now, consider that  f ( x *^ ) = 2( x *- x ). By necessary and sufficient conditions for convex optimization problems (derived later), the condition in the theorem must hold.

x^ ¢

x z

Application : Suppose we want to approximate f ( x ) over a set of points {^ x 1^ ,^ ,^ x m } using^ g ( )^ x^^ =^ å^ k  = 1 r^  f ( ) x , where the^^ fi^ are basis functions and^ r^ is a vector of weights. One way to do this is to solve the problem

{ }

2 min 1 ( ) ( ) s.t. ( ) is a linear combination of ( )

m i f^ i g i g f

=éêë^ - ùúû ⋅ ⋅

å x^ x  Consider the matrix F i (^) ,  = f  ( x i ) and the vector y , yi = f ( x (^) i ). This problem is equivalent to

{ }

min s.t. : k

Î F r Î

z r

y z  This is a projection problem, and so a unique optimizer exists.

Existence of solutions

 Theorem – Sufficient Conditions (Weierstrass): Consider the problem min f ( ) s.t. x x Î  Ì n. Then if o  is non-empty o f is lower semicontinuous over  and one of the following conditions hold:

 is compact
 is closed, and f is coercive
There exists a scalar g^ such that the level set (^ g^ )^ =^ { x^ Î^ :^ f ( ) x^ £ g } is nonempty and compact. then the set of optimal minimizing solutions of f is non-empty and compact. Proof : o 1 ^3 : define

(^) inf ( ) { } f = (^) x Î  f x Î È - ¥ (this always exists). Then, given g > f *, the level set { x Î  : f ( ) x £ g } must be non-empty. By the continuity of f, it is also closed. Thus, since ^ is compact, so is this set. o 2 ^3 : Define (^ g^ )^ =^ { x^ Î^ :^ f ( ) x^ £ g }. Since f is coercive, ( )^ g is non-empty and bounded for any g^ >^^ f *. Furthermore, since the domain of f (ie: ^ ) is

lim (^ )^ (^ ) 0

li (^ )^ (^ )

m ( )

f f

a a a a a



x d x d x

d x

x (^) d

If x * is a global optimum, the LHS must be positive for small enough a.

Thus,  f ( x *) ⋅ d ³ 0. Since d is arbitrary, we must have  f ( x *) = 0.

 Second order : fix d Î n. For sufficiently small a :

12 2 2 *^2 1 2 2 *^2

T T

f f f o o

f f

a a a a a a

x + d x x d d x d d x d

If^ x^ * is a global optimum, the LHS must be positive for small enough a^ ,

and so 12 2 2 *^2 1 2 *^2 2 2

( ) (^ ) 0

T T

f o f o

a a a a

d x d d x d Taking limits as a  0 : d T^ ^2 f ( x *) d ³ 0 Since d is arbitrary, this leads to our result. o Theorem – sufficient conditions : Consider a point x *^ Îint. If f is twice

continuously differentiable in a neighborhood of x * , and  f ( x *^ ) = 0 ^2 f ( x *) 0 Then^ x^ * is a strict unconstrained local minimum. The geometric interpretation is as above – the only difference is that we now require a positive definite instead of a positive semi definite matrix. Proof : Let l > 0 be the smallest eigenvalue of ^2 f ( x *), and let d Î Nr ( ) \ { } 0 0 ( ) ( ) ( ) ( )

12 * 2 12 *^2 1 2 2 2 (^22)

2 || || |^ ||

T T

f f o f

f o o

l l

æç ö÷÷ = ççç^ + ÷÷÷ ççè ÷÷

- x d d x d d d x d d d d d (^) d d

x d x

Now, for any g Î (0, l ), there exists e Î (0, r ]such that

2 2 with ||^ |

l (^) + o d ³ g (^) " d d < e d And this means that ( *^ ) ( *^ ) || ||^2 ( *) f f (^) 2 f x + d ³ x + g d > x

 Using the necessary conditions

o Verify there is a global minimum (using the existence theorem). o Find the set of possible unconstrained local minima using  f ( ) x = 0. o Compare these points with all points on the boundary  \ int. o Example : Consider min (^) x Î  n^12 x^ ^ G x - b x ^ and G  0. By an earlier theorem, global minima must exist. Furthermore,  \ int is empty, and so the global minimum must be an unconstrained local minimum. The first order necessary conditions immediately allow us to characterize that point as G x *- b = 0.

 Sensitivity analysis

o Consider the problem min f ( , x a ) s.t. x Î n. We let x * be a local optimum, and f *^ ( ) a = f ( x *( ), a a ). The first-order conditions are  x (^) f ( x *( ), a a )= 0 Taking the derivative with respect to a , we obtain  x *^ ( ) a  (^2) xx (^) f ( x *^ ( ), a a ) + ^2 xa f ( x *( ), a a ) = 0 From this expression, we can obtain expressions for the sensitivity of the optimum, and of the optimal value:  x *^ ( ) a = -^2 xa^ f ( x *^ ( ), a a ) (^) { ^2 xx^ f ( x *( ), a a )}- 1  f^ *^ ( ) a =  a (^) f ( x *^ ( ), a a ) =  x *^ ( ) a  (^) x (^) f ( x *^ ( ), a a ) +  a (^) f ( x *^ ( ), a a ) =  af ( x *( ), a a ) o The implicit function theorem tells us when this exists.

Constrained local optimality

 Consider the problem min f ( ) s.t. x x Î  Ì n. We are interested in characterizing local minima that are not in int . We will assume, though, that f is continuously differentiable in a neighborhood of the point considered.

( ) ( )

k^^ *^ k * k k k x - x =^ x^^ -^ x^ d + d = x^ - x d d z d And so we can re-write the above as

f ( x (^) k (^) ) = f ( x ) + x^^ k - d x^  f ( x  k )⋅ d k

Now, if d Î( x *) as well, then  f ( x *) ⋅ d < 0. The strict inequality implies that this is also true in a neighborhood of x *, and so for k large enough, we get f ( x k ) < f ( x *). This contradicts the local minimality of x *.  Unfortunately,  is hard to characterize algebraically, unless we focus on the particular example where  is the intersection of equality constraints.

Equality constrained optimization

 Consider the problem min f ( ) s.t. x h x ( ) = 0 x , Î n where h :  n^  m. We assume the f and h (^) i are continuously differentiable in a neighborhood of the local minimum.  In this particular case, we will show we can characterize  in a simple way. The intuition behind our result is that for any feasible x , d Î n and a > 0 h x ( + a d ) » h ( ) x + a  h x ( )  d = a  h ( ) x  d So intuitively, one might expected that any direction for which  h x ( )  d^ = 0 to maintain feasibility. We now formalize this statement…  Definition : the cone of first-order feasible variations at x *^ Î n is the set ( x *^ ) = (^) { d Î  n : h x ( *^ )  d^ = (^0) } = éêëker^  h x ( *)ùúû Note that d Î  ( x *^ )  - d Î( x *). As such,  ( x *)is actually a subspace of  n.  Definition : A point x *^ Î n is a regular point if it is feasible and the constraint gradients  h (^) i ( x *) are linearly independent. In other words,  h x ( *) ¹ 0. If m > n , no regular points exist, and if m = 1, this reduces to  h 1 ( x *)¹ 0.  Lemma (regularity) : Let x * be a regular point. Then  ( x *^ ) =( x *) Proof : This theorem is hard. The intuition behind the proof is o Consider the curve we would trace if we were sitting at a point x * and we started walking forward or backwards while staying on the constraint (ie: while keeping the constraint satisfied). We’ll start by showing that for any direction

d Î( x *), there is such a path that starts by walking forward or backward along the direction d. o Once we’ve established this, the result is relatively easy, because the path constitutes a “walk” fully contained in our set  which eventually ends up being in the direction d. It’s therefore in . And now the painful details! First, let’s find the curve in question: o Begin by choosing d Î ( x *). Given a scalar t , consider the curve x ( ) t = x *+ t d. This satisfies our requirement that we be moving either side of x *, and that we start by going in direction d. However, there’s no guarantee we stay on the constraints. o Instead, consider the path x ( t ) = x *^ + t d +  h x ( *) ( ) u t for some unknown vector u ( ) t Î m. This seems sensible – we are correcting our path to reflect how h might change. For x ( t ) to be “valid”, we require it to satisfy the m equations h x ( *^ + t d +  h x ( *) ( ) u t )= 0 For t = 0, u (0) = 0 is clearly a solution.

Now, take the gradient of the boxed equation with respect to u and evaluate it at ( t , u ) = 0. We get  h x ( *^ )  h x ( *) Since the columns of  h x ( *)are linearly independent, this matrix is invertible. The two results above allow us to use the implicit function theorem to deduce that a solution u ( t ) to the boxed equation exists for all t Î -( t t , ), for some t.

Thus, we have managed to find a curve x ( )^ t that keeps us on the constraints and that is defined over t Î -( t t , ) with x (0) = x * (this implies that the curve represents moving forward and backward from x *). o All we now need to prove is that the initial direction in which we move is d. To do that, differentiate the boxed equation above with respect to t and evaluate at t = 0. We get

Or in other words, we require  f ( x *)to be in  ( x *)^:  f ( *^ ) Î ( *^ )^ = éê^ ker  ( *^ ) ùú^= im  ( *) x x (^) ë h x (^) û h x   Or in other words, there exists l Î m such that  f ( x *^ ) =  h x ( *) l. Proof : Since x * is a local minimum, ( x *^ ) Ç  ( x *)= Æ, and since x * is regular, ( x *^ ) Ç  ( x *)= Æ. Now, assume d Î( x *) – by what we have such said, d Ï( x *), and so  f ( x *) ⋅ d ³ 0. However, since we also have - d Î( x *), we must have  f ( x *) ⋅ d = 0. For the last part of the theorem, note that im A = (ker A ^ )^, as proved in the introductory section of these notes.  The last part of the previous theorem is important, because it provides a “simple” way to characterize the tangent cone, and a “recipe” to find optimal points. This can be formalized further using…

 …Lagrange Multipliers

o Theorem – necessary conditions : If x * is a local minimum that is a regular point, then there exists a unique vector l *^ Î m called a Lagrange multiplier such that  f ( x *^ ) + l *^  h x ( *^ ) =  f ( x *^ ) + (^) å^ mi = 1 li *^  h (^) i ( x *)= 0 In addition, if f and h are twice continuously differentiable d ^ ( ^2 f ( x *^ ) + (^) å^ mi = 1 li *^ ^2 hi ( x *^ )) d ³ 0 " d Î( x *)

There is an interesting geometrical interpretation of the first-order condition. It effectively states that  f ( x *) [the direction in which we might increase our objective] must be a linear combination of the  h (^) i ( x *) [the perpendicular to the constraints hi ( x * ) = 0 ]. Since we cannot move along any of those perpendiculars without leaving the constraints, we clearly cannot move along  f ( x *). Here is an example, in which  f^ ( ) x^ is constant:

Proof : The existence of l * is simply a restatement of the previous theorem. The uniqueness of l * follows from the fact that the columns of  h x ( *) are linearly independent. For the second-order condition, consider a d Î( x *), and use the first part of the regularity lemma to define a path x ( ) t either side of x *, which always stays on the constraints and such that x  (0) = d. Now, define g t ( ) = f ( ( )) x t and take a double derivative g t ( ) = x  ( ) t ^ ^2 f ( ( )) ( ) x t x  t + x ( ) t   f ( ( )) x t Since all points x ( t ) satisfy the constraints of the problem, and x * is a local minimum, t = 0 must be an unconstrained local minimum of g ( t ). Thus g (0) = d ^2 f ( x *^ ) d + x (0)   f ( x *) ³ 0 Finally, consider ( ) t = l * h^ ( ( )) x t = 0 and differentiate it twice, to get (0) (^) = d ^ ( å^ mi = 1 li ^2 h^ i ( x *^ )) d + x (0) h ( x *^ ) l *= 0 Finally, add the last two equations, and apply the first order condition. o We define the Lagrangian as  ( x , l ) = f ( ) x + l ⋅ h x ( ) The first and second order conditions then reduce to

2 * * *

( , ) 0 d ( )

 ³ " Î

x xx

x d x d x

l  l

And the feasibility condition is given by  (^) l ( x *^ , l *) = 0

^2

{ x^ :^ h x ( )^^ =^0 }  f ( ) x (Darker shading implies larger value of f )

 h x ( *)

x^ *

multipliers under weaker assumptions called constraint qualifications. If the constraints are linear, for example, Lagrange multipliers are guaranteed to exist. The weakest form of constraint qualification is quasiregularity , which requires that ( x *^ ) = ( x *). o Theorem – Sufficient Conditions : Assume that f and h are both twice

continuously differentiable, and that x *^ Î n and l *^ Î m satisfy

2 * * *

( , ) 0 ( ) \ { }

L

 > " Î

x 0 x 0 d x d d x 0

l l  (^) l  Then x * is a strict local minimum. Proof : The second condition above implies that x * is clearly feasible. Suppose it is not a strict local minimum; then there exists a sequence { x (^) k } Ì n such that x (^) k ¹ x * and x k (^)  x * which lies entirely in the feasible region of the problem [ie: h x ( (^) k (^) )= 0 ] and f ( x k )£ f ( x *). We define, for some d

k kk *^ dk k^0

d x^ x d x x x x Now, by the mean value theorem, there exists x  Î [ x *, x k ]with h x ( (^) k (^) ) - h ( x *^ ) =  h x ( (^) k (^) ) (^ x k - x *) =  h ( x  k ) ( d (^) k d k ) But since x * and x k are feasible, h x ( (^) k (^) ) = h x ( *)= 0 , so.  h x (  k (^) ) d k = 0 Taking the limit as k  ¥ , we get  h x ( *) d^ = 0 , and so d Î ( x *). Now, we know that h x ( (^) k (^) ) = 0 f x ( (^) k (^) ) - f ( x *) £ 0 Using a second order Taylor expansion (with remainder) with some set of x ˆ i Î [ x k , x *^ ], we can re-write these as hi ( x k (^) ) = hi ( x *) + dk  h (^) i ( x *^ ) ⋅ d k (^) + 12 dk^2 d^ k  ^2 hi ( x ˆ i ) d k = 0 f ( x (^) k ) - f ( x^ *^ ) = dk  f ( x *^ ) ⋅ d k (^) + 12 dk^2^ d k  ^2 f ( x ˆ^0 ) d (^) k £ 0 We can modify the first set of equations slightly by remembering that h x ( *) = 0 , and multiplying both sides of the equation by l i *. This gives

hi ( x (^) k ) = dk li *  h (^) i ( x *^ ) ⋅ d k (^) + 12 dk^2 d^ k  li *^ ^2 hi ( x ˆ i ) d (^) k = 0 Adding these m + 1 equations, we get ( ) ( ) ( )

1 * * 12 2 2 0 1 * 2
- 12 2 2 0 1 * 2

( ) ( ) ( ˆ^ ) ( ˆ) 0

m (^) i i k k k m (^) i i i k i i m (^) i k k k

k k (^) i i i k

f h f h L f h

d d

l d l d l

= =

å å x å

x x d d x x d x l d d x x d

 

Noting that, by the first order conditions,  x L ( x *^ , l * ) and then dividing by 12 dk^2 and taking the limit as k  ¥ , this becomes d  ( ^2 f ( x *^ ) + (^) å^ mi = 1 li *^ ^2 hi ( x *)) d £ 0 But since d Î ( x *) \ { } 0 , this violates our assumed second order condition. o We now consider an application of these conditions. Consider the program min (^) x Î  n s^2 = x^ G^ x s.t. 1 x ^ = 1, m  x = m which might represent minimizing the variance in a portfolio while keeping total sales equal to 1 unit, and keeping the expected return equal to a certain value m. The first-order conditions give 2 G x *^ + l 1 * 1 + l 2 *^ m = 0 1 x ^ *^ = 1, m  x *= m From the first equation, we obtain

12 1 1 * 12 1 2 *
12 1 1 * 12 1 2 *
12 1 1 * 12 1 2 *

l l l l l l m

= - G - G

= - G - G =

x 1 1 x 1 1 1 x 1

m m m m m m

      The last two equations are a system of equations for ( l^1 *^ ,^ l 2^ *): 1 1 1 * 1 1 2 *

l l m

æç (^) G G öæ÷÷ (^) ç ö÷÷ (^) æç ö÷÷

ççç^ G G ÷÷÷ ççç^ ÷÷÷ =çççç ÷÷÷ çè ÷øè ç ÷ø è ø

m m m m

    this system is nonsingular provided that G  0 and 1 and m are linearly independent. We then get 1 * 1 1 2 *^2

l h z m l h^ z m

æç ö÷÷ (^) æç (^) + ö÷÷ ççç (^) ÷÷÷ =ççç + ÷÷÷ çè ÷ø çè ø Where the constants depend on G and m. Now, using the first equation in the FOCs, we obtain, for some vectors v and w x^ *^ = m v + w

Foundations of Optimization - Lecture Notes, Lecture notes of Mathematics

Related documents

Partial preview of the text

Download Foundations of Optimization - Lecture Notes and more Lecture notes Mathematics in PDF only on Docsity!

FOUNDATIONS OF O PTIMIZATION

Basics

 Optimization problems

 Topology

Nr ( ) x = { y Î  n : x - y < r }, where x = å x i^2.

 Analysis

¶ = +^ -

F F F F

 Linear algebra – Kernels and Images

 Sets, etc…

Ì

Existence of solutions

If x * is a global optimum, the LHS must be positive for small enough a.

 Second order : fix d Î n. For sufficiently small a :

If^ x^ * is a global optimum, the LHS must be positive for small enough a^ ,

( ) (^ ) 0

2 || || |^ ||

 Using the necessary conditions

 Sensitivity analysis

Constrained local optimality

Equality constrained optimization

 …Lagrange Multipliers

 ³ " Î

^2

( , ) 0 ( ) \ { }

L

L

L

 > " Î

( ) ( ) ( ˆ^ ) ( ˆ) 0

= =

= - G - G

= - G - G =

= - G - G =