Nonlinear Optimization: Solving Equations & Estimating Logit Demand Models, Slides of Advanced Algorithms

Nonlinear unconstrained optimization, specifically focusing on solving systems of nonlinear equations and estimating logit demand models. It covers the concept of Newton's method, the challenges of solving nonlinear systems of equations, and the use of the Newton-Raphson method for finding roots. The document also includes examples and code snippets.

Typology: Slides

2020/2021

Uploaded on 06/11/2021

markzck
markzck 🇺🇸

4.2

(10)

253 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1.204 Lecture 21
Nonlinear unconstrained optimization:
First order conditions: Newton’s method
Estimating a logit demand model
Nonlinear unconstrained optimization
Network equilibrium was a
constrained
nonlinear
Network equilibrium was a
constrained
nonlinear
optimization problem
Nonnegativity constraints on flows
Equality constraints on O-D flows
Other variations (transit, variable demand) have
inequality constraints
In these two lectures we examine unconstrained
nonlinear optimization problems
No constraints of any sort on the problem; we just find
the global minimum or maximum of the function
Lagrangians can be used to write constrained problems
as unconstrained problems, but it’s usually best to
handle the constraints explicitly for computation
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Nonlinear Optimization: Solving Equations & Estimating Logit Demand Models and more Slides Advanced Algorithms in PDF only on Docsity!

1.204 Lecture 21

Nonlinear unconstrained optimization:

First order conditions: Newton’s method

Estimating a logit demand model

Nonlinear unconstrained optimization

  • Network equilibrium was aNetwork equilibrium was a constrainedconstrained nonlinearnonlinear

optimization problem

  • Nonnegativity constraints on flows
  • Equality constraints on O-D flows
  • Other variations (transit, variable demand) have

inequality constraints

  • In these two lectures we examine unconstrained

nonlinear optimization problems

  • No constraints of any sort on the problem; we just find

the global minimum or maximum of the function

  • Lagrangians can be used to write constrained problems

as unconstrained problems, but it’s usually best to

handle the constraints explicitly for computation

Solving systems of nonlinear equations

  • One way to solve for max z(x), where x is aOne way to solve for max z(x), where x is a

vector, is to find the first derivatives, set them equal to zero, and solve the resulting system of nonlinear equations

  • This is the simplest approach and, if the problem

is convex (any line between two points on the boundary of the feasible space stays entirely inboundary of the feasible space stays entirely in the feasible space), it is ‘good enough’

  • We will estimate a binary logit demand model

with this approach in this lecture

  • We’ll use a true nonlinear unconstrained minimization

algorithm in the next lecture, which is a better way

Solving nonlinear systems of

equations is hard

  • Press,Press, Numerical RecipesNumerical Recipes :: “There areThere are nono good,good,

general methods for solving systems of more than one nonlinear equation. Furthermore, it is not hard to see why (very likely) there never will be any good, general methods.”

  • “Consider the case of two dimensions, where we

want to solve simultaneously”want to solve simultaneously

  • f(x, y)= 0
  • g(x, y)= 0

Nonlinear minimization is easier

• There are efficient general techniques for finding

a minimum of a function of many variablesy

  • Minimization is not the same as finding roots of n first

order equations (∂z/∂x= 0 for all x n variables)

  • Components of gradient vector (∂z/∂x) are not

independent, arbitrary functions

  • Obey ‘integrability conditions’: You can always find a

minimum by going downhill on a single surface

  • There is no analogous concept for finding the root of Ng p g

nonlinear equations

• We will cover constrained minimization methods

in next lecture

  • Nonlinear root finder has easier code but is less capable
  • Nonlinear minimization has harder code but works well From Press

Newton-Raphson for nonlinear system

• We have n equations in n variables x i :

f ( ) 0

• Near each x value, we can expand fi using a

Taylor series:

• Matrix of partial derivatives is Jacobian J:

fi ( x 0 , x 1 , x 2 ,..., xn − 1 )= 0

=

1

0

( ) ( ) (^2 )

n

j

j j

i

i i x O x

x

f

f x δ x f x δ δ

• In matrix notation our expansion is:

j

i ij

x

f

J

f ( x +∂ x )= f ( x )+ J ⋅∂ x + O (∂ x^2 )

Newton-Raphson

∂f/∂x

f(x)

Initial guess of root

In multiple dimensions, simultaneously

Newton-Raphson, p. 2

  • Ignore O(∂x^2 ) terms and set f(x+∂x)= 0 to find a set

of linear equations for the correctionsof linear equations for the corrections ∂∂x to movex to move each function in f closer to zero simultaneously:

  • We solve this system using Gaussian elimination

or LU decomposition

  • We add the corrections to the previous solution

and iterate until we converge:

J ⋅δ x =− f

and iterate until we converge:

  • If high order derivatives are large or first derivative is small, Newton can fail miserably - Converges quickly if assumptions met

x '= x + δ x

d bl [] f d bl [ l th]

i t T i l 20

SimpleModel

// Solve x^2 + xy= 10 and y + 3xy^2 = 57 public class SimpleModel implements MathFunction2 { public double[] func(double[] x) { double[] f= new double[x.length]; f[0]= x[0]x[0] + x[0]x[1] - 10; f[1]= x[1] + 3x[0]x[1]x[1] - 57; return f; } public double[][] jacobian(double[] x) { int n= x.length; double[][] j= new double[n][n]; j[0][0]= 2x[0] + x[1]; j[0][1]= x[0]; j[1][0]= 3x[1]x[1]; j[1][1]= 1 + 6x[0]x[1]; return j; } }

SimpleModelTest

public class SimpleModelTest { public static void main(String[] args) { SimpleModel s= new SimpleModel(); int nTrial= 20; double[] x= {1.5, 3.5}; // Initial guess x= Newton.mnewt(nTrial, x, s); for (double d : x) System.out.println(d); } }

// Finds solution {2, 3}

Logit demand models

  • Mode choice example for work trip
    • Individual has choice between transit and auto
    • We assume the utility of each choice is a linear function
      • *U= β 0 + β 1 IVTT + β 2 * Walk + β 3 * Wait + β 4 * Cost
    • The probability pi that a traveler chooses mode i is

In vehicle time Walk time Wait time Cost Auto 20 3 0 1000 Transit 15 17 4 150

i j j^ i

i U U U U

U

e e e

e p i

=

= 1

1 ()

Estimating a logit model

From Ben-Akiva, Lerman

P (^) n(i)

0 V^ in^ -V^ jn

Figure by MIT OpenCourseWare.

Observationnumber Auto time Transit time (^) alternativeChosen 1 2 3 (^45) 6 7 (^89) 10 (^1112) 13 14 (^1516) 17 18 19 (^2021)

56.251.

89.941.

99.118.

22.551.

95.141.

31.620.

2.224.

8.4 84

74.183.

22.291.

Transit Transit Auto TransitTransit Auto Auto TransitTransit Transit TransitAuto Auto Transit AutoAuto Transit Auto Auto TransitAuto

Data for Simple Binary Example

Simple Binary Example

Auto utility, V (^) An 1 Transit travel time [min] Transit utility, VTn 0 Transit travel time [min]

β 1 β 2

Figure by MIT OpenCourseWare.

DemandModel: jacobian, logLikelihood

public double[][] jacobian(double[] beta) { int n= y.length; int k= beta.length; double[][] jac= new double[k][k]; for (int j= 0; j < k; j++) for (int jj= 0; jj < k; jj++) for (int i= 0; i < n; i++) jac[j][jj] -= (p[i](1-p[i]))x[i][j]x[i][jj]; return jac; } public double logLikelihood(double[] beta) { int n= y.length; int k= beta.length; double result= 0.0; for (int i= 0; i < n; i++) { // Compute utility double util= 0; for (int j= 0; j < k; j++) util += beta[j]x[i][j]; p[i]= 1/(1 + Math.exp(-util)); // Compute estimated prob result += y[i]Math.log(p[i]) + (1-y[i])Math.log(1-p[i]); } return result;}

DemandModel: print

public void print(double log0, double logB, double[] beta, double[][] fjac) { int n= fjac.length; // 2 nd^ derivatives give var-covar matrix double[][] variance= Gauss.invert(fjac); for (int i= 0; i < n; i++) for (int j= 0; j < n; j++) variance[i][j]= -variance[i][j]; for (int i= 0; i < beta.length; i++) System.out.println("Coefficient "+ i + " : "+ beta[i]+ " Std. dev. "+ Math.sqrt(variance[i][i])); System.out.println("\nLog likelihood(0) "+ log0); System.out.println("Log likelihood(B) " + logB); System.out.println("-2[L(0)-L(B)] " + -2.0*(log0-logB)); System.out.println("Rho^2 " + (1.0 - logB/log0)); System.out.println("Rho-bar^2 " + (1.0 - (logB- beta.length)/log0)); System.out.println("\nVariance-covariance matrix"); for (int i= 0; i < n; i++) { for (int j= 0; j < n; j++) System.out.print(variance[i][j]+" "); System.out.println(); } }

{1 4 1 28 5} // A d ll th b

DemandModelTest

public class DemandModelTest { public static void main(String[] args) { double[][] x= { {1, 52.9 - 4.4}, {1, 4.1 - 28.5}, // And all other obs }; // Note we use the difference in times // 0: transit chosen, 1: auto chosen double[] y= {0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1};

DemandModel d= new DemandModel(x, y); int nTrial= 20; // Max Newton iterations double[] beta= {0, 0}; // Initial guess double log0= d.logLikelihood(beta); // Minor tweak to Newton: add getFjac() method beta= NewtonForDemand.mnewt(nTrial, beta, d); double logB= d.logLikelihood(beta); d.print(log0, logB, beta, NewtonForDemand.getFjac()); } }

DemandModelTest output

Iteration 0 coeff 0 : -0.06081971708 coeff 1 : -0. Iteration 1 coeff 0 : -0.14520466978 coeff 1 : -0. Iteration 2 coeff 0 : -0.21506935954 coeff 1 : -0. Iteration 3 coeff 0 : -0.23641429578 coeff 1 : -0. Iteration 4 coeff 0 : -0.23757284839 coeff 1 : -0. Iteration 5 coeff 0 : -0.23757544483 coeff 1 : -0. Iteration 6 coeff 0 : -0.23757544484 coeff 1 : -0.

Coefficient 0 : -0.237575444848 Std. dev. 0. Coefficient 1 : -0.053109827465 Std. dev. 0.

Log likelihood(0) -14. LogLog likelihood(B)likelihood(B) -6.16604221246. -2[L(0)-L(B)] 16. Rho^2 0. Rho-bar^2 0.

Variance-covariance matrix 0.56321517575 0. 0.00254981359 4.2610367391E-

Summary

  • This model is convex, so convergence is easierThis model is convex, so convergence is easier

than many nonlinear models

  • Demand model variations can be more difficult to solve
  • We cover direct minimization methods next lecture,

some of which give more control in solving harder

problems

  • We didn’t need a good first guess here, but we

almost always do

  • Generate good first guesses through analytical

approximations (as in lecture 23 and homework 8)