Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Optimization Techniques: Cost Functions, Bracketing Methods, and Gradient Descent - Prof. , Study notes of Computer Science

University of Maryland Computer Science

Prof. Ramalingam Chellappa

Various optimization techniques, including cost functions, bracketing methods in one and multiple dimensions, and gradient descent. It discusses how to find the global minimum and local minimum of a cost function, the importance of derivative information, and different methods for bracketing a minimum in one and multiple dimensions. The document also introduces the downhill simplex method and basic calculus concepts, such as the direction of maximum increase and critical points of a function.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-sx5 🇺🇸

10 documents

1 / 19

This page cannot be seen from the preview

Don't miss anything!

Optimization - 2

CMSC828 D

Discover Study notes of Computer Science University of Maryland

Partial preview of the text

Download Optimization Techniques: Cost Functions, Bracketing Methods, and Gradient Descent - Prof. and more Study notes Computer Science in PDF only on Docsity!

Optimization - 2

CMSC828 D

Outline

Cost functions (last class)
Given a cost function we can calculate
- The global minimum
- A local minimum
Algorithms can be classified according to
- Derivative information available/not available or expensive
  - Derivatives via finite-differences
- Linear or nonlinear
- Local minimum or global minimum
- Differential or “statistical”
- Constrained or Unconstrained
Read Chapter 10-0 of Numerical Recipes.
Focus will not be on details but educated use of these

routines as black-boxes.

Bracketing a minimum in multiple dimensions

Smallest region bounded by a group of points in
- 1D is bounded by two points (a line segment)
- 2D is bounded by three points (a triangle)
- 3D by four points (a tetrahedron)
- In N D by N+1 points (a simplex)
Can find a direction of a decreasing function in
- 1D by the line from point with higher value to lower
- 2D by joining point with highest value through point with average value on the opposite side of the triangle
- And so on for N D
However cannot guarantee a bracket of a minimum in N D

Downhill Simplex Method (Nelder-Mead)

Reflection: Project along the

direction of decrease with size 1.

Reflection and expansion:If

decrease is large try a step of

size 2.

Contraction: Result of reflection

is bad, so try a simple reduction

within simplex.

Multiple contraction: If result of

contraction does not give a better

result than lowest point.

Conclude: volume of simplex

becomes below tolerance.

Newton’s Method

If f ( x ) is a scalar valued function of n variables x
- No way to get n equations from one equation above
- Use steepest descent methods
However in optimization problems we are usually solving

for the minimum of a scalar valued function of multiple

variables f ( x ), where x is an n dimensional vector

We need to solve an equation of the type g ( x )= ∇ f=
Same prescription works but now ∇g is a matrix called the Jacobian matrix
Solve the equation to get corrections and iterate
However note that we are actually computing Hessian of f

f ( x + h ) = f ( xi + hi ) = f ( xi ) + h fi i ( xi )= 0

( ) (^) j ( (^) i i ) (^) j ( (^) i ) (^) i j 0 i

g g x h g x h x

∂

= + = + = ∂

g x h

Gradient Descent

We have a function f and an estimate of its gradient ∇ f
Decrease f by a quantity along the direction of ∇ f
- Begin initialize x, tol, k= do k<-k+ x x-h k ∇ f until h k ∇ f< tol` return x end
Determining h is not easy
- Called “learning rate” in AI
- Hard to determine h
  - If h is too small algorithm will be too slow to converge. If it is too large the procedure will diverge
  - Can select it using a line search or using a Newton method.

Function Evaluations

Often evaluating the function is hard
- Crash a car to measure a data point
Analytical expressions for the derivatives are harder, and

very much prone to programming error.

Analytical derivatives should always be compared with finite difference estimates for accuracy
Often derivatives are evaluated using finite differences.
Recall f/^ = h-1^ ( f(x+h)-f(x)) => 2 function evaluations
For an n dimensional function we need at least n+1 function evaluations to get the derivative
However recall that this is the least accurate
Promising research area : Use chain rule and semantic

parsing of functions to perform automatic differentiation

Powell’s method

Sometimes it is not possible to estimate the derivative ∇ f to obtain the direction in a steepest descent method
First guess, minimize along one coordinate axis, then along other and so on.Repeat
Can be very slow to converge
Conjugate directions: Directions which are independent of each other so that minimizing along each one does not move away from the minimum in the other directions.
Powell introduced a method to obtain conjugate directions without computing the derivative.

Use the fact that there is a routine available to calculate f

and the Jacobian ∇ f to calculate iteratively approaximations

to the minimum

Conjugate gradients performs minimizations in conjugate directions without constructing A
Quasi Newton methods construct approximations to A -1^ iteratively
Black boxes, as far as this course is concerned.
Generally only worth it when we are

in the vicinity of a minumum.

For nonlinear problems they often

converge to a local minimum away

from the true one.

Conjugate gradient and quasi-newton

Return to problem of model

fitting by minimizing

As before set
Observation: steepest descent methods move faster (per

function evaluation) far away from the minimum while

Newton methods do well near it.

Idea combine them so that the method adapts according to

the location in parameter space.

Usually for model fitting it is not too difficult to calculate

derivatives

Levenberg Marquardt

LM Algorithm

When the algorithm has converged set λ=0 and

compute the final solution

Constrained optimization

We have to optimize f(x) subject to g(x)=
- Makes sense if g(x)=0 leaves a few degrees of freedom ( N-M )
Approach 1 (Eliminate constraints)
- Eliminate variables using constraint equations and solve a reduced problem f(x *^ )= 0
- Not practical, except for simple problems
Approach 2 (Penalty function)
- Construct a new minimization function f(x)+Pg(x) where P>>
- If constraint is violated the minimization function increases rapidly, forcing the optimization routine to solutions where it is not violated
Approach 3 (Lagrange Multipliers)
- Solution has to lie on the surface of g(x)=
- Can’t have ∇ f =0 anymore
- However we require ∇ f parallel to ∇ g=

Linear programming

Black box in this course
Solve problems with systems of linear equality and inequality constraints

Optimization Techniques: Cost Functions, Bracketing Methods, and Gradient Descent - Prof. , Study notes of Computer Science

Related documents

Partial preview of the text

Download Optimization Techniques: Cost Functions, Bracketing Methods, and Gradient Descent - Prof. and more Study notes Computer Science in PDF only on Docsity!

Optimization - 2

CMSC828 D

Outline

routines as black-boxes.

Downhill Simplex Method (Nelder-Mead)

direction of decrease with size 1.

decrease is large try a step of

size 2.

is bad, so try a simple reduction

within simplex.

contraction does not give a better

result than lowest point.

becomes below tolerance.

Newton’s Method

for the minimum of a scalar valued function of multiple

variables f ( x ), where x is an n dimensional vector

f ( x + h ) = f ( xi + hi ) = f ( xi ) + h fi i ( xi )= 0

Gradient Descent

Function Evaluations

very much prone to programming error.

parsing of functions to perform automatic differentiation

Powell’s method

and the Jacobian ∇ f to calculate iteratively approaximations

to the minimum

in the vicinity of a minumum.

converge to a local minimum away

from the true one.

Conjugate gradient and quasi-newton

fitting by minimizing

function evaluation) far away from the minimum while

Newton methods do well near it.

the location in parameter space.

derivatives

Levenberg Marquardt

LM Algorithm

compute the final solution

Constrained optimization

Linear programming