Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Notes on Numerical Optimization | Lecture Notes 24 | MATH 693A, Study notes of Mathematics

San Diego State University (SDSU)Mathematics

Material Type: Notes; Subject: Mathematics; University: San Diego State University; Term: Spring 2004;

Typology: Study notes

Pre 2010

Uploaded on 03/28/2010

koofers-user-xbe 🇺🇸

10 documents

1 / 19

This page cannot be seen from the preview

Don't miss anything!

Summary

Orthogonal Distance Regression

Numerical Optimization

Lecture Notes #24

Nonlinear Least Squares Problems — Orthogonal Distance

Regression

Peter Blomgren,

h[email protected]i

Department of Mathematics and Statistics

Dynamical Systems Group

Computational Sciences Research Center

San Diego State University

San Diego, CA 92182-7720

http://terminus.sdsu.edu/

Fall 2009

Peter Blomgren, h[email protected]iOrthogonal Distance Regression — (1/19)

Discover Study notes of Mathematics San Diego State University (SDSU)

Partial preview of the text

Download Notes on Numerical Optimization | Lecture Notes 24 | MATH 693A and more Study notes Mathematics in PDF only on Docsity!

Summary Orthogonal Distance Regression

Numerical Optimization

Lecture Notes # Nonlinear Least Squares Problems — Orthogonal Distance Regression

Peter Blomgren, 〈[email protected]〉 Department of Mathematics and Statistics Dynamical Systems Group Computational Sciences Research Center San Diego State University San Diego, CA 92182- http://terminus.sdsu.edu/

Fall 2009

Summary Orthogonal Distance Regression

Outline

(^1) Summary Linear Least Squares Nonlinear Least Squares

(^2) Orthogonal Distance Regression Error Models Weigthed Least Squares / Orthogonal Distance Regression ODR = Nonlinear Least Squares, Exploiting Structure

Summary Orthogonal Distance Regression Linear Least Squares Nonlinear Least Squares

Summary: Nonlinear Least Squares 1 of 4

Problem: Nonlinear Least Squares

¯x∗^ = arg min ¯x∈Rn^ [f (¯x)] = arg min ¯x∈Rn

∑^ m

rj (¯x)^2

 (^) , m ≥ n,

where the residuals rj (¯x) are of the form rj (¯x) = yj − Φ(¯x; tj ). Here, yj are the measurements taken at the locations/times tj , and Φ(¯x; tj ) is our model.

The key approximation for the Hessian

∇^2 f(¯x) = J(¯x)T^ J(¯x) +

∑^ m

rj (¯x)∇^2 rj (¯x) ≈ J(¯x)TJ(¯x).

Summary Orthogonal Distance Regression Linear Least Squares Nonlinear Least Squares

Summary: Nonlinear Least Squares 2 of 4

Line-search algorithm: Gauss-Newton, with the subproblem: [ J(¯xk )T^ J(¯xk )

]

¯pGN k = −∇f (¯xk ).

Guaranteed descent direction, fast convergence (as long as the Hessian approximation holds up) equivalence to a linear least squares problem (used for efficient, stable solution).

Summary Orthogonal Distance Regression Linear Least Squares Nonlinear Least Squares

Summary: Nonlinear Least Squares 4 of 4

Hybrid Algorithms:

When implementing Gauss-Newton or Levenberg-Marquardt, we should implement a safe-guard for the large residual case, where the Hessian approximation fails.
If, after some reasonable number of iterations, we realize that the residuals are not going to zero, then we are better off switching to a general-purpose algorithm for non-linear opti- mization, such as a quasi-Newton (BFGS), or Newton method.

Summary Orthogonal Distance Regression

Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure

Fixed Regressor Models vs. Errors-In-Variables Models

So far we have assumed that there are no errors in the variables describing where / when the measurements are made, i.e. in the data set {tj , yj } where tj denote times of measurement, and yj the measured value, we have assumed that tj are exact, and the measurement errors are in yj. Under this assumption, the discrepancies between the model and the measured data are

ǫj = yj − Φ(¯x; tj ), i = 1, 2 ,... , m

Today, we will take a look at the situation where we take errors in tj into account — these models are known as errors-in-variables models, and their solutions in the linear case are referred to as total least squares optimization, or in the non-linear case as orthogonal distance regression.

Summary Orthogonal Distance Regression

Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure

Orthogonal Distance Regression

For the mathematical formulation of orthogonal distance regression we introduce perturbations (errors) δj for the variables tj , in addition to the errors ǫj for the yj ’s. We relate the measurements and the model in to following way

ǫj = yj − Φ(¯x; tj + δj ),

and define the minimization problem:

(¯x∗, ¯δ∗) = arg min ¯x,¯δ

∑^ m

j= 1

[

w^2 j

[

yj − Φ(¯x; tj + δj)

] 2

d^2 j δj^2

]

where d¯ and w¯ are two vectors of weights which denote the relative significance of the error terms.

Summary Orthogonal Distance Regression

Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure

Orthogonal Distance Regression: The Weights

The weight-vectors ¯d and w¯ must either be supplied by the modeler, or estimated in some clever way. If all the weights are the same wj = dj = C, then each term in the sum is simply the shortest distance between the point (tj , yj ) and curve Φ(¯x; t) (as illustrated in the previous figure). In order to get the orthogonal-looking figure, I set wj = 1/ 0 .5 and dj = 1/4, thus adjusting for the different scales in the t- and y -directions. The shortest path between the point and the curve will be normal (orthogonal) to the curve at the point of intersection. We can think of the scaling (weighting) as adjusting for measuring time in fortnights, seconds, milli-seconds, micro-seconds, or nano-seconds...

Summary Orthogonal Distance Regression

Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure

Orthogonal Distance Regression → Least Squares

If we take a cold hard stare at the expression

(¯x∗, ¯δ∗) = arg min ¯x,¯δ

∑^2 m

rj (¯x, ¯δ)^2 = arg min ¯x,¯δ

‖¯r(¯x, δ¯)‖^22

We realize that this is now a standard least squares problem with 2 m terms and (n + m) unknowns — {¯x, ¯δ}. We can use any of the techniques we have previously explored for the solution of the nonlinear least squares problem. However, a straight-forward implementation of these strategies may prove to be quite expensive, since the number of parameters have doubled to 2m and the number of independent variables have grown from n to (n + m). Recall that usually m ≫ n, so this is a drastic growth of the problem.

Summary Orthogonal Distance Regression

Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure

ODR → Least Squares: Exploiting Structure

Fortunately we can save a lot of work by exploiting the structure of the Jacobian of the Least Squares problem originating from the orthogonal distance regression — many entries are zero!

∂rj ∂δi = wj^ ∂[yj^ −^ Φ(¯x;^ tj^ +^ δj^ )] ∂δi = 0, ∀i, j ≤ m, i 6 = j ∂rj ∂δi = ∂[dj−mδj−m] ∂δi =

{ 0 i 6 = (j − m), j > m dj−m i = (j − m), j > m ∂rj ∂xi = ∂[dj−mδj−m] ∂xi = 0, i = 1, 2 ,... , n, j > m

Let vj = wj^ ∂[yj^ −^ Φ(¯x;^ tj^ +^ δj^ )] ∂δj , and let D = diag(¯d), and V = diag(¯v), then we can write the Jacobian of the residual function in matrix form...

Summary Orthogonal Distance Regression

Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure

ODR → Least Squares → Levenberg-Marquardt 1 of 2

If we partition the step vector ¯p, and the residual vector ¯r into

¯p =

[

¯px ¯pδ

]

, ¯r =

[

˜r 1 ˜r 2

]

where p¯x ∈ Rn, ¯pδ ∈ Rm, and ˜r 1 ,˜r 2 ∈ Rm, then e.g. we can write the Levenberg-Marquardt subproblem in partitioned form

[̂ JT^ ̂J + λIn ̂JT^ V V ̂J V 2 + D^2 + λIm

] [

¯px ¯pδ

]

[̂

JT˜r 1 V˜r 1 + D˜r 2

]

Since the (2, 2)-block V 2 + D^2 + λIm is diagonal, we can eliminate the ¯pδ variables from the system...

Summary Orthogonal Distance Regression

Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure

ODR → Least Squares → Levenberg-Marquardt 2 of 2

[̂ JT^ ̂J + λIn ̂JT^ V V ̂J V 2 + D^2 + λIm

] [ ¯px ¯pδ

] = −

[̂ JT˜r 1 V˜r 1 + D˜r 2

]

¯pδ = −

[ V 2 + D^2 + λIm

]− 1 [ (V˜r 1 + D˜r 2 ) + V ̂J¯px

]

This leads to the n × n-system A¯px = ¯b, where

A =

[̂ JT^ ̂J + λIn − ̂JT^ V

[ V 2 + D^2 + λIm

]− 1 V ̂J

]

¯b =

[ − ̂JT˜r 1 + ̂JT^ V

[ V 2 + D^2 + λIm

]− 1 [ V˜r 1 + D˜r 2

]] .

Hence, the total cost of finding the LM-step is only marginally more expensive than for the standard least squares problem.

Summary Orthogonal Distance Regression

Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure

Software and References

MINPACK Implements the Levenberg-Marquardt algorithm. Available for free from http://www.netlib.org/minpack/. ODRPACK Implements^ the^ orthogonal^ distance^ regres- sion algorithm. Available for free from http://www.netlib.org/odrpack/. Other The NAG (Numerical Algorithms Group) library and HSL (formerly the Harwell Subroutine Library), implement several robust nonlinear least squares implementations. GvL Golub and van Loan’s^ Matrix Computations, 3rd edition (chapter 5) has a comprehensive discussion on orthogonal- ization and least squares; explaining in gory detail much of the linear algebra (e.g. the SVD and QR-factorization) we swept under the rug.

Notes on Numerical Optimization | Lecture Notes 24 | MATH 693A, Study notes of Mathematics

Related documents

Partial preview of the text

Download Notes on Numerical Optimization | Lecture Notes 24 | MATH 693A and more Study notes Mathematics in PDF only on Docsity!

Numerical Optimization

]

[

[

] 2

]

[

]

[

]

] [

]

[̂

]