











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Subject: Mathematics; University: San Diego State University; Term: Spring 2004;
Typology: Study notes
1 / 19
This page cannot be seen from the preview
Don't miss anything!












Summary Orthogonal Distance Regression
Lecture Notes # Nonlinear Least Squares Problems — Orthogonal Distance Regression
Peter Blomgren, 〈[email protected]〉 Department of Mathematics and Statistics Dynamical Systems Group Computational Sciences Research Center San Diego State University San Diego, CA 92182- http://terminus.sdsu.edu/
Fall 2009
Summary Orthogonal Distance Regression
Outline
(^1) Summary Linear Least Squares Nonlinear Least Squares
(^2) Orthogonal Distance Regression Error Models Weigthed Least Squares / Orthogonal Distance Regression ODR = Nonlinear Least Squares, Exploiting Structure
Summary Orthogonal Distance Regression Linear Least Squares Nonlinear Least Squares
Summary: Nonlinear Least Squares 1 of 4
Problem: Nonlinear Least Squares
¯x∗^ = arg min ¯x∈Rn^ [f (¯x)] = arg min ¯x∈Rn
∑^ m
j=
rj (¯x)^2
(^) , m ≥ n,
where the residuals rj (¯x) are of the form rj (¯x) = yj − Φ(¯x; tj ). Here, yj are the measurements taken at the locations/times tj , and Φ(¯x; tj ) is our model.
The key approximation for the Hessian
∇^2 f(¯x) = J(¯x)T^ J(¯x) +
∑^ m
j=
rj (¯x)∇^2 rj (¯x) ≈ J(¯x)TJ(¯x).
Summary Orthogonal Distance Regression Linear Least Squares Nonlinear Least Squares
Summary: Nonlinear Least Squares 2 of 4
Line-search algorithm: Gauss-Newton, with the subproblem: [ J(¯xk )T^ J(¯xk )
¯pGN k = −∇f (¯xk ).
Guaranteed descent direction, fast convergence (as long as the Hessian approximation holds up) equivalence to a linear least squares problem (used for efficient, stable solution).
Summary Orthogonal Distance Regression Linear Least Squares Nonlinear Least Squares
Summary: Nonlinear Least Squares 4 of 4
Hybrid Algorithms:
Summary Orthogonal Distance Regression
Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure
Fixed Regressor Models vs. Errors-In-Variables Models
So far we have assumed that there are no errors in the variables describing where / when the measurements are made, i.e. in the data set {tj , yj } where tj denote times of measurement, and yj the measured value, we have assumed that tj are exact, and the measurement errors are in yj. Under this assumption, the discrepancies between the model and the measured data are
ǫj = yj − Φ(¯x; tj ), i = 1, 2 ,... , m
Today, we will take a look at the situation where we take errors in tj into account — these models are known as errors-in-variables models, and their solutions in the linear case are referred to as total least squares optimization, or in the non-linear case as orthogonal distance regression.
Summary Orthogonal Distance Regression
Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure
Orthogonal Distance Regression
For the mathematical formulation of orthogonal distance regression we introduce perturbations (errors) δj for the variables tj , in addition to the errors ǫj for the yj ’s. We relate the measurements and the model in to following way
ǫj = yj − Φ(¯x; tj + δj ),
and define the minimization problem:
(¯x∗, ¯δ∗) = arg min ¯x,¯δ
∑^ m
j= 1
w^2 j
yj − Φ(¯x; tj + δj)
where d¯ and w¯ are two vectors of weights which denote the relative significance of the error terms.
Summary Orthogonal Distance Regression
Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure
Orthogonal Distance Regression: The Weights
The weight-vectors ¯d and w¯ must either be supplied by the modeler, or estimated in some clever way. If all the weights are the same wj = dj = C, then each term in the sum is simply the shortest distance between the point (tj , yj ) and curve Φ(¯x; t) (as illustrated in the previous figure). In order to get the orthogonal-looking figure, I set wj = 1/ 0 .5 and dj = 1/4, thus adjusting for the different scales in the t- and y -directions. The shortest path between the point and the curve will be normal (orthogonal) to the curve at the point of intersection. We can think of the scaling (weighting) as adjusting for measuring time in fortnights, seconds, milli-seconds, micro-seconds, or nano-seconds...
Summary Orthogonal Distance Regression
Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure
Orthogonal Distance Regression → Least Squares
If we take a cold hard stare at the expression
(¯x∗, ¯δ∗) = arg min ¯x,¯δ
∑^2 m
i=
rj (¯x, ¯δ)^2 = arg min ¯x,¯δ
‖¯r(¯x, δ¯)‖^22
We realize that this is now a standard least squares problem with 2 m terms and (n + m) unknowns — {¯x, ¯δ}. We can use any of the techniques we have previously explored for the solution of the nonlinear least squares problem. However, a straight-forward implementation of these strategies may prove to be quite expensive, since the number of parameters have doubled to 2m and the number of independent variables have grown from n to (n + m). Recall that usually m ≫ n, so this is a drastic growth of the problem.
Summary Orthogonal Distance Regression
Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure
ODR → Least Squares: Exploiting Structure
Fortunately we can save a lot of work by exploiting the structure of the Jacobian of the Least Squares problem originating from the orthogonal distance regression — many entries are zero!
∂rj ∂δi = wj^ ∂[yj^ −^ Φ(¯x;^ tj^ +^ δj^ )] ∂δi = 0, ∀i, j ≤ m, i 6 = j ∂rj ∂δi = ∂[dj−mδj−m] ∂δi =
{ 0 i 6 = (j − m), j > m dj−m i = (j − m), j > m ∂rj ∂xi = ∂[dj−mδj−m] ∂xi = 0, i = 1, 2 ,... , n, j > m
Let vj = wj^ ∂[yj^ −^ Φ(¯x;^ tj^ +^ δj^ )] ∂δj , and let D = diag(¯d), and V = diag(¯v), then we can write the Jacobian of the residual function in matrix form...
Summary Orthogonal Distance Regression
Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure
ODR → Least Squares → Levenberg-Marquardt 1 of 2
If we partition the step vector ¯p, and the residual vector ¯r into
¯p =
¯px ¯pδ
, ¯r =
˜r 1 ˜r 2
where p¯x ∈ Rn, ¯pδ ∈ Rm, and ˜r 1 ,˜r 2 ∈ Rm, then e.g. we can write the Levenberg-Marquardt subproblem in partitioned form
[̂ JT^ ̂J + λIn ̂JT^ V V ̂J V 2 + D^2 + λIm
¯px ¯pδ
JT˜r 1 V˜r 1 + D˜r 2
Since the (2, 2)-block V 2 + D^2 + λIm is diagonal, we can eliminate the ¯pδ variables from the system...
Summary Orthogonal Distance Regression
Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure
ODR → Least Squares → Levenberg-Marquardt 2 of 2
[̂ JT^ ̂J + λIn ̂JT^ V V ̂J V 2 + D^2 + λIm
] [ ¯px ¯pδ
] = −
[̂ JT˜r 1 V˜r 1 + D˜r 2
]
¯pδ = −
[ V 2 + D^2 + λIm
]− 1 [ (V˜r 1 + D˜r 2 ) + V ̂J¯px
]
This leads to the n × n-system A¯px = ¯b, where
A =
[̂ JT^ ̂J + λIn − ̂JT^ V
[ V 2 + D^2 + λIm
]− 1 V ̂J
]
¯b =
[ − ̂JT˜r 1 + ̂JT^ V
[ V 2 + D^2 + λIm
]− 1 [ V˜r 1 + D˜r 2
]] .
Hence, the total cost of finding the LM-step is only marginally more expensive than for the standard least squares problem.
Summary Orthogonal Distance Regression
Error Models Weigthed Least Squares / Orthogonal Distance RegressionODR = Nonlinear Least Squares, Exploiting Structure
Software and References
MINPACK Implements the Levenberg-Marquardt algorithm. Available for free from http://www.netlib.org/minpack/. ODRPACK Implements^ the^ orthogonal^ distance^ regres- sion algorithm. Available for free from http://www.netlib.org/odrpack/. Other The NAG (Numerical Algorithms Group) library and HSL (formerly the Harwell Subroutine Library), implement several robust nonlinear least squares implementations. GvL Golub and van Loan’s^ Matrix Computations, 3rd edition (chapter 5) has a comprehensive discussion on orthogonal- ization and least squares; explaining in gory detail much of the linear algebra (e.g. the SVD and QR-factorization) we swept under the rug.