












Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An in-depth explanation of the least-squares method for finding the correlation coefficient β in a linear regression model. It covers the derivation of the least-squares coefficient, the statistical accuracy of the estimate, and the multivariate case. The document also discusses the positive definiteness of the hessian matrix and the level sets of the criterion function.
Typology: Study notes
1 / 20
This page cannot be seen from the preview
Don't miss anything!













Dr. Scott, Stat 410
October 13, 2005
For our model Y = Xβ +
min
β
SS(β) =
t
= (Y − Xβ)
T
(Y − Xβ)
t
Y − 2 β
t
t
Y + β
t
t
Xβ
The least-squares coefficient solves
β
t
t
Xβ
β = (X
t
− 1
t
Hessian = ∇∇
t
t
which is pos. def. ⇒ minimizer!!
Cute Example
V ar(ˆβ
k
) = V ar
β 0
β 1
β
k
β p− 1
= V ar(e
t
k
= σ
2
e
t
k
t
e k
= σ
2
e
t
k
t
− 1
e k
= σ
2
t
− 1
kk
as we have seen before.
Familiar Example p = 1
Y = Xβ + with X =
Thus
β = (X
t
− 1
︸ ︷︷ ︸
n
− 1
t
︸ ︷︷ ︸
ny¯
= ¯y
σ
2
(ˆβ) = σ
2
(¯y) = σ
2
t
− 1
σ
2
n
as usual.
Note:
∑
(y i
− β)
2
∑
(y i
β + ˆβ − β)
2
∑ [
(y
i
β)
2
i
β) + (ˆβ − β)
2
]
∑
(y i
− ¯y)
2
2
exactly!!
Dual: n = 100 and n = 400 (see sketches)
Tentative conclusion:
steeper criterion ⇒ more accurate parameters
Multivariate β
g(β) =
t
Y = Xβ +
β ∼ N (β, σ
2
t
− 1
Multivariate Taylor’s series:
g(β) = g(ˆβ)+(β −
β)
t
∇g(ˆβ)+
(β −
β)
t
2
g(ˆβ)(β −
β) + · · ·
For our least-squares problem:
g(β) = g(ˆβ) + 0 +
(β −
β)
t
t
X)(β −
β)
t
(I − H)Y + (β −
β)
t
t
X)(β −
β)
(see sketch)
Facts about Positive Definite Matrices
t
X symmetric
Look at the quadratic form
y
t
A y = y
t
t
X y
︸︷︷︸
w
= w
t
w ≥ 0 ∀y
Look at eigenvalues/eigenvectors:
A v k
= λ k
v k
k = 1,... , p
and
v
t
k
v
k
= 1 v
t
k
v
`
= 0 k 6 = `.
Assume λ 1
λ 2
· · · > λ p
. Consider
v
t
k
(A v
k
) = v
t
k
(λ
k
v
k
) = λ
k
(v
t
k
v
k
) = λ
k
so, in fact, all the λ k
Level Sets
g(β) = g(ˆβ) + (β −
β)
t
A (β −
β)
Find values of β satisfying
(β −
β)
t
A (β −
β) = c
(see sketch)
Suppose A =
a 1
0 a 2
0 0 · · · a p
By inspection, A e k
= a k
e k
, so eigenvectors
are the coordinate axes. Level sets satisfy
y
t
A y =
p
∑
k=
a
k
y
2
k
p
∑
k=
y
2
k
1 /a k
= c
which is an ellipse.
(see sketch)
Next, find point, y, on the level set in the
direction, v 1
; thus y has the form α v 1
y
t
A y = c
α
2
v
t
1
A v 1
︸ ︷︷ ︸
λ 1
v 1
︸ ︷︷ ︸
λ 1
= c
so that
α
2
c
λ 1
Since λ 1
is the largest eigenvalue, this is the
shortest axis of the ellipse. Also, changes in
β in that direction give the quickest increase
in the criterion function, BUT most accurate
in that direction.
In general, points y
k
= α
k
v
k
on the level set
satisfy
α
2
k
v
t
k
A v
k
= c ⇒ α
k
√
c
λ
k
y k
√
c
λ
k
v k
or get same result by recalling
β −
β ∼ N (0, σ
2
t
− 1
has level sets
(β −
β)
t
− 1
(β −
β) = c
Look in the direction β −
β = α
k
v
k
, then
α
2
k
v
t
k
(
σ
2
t
− 1
)
− 1
v
k
= c
α
2
k
σ
2
v
t
k
t
X) v
k
︸ ︷︷ ︸
λ k
= c
so
α
2
k
σ
2
c
λ k
⇒ α k
= σ
√
c
λ k
(see sketch).... note the same orientation in
the end.
Well, now for the computer demos...