Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Variance Components - Mathematics and Statistics - Study Notes, Study notes of Mathematical Statistics

Alliance University Mathematical Statistics

This document has following main points Variance Components, Weights, Model, Minimum Norm Quadratic Unbiased Estimate, Maximum Likelihood Estimate, The Fisher Information Matrix, Iteration Procedure

Typology: Study notes

2011/2012

Uploaded on 10/31/2012

sangawar 🇮🇳

4.5

(4)

118 documents

1 / 22

This page cannot be seen from the preview

Don't miss anything!

1

Variance Components (VARCOMP)

The Variance Components procedure provides estimates for variances of random

effects under a general linear model framework. Four types of estimation methods

are available in this procedure.

Notation

The following notation is used throughout this chapter. Unless otherwise stated, all

vectors are column vectors and all quantities are known.

n Number of observations, n≥1

k Number of random effects, k≥0

m0 Number of parameters in the fixed effects, m00≥

mi Number of parameters in the ith random effect, mi

≥

0, ik=1, ,K

m Total number of parameters, mm m m

k

=+++

01

L

σ

i

2 Unknown variance of the ith random effect,

σ

i

20≥, ik=1, ,K

σ

e

2 Unknown variance of the residual term, same as

σ

k+1

2

,

σ

e

20>

γ

i

2 Unknown variance ratio of the ith random effect,

γσσ

i

ie

2

22

=,

γ

i

20≥,

ik=1, ,K, and

γ

k+=

1

21

y The length n vector of observations

e The length n vector of residuals

Xi The nm

i

× design matrix, ik=01,, ,K

β0 The length m0 vector of parameters of the fixed effects

βi The length mi vector of parameters of the ith random effect, ik=1, ,K

Unless otherwise stated, a

p

× identity matrix is denoted as Ip, a

p

q× zero

matrix is denoted as 0pq×, and a zero vector of length p is denoted as 0p.

Discover Study notes of Mathematical Statistics Alliance University

Partial preview of the text

Download Variance Components - Mathematics and Statistics - Study Notes and more Study notes Mathematical Statistics in PDF only on Docsity!

1

The Variance Components procedure provides estimates for variances of random effects under a general linear model framework. Four types of estimation methods are available in this procedure.

Notation

The following notation is used throughout this chapter. Unless otherwise stated, all vectors are column vectors and all quantities are known.

n Number of observations, n ≥ 1

k (^) Number of random effects, k ≥ 0 m 0 Number of parameters in the fixed effects, m 0 ≥ 0 mi Number of parameters in the i th random effect, mi ≥ 0 , i = 1, K, k

m Total number of parameters, m = m 0 + m 1 + L+ mk

σ (^) i^2 Unknown variance of the i th random effect, σ (^) i^2 ≥ 0 , i = 1, K, k σ (^) e^2 Unknown variance of the residual term, same as σ (^) k^2 +^1 , σ (^) e^2 > 0 γ (^) i^2 Unknown variance ratio of the i th random effect, γ (^) i^2^ = σ (^) i^2^ σ e^2 , γ (^) i^2 ≥ 0 , i = 1, K, k , and γ (^) k^2 +^1 = 1 y (^) The length n vector of observations

e The length n vector of residuals

X i The n × mi design matrix, i = 0 1, , K, k β 0 The length m 0 vector of parameters of the fixed effects β i The length mi vector of parameters of the i th random effect, i = 1, K, k Unless otherwise stated, a p × p identity matrix is denoted as I (^) p , a p × q zero matrix is denoted as (^0) p × q , and a zero vector of length p is denoted as (^0) p.

Weights

For the sake of clarity and simplicity, the algorithms described in this chapter assume unit frequency weight and unit regression weight for all cases. Weights can be applied as described in the following two sections.

Frequency Weight

The WEIGHT command specifies frequency weights.

Cases with nonpositive frequency are excluded from all calculations in the procedure.
Non-integral frequency weight is rounded to the nearest integer.
The total sample size is equal to the sum of positive rounded frequency weights.

Regression Weight

The REGWGT subcommand specifies regression weights. Suppose the l th 0 l = 1, K, n 5 case has a regression weight wi > 0 (cases with nonpositive regression weights are excluded from all calculations in the procedure). Let W = diag 1 w (^) 1 , K, w (^) n 6 be the n × n diagonal weight matrix. Then the VARCOMP procedure will perform all calculations as if y is physically transformed to W y

(^12)

and X i to W X

(^12) i ,^ i^ =^ 0 1, ,^ K,^ k ; and then the pertinent algorithm is applied to the transformed data.

Model

The mixed model is represented, following Rao (1973), as

y = X + X + e

(^0 0) ∑ 1

β (^) i β i i

k

and

q i^ k i (^) i k = i ′^ = = +

% & '

SSQ , ,

SSQ

X Ry Ry

1 6 0 5

K

where SSQ( A ) is the sum of squares of all elements of a matrix A.

MINQUE(0)

The prior values are α (^) i = 0 , i = 1, K, k , and α (^) k + 1 = 1. Under this set of prior values, V^ =^ I n and^ R^ =^ I^ n −^ X^ 0 1 X X ′ 0^ 0 6 − X ′ 0. Since this^ R^ is an idempotent matrix, some of the elements of S and q can be simplified to

s i k s j k s n q

i k i i k j j j k k k

, , ,

trace , , ; trace , , ; rank

1 1 1 1 0 1

X RX

X

y Ry

1 6 3 8 1 6

K

Using the algorithm by Goodnight (1978), the elements of S and q are obtained without explicitly computing R. The steps are described as follows: Step 1. Form the symmetric matrix:

′ ′ ′ ′ ′ ′ ′ ′

!

"

$

# # # #

X X X X X X X y X X X X X X X y

X X X X X X X y y X y X y X y y

0 0 0 1 0 0 1 0 1 1 1 1

0 1 0 1

L

M M M M

L

k k

k k k k k k

Step 2. Sweep the above matrix by pivoting on each diagonal of X X ′ 0 0. This produces the following matrix:

G GX X GX X GX y X X G X RX X RX X Ry

X X G X RX X RX X Ry y X G y RX y RX y Ry

!

"

$

# # # #

0 1 0 0 1 0 1 1 1 1

0 1 0 1

L

M M M M

L

k k

k k k k k k

where G = 1 X X ′ 0 06 −. In the process of computing the above matrix, the rank of X 0 is obtained as the number of nonzero pivots found.

Step 3. Form S and q. The MINQUE(0) of σ are σ =$ S q −.

MINQUE(1)

The prior values are α (^) i = 1 , i = 1 , K, k + 1. Under this set of prior values,

V = X X ′

∑ i^ i i

k

1

. Using Giesbrecht (1983), the matrix S and the vector q are obtained

through an iterative procedure. The steps are described as follows: Step 1. Construct the augmented matrix A = X (^) 0 X (^) 1 L X (^) k y. Then compute the 0 m + 1 5 0× m + 15 matrix T 0 (^) k + 1 5 = A A ′.

Step 2. Define H (^) l X X i i i l

k 0 5 =^ ′ =

∑

1 , and T 0 5 (^) l = A H ′ (^) 0 5− l^1 A^ , l = 1, K, k. Update T 0 (^) l + 15 to

T 0 5 (^) l using the W Transform given in Goodnight and Hemmerle (1979). The updating formula is

T 0 5 (^) l = T 0 (^) l + 5 − A H ′ (^) 0 − l^ + 5 X (^) l 4 I (^) m (^) l + X H l ′ (^) 0 − l + 5 X (^) l 9 X H l ′ (^) 0 l 5 A

−

− 1 1

1 1

1

Likelihood Function

The likelihood function is

L ≡ L = −^ n^ e − − ′^ − e

− (^) − 0 5 0 5θ 2 2 2 1 β 6 1 β 6

1 2 (^1) 2 0 0 π σ V exp y X V^1 y X 0 0 σ^2.

The log-likelihood function is

l L n^ n^ e e

= log = − 2 log 2 − 2 log − 12 log − 1 − ′^ − −. 2

2 2 0 0 π σ^1 0 σ

0 5 4 9 V 1 y X β 6 V 1 y X β 6

The Gradient Vector

∂ ∂ (^) σ ∂ ∂ γ σ ∂ ∂ σ σ σ

l

l (^) i k

l n

e

i e

i i i i

e e e

β 0 2 0

1

2 2

1 1 1

2 4

1 2

= ′^ ′ − ′ =

= ′^ −

−

− − −

−

X V r

r V X X V r X V X

r V r

tr , , , ,

4 9 K

where r = y − X 0 β 0. The gradient vector is

dl d

l

l e

θ

β

γ

!

"

$

# # # # #

∂ σ

0 2

2

The Hessian Matrix

∂ ∂ (^) σ

2 0 0 2

0 1 0

l 1 β β (^) e

= − X V ′ − X

∂ ∂ γ ∂ σ ∂ ∂ γ ∂ γ σ

2 2 0 2

(^1 1 )

2 2 2

1 1 2

1 1 1

l (^) i k

l (^) i k j k

i e

i i

i j

i j j i e

i i j j

β

− −

− − − − −

r V X X V X

X V X X V X r V X X V X X V r

tr , , ; , , ,

K

4 9 K^ K

∂ σ ∂ σ ∂ ∂ σ ∂ γ σ ∂ ∂ σ ∂ σ σ σ

2 2 0 4

(^1 )

2 2 2 4

1 1

2 2 2 4 6

1

l

l (^) j k

l n

e e

e j e

j j

e e e e

β

−

− −

−

r V X

r V X X V r

r V r

, K,

The Hessian matrix is

d l d d

l l l

e

e e e e

2

2 0 0

2 0 2

2

2 0 2 (^2 )

2 2 2

2 2 2 2 (^2 )

2 2 2

θ θ

β β (^) β γ β

γ β γ γ γ

β γ

!

"

$

# # # # # #

∂ ∂ σ ∂ ∂ ∂

∂ ∂ σ ∂ ∂ σ ∂

∂ σ ∂

∂ σ ∂ σ

E l

E l^ j k

E l^ n

e

m

e j e

j j

e e e

∂ σ ∂

∂ ∂ σ ∂ γ σ ∂ ∂ σ ∂ σ σ

2 (^2 ) 2 2 2 2

1

2 2 2 4

0

1 2

β

^

= −^ ′^ =

^

−

tr 4 X V X 9 , K, ,

The Fisher Information matrix is

E d l d d

l

E l^ E l

m m m m

m m m (^) e

m e e e

2

2 0 0 2 2 2

2 2 2 2 2 2

2 2 2

0 0 0

0 0

0

θ θ

β β

γ γ γ

γ

^

′

^

!

"

$

# # # # # #

× −

− ×

∂ ∂ σ ∂ ∂ σ ∂

∂ σ ∂ σ

1 6

where

E l

E l^ E l

k

k k k

∂ γ ∂ γ

2 2 2

2 12 12

2 12 2

2 2 12

2 2 2

γ γ

^

!

"

$

# # # # # #

L

M O M

L

and

E l

e

e k

∂ σ ∂

∂ σ ∂ γ

2 2 2

2 2 12

2 2 2

γ

^

!

"

$

# # # # # #

M

Iteration Procedure

Initial Values

Fixed Effect Parameters : β$ 0 = 1 X X 0 ′ 0 (^) 6 − X y 0 ′.
Random Effect Variance Components : For the i th 0 i = 1, K, k 5 random effect, compute β$ i = 1 X X (^) i ′ (^) i (^) 6 − X y i ′. Then assign the variance of the mi elements of β$ i using divisor 1 mi − 16 to the estimate σ$ (^) i^2 if mi ≥ 2 ; otherwise σ$ (^) i^2 = 0.
Residual Variance : σ$ (^) e^2^ = r r ′ n where X = X (^) 0 X (^) 1 L X k and r = y − 0 X X ′ 5 − X y ′. If σ$ (^) e^2 = 0 but k ≥ 1 then reset σ$ (^) e^2 = 10 −^8 so that the iteration can continue. The variance ratios are then computed as γ$ (^) i^2^ = σ$^ i^2^ σ$ e^2 , i = 1, K, k. Following the same method in which the residual variance is initialized, σ$ (^) e^2 > 0 for k ≥ 1.

Updating

At the s th iteration 0 s = 0 1, ,K 5 , the parameter vector is updated as

θ^ $^0 s + 1 5 = θ$^ 0 5 s + ρ∆θ$0 5 s

where ∆ θ$ (^) 0 5 s is the value of increment ∆θ evaluated at θ = θ$^ 0 5 s , and ρ > 0 is a step size such that l (^) 4 θ $^0 s + 1 5 9 ≥ l 4 θ $0 5 s 9. The increment vector depends on the choice of step type—Newton-Raphson versus Fisher scoring. The step size is determined by the step-halving technique with ρ = 1 initially and a maximum of 10 halvings.

Covariance Matrix

Let θ$ be the vector of maximum likelihood estimates. Their covariance matrix is given by

cov $ $

θ 4 9 θ θ (^) θ θ

^

− E d l d d

2 1

Let

ψ

β

=

!

"

$

# # # #

0 1

2

2 2

σ

σ σ

M

k e

be the original parameters. Their maximum likelihood estimates are given by

ψ

β

=

!

"

$

# # # #

0 2 12

2 2 2

σ γ

σ γ σ

e

e k e

M

and their covariance matrix is estimated by

cov (^) 2 7ψ$^ = J cov (^) 4 9θ$ J ′

where

J

I 0 0

= 0 I

!

"

$

× ×

m m k m k m e k

0 0 0 0

2 0 0 1

σ γ

which is the 1 m (^) 0 + k + (^1) 6 1 × m (^) 0 + k + 16 Jacobian matrix of transforming θ to ψ.

Restricted Maximum Likelihood Estimate (REML)

The restricted maximum likelihood method finds a linear transformation on y such that the resulting vector does not involve the fixed effect parameter vector β 0 regardless of their values. It has been shown that these linear combinations are the residuals obtained after a linear regression on the fixed effects. Suppose r is the

rank of X 0 ; then there are at most n − r linearly independent combinations. Let K

be an n × 0 n − r 5 matrix whose columns are these linearly independent combinations. Then the properties of K are (Searle et al., 1992, Chapter 6):

′ = ′ =

K X 0 − ×

K TM

(^0 0) n r 5 m 0

where T is a 0 n − r 5 × n matrix with linearly independent rows and

M = I (^) n − X (^) 0 1 X X ′ 0 (^) 0 6 − X ′ 0

It can be shown that REML estimation is invariant to K (Searle et al., 1992, Chapter 6); thus, we can choose K such that K K ′ = I n − r to simplify calculations. It follows that the distribution of K y ′ is N (^) n − r 4 (^0) , σ (^) e^2 K VK ′ 9.

Parameters

The parameter vector is θ = γ

!

" $

2 σ (^2) e^ where^ γ^

2

12

2

!

"

$

γ

M

k

The Hessian Matrix

∂ γ ∂ γ σ ∂ ∂ σ ∂ γ σ ∂ ∂ σ ∂ σ σ σ

2 2 2 2 2 2 2 4 2 2 2 6 4

l (^) i k j k

l (^) j k

l n r

i j e

i i j j i j j i

e j e

j j

e e e e

y RX X RX X Ry X RX X RX

y RX X Ry

y Ry

tr , , ; , ,

3 8 K^ K

K

The Hessian matrix is

d l d d

l l

e

e e e

2

2 2 2

2 2 2 2 2 2

2 2 2

θ θ

γ γ γ

γ

!

"

$

∂ ∂ ∂

∂ ∂ ∂ σ ∂ ∂ σ ∂

∂ ∂ σ ∂ σ

where

∂ γ ∂ γ

2 2 2

2 12 12

2 12 2

2 2 12

2 2 2

l

l l

k

k k k

γ γ

!

"

$

# # # # # #

L

M O M

L

and^ ∂ ∂ σ ∂

∂ σ ∂ γ

2 2 2

2 2 12

2 2 2

l

e

e k

γ

!

"

$

# # # # # #

M

The Fisher Information Matrix

Since K X ′ (^) 0 = (^0) 0 n − r 5 × m 0 and trace 0 RV 5 = n − r , the expected second derivatives are

E l^ i k j k i j

i j j i

∂ γ ∂ γ

2 2 2

= −^ tr^3 X RX X RX^ ′^ ′^8 =^ ,^ K,^ ,^ = ,^ K,

E l^ j k

E l^ n^ r

e j e

j j

e e e

∂ ∂ σ ∂ γ σ ∂ ∂ σ ∂ σ σ

2 2 2 2

2 2 2 4

= −^ ′^ =

^

tr 3 X RX 8 , K,

The Fisher Information matrix is

E d l d d

E l^ E l

e

e e e

2

2 2 2

2 2 2 2 2 2

2 2 2

θ θ

γ γ γ

γ

^

!

"

$

∂ ∂ σ ∂ ∂ σ ∂

∂ σ ∂ σ

where

E l

E l^ E l

k

k k k

∂ γ ∂ γ

2 2 2

2 12 12

2 12 2

2 2 12

2 2 2

γ γ

^

!

"

$

# # # # # #

L

M O M

L

and

E l

e

e k

∂ σ ∂

∂ σ ∂ γ

2 2 2

2 2 12

2 2 2

γ

^

!

"

$

# # # # # #

M

Otherwise, the Fisher scoring step is used instead. The increment vector for each type of step is:

Newton-Raphson Step :∆θ θ θ θ

^

− d l d d

dl d

2 1 .

Fisher Scoring Step : ∆θ θ θ θ

^

− E d l d d

dl d

2 1 .

Convergence Criteria

Given the convergence criterion ε > 0 , the iteration is considered converged when the following criteria are satisfied:

l (^) 4 θ $^0 s + 1 5 9 4− l θ $^ 0 5 s (^) 9 < ε × max 1 , l 4 θ $0 5 s 9 , and
ρ 4 θ $^0 s + 1 5 − θ$^ 0 5 s (^) 9 < ε×max 41 , θ$0 5 s 9 where a is the sum of absolute values of elements of the vector a.

Negative Variance Estimates

Negative variance estimates can occur at the end of an iteration. An ad hoc method is to set those estimates to zero before the next iteration.

Covariance Matrix

Let θ$ be the vector of maximum likelihood estimates. Their covariance matrix is given by

cov $ $

θ 4 9 θ θ (^) θ θ

^

− E d l d d

2 1

Let

ψ =

!

"

$

σ

σ σ

12

2 2

M

k e

be the original parameters. Their maximum likelihood estimates are given by

ψ =

!

"

$

σ γ

σ γ σ

e

e k e

2 12

2 2 2

M

and their covariance matrix is estimated by

cov (^) 2 7ψ$^ = J cov (^) 4 9θ$ J ′

where

J = I

!

" $

σ (^) e^2 (^) k 0 1

γ

which is the 0 k + 1 5 0× k + 15 Jacobian matrix of transforming θ to ψ.

ANOVA Estimate

The ANOVA variance component estimates are obtained by equating the expected mean squares of the random effects to their observed mean squares. The VARCOMP procedure offers two types of sum of squares: Type I and Type III (see Appendix 11 for details).

Variance Components - Mathematics and Statistics - Study Notes, Study notes of Mathematical Statistics

Related documents

Partial preview of the text

Download Variance Components - Mathematics and Statistics - Study Notes and more Study notes Mathematical Statistics in PDF only on Docsity!

Notation

n Number of observations, n ≥ 1

m Total number of parameters, m = m 0 + m 1 + L+ mk

e The length n vector of residuals

Weights

Model

y = X + X + e

SSQ , ,

SSQ

K

X RX

X RX

X

K

K

# # # #

L

L

M M M M

L

L

# # # #

L

L

M M M M

L

L

Step 3. Form S and q. The MINQUE(0) of σ are σ =$ S q −.

V = X X ′

= ′^ ′ − ′ =

= ′^ −

# # # # #

= − X V ′ − X

K

, K,

# # # # # #

# # # # # #

# # # # # #

M O M

L

# # # # # #

M



# # # #

M

# # # #

M

J

I 0 0

= 0 I

Restricted Maximum Likelihood Estimate (REML)

rank of X 0 ; then there are at most n − r linearly independent combinations. Let K

K X 0 − ×

K TM

M

K

# # # # # #

L

M O M

L

# # # # # #

# # # # # #

M O M

L

# # # # # #

M



M

M

J = I

ANOVA Estimate