Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Multinomial Loglinear and Logit Models in GENLOG: Estimation and Analysis, Study notes of Mathematical Statistics

Alliance University Mathematical Statistics

An overview of multinomial loglinear and logit models in genlog, focusing on normalizing constants, cell structure values, likelihood equations, hessian matrix, and estimation methods. It covers topics such as maximum likelihood estimates, newton-raphson method, initial values, and stopping criteria.

Typology: Study notes

2011/2012

Uploaded on 10/31/2012

sangawar 🇮🇳

4.5

(4)

118 documents

1 / 21

This page cannot be seen from the preview

Don't miss anything!

1

GENLOG

Multinomial Loglinear and Logit Models

This chapter describes the algorithms used to calculate maximum-likelihood

estimates for the multinomial loglinear model and the multinomial logit model.

This algorithm is applicable only to aggregated data.

Notation

The following notation is used throughout this chapter unless otherwise stated:

A Generic categorical independent (explanatory) variable. Its categories are

indexed by an array of integers.

B Generic categorical dependent (response) variable. Its categories are

indexed by an array of integers.

r Number of categories of B.

c Number of categories of A.

p Number of nonredundant (nonaliased) parameters.

i Generic index for the category of B.

j Generic index for the categories of A.

k Generic index for the parameter.

nij Observed count in the ith response of B and the jth setting of A.

N

j Marginal total count at the jth setting of A. It is equal to

nij

i

r

=

∑1.

N Total observed count. It is equal to

nij

i

r

j

c

== ∑∑ 11 .

mij Expected count.

Discover Study notes of Mathematical Statistics Alliance University

Partial preview of the text

Download Multinomial Loglinear and Logit Models in GENLOG: Estimation and Analysis and more Study notes Mathematical Statistics in PDF only on Docsity!

1

GENLOG

Multinomial Loglinear and Logit Models

This chapter describes the algorithms used to calculate maximum-likelihood estimates for the multinomial loglinear model and the multinomial logit model. This algorithm is applicable only to aggregated data.

Notation

The following notation is used throughout this chapter unless otherwise stated: A Generic categorical independent (explanatory) variable. Its categories are indexed by an array of integers. B (^) Generic categorical dependent (response) variable. Its categories are indexed by an array of integers. r Number of categories of B. c Number of categories of A. p Number of nonredundant (nonaliased) parameters. i (^) Generic index for the category of B. j Generic index for the categories of A. k Generic index for the parameter. nij Observed count in the ith response of B and the jth setting of A. N (^) j Marginal total count at the jth setting of A. It is equal to

i^ nij

r

N Total observed count. It is equal to

i^ nij

r j

c

mij Expected count.

π (^) ij Probability of having an observation in the ith response of B and the jth

setting of A. 0 ≤ πij ≤ 1 ∑ ∑i= 1 πij= 1

r j

c and (^) =. z (^) ij Cell structure value. α (^) j jth normalizing constant. β (^) k kth nonredundant parameter.

β A vector of 3 β 1 , K, βp 8 ′ .

x (^) ijk An element in the ith row and the kth column of the design matrix for the j setting. The same notation is used for both loglinear and logit models so that the methods are presented in a unified way. Conceptually, one can consider a loglinear model as a special case of a logit model where the explanatory variable has only one level (that is, c = 1).

Components of the Model

There are two components in a loglinear model: the random component and the systematic component.

Random Component

The random component describes the joint distribution of the counts.

The counts (^) =n (^1) j , K,nrjB at the jth setting of A have the multinomial

3 N^ j ,^ π^1 j ,^ K,πrj 8 distribution.

The counts n (^) ij and ni j′ ′ are independent if j ≠ j′.

Normalizing Constants

α (^) j j ij

v i

r

N

z e

j c ij

log , , 1

1 K (2)

Cell Structure Values

The cell structure values play two roles in SPSS loglinear procedures, depending on

their signs. If z ij > 0 , it is a usual weight for the corresponding cell and log 3 8z ij is

sometimes called the offset. If z (^) ij ≤ 0 , a structural zero is imposed on the cell

0 B = i A, =j 5. Contingency tables containing at least one structural zero are called

incomplete tables. If n ij = 0 but zij> 0 , the cell 0 B = i A, =j 5 contains a

sampling zero. Although SPSS still considers a structural zero part of the contingency table, it is not used in fitting the model. Cellwise statistics are not computed for structural zeros.

Maximum-Likelihood Estimation

The multinomial log-likelihood is

L L (^) p n (^) ij mij i

r

j

c

= =

1 1

, K , constant log (3)

Likelihood Equations

It can be shown that

∂β

L (^) n m x k (^) i ij^ ij

r

j

c = − ijk = =

∑ ∑^3

1 1

for k = 1, K,p

Let g 1 6 β = 3 g 1 1 6 β, K, g p1 6β 8 ′be the 0 p + 15 gradient vector with

g (^) k L k

β ∂

The maximum-likelihood estimates β$^ = β$^1 , K , β$p

t

4 9 are regarded as a solution to

the vector of likelihood equations:

g 1 6 β = 0 (4)

Hessian Matrix

The likelihood equations are nonlinear functions of β. Solving them for β$ requires an iterative method. The Newton-Raphson method is used. It can be shown that ∂ ∂β ∂β θ θ

2

1 1

L (^) m x x k t

ij ijk jk ijl jl i

r

j

c = − − − = =

∑ ∑^3 83

where

θ (^) jk j

ij ijk i

r

= N ∑= m x j = c k = p

1

, K, and , K, (5)

Let H 1 6 β be the p × pinformation matrix, where − H 1 6 β is the Hessian matrix of

(3). The elements of H 1 6 β are

h (^) kl L^ k p l p k l

2 1 , K, and 1 , K, (6)

Note: H 1 6 β is a symmetric positive-definite matrix. The asymptotic covariance

matrix of β$ is estimated by H −^1 1 6 β.

Initial Values

SPSS uses the β 0 5^0 , which corresponds to a saturated model as the initial value for β. Then the initial estimates for the expected cell counts are

m

n z ij z

ij ij ij

0 5 (^) = +^ > ≤

∆ if if (9)

where ∆ ≥ 0 is a constant. Note: For saturated models, SPSS adds ∆ to nij if z (^) ij > 0. This is done to avoid numerical problems in case some observed counts are 0. We advise users to set ∆ to 0 whenever all observed counts (other than structural zeros) are positive.

The initial values for other quantities are

θ (^) jk j (^) i ij

r N m^ xijk

0 0 1

0 5 (^) =^1 0 5

and

ηij^0 mij^ mij^ zij^ nij^ mij^ zij^ mij

0 5 = %&K 0 5^4 0 5^ 9 4+ − 0 5^9 > 0 5>

'K^

log / if and otherwise

Stopping Criteria

SPSS checks the following conditions for convergence:

1. maxi j, mij 0 s^ +^15 −mij 0 5s^ /mij0 5s < ε provided that mij 0 5s^ > 0

2. maxi j, mij 0 s^ +^15 −mij0 5s < ε

(^) k g (^) k p

p (^2) 1

4 9^ β^ $^ / ε

^

^ <

The iteration is said to be converged if either conditions 1 and 3 or conditions 2 and 3 are satisfied. If p^ =^0 , then condition 3 will be automatically satisfied. The iteration is said to be not converged if neither pair of conditions is satisfied within the maximum number of iterations.

Algorithm

The iteration process uses the following steps:

Calculate mij0 5^0 using (9), θ 0 5jk^0 using (10), and nij0 5^0 using (11).
Set s = 0.

3. Calculate H 4 β 0 5s 9 using (6) evaluated at m ij = mij0 5s; calculate q 4 β 0 5s 9 using

(7) evaluated at n (^) ij = nij0 5s.

Solve for β 0 s^ +^15 using (8).
Calculate v (^) ijs^ k xijk (^) ks
- p +

1 1 1

(^0 5) β 0 5 and

m N^ z e^ z e^ z z

ij

s (^) j ijv t tjv

r ij ij

ijs ijs = =

^

&K

'K

1 1 0 0 0

0 5

/ if if

Check whether the stopping criteria are satisfied. If yes, stop iteration and declare convergence. Otherwise continue.
Increase s by 1 and check whether the maximum iteration has been reached. If yes, stop iteration and declare the process not converged. Otherwise repeat steps 3-7.

where

X

n m m z n m z n m z n m

ij

ij ij ij ij ij ij ij ij ij ij ij ij

2

KK

K

3 8 if^ and

SYSMIS if and if or

If any X (^) ij 2 is system missing, then X 2 is also system missing. The likelihood-ratio chi-square statistic is

G Gij i

r

j

c 2 2 1 1

= =

where

G

n n m (^) z n m z n m z n m z n m

ij

ij ij ij ij ij ij ij ij ij ij ij ij ij ij ij

2

K

log / $^ , $ , $ , , $^ ; $

4 3 89 if and

SYSMIS if and if and or

If any Gij 2 is system missing, then G^2 is also system missing.

Degrees of Freedom

The degrees of freedom for each statistic is defined as a = c r 0 − 15 − p −E, where E

is the number of cells with z (^) ij ≤ 0 or m$^ ij= 0.

Significance Level

The significance level (or the p value) for the Pearson chi-square statistic is

Prob 4 χ 2 a > X^29 and that for the likelihood-ratio chi-square statistic is

Prob 4 χ 2 a > G^29. In both cases, χ 2 a^ is the central chi-square distribution with a

degrees of freedom.

Analysis of Dispersion (Logit Models Only)

SPSS provides the analysis of dispersion based on two types of dispersion: entropy and concentration. The following definitions are used: S(A) Dispersion due to the model S(B|A) Dispersion due to residuals S(B) Total dispersion R=S(A)/S(B) Measure of association

where S A0 5 + S B A 0 | 5 =S B0 5. Also define

π (^) i j ij

c

j j

c

m

N

=

1

π (^) i j | ij j

m N

The bounds are 0 ≤ π$^ i ≤ 1 and 0 ≤ π$i j| ≤ 1.

Entropy

S B N S (^) iB i

r

=

1

where

Si B i^ i^ i i

0 5 =^ 1 6

$ (^) log $ $

π π π π

if if

Residuals

Goodness-of-fit statistics provide only broad summaries of how models fit data. The pattern of lack of fit is revealed in cell-by-cell comparisons of observed and fitted cell counts.

Simple Residuals

The simple residual of the (i,j)th cell is

r

n m z ij z

ij ij ij ij

$ if SYSMIS if

Standardized Residuals

The standardized residual for the (i,j)th cell is

r

n m m m N z m N ijS z^ n^ m

ij ij ij ij j ij ij j = ij ij ij

KK

K

if and 0 < if and SYSMIS otherwise

The standardized residuals are also known as Pearson residuals even though

i^ rij^ S X

r j

c

2 1 1

2

∑ = ∑ = ≠^. Although the standardized residuals are asymptotically

normal, their asymptotic variances are less than 1.

Adjusted Residuals

The adjusted residual is the simple residual divided by its estimated standard error. Its definition and applications first appeared in Haberman (1973) and re- appeared on page 454 of Haberman (1979). This statistic for the (i,j)th cell is

r

n m s z m ijA z^ m

ij ij ij ij ij = ij ij

K

3 8 if^ and

if and n SYSMIS otherwise

ij

where

s m

m N ij ij ij m^ x^ x^ h j ijk

p ijk jk ijl jl l

p

= − − − − kl

= =

1 $^ $^ $

1 1

h kl^ is the (k,l)th element of H −^1 4 9 β$. The adjusted residuals are asymptotically

standard normal.

Deviance Residuals

Pierce and Schafer (1986) and McCullagh and Nelder (1989) define the signed square root of the individual contribution to the G^2 statistic as the deviance residual. This statistic for the (i,j)th cell is

rij D^ = sign 3 n ij −m$ ij 8 dij

where

d

n n m n m z m n m z m z m

ij

ij ij ij ij ij ij ij ij ij ij ij ij ij

K

log / $^ $^ , $^ , $ (^) , $ (^) , $

4 4 3 8 9 3^89 if^ and

if and n if and n SYSMIS otherwise

ij ij

For multinomial sampling, the individual contribution to the G^2 statistic is only

2 n ij log 3 n ij / m$ij 8 , but this is negative when n ij < m$^ ij. Thus, an extra term

2 3 n ij − m$ij 8 is added to it so that d ij > 0 for all i and j. However, we still have

rij D G i

r j

c

(^2 ) 1 1

where

V d m N ij ij d m^ f^ f h i

r

j

c

j j

c ij ij i

r k l l

p kl k

p = −

= = = = = =

2 1 1 1 1

2

1 1

f (^) k d mij x i

r

j

c = (^) ij ijk − ik = =

1 1

Generalized Log-Odds Ratio

Consider a linear combination of the natural logarithm of cell counts

d (^) ij m i

r

j

c ij = =

1 1

log 3 8 (12)

where d (^) ij are real numbers with the restriction

d (^) ij j c i

r = = =

1

, K,

The quantity in (12) is estimated by

d (^) ij m d z d x i

r

j

c ij ij i

r

j

c ij ij ijk k k

p

i

r

j

c

= = = = = = =

∑ ∑ =^ ∑ ∑ +∑∑∑

1 1 1 1 1 1 1

log 3 $^8 log 3 8 β$ (13)

The variance of (13) is

var d (^) ij m w w h i

r

j

c ij k l kl l

p

k

p

= = = =

1 1 1 1

log 3 $ 8 (14)

where

w (^) k d (^) ijx k p i

r

j

c = (^) ijk = = =

1 1

1, K,

Wald Statistic

The null hypothesis is

H d (^) ij m i

r

j

c 0 ij 1 1

: log 0 = =

∑ ∑^3 8 =

The Wald statistic is

W

d m

w w h

i ij

r j

c ij

l k^ l^ kl

p k

= p

^

= =

1 1

2

1 1

log 3 $ 8

Under H 0 , W asymptotically distributes as a chi-square distribution with 1 degree

of freedom. The significance level is Prob 4 χ 12 ≥ W 9. Note: W will be system

missing if (14) is 0.

The cell count is

n

n v v v

ij s v ijs ij ij ij

= ij

K

'K

≤ ≤

if if or

where

n

n n z ijs n z

ijs ijs ijs ijs ijs

+ = >^ >

if and if and

and ∑ 1 ≤ ≤s v

ij

means summation over the range of s with the terms z (^) ijs > 0. The cell weight value is

z

n z n n v

z v n v v v

ij

s v ijsijs^ ij^ ij^ ij

s v ijs^ ij ij^ ij ij ij

ij = ij

K

≤ ≤

1

if and

if and if if

If no variable is specified as the cell weight variable, then all cases have unit cell weights by default.

The cell covariate value is

x

n x n n v

x v n v v v

ij

s v ijsijs^ ij^ ij^ ij

s v ijs^ ij ij^ ij ij ij

ij

K

KK

K

≤ ≤

1

if and

if and if or

The cell GRESID coefficient is

c

n c n n v

c v n v v v

ij

s v ijsijs^ ij^ ij^ ij

s v ijs^ ij ij^ ij ij ij

ij

K

KK

K

≤ ≤

1

if and

if and if or

There are no defaults for the GRESID coefficients.

The cell GLOR coefficient is

e

n e n n v

e v n v v v

ij

s v ijsijs^ ij^ ij^ ij

s v ijs^ ij ij^ ij ij ij

ij

K

KK

K

≤ ≤

1

if and

if and if or

There are no defaults for the GLOR coefficients.

Multinomial Loglinear and Logit Models in GENLOG: Estimation and Analysis, Study notes of Mathematical Statistics

Related documents

Partial preview of the text

Download Multinomial Loglinear and Logit Models in GENLOG: Estimation and Analysis and more Study notes Mathematical Statistics in PDF only on Docsity!

GENLOG

Multinomial Loglinear and Logit Models

Notation

setting of A. 0 ≤ πij ≤ 1 ∑ ∑i= 1 πij= 1

β A vector of 3 β 1 , K, βp 8 ′ .

Components of the Model

Random Component

3 N^ j ,^ π^1 j ,^ K,πrj 8 distribution.

Normalizing Constants

N

1 K (2)

Cell Structure Values

their signs. If z ij > 0 , it is a usual weight for the corresponding cell and log 3 8z ij is

0 B = i A, =j 5. Contingency tables containing at least one structural zero are called

incomplete tables. If n ij = 0 but zij> 0 , the cell 0 B = i A, =j 5 contains a

Maximum-Likelihood Estimation

Likelihood Equations

∑ ∑^3

Let g 1 6 β = 3 g 1 1 6 β, K, g p1 6β 8 ′be the 0 p + 15 gradient vector with

4 9 are regarded as a solution to

g 1 6 β = 0 (4)

Hessian Matrix

∑ ∑^3 83

= N ∑= m x j = c k = p

Let H 1 6 β be the p × pinformation matrix, where − H 1 6 β is the Hessian matrix of

(3). The elements of H 1 6 β are

Note: H 1 6 β is a symmetric positive-definite matrix. The asymptotic covariance

matrix of β$ is estimated by H −^1 1 6 β.

Initial Values

0 5 (^) =^1 0 5

0 5 = %&K 0 5^4 0 5^ 9 4+ − 0 5^9 > 0 5>

'K^

Stopping Criteria

1. maxi j,  mij 0 s^ +^15 −mij 0 5s^ /mij0 5s < ε provided that mij 0 5s^ > 0

2. maxi j,  mij 0 s^ +^15 −mij0 5s  < ε

4 9^ β^ $^ / ε

^

^ <

Algorithm

3. Calculate H 4 β 0 5s 9 using (6) evaluated at m ij = mij0 5s; calculate q 4 β 0 5s 9 using

^

^

^

&K

'K

X

KK

K

K

3 8 if^ and

G

K

K

K

K

4 3 89 if and

Degrees of Freedom

The degrees of freedom for each statistic is defined as a = c r 0 − 15 − p −E, where E

Significance Level

Prob 4 χ 2 a > X^29 and that for the likelihood-ratio chi-square statistic is

Prob 4 χ 2 a > G^29. In both cases, χ 2 a^ is the central chi-square distribution with a

Analysis of Dispersion (Logit Models Only)

where S A0 5 + S B A 0 | 5 =S B0 5. Also define

Entropy

0 5 =^ 1 6

Residuals

Simple Residuals

Standardized Residuals

KK

K

K

∑ = ∑ = ≠^. Although the standardized residuals are asymptotically

Adjusted Residuals

K

K

3 8 if^ and

1. maxi j, mij 0 s^ +^15 −mij 0 5s^ /mij0 5s < ε provided that mij 0 5s^ > 0

2. maxi j, mij 0 s^ +^15 −mij0 5s < ε

^

^ <

^

^

^

= − − − − kl

^