















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An in-depth explanation of the computations and statistical inference involved in the analysis of variance (anova) and multivariate general linear model (mglm). Topics covered include the formation of augmented row vectors, computation of cross-product and sums-of-squares matrices, t-statistic and partial eta-squared calculations, noncentrality parameter and observed power, correlation, standard error, studentized residuals, and wilks' lambda. The document also discusses the hotelling-lawley trace, pillai's trace, and the likelihood ratio test statistic.
Typology: Study notes
1 / 23
This page cannot be seen from the preview
Don't miss anything!
















1
GLM (general linear model) is a general procedure for analysis of variance and covariance, as well as regression. It can be used for both univariate and multivariate designs. Repeated measures analysis is also available. Algorithms that apply only to repeated measures are in the chapter GLM Repeated Measures. For information on post hoc tests, see Appendix 10. For sums of squares, see Appendix 11. For distribution functions, see Appendix 12. For Box’s M test, see Appendix 14.
The following notation is used throughout this chapter. Unless otherwise stated, all vectors are column vectors and all quantities are known.
n Number of cases. N Effective sample size. p Number of parameters (including the constant, if it exists) in the model. r Number of dependent variables in the model. Y (^) n × r matrix of dependent variables. The rows are the cases and the columns are the dependent variables. The i th row is y ′ i , i = 1, K, n. X (^) n × p design matrix. The rows are the cases and the columns are the parameters. The i th row is x ′ i , i = 1, K, n. rX Number of nonredundant columns in the design matrix. Also the rank of the design matrix. w i
Regression weight of the i th case.
f i
Frequency weight of the i th case.
B p × runknown parameter matrix. The columns are the dependent variables. The j th column is b j , j = 1, K, r.
Y. The (i, j) th element is σ ij , i = 1, K, r , j = 1, K, r.
Model
The model is Y = XB and y ′ i is independently distributed as a p -dimensional normal distribution with mean x B ′ i and covariance matrix w (^) i −^1 Σ. The i th case is ignored if w (^) i ≤ 0.
Frequency Weight and Total Sample Size
The frequency weight f i is the number of replications represented by an SPSS case; therefore, the weight must be a non-negative integer. It is computed by rounding the value in the SPSS weight variable to the nearest integer. The total sample size is N f (^) i wi i
n = > ∑ (^) =
1 I T, where^ I^ I w^ i >^0 T =^1 if^ w^ i >^0 and is equal to 0 otherwise.
The Cross-Product and Sums-of-Squares Matrices
To prepare for the SWEEP operation, an augmented row vector of length H p + r S is formed:
z ′ = (^) i (^) I x ′ (^) i (^) , y ′ i T
Then the H p + r S H× p + r S matrix is computed:
∑= Z WZ f wi z z i
n 1 i^ i^ i.
This matrix is partitioned as
% ' &
( 0
Z WZ (^) )
The t Statistic
For testing H 0 : bij = 0 versus H 1 : bij ≠ 0 , the t statistic is
t b (^) ij bij =
7 8 9 u
$ (^) / se( $ (^) ) if the standard error is positive SYSMIS otherwise
The significance value for this statistic is 2 1Q − CDF. T t ,I N − rx TV where CDF.T is the SPSS function for the cumulative t distribution.
Partial Eta Squared Statistic
η^2
2 2
= 1 0
7 8
u
9
u
b^ $^ ( b $^ ( N r ) var( b $^ )) N r N b
ij ij ij X X ij
X if r^ and the denominator is positive if but SYSMIS otherwise
The value should be within 0 ≤ η^2 ≤ 1.
Noncentrality Parameter
c = t
Observed Power
p
t N r c t N r c r N
r N
c X c X X = X
7
8
uu
9
u u
or any arguments to NCDF. T or IDF. T are SYSMIS
where t (^) c = IDF.T 1I − α / 2 , N − rX T and α is the user-specified chance of Type I error H 0 < α< 1 S. NCDF.T and IDF.T are the SPSS functions for the cumulative noncentral t distribution and for the inverse cumulative t distribution, respectively. The default value is α = 0 05.. The observed power should be within 0 ≤ p ≤ 1.
Confidence Interval
For the p % level, the individual univariate confidence interval for the parameter is
b^ $^ t se b $ ij ±α R W ij
where t α (^) = IDF.T 0 5 1P. H + p / 100 S, N − rX U for i = 1 , K, n j ; = 1 , K, r. The default value of p is 95 ( 0 < p < 100 ).
Correlation
corr if the standard errors are positive SYSMIS otherwise
$ (^) se $^ se $ b (^) ij b (^) rs = js^ g^ ir^ b^ ij^ × brs
7 8
u
9 u
σ (^) R R W R WW
for i r , = 1 , K , p j s ; , = 1 , K, r.
Estimated Marginal Means
Estimated marginal means (EMMEANS) are computed as the generic l Bm ′$ expression with appropriate l and m vectors. l is a column vector of length p and m is a column vector of length r. Since the l vector is chosen to be always estimable, the quantity l Bm ′$^ is in fact the estimated modified marginal means (Searle, Speed, and Milliken, 1980). When covariates (or products of covariates) are present in the effects, the overall means of the covariates (or products of covariates) are used in the l matrix. Suppose X and Y are covariates and they appear as XY in an effect; then the mean of XY is used instead of the product of the mean of X and the mean of Y.
For each level combination of the between subjects factors in TABLES, identify the nonmissing cases with positive caseweights and positive regression weights which are associated with the current level combination. Suppose the cases are classified by three between-subjects factors: A, B and C. Now A and B are specified in TABLES and the current level combination is A=1 and B=2. A case in the cell A=1, B=2, and C=3 is associated with the current level combination,
The t statistic is
t = ′^ ′^ ′^ >
7 8
u 9 u
l Bm $^ se (^) R l Bm $^ (^) W ifse (^) R l Bm $ W SYSMIS otherwise
If the t statistic is not system missing, then the significance is computed based on a t distribution with N − r X degrees of freedom.
Between-Subjects Factor
Suppose the l vectors are indexed by the level of the between-subjects factor as l i (^) 1 ,K , ib , i (^) s = 1, K, ns and s = 1, K, b where n (^) s is the number of levels of between- subjects factor s and b is the number of between-subjects factors specified inside TABLES. The difference in estimated marginal means of level is and level is ′ of between-subjects factor s at fixed levels of other between-subjects factors is
l (^) i 1 (^) ,K , i (^) s (^) − 1 , i i (^) s , (^) s (^) + 1 ,K , i (^) b − l (^) i 1 (^) , K, i (^) s − 1 , i i (^) s , (^) s + 1 ,K , i (^) b Bm $
R (^) ′ W for^ i^ s ,^ is^ ′ =^1 ,^ K,^ n^ s ; is^ ≠ ′ is.
The standard error of the difference is computed by substituting for l in (1) : l (^) i (^) 1 ,K , i (^) s − 1 , i i (^) s , (^) s + 1 , K, i (^) b − l i 1 (^) ,K , i (^) s (^) − 1 , i i (^) s ′ , (^) s (^) + 1 ,K , ib.
Within-Subjects Factor
Suppose the m vectors are indexed by level of the within-subjects factor as m (^) j 1 (^) ,K , jw , j (^) s = 1, K , ns and s = 1, K, w , where ns is the number of levels of within- subjects factor s and w is the number of within-subjects factors specified inside TABLES. The difference in estimated marginal means of level js and level js ′ of within-subjects factor s at fixed levels of other within-subjects factors is
l B m ′ (^) R (^) j 1 (^) , K, j (^) s − 1 , js , j (^) s + 1 ,K , j (^) b − m j 1 (^) , K, j (^) s − 1 , j (^) s ′ , j (^) s + 1 ,K , jb W for j (^) s , j (^) s ′ = 1 , K, n (^) s ; j (^) s ≠ js ′.
The standard error of the difference is computed by substituting for m in (1) m (^) i 1 (^) , K, i (^) s − 1 , i i (^) s , (^) s + 1 ,K , i (^) b − m i 1 (^) ,K , i (^) s − 1 , i i (^) s ′ , (^) s (^) + 1 ,K , ib.
The H 1 − αS ×100%confidence interval is
l Bm ′ $^ ± t 1 (^) − α 2 ; N − r X × se (^) R l Bm ′$ W
and t 1 (^) − α 2 ; N − r X is the I 1 − α 2 T ×100%percentile of a t distribution with N − r X degrees of freedom. No confidence interval is computed if N − r X ≤ 0.
Saved Values
Temporary variables can be added to the working data file. These include predicted values, residuals, and diagnostics.
The n × r matrix of predicted values is Y $^ = XB $. The i th row of Y $is y $^ ′ = (^) i x B ′ i $, i = 1, K, n. Let the elements of Y $ be y $ ij and the elements of XGX ′ be π i j.
The standard error of y $ ij is
se (^) Q V y $ (^) ij = σ$ (^) jj π (^) ii for i = 1 , K , n j ; = 1 , K, r
The weighted predicted value of the i th case is w (^) i y $^ (^) i ′.
The n × r matrix of residuals is E $^ = Y − Y $.
The i th row of E $ is e $^ ′ = i y ′ − i $ y ′ i , i = 1, K, n.
Let the elements of E $ be e $ ij ; then
e^ $^ ij = y (^) ij − $ yij , for i = 1 , K, n j ; = 1 , K, r
The weighted residual is w (^) i $ e^ (^) i ′.
Cook’s Distance
Cook’s Distance D measures the change to the solution that results from omitting each observation. The formula is
e w w^ r
ij
ij
jj i ii
ii i ii X
%
'
& &
(
0
) ) −
% '
&
( 0
)
σ$ π
π 1 1 π
2
I T I^ T
for i = 1 , K, n j ; = 1 , K, r. This formula is equivalent to
Dij = (^) Q e $ ij^ se( e $ (^) ij^ ) (^) V (^) Rse Q V y $ ij^ (^) se (^) Q V e $ (^) ij W rX
2 provided w (^) i > 0 and se (^) Q V e $ ij > 0.
When w (^) i ≤ 0 or se (^) Q V e $ ij = 0 , D (^) ij is system missing.
Leverage (Uncentered)
The leverage for the i th case ( i = 1, K, n ) for all dependent variables is
if i (^) SYSMIS otherwise
7 > 8 9
π 0
Hypothesis Testing
Let L be an l × p known matrix, M be an r × m known matrix and K be an l × m known matrix. The test hypotheses H 0 : LBM = K versus H 1 : LBM ≠ K are testable if and only if LB is estimable. The following results apply to testable hypotheses only. Nontestable hypotheses are excluded.
The hypothesis SSCP matrix is S (^) H = ( LBM $^ − K )′ H LGL ′ S −^1 ( LBM $^ − K ) and the error SSCP matrix is S (^) E = M SM ′.
Four test statistics, based on the eigenvalues of S (^) E^ −^1 S^ H , are available: Wilks’ lambda, Hotelling-Lawley trace, Pillai’s trace, and Roy’s largest root. Let the eigenvalues of S (^) E^ −^1 S^ H be λ 1 ≥ K≥ λ rE ≥ 0 and λ rE + 1 , K, λ m = 0 , and let rE = rank( S (^) E ); s = min (^) I l r , E T; n (^) e = n − r X ; m^ ∗^ = (^12) P rE − l − (^1) U; n^ ∗^ = (^12) I n (^) e − rE − (^1) T.
=
∏
det det
E H E (^) k k
m I T I T I T
1 λ
When H o is true, the F statistic
lrE
ςτ − υ −
τ
τ
1
1
I T R^ Λ W Λ
follows asymptotically an F distribution, where
ς
υ
τ
7 8
u
9 u
n r l
lr
l r l r l r
e E
E
E E E
1 2 1 4 2 2 2 2 2 2
I T
I T
R W R W ifR W 1 otherwise
The degrees of freedom are I lr (^) E , ςτ − 2 υT. The F statistic is exact if s = 1 2,. See Rao (1951) and Section 8c.5 of Rao (1973) for details.
The eta-squared statistic is η^2 = 1 − Λ1/ s^.
The noncentrality parameter is λ = (^) Iξτ − 2 υ η (^) T 2 / (^) R 1 −η^2 W.
The power is 1 − NCDF. F FP (^) α, lrE , (^) Iξτ − 2 υ (^) T,λU where Fα is the upper 100 α percentage point of the central F distribution, and α is user-specified on the ALPHA keyword on the CRITERIA subcommand.
follows asymptotically an F distribution with degrees of freedom R^ s^ R^2 m^^ *^ +^ s^ +^1 W R, s^^2 n^^ *+^ s +^1 WW. The^ F^ statistic is exact if^ s^ =^ 1.
The eta-squared statistic is η^2 = V / s.
The noncentrality parameter is λ= s (^) R 2 n *^ + s + (^1) Wη 2 /R 1 −η^2 W.
The power is 1 − NCDF. F FR (^) α, s (^) R 2 m *^ + s + (^1) W R , s 2 n * + s + (^1) W,λW where Fα is the
upper 100α percentage point of the central F distribution and α is user-specified on the ALPHA keyword on the CRITERIA subcommand.
Θ =λ 1
which is the largest eigenvalue of S^ − E^1 S^ H. When H o is true, the F statistic is
F = ΘI n (^) e − ω + rH Tω
where ω = max (^) I l r , E T is an upper bound of F that yields a lower bound on the significance level. The degrees of freedom areI ω , n (^) e − ω+ rH T. The F statistic is exact if s = 1.
The eta-squared statistic is η^2 = Θ / (^) H 1 +ΘS.
The noncentrality parameter is λ = (^) I n (^) e − ω + rH Tη 2 / (^) R 1 −η^2 W.
The power is 1 − NCDF. F FI (^) α ,ω , n (^) e − ω + l ,λT, where Fα is the upper 100 α percentage point of the central F distribution and α is user-specified on the ALPHA keyword on the CRITERIA subcommand.
l n r
H i E i
; ; I^ T
, i = 1, K, m
where S H;i and S E;i are the i th diagonal elements of the matrices S H and S E respectively. Under the null hypothesis, the F statistic has an F distribution with degrees of freedom (^) I l n , − rX T.
The partial eta-squared statistic is η^2 = S (^) H i ; / (^) Q S (^) H i ; + S E i ;V.
The noncentrality parameter is λ = (^) I n − r X (^) T S H i (^) ; S E i ;.
The power is 1 − NCDF. F FI (^) α, , 1 n − rX ,λT where Fα is the upper 100α percentage point of the central F distribution and α is user-specified on the ALPHA keyword on the CRITERIA subcommand.
Bartlett’s Test of Sphericity
Bartlett’s test of sphericity is printed when the Residual SSCP matrix is requested.
In Bartlett’s test of sphericity the null hypothesis is H (^) o :Σ =σ 2 I (^) r versus the
identity matrix.
λ =
7
8
u u
9
u u
n nr r
2 2 0
trace
trace
trace
I H S T
H S
H S
if
SYSMIS if
1 − CDF.CHISQ H c f , S − ω 2 PCDF.CHISQ H c f , + 4 S −CDF.CHISQH c f , SU
where CDF.CHISQ is the SPSS function for the cumulative chi-square distribution. The significance is reset to zero whenever the computed value is less than zero due to floating point imprecision.
Custom Hypothesis Tests
The TEST subcommand offers custom hypothesis tests. The hypothesis term is any effect specified (either explicitly or implicitly) in the DESIGN subcommand. The error term can be a linear combination of effects that are specified in the DESIGN subcommand or a sum of squares with specified degrees of freedom. The TEST subcommand is available only for univariate analysis; therefore, an F statistic is computed. When the error term is a linear combination of effects and no value for degrees of freedom is specified, the error degrees of freedom is approximated by the Satterthwaite (1946) method.
The following notation is used in this section:
S Number of effects in the linear combination q (^) s Coefficient of the s th effect in the linear combination, s = 1, K, S l (^) s Degrees of freedom of the s th effect in the linear combination, s = 1, K, S MS s Mean square of the s th effect in the linear combination, s = 1, K, S Q Linear combination of effects lQ Degrees of freedom of the linear combination MS Q Mean square of the linear combination
Mean Squares
If the error term is a linear combination of effects, the error mean square is
MS (^) Q s MS s s
S = q × =
∑ 1
If the user supplied the mean squares, MS Q is equal to the number specified after the keyword VS. If MS Q < 0 , the custom error term is invalid, and MS Q is equal to the system-missing value and an error message is issued.
If MS Q ≥ 0 and the user did not supply the error degrees of freedom, then the error degrees of freedom is approximated using the Satterthwaite (1946) method. Define
d q l l s =^ s^ s^ s^ s
(^7) > 8
u 9 u^
MS if otherwise
I T
2 0 0
Then D d (^) s s
=
∑ 1
. The approximate error degrees of freedom is
l (^) Q = Q D^ D >
7 8
u 9 u^
MS if SYSMIS otherwise
Q V
2 0
If MS Q ≥ 0 and the user supplied the error degrees of freedom, lQ is equal to the number following the keyword DF. If lQ < 0 , the custom degrees of freedom is invalid. In this case, lQ is equal to the system-missing value and an error message is issued.
σ (^) e^2 Unknown variance of the residual term, σ (^) e^2 > 0. X i The n × pi design matrix, i = 0 1, , K, k. i 0 The length p 0 vector of parameters of the fixed effects. i i The length pi vector of parameters of the i th random effect, i = 1, K, k. L (^) The s × p full row rank matrix. The rows are estimable functions. s ≥ 1.
Relationships between these symbols and those defined at the beginning of the chapter are:
1
3
2 2 2 2
4
6
5 5 5 5
i i
i
0 1 M k
The mixed model is represented, following Rao (1973), as
(^0 0) ∑ 1
i (^) i i i i
k
The random vectors i 1 , K, i k and e are assumed to be jointly independent. Moreover, the random vector i i is distributed as N (^) pi R (^0) , σ (^) i^2 I pi W for i = 1, K, k and
the residual vector e is distributed as N (^) n R (^0) , σ (^) e^2 W −^1 W. Thus,
i i i i
k e
H S
H S
=
− ∑
0 0
2
1
2 1
i
cov σ σ
For the estimable function L , the expected hypothesis sum of squares is
L i k L k i
k e L
I T
I T
= %' ′ ( 0
∑
1 2 1 2
1 2 1 2 1 2 1 0 0 0 0 2 2 1
i i σ trace σ^2 trace
where
(^12 ) H S
Since L = LGX WX ′ , trace (^) I A (^) L T = s and X W A W X ′ = L ′ LGL ′ − L
(^12 12 ) L H^ S^. The matrix X W A W X ′
(^12 ) L can therefore be computed in the following way:
Now we have X W A W X ′ = CC ′
(^12 ) L. If the rows of^ C^ are partitioned into the same-size submatrices as those contained in X —that is,
1
3
2 2 2 2
4
6
5 5 5 5
0 1 M k
where C i is a pi × s submatrix—then X W A W X ′ k (^) L (^) k = C C i ′ i
1 2 1 (^2) , i = 0 1, , K, k.