Download Model Fitting in Covariance Structural Modeling and more Study notes Cognitive Psychology in PDF only on Docsity!
Ch. 5: Model Fitting Issues
I. Model Fitting: Model fitting is a process of hypothesis testing.
A. H 0 : Σ = Σ ( θ ).
B. We hope not to reject H 0 because it is a model fitting procedure. C. Test Statistic
- W( S ; Σ,n) =
=
−
− Σ −−
− p
i
n np pp
ntrS n p
n i
e nS
1
( 1 / 2 ) ( 1 / 2 ) ( 1 / 4 ) ( 1 )
( 1 / 2 ) ( ) ( 1 / 2 )( 1 )
1
= C
e n
ntrS ⋅ Σ
− Σ− ( 1 / 2 )
( 1 / 2 ) ( )
| |
1
= e (^1 /^2 ) ntr^ ( S )| |(^1 /^2 ) nC − Σ^1 − Σ
−
- Let the likelihood ratio (L) between a given model and a perfect model be,
L = Likelihoodforaperfect el
Likelihhodforagiven el mod
mod
e S C
e C ntrSS n
ntrS n ( 1 / 2 ) ( ) ( 1 / 2 )
( 1 / 2 ) ( ) ( 1 / 2 )
| |
1
1
− −
− Σ − −
− Σ
= e (^1 /^2 ) ntr^ ( S )| |(^1 /^2 ) ne (^1 /^2 ) ntr ( SS )| S |(^1 /^2 ) n. − Σ−^1 − −^1 Σ
- Let the natural log of L be ln (L), l. l = -(1/2)ntr( S Σ -1) – (1/2)log| Σ | + (1/2)ntr( SS -1) + (1/2)nlog| S | = -(1/2)n[tr( S Σ^ -1) + log| Σ^ | - log| S | - tr( SS -1)] = -(1/2)n[tr( S Σ -1) + log| Σ | - log| S | - (p+q)] (∵ SS -1^ is a (p+q) identity matrix whose trace is the sum of the (p+q) 1’s on the diagonal.)
- If we multiply the l -function by -2, it will be a χ^2 - function for the fit test between a given model and a perfect model.
- Thus, χ 2 = -2(-(1/2)n[tr( S Σ -1) + log| Σ | - log| S | - (p+q)]) = 2n(1/2)[tr( S Σ -1) + log| Σ | - log| S | - (p+q)]) = n[tr( S Σ -1) + log| Σ | - log| S | - (p+q)] = nF with df = (1/2)[(p+q)(p+q+1)] - t where t = the total number of estimated coefficients, and p and q are number of endogenous and exogenous variables.
- Smaller χ 2 -values indicate better fitting models, and an insignificant χ (^2) is desirable because the model’s predicted Σ is sufficiently close to the observed data covariance, S , so that the remaining differences are sampling errors.
- About df, the quantity (1/2)[(p+q)(p+q+1)] can be seen as, (1/2)[(p+q)(p+q) + (p+q)] = (1/2)(p+q)(p+q) + (1/2)(p+q). The first term is half of the total entries in a (p+q)X(p+q) covariance matrix (lower triangle + half of the variance) and the second term is half of the variance on the main diagonal.
II. Model Specification A. Regression coefficients: Β , Γ , Λ (^) X , and Λ (^) Y. B. Four covariance matrices: Φ , Ψ ,Θ (^) δ, andΘ ε. C. Models should be theory-driven models.
III. Model Estimation A. Maximum Likelihood Method for Σ (^) YY , Σ (^) YX , and Σ (^) XX. B. Newton-Raphson method has been used for numerical analysis.
IV. Model Fitting Indices (Ch. 3 from the Byrne’s book, AMOS) A. Basic Models
- Default model: the model you specified.
- Saturated model: a model in which the number of estimated parameters equals the number of data points (variances and covariances of the observed variables: the just-identified model). It is the least restricted model.
- Independence model: a model of complete independence of all variables in the model (all correlations among variables are zero). It is the most restricted model. B. χ 2 -test (Absolute fit index)
- χ^2 -test is sensitive to sample size and it is the central χ^2 -distribution.
- RMSEA (Root Mean Square Error of Approximation): One of the most informative criteria in covariance structural modeling. It is an index for errors of approximation in the population, the discrepancy between the population covariance and the optimally chosen parameters. Values less than .05 indicate good fit and .08 is reasonable. The 90% CI values are computed.
- PCLOSE (Probability of Closeness of the fit in the population): The p-value for this test should be greater than .50.
- AIC (Akaike’s Information Criterion): statistical goodness-of-fit and number of parameters estimated are taken into account, does not consider sample size. AIC =. The smaller, the better. No criterion for “Small enough,” only through comparison to other models such as saturated and independence models.
el df mod el
2
χmod + 2
- CAIC (Consistent Version of the AIC): proposed by Bozdogan
(1987). CAIC = χmod^2 el +(ln^ N^ +^1 ) df mod el. Considers both number
of parameters and sample size.
- BCC (Brown-Cudeck Criterion, 1989) and BIC (Bayes Information Criterion, Raftery, 1993): reflect the extend to which parameter estimates from the original sample will cross-validate in future samples. BCC and BIC impose greater penalties on model complexity than AIC and CAIC.
- ECVI (Expected Cross-Validation Index): the discrepancy between the fitted covariance matrix in the analyzed sample and the expected covariance matrix that would be obtained in another sample of equivalent size. It can take any value, thus no determined range values exist.
- Hoelter .05 and .01: Adequacy of sample size. A value over 200 is indicative of adequately represent the sample data.
V. Model misfit indices A. Residuals
1. The discrepancy between Σ ( θ ) and S for each pair of observed
variables (possibly, N(N+1)/2) from a covariance matrix.
- A perfect fit would give us a residual matrix of zero (Null matrix).
- Two types of residuals: unstandardized and standardized.
- Unstandardized residuals are dependent on the measurement unit and are hard to interpret.
- Standardized residuals z-scores of the unstandardized residuals.
- If the standardized residual value is greater than |2.58| (.01), it is considered to be large.
- It is not a default output Æ should go to the Analysis Properties icon.
B. Modification Index (MI)
- The value of which represents the expected drop in overall^ χ^2 -value if the fixed parameter were to be freely estimated in a subsequent run.
- All freely estimated parameters automatically have the MI values equal to zero.
- The (^) χ^2 -value for each fixed parameter is with one (1) degrees of freedom.
- A high MI value (e.g., >10 or 20) in covariance indicates a possible correlation between two variables and may require a correlation path in the model.
- A high MI value in regression may indicate a cross-loading of the factor on the variable and may require moving the variable to another factor.
- Not a default output Æ should go to the Analysis Properties icon and specify the threshold value. C. Par Change (Expected Parameter Change, EPC)
- It is an estimated change (positive or negative) for each fixed parameter in the model from any reparameterization of the model based on the MI value.
- An index for sensitivity of fit for each parameter estimation.
VI. Analysis Strategy A. Discarding the model
- If the proposed model does not fit the data set with low fit indices (e.g., a significant χ 2 -value, GFI lower than .80, and RMSEA greater than .10), we need to discard the model and start all over again.
- We need to come up with another theory-driven model for a separate data set.
- By not modifying the model we maintain the confirmatory nature of the analysis. B. Changing factor structure
- Although the proposed model does not fit the data set, the fit indices are relatively high (e.g., GFI between .85 - .89 and RMSEA between .05 - .08) with some high values of MI and standardized residuals for several parameters.
- By changing factor structure within the “safe-zone” and by reanalyzing the data, we can obtain a better fit of the model for the data set.
- By running a post hoc analysis with modified paths, we change the analysis from confirmatory to exploratory. C. Deleting unnecessary variables (Parceling)
- The rule of thumb is “five variables per factor.”
- If we have too many variables for a factor, some low loading variables may be deleted.
- We would like to select a strong group of variables for each factor.
- This procedure is also an exploratory analysis in nature. D. Compare the two models with Δχ^2 =χ 12 −^ χ 22 withΔ df = df 1 − df 2.