Download Factor Analysis - Applied Multivariate Methods - Lecture Notes | STAT 579 and more Exams Descriptive statistics in PDF only on Docsity!
Factor Analysis
Why Perform Factor Analysis?
You suspect that the variables you observe (manifest variables) are functions of variables that you cannot observe directly (latent variables).
- Identify the latent variables to learn something interesting about the behavior of your population.
- Identify relationships between different latent variables.
- Show that a small number of latent variables underlies the process or behavior you have measured to simplify your theory.
- Explain inter-correlations among observed variables.
Objectives of Factor Analysis
- To determine whether the p response variables can be partitioned into m subsets, with high correlations within a subset, and low correlations between subsets.
- To determine this new set of m uncorrelated variables (factors) in such a way that these new variables (factors) are interpretable.
Exploratory Factor Analysis
F1:Consumer confidence
F2: Buying power
New Home Buys Durable Goods Buys
Borrowing
Income
Import Purchases
u 1
u 2
u 3
u 4
u 5
Factor Analysis Model
x (^) p × 1 = Λ (^) p m × f (^) m × 1 + η p × 1
Factor loadings
Common factors
Specific factors
Factor Analysis Model
- f and η are independent
- x is either centered or standardized
Assumptions of the Common
Factor Model
- The unique factors (residuals) are uncorrelated with each other.
- The unique factors (residuals) are uncorrelated with the common (latent) factors. Under these constraints, you can solve for the correlation matrix:
R = Λ Λ ′+ Ψ
PCA versus Factor Analysis
The variables reflect the common (latent) factors and explain shared variation in the manifest variables.
The components are derived from the variables and explain 100% of the variation in the data.
Not necessary that 100% of variance be accounted for by the extracted factors.
100% of variance accounted for by all components.
PCA Factor Analysis
Factor Analysis – Details
σ ij
f
If and exist, so that
then, since is a diagonal matrix, the common factors completely explain the covariances
Factor Analysis – Details
2 2 1
m j jj jk j k
=
= = (^) ∑ +
The proportion of variance of x j that is explained by the common factors is
2 (^2 )
m jk k j j jj
h x
∑ communality of
Factor Analysis – Details
( )
1
Cov ,
m jj ik jk k
jk j k th th
x f
j
k
σ λ λ
λ
=
∑
loading of the response
variable on the factor
Factor Analysis using Ρ
( )
2 1 2 2 1
Corr ,
m jk j k m j jk j k jk j k j j j jj
x h
z f x z
=
=
∑
∑
Communality of
where
Solving the Factor Analysis
Equations
- In many cases, solutions do not exist at all.
- For example,
- estimates of λ^2 are negative or
- estimates of ψ are negative
Choosing the Number of Factors m
- Initially, do a PCA, and start with m
- as the number of PC’s that explain most (≥ 70%) of the variance
- (if PCA on correlation matrix) as the number of components with eigenvalues > 1
- as the number determined from a scree plot (find the elbow, and select all components occurring in the sharp descent before leveling off)
Choosing the Number of Factors m
- After a FA, drop trivial factors (factors that load on only one response variable).
Choosing the Number of Factors m
- Determine the minimum number of factors that account for 100% of the common variance
- Interpretability criteria
- At least three items load on each factor
- Variables within a factor share conceptual meaning
- Variables between factors measure different constructs
- Rotated factors demonstrate simple structure
Methods for Solving the Factor Analysis Equations
- Maximum Likelihood Method
- An iterative procedure that is less efficient computationally
- Yields better estimates than the Principal Factor method in large samples
- Produces a statistical test for the number of factors
- A factor extraction method that produces parameter estimates that are most likely to have produced the observed correlation matrix if the sample is from a multivariate normal distribution. The correlations are weighted by the inverse of the uniqueness of the variables, and an iterative algorithm is employed.
Methods for Solving the Factor Analysis Equations
- Unweighted Least Squares Method
- A factor extraction method that minimizes the sum of the squared differences between the observed and reproduced correlation matrices ignoring the diagonals.
- Generalized Least Squares Method
- A factor extraction method that minimizes the sum of the squared differences between the observed and reproduced correlation matrices. Correlations are weighted by the inverse of their uniqueness, so that variables with high uniqueness are given less weight than those with low uniqueness.
Methods for Solving the Factor Analysis Equations
- Alpha
- A factor extraction method that considers the variables in the analysis to be a sample from the universe of potential variables. It maximizes the alpha reliability of the factors.
- Image
- A factor extraction method developed by Guttman and based on image theory. The common part of the variable, called the partial image, is defined as its linear regression on remaining variables, rather than a function of hypothetical factors.
Principal Factor Method on R
Model: R = Λ Λ^ ′+ Ψ
( 1 ) 2 , '
2 2 , '
diag , ,
ˆ (^1) all the other s
all the other s
Estimate
with or equivalently, let the communality
j
j
p
j x x
j x x j
R
h R SMC
(1)^ Ψ^ L
Principal Factor Method on R
- Estimation using the adjusted squared multiple correlation:
max (^2 )
1
p i
i j j (^) p j i i
r h ASMC SMC SMC
=
=
∑
∑
Principal Factor Method on R
To obtain a unique solution, let
Λ Λ ′^ = D =diag (^) ( d 1 ,L , dm )
Also, let (^) Λ =[ λ 1 (^) , L, λ m ]
Principal Factor Method on R
1 1
k k k m m
d k m d d
Λ Λ Λ R Ψ Λ
Λ D R Ψ Λ
R Ψ
R Ψ
L
L
L
are the eigenvalues of are the corresponding eigenvectors
Principal Factor Method on R
- Choose λ’s corresponding to the m largest nonnegative eigenvalues
- May have to reduce m
Methods of Orthogonal Factor
Rotation
- Varimax
- An orthogonal rotation method that minimizes the number of variables that have high loadings on each factor. It simplifies the interpretation of the factors.
Methods of Orthogonal Factor
Rotation
- Quartimax
- A rotation method that minimizes the number of factors needed to explain each variable. It simplifies the interpretation of the observed variables.
- Orthomax
- A class of rotations that includes quartimax
Methods of Orthogonal Factor
Rotation
- Equamax
- A rotation method that is a combination of the varimax method, which simplifies the factors, and the quartimax method, which simplifies the variables. The number of variables that load highly on a factor and the number of factors needed to explain a variable are minimized.
Simple Structure of the Matrix of
Factor Loadings
The matrix of factor loadings is
11 12 1 21 22 2
1 2
m m
p p pm
λ λ λ λ λ λ
λ λ λ
= ⎢^ ⎥
L
L
M M O M
L