Factor Analysis - Applied Multivariate Methods - Lecture Notes | STAT 579, Exams of Descriptive statistics

Material Type: Exam; Class: Applied Multivariate Methods; Subject: Statistics; University: University of Tennessee - Knoxville; Term: Unknown 1989;

Typology: Exams

Pre 2010

Uploaded on 08/26/2009

koofers-user-vq3-1
koofers-user-vq3-1 🇺🇸

4

(1)

8 documents

1 / 35

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Factor Analysis
Why Perform Factor Analysis?
You suspect that the variables you observe
(manifest variables) are functions of variables that
you cannot observe directly (latent variables).
Identify the latent variables to learn something
interesting about the behavior of your population.
Identify relationships between different latent variables.
Show that a small number of latent variables underlies
the process or behavior you have measured to simplify
your theory.
Explain inter-correlations among observed variables.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23

Partial preview of the text

Download Factor Analysis - Applied Multivariate Methods - Lecture Notes | STAT 579 and more Exams Descriptive statistics in PDF only on Docsity!

Factor Analysis

Why Perform Factor Analysis?

You suspect that the variables you observe (manifest variables) are functions of variables that you cannot observe directly (latent variables).

  • Identify the latent variables to learn something interesting about the behavior of your population.
  • Identify relationships between different latent variables.
  • Show that a small number of latent variables underlies the process or behavior you have measured to simplify your theory.
  • Explain inter-correlations among observed variables.

Objectives of Factor Analysis

  • To determine whether the p response variables can be partitioned into m subsets, with high correlations within a subset, and low correlations between subsets.
  • To determine this new set of m uncorrelated variables (factors) in such a way that these new variables (factors) are interpretable.

Exploratory Factor Analysis

F1:Consumer confidence

F2: Buying power

New Home Buys Durable Goods Buys

Borrowing

Income

Import Purchases

u 1

u 2

u 3

u 4

u 5

Factor Analysis Model

x (^) p × 1 = Λ (^) p m × f (^) m × 1 + η p × 1

Factor loadings

Common factors

Specific factors

Factor Analysis Model

  • f and η are independent
  • x is either centered or standardized

Assumptions of the Common

Factor Model

  • The unique factors (residuals) are uncorrelated with each other.
  • The unique factors (residuals) are uncorrelated with the common (latent) factors. Under these constraints, you can solve for the correlation matrix:

R = Λ Λ ′+ Ψ

PCA versus Factor Analysis

The variables reflect the common (latent) factors and explain shared variation in the manifest variables.

The components are derived from the variables and explain 100% of the variation in the data.

Not necessary that 100% of variance be accounted for by the extracted factors.

100% of variance accounted for by all components.

PCA Factor Analysis

Factor Analysis – Details

σ ij

f

If and exist, so that

then, since is a diagonal matrix, the common factors completely explain the covariances

Factor Analysis – Details

2 2 1

m j jj jk j k

=

= = (^) ∑ +

The proportion of variance of x j that is explained by the common factors is

2 (^2 )

m jk k j j jj

h x

∑ communality of

Factor Analysis – Details

( )

1

Cov ,

m jj ik jk k

jk j k th th

x f

j

k

σ λ λ

λ

=

loading of the response

variable on the factor

Factor Analysis using Ρ

( )

2 1 2 2 1

Corr ,

m jk j k m j jk j k jk j k j j j jj

x h

z f x z

=

=

Communality of

where

Solving the Factor Analysis

Equations

  • In many cases, solutions do not exist at all.
  • For example,
    • estimates of λ^2 are negative or
    • estimates of ψ are negative

Choosing the Number of Factors m

  • Initially, do a PCA, and start with m
    • as the number of PC’s that explain most (≥ 70%) of the variance
    • (if PCA on correlation matrix) as the number of components with eigenvalues > 1
    • as the number determined from a scree plot (find the elbow, and select all components occurring in the sharp descent before leveling off)

Choosing the Number of Factors m

  • After a FA, drop trivial factors (factors that load on only one response variable).

Choosing the Number of Factors m

  • Determine the minimum number of factors that account for 100% of the common variance
  • Interpretability criteria
    • At least three items load on each factor
    • Variables within a factor share conceptual meaning
    • Variables between factors measure different constructs
    • Rotated factors demonstrate simple structure

Methods for Solving the Factor Analysis Equations

  • Maximum Likelihood Method
    • An iterative procedure that is less efficient computationally
    • Yields better estimates than the Principal Factor method in large samples
    • Produces a statistical test for the number of factors
    • A factor extraction method that produces parameter estimates that are most likely to have produced the observed correlation matrix if the sample is from a multivariate normal distribution. The correlations are weighted by the inverse of the uniqueness of the variables, and an iterative algorithm is employed.

Methods for Solving the Factor Analysis Equations

  • Unweighted Least Squares Method
    • A factor extraction method that minimizes the sum of the squared differences between the observed and reproduced correlation matrices ignoring the diagonals.
  • Generalized Least Squares Method
    • A factor extraction method that minimizes the sum of the squared differences between the observed and reproduced correlation matrices. Correlations are weighted by the inverse of their uniqueness, so that variables with high uniqueness are given less weight than those with low uniqueness.

Methods for Solving the Factor Analysis Equations

  • Alpha
    • A factor extraction method that considers the variables in the analysis to be a sample from the universe of potential variables. It maximizes the alpha reliability of the factors.
  • Image
    • A factor extraction method developed by Guttman and based on image theory. The common part of the variable, called the partial image, is defined as its linear regression on remaining variables, rather than a function of hypothetical factors.

Principal Factor Method on R

Model: R = Λ Λ^ ′+ Ψ

( 1 ) 2 , '

2 2 , '

diag , ,

ˆ (^1) all the other s

all the other s

Estimate

with or equivalently, let the communality

j

j

p

j x x

j x x j

R

h R SMC

(1)^ Ψ^ L

Principal Factor Method on R

  • Estimation using the adjusted squared multiple correlation:

max (^2 )

1

p i

i j j (^) p j i i

r h ASMC SMC SMC

=

=

Principal Factor Method on R

To obtain a unique solution, let

Λ Λ ′^ = D =diag (^) ( d 1 ,L , dm )

Also, let (^) Λ =[ λ 1 (^) , L, λ m ]

Principal Factor Method on R

1 1

k k k m m

d k m d d

Λ Λ Λ R Ψ Λ

Λ D R Ψ Λ

R Ψ

R Ψ

L

L

L

are the eigenvalues of are the corresponding eigenvectors

Principal Factor Method on R

  • Choose λ’s corresponding to the m largest nonnegative eigenvalues
  • May have to reduce m

Methods of Orthogonal Factor

Rotation

  • Varimax
    • An orthogonal rotation method that minimizes the number of variables that have high loadings on each factor. It simplifies the interpretation of the factors.

Methods of Orthogonal Factor

Rotation

  • Quartimax
    • A rotation method that minimizes the number of factors needed to explain each variable. It simplifies the interpretation of the observed variables.
  • Orthomax
    • A class of rotations that includes quartimax

Methods of Orthogonal Factor

Rotation

  • Equamax
    • A rotation method that is a combination of the varimax method, which simplifies the factors, and the quartimax method, which simplifies the variables. The number of variables that load highly on a factor and the number of factors needed to explain a variable are minimized.

Simple Structure of the Matrix of

Factor Loadings

The matrix of factor loadings is

11 12 1 21 22 2

1 2

m m

p p pm

λ λ λ λ λ λ

λ λ λ

= ⎢^ ⎥

L

L

M M O M

L