Canonical Correlation Analysis - Applied Multivariate Analysis | STAT 636, Study notes of Descriptive statistics

Material Type: Notes; Class: APPLIED MULTIVARIATE ANLY; Subject: STATISTICS; University: Texas A&M University; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-vq3-1
koofers-user-vq3-1 🇺🇸

4

(1)

8 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
LECTURE 9
CANONICAL CORRELATION ANALYSIS
Introduction
The concept of canonical correlation arises when we want to quantify the associations
between two sets of variables.
For example, suppose that the first set of variables, labeled 'arithmetic' records x the
1
speed of an individual in working problems and x the accuracy. The second set of
2
variables, labeled 'reading' consists of x reading speed and x comprehension. We can
3
4
examine the six pair wise correlations but in addition, we ask if it makes sense to ask if
arithmetic is correlated with reading.
The answer is given by considering a linear combination of the arithmetic variables, say,
u and a linear combination of the reading variables, say v and using their correlation to
represent the association between the groups. Thus we construct
u a x a x and v b x b xœ œ
1 1 2 2 1 3 2 4
and we seek coefficients so that this correlation is maximized.
(NOTE: Every text I know of uses u and v for these variables. SAS PROC CANCORR
uses v and w. That is OK but don't get confused.)
Development
Suppose we have a vector of variables, x that consists of two sets of variables, x and x
1 2
where, x has length p and x has length p . Assume that p p . To develop the
1 1 2 2 1 2
Ÿ
notation, let
E[ ] and Var( )x x x
x
x
œ œ œ œ
1 11 12
2
1
221 22
.
..
.
.
..
.
D D
D DD D
D D
D D
D DD D
D D
D
The matrix gives the covariances between the variables in set one and set two and inD
DD
D
12
correlation form it gives the correlations. When p and p are moderately large,
1 2
examining the p p correlations and drawing conclusions is not an easy task. As an
1 2
alternative, we consider linear combinations
u and vœ œa x b x
T T
1 2
Note that
Var[u] Var[v] Cov[u,v]œ œ œa a b b a b
T T T
11 22 12
D D D
D D DD D D
D D D
We want to determine the vectors and so thata b
Corr[u, v] œ
a b
a a b b
T12
T T
11 22
D
DD
D
D D
D DD D
D D
ÈÈ
ÈÈ
È
is as large as possible. To this end, we determine a and b as the solution to the problem
pf3
pf4
pf5

Partial preview of the text

Download Canonical Correlation Analysis - Applied Multivariate Analysis | STAT 636 and more Study notes Descriptive statistics in PDF only on Docsity!

LECTURE 9

CANONICAL CORRELATION ANALYSIS

Introduction

The concept of canonical correlation arises when we want to quantify the associations between two sets of variables.

For example, suppose that the first set of variables, labeled 'arithmetic' records x 1 the speed of an individual in working problems and x 2 the accuracy. The second set of variables, labeled 'reading' consists of x 3 reading speed and x comprehension. We can 4 examine the six pair wise correlations but in addition, we ask if it makes sense to ask if arithmetic is correlated with reading.

The answer is given by considering a linear combination of the arithmetic variables, say, u and a linear combination of the reading variables, say v and using their correlation to represent the association between the groups. Thus we construct

u œ a x 1 1 € a x 2 2 and v œ b x 1 3 €b x 2 4

and we seek coefficients so that this correlation is maximized. (NOTE: Every text I know of uses u and v for these variables. SAS PROC CANCORR uses v and w. That is OK but don't get confused.)

Development

Suppose we have a vector of variables, x that consists of two sets of variables, x 1 and x 2 where, x 1 has length p 1 and x 2 has length p. Assume that p 2 1 Ÿp. To develop the 2 notation, let

x E[ ] x and Var( ) x

x x œ (^) ”^1 • œ (^) ” • œ œ”^11 12 • 2

1 2 21 22

DDDD DDDD

DDDD DDDD

D

The matrix DDDD 12 gives the covariances between the variables in set one and set two and in correlation form it gives the correlations. When p 1 and p 2 are moderately large, examining the p p 1 2 correlations and drawing conclusions is not an easy task. As an alternative, we consider linear combinations u œ a xT^ 1 and vœ b xT 2 Note that Var[u] œ a T^ DDDD 11 (^) a Var[v] œ b T^ DDDD 22 (^) b Cov[u,v]œ a T DDDD 12 b

We want to determine the vectors a and b so that

Corr[u, v] œ (^) a a^ a bb b

T (^12) T 11 T 22 DDDD È DDDD^ ÈÈÈÈ DDDD

is as large as possible. To this end, we determine a and b as the solution to the problem

maximize a T^ DDDD 12 b subject to : a T^ DDDD 11 a œœœœ 1

b T^ DDDD 22 b œœœœ 1

The variables so determined are called the first pair of canonical variables, u 1 and v. 2 The second pair of canonical variables, u 2 and v 2 are similarly determined by linear combinations of x 1 and x 2 with unit variance and maximum correlation among all variables that are uncorrelated with the first pair. This reminds us of the discussion of principal components and leads to the determination of eigenvalues and eigenvectors.

The solution leads us to the stationary equations,

DDDD (^) 12 b  -D-D-D-D 11 a œœœœ 0

DDDD (^) 21 a  )D)D)D)D 22 b œœœœ 0

Multiplying the first equation by a T^ and the second by b Tshows that

  • œ )œ a T^ DDDD 12 b.

We thus seek - so that

º º^ 0.

-D-D-D-D DDDD

DDDD -D-D-D-D

11 12 21 22

œ

The following result is useful: I the matrix A is written in partitioned form as

A

A A

A A

œœœœ (^) ””””^11 12 • ••• 21 22 then llll A l œ ll œ ll œ ll œ l A 11 llllllll A 22  A 21 A 11 ^1 A 12 llll

œ lœ lœ lœ l A 22 llll llll A 11  A 12 A 22 ^1 A 21 llll

Applying the second form of this to our matrix we have

º º^ (^ )

-D-D-D-D DDDD

DDDD -D-D-D-D

11 12 21 22

œ l  -D 22 ll  -D 11 € (^) - "^ D 12 D 22 ^1 D 21 l

œ l  D 22 ll D 12 ( D 22 )^1 D^21  - D^211 l

œ l  D 22 ll D 11 ll D 11 ^1 D 12 ( D 22 ) ^1 D 21  -^2 Il

We can continue this for all non-zero eigenvalues.

Summary

The canonical variable pairs, u (^) i œa xiT 1 and v xiT 2 as determined have the following properties:

Corr(u , v )i i œ - i Corr(u , u )i j œ 0

Corr(v , v )i j œ 0 Corr(u , v )i j œ 0 for i Áj

These properties can be summarized by the correlation matrix

R

I Diag(( ) uv Diag( ) I

p i i p

œ (^) ” 1 • 2

Example

Returning to the reading-arithmetic example, suppose the sample correlation matrix is given by

1 .4 .5. .4 1 .3 .4 1 .4 .5. .5 .3 1 .2 .4 1 .3. .6 .4 .2 1

R œ R œ R œ

Ô ×

Ö Ù

Ö Ù

Õ Ø

R (^) 22 œ (^) ” • R 21 œ” •

Note that it is best to apply the results to standardized data and hence we use the correlation matrix. We may then compute

.452. .146. A œ R (^) 11 ^1 R (^) 12 R (^) 22 ^1 R 21 œ” •

and .206. .278. B œ R (^) 22 ^1 R (^) 21 R 11 ^1 R 11 œ” •

The eigenvalues of these two matrices are the same, that is, - 12 œ .5457 and - 22 œ.0009. The eigenvectors of A and B are the columns of the matrices

VecA and VecB

œ (^) ” • œ” •

Recall that we have specified that the variances of the u (^) i and v imust be one. That is,

a RTi 11 a (^) i œœœœ 1 and b Ti DDDD 22 b (^) i œœœœ 1

The eigenvectors as determined are normalized to have length one but do not satisfy this condition. The eigenvectors must be scaled. The scaled eigenvectors are given by

and

A œ VecA (^) Œ  B œ VecB Œ 

 "#^ "#

Thus,

and

A œ (^) ” • B œ” •

It follows that the first canonical pair is defined by

u 1 œ .856z 1 € ..278z 2 v 1 œ .545z 3 €.737z 4

with correlation 31 œ È .5457 œ.

The second canonical pair is defined by

u 2 œ  ..677z 1 € 1.056z 2 v 2 œ  .863x 3 €.706x 4

with correlation 32 œ È .0009 œ.

We see that the first pair captures most of the relation between arithmetic and reading. The canonical variate for arithmetic, u , places over three times as much weight on speed 1 as it does on accuracy and the canonical variate for reading, v , puts more weight on 1 comprehension that on speed in proportion 4:3. Note that this does not say, for example, that speed is three times as important as accuracy in arithmetic. It simply says that if we are asking for a measure of the relation between arithmetic and reading, these functions provide the essential component of that relation.

Interpretation of Canonical Variables

In general, the canonical variables are artificial and may have no physical meaning. The interpretation is often aided by computing the correlation between the original variables and the canonical variables. To do this, note that the canonical variables are related to the original variables by the equations,

u œ A z T^ 1 and v œœœœ B z T 2