Partial Least Squares Regression: A Simple Two-Dimensional Example - Prof. Nam Sun Wang, Study notes of Chemistry

An explanation of partial least squares regression with a simple two-dimensional example. The instructor, nam sun wang, outlines the steps of mean-centering, variance-scaling, finding eigenvalues and eigenvectors, scoring and loading vectors, and regression. The document also includes checks for orthogonality and the residual matrices.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-nfy
koofers-user-nfy 🇺🇸

5

(1)

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1 pca5.mcd
Partial Least Squares Regression -- a simple two-dimensional example.
Instructor: Nam Sun Wang
Step 0. Generate X and Y Data. The first two independent variables x<0> and x<1> are
somewhat dependent. Although x<0> is 2 times of x<1>, this difference is absorbed during
variance-scaling.
Number of points: N 50 i ..0 N
Dimension: m 1 j ..0 m
< >
Xi.
( )rnd( )1 0.5 2
1.
( )rnd( )1 0.5 0
1.
( )rnd( )1 0.5 2
0XT
X
The dependent variable Y depends on x<0> and x<1> with a 2:1 ratio. The relative contribution to Y is
proportional to the standard deviation xstdev and the multiplicative coefficient a.
a2
1y( )x .
x a
Yi.
a0X,i 0 .
a1X,i 1 ( )rnd( )0.1 0.05
Step 1a. Mean-Centering:
xmeanjmean < >
Xjxmean T
xmean < >
Xj< >
Xjxmean ,0 j
=x mean 4.487 10 40.015( )
ymean mean( )Y Y Y y mean =y mean 0.02
Step 1b. Variance-Scaling:
xstdevjstdev < >
Xjxstdev T
xstdev < >
Xj< >
Xj
xstdev ,0 j
=x stdev 0.913 0.358( )
Step 2. Find the eigenvalues and eigenvectors of the mean-centered and variance-scaled
covariance matrix XTYYTX.
Covariance matrix:
W...
T
X Y T
Y X =W 4.195 10 8 Singular.
=W 1.047 104
6.59 103
6.59 103
4.149 103
Eigenvalue/eigenvector
λreverse( )sort( )eigenvals( )W =
T
λ1.462 1043.052 10 12
( )
< >
wjeigenvec ,W λj=w 0.846
0.533
0.533
0.846
Correct the weighting vector w<0>.
< >
w0.
< >
w0..
T
X X < >
w0
...
T
< >
w0 T
X X < >
w0
=
< >
w00.855
0.539 =
< >
w01.011
pf3
pf4
pf5

Partial preview of the text

Download Partial Least Squares Regression: A Simple Two-Dimensional Example - Prof. Nam Sun Wang and more Study notes Chemistry in PDF only on Docsity!

Partial Least Squares Regression -- a simple two-dimensional example.

Instructor: Nam Sun Wang

Step 0. Generate X and Y Data. The first two independent variables x <0> and x <1> are

somewhat dependent. Although x <0> is 2 times of x <1> , this difference is absorbed during

variance-scaling.

Number of points: N 50 i 0 ..N

Dimension: m 1 j 0 ..m

X

i ( rnd( 1 ) 0.5 ).

( rnd( 1 ) 0.5 ).

( rnd( 1 ) 0.5).

X

T X

The dependent variable Y depends on x <0> and x <1> with a 2:1 ratio. The relative contribution to Y is

proportional to the standard deviation x stdev and the multiplicative coefficient a.

a

y( x ) x a.

Y

i

a. 0

X

i 0,

a. 1

X

i 1,

( rnd( 0.1) 0.05)

Step 1a. Mean-Centering :

x mean j

mean

X

j x mean

T x mean

X

j < > X

j x mean 0 j,

x = mean

4 ( 0.015)

y mean

mean( Y ) Y Y y mean

y = mean

Step 1b. Variance-Scaling :

x stdev j

stdev

X

j x stdev

T x stdev

X

j

X

j

x stdev 0 j, x = stdev

Step 2. Find the eigenvalues and eigenvectors of the mean-centered and variance-scaled

covariance matrix X T ⋅Y⋅Y T ⋅X.

Covariance matrix:

W...

T X Y

T Y X W =4.195 10

8 ← Singular. W =

4

3

3

3

Eigenvalue/eigenvector

λ reverse( sort( eigenvals( W ))) =

T λ 1.462 10

4 3.052 10

12 ( )

w

j eigenvec W ,λ j (^) w =

Correct the weighting vector w <0> .

w

0 .

w

0

T X X

w

0

T < > w

0 T X X

w

0

w

w

0

Step 3. Score and loading vectors for X and Y.

score (column) vector for X:

t

0 . X

w

0

loading (row) vector for Y:

q

0 . T Y

t

0 normalize:

q

0

q

0

q

0

q = 1

score (column) vector for Y:

u

0 . Y

q

0

loading (row) vector for X:

p

0 . T X

t

0 normalize:

p

0

p

0

p

0

p =

A comparison of different vectors.

Eigenvalue/eigenvector of the covariance matrix X T ⋅X -- Principal Component Regression (PCR).

V

T X X

λ v reverse( sort( eigenvals( V) )) =

T λ v

v

j eigenvec V, λ v j v^ =

xx 3 .. 3

3 2 1 0 1 2 3

3

2

1

0

1

2

3

Data

One Particular Point

Principal Component v

Relative Contribution to Y

Weighting Vector w

Loading Vector p

< > X

1 i

< > X

1 0

.

< > v

0 1

< > v

0 0

xx

..

a 1

a 0

x stdev 0 1,

x stdev 0 0,

xx

.

< > w

0 1

< > w

0 0

xx

.

< > p

0 1

< > p

0 0

xx

0

0

, ,

< > X

0 i

< > X

0 0 xx

Step 4. Regression.

b 0

slope ,

t

u

0 b =1.623 tt 3 , 2.9 .. 3

2 0 2

5

0

5

0th score of x

0th score of y

< > u

0 i

< > u

0 0

b. 0 tt

0

0

, ,

< > t

0 i

< > t

0 0 tt

Step 5. Compute the residual matrices

E X

t

0 T < > p

0

F Y

t

0 T < > q

0 b 0

Step 6. Goodness of fit.

sse old

Y Y sse = old

sse

F F sse =11.

r

sse old

sse

sse old

r2 =94.322 %

r r2 r =97.119 %

2nd Iteration. We start with the residual independent variables E and F.

Covariance matrix: W

T E F

T F E W =

Eigenvalue: λ reverse( sort( eigenvals( W ))) =

T λ ( 318.022 0 )

Eigenvector: < > w

1 eigenvec W, λ (^0) w =

score (column) vector for X:

t

1 . E

w

1

loading (row) vector for Y:

q

1 . T F

t

1 normalize:

q

1

q

1

q

1

q

1 1

score (column) vector for Y:

u

1 . F

q

1

loading (row) vector for X:

p

1 . T E

t

1 normalize:

p

1

p

1

p

1

p

Regression: b 1

slope ,

t

u

1

b =

tt 3 , 2.9 .. 3

2 0 2

4

2

0

2

4

1st score of x

1st score of y

< > u

1 i

< > u

1 0

b. 1 tt

0

0

, ,

< > t

1 i

< > t

1 0 tt

Residual matrices: E E

t

1 T < > p

1

F F

t

1 T < > q

1 b 1

Goodness of fit: sse

F F sse =0.

r

sse old

sse

sse old

r2 =99.98 %

r r2 r =99.99 %

Step 6. Regression Model. (Be sure to take care of both mean-centering and variance-scaling)

Variance-scaling: x stdev_inv j j,

x stdev 0 j,

I

j j,

1 ... an identity matrix.

y regress ( x)

x x mean x stdev_inv

w

0 T < > q

0 b 0

I

w

0 T < > p

w

1 T < > q

1 b 1

y mean

Let us examine the slope and intercept with 2 terms (j=0,1).

slope:

x stdev_inv

w

0 T < > q

0 b 0

I

w

0 T < > p

w

1 T < > q

1 b 1

←→ a =

Examples:

y = regress ( ( 5 0.5 )) 10.508 ←→ y( ( 5 0.5 ))=10.5 ← O.K.

y = regress ( ( 5 0.5 )) 9.497 ←→ y( ( 5 0.5 ))=9.5 ← O.K.