Matrix Decompositions, Study notes of Linear Algebra

Different ways of decomposing matrices, including the singular value decomposition (SVD) and QR-decomposition. It explains the utility of these decompositions and their applications, such as finding solutions to least squares problems and compressing/transmitting data efficiently. The document also provides theorems and definitions related to SVD, including the definition of singular values. It includes examples and exercises for practice.

Typology: Study notes

2021/2022

Available from 03/12/2023

SteveFisher
SteveFisher 🇺🇸

10 documents

1 / 18

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
MATRIX DECOMPOSITIONS
Structure Page
Nos.
6.1 Introduction
Objectives
6.2 Singular Value Decomposition
6.3 Applications of SVD
Least Squares Solution
Data
Compression
6.4 QR-decomposition
1
6.5 Summary
6.6 Solutions/Answers
6.1
INTRODUCTION
In
your undergraduate studies you would have studied about the
LU
decomposition
of
a matrix.
In
Block 1, you have also seen that diagonalisable matrices can be
decomposed as
P-'DP,
where
P
is non-singular and D is a diagonal matrix. In this
unit, we will consider some other ways of decomposing matrices, and the utility of
d
such decompositions.
To start with, in Sec. 6.2, we focus on a decomposition of
any
matrix as a product of 3
?
matrices. This is called the singular value decomposition, since the middle matrix in
the decomposition of A has the singular values of A along its principal diagonal, and
zero entries elsewhere.
This way of decomposkg a matridlinear transformation has various applications. We
take up some of these in Sec. 6.3. The first application pertains to a method for
finding solutions to the least squares problem, which you first encountered in Unit 4.
This method is related to the one done in Sec. 4.5, but more efficient in practice. The
second application that we discuss is for transmitting data efficiently, particularly
images.
Sec. 6.4 is about the QRdecomposition of matrices, i.e., any mxn matrix with real
entries and of rank n can be written as a product of an orthogonal matrix Q and an
upper triangular matrix R. We also discuss some advantages of a QR-decomposition
of a matrix, one such being another efficient method for obtaining the least squares
solution.
Objectives
b
After studying this unit, you should be able to
compute the singular values of a linear transformation/ matrix, and obtain its
singular value decomposition;
apply the singular value decomposition for obtaining least squares solutions;
explain how the singular value decomposition can be applied for compressing and
transmitting data efficiently;
obtain the QRdecomposition of an m x n matrix over
R
of rank n;
apply the QR-decomposition of a matrix to obtain least squares solutions.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12

Partial preview of the text

Download Matrix Decompositions and more Study notes Linear Algebra in PDF only on Docsity!

MATRIX DECOMPOSITIONS

Structure Page Nos.

6.1 Introduction Objectives 6.2 Singular Value Decomposition 6.3 Applications of SVD Least Squares Solution Data Compression 6.4 QR-decomposition

1 6.5 Summary

6.6 Solutions/Answers

6.1 INTRODUCTION

In your undergraduate studies you would have studied about the LU decomposition of a matrix. In Block 1, you have also seen that diagonalisable matrices can be decomposed as P-'DP, where P is non-singular and D is a diagonal matrix. In this unit, we will consider some other ways of decomposing matrices, and the utility of d such decompositions.

To start with, in Sec. 6.2, we focus on a decomposition of any matrix as a product of 3 ? matrices. This is called the singular value decomposition, since the middle matrix in the decomposition of A has the singular values of A along its principal diagonal, and zero entries elsewhere.

This way of decomposkg a matridlinear transformation has various applications. We take up some of these in Sec. 6.3. The first application pertains to a method for finding solutions to the least squares problem, which you first encountered in Unit 4. This method is related to the one done in Sec. 4.5, but more efficient in practice. The second application that we discuss is for transmitting data efficiently, particularly images.

Sec. 6.4 is about the QRdecomposition of matrices, i.e., any m x n matrix with real entries and of rank n can be written as a product of an orthogonal matrix Q and an upper triangular matrix R. We also discuss some advantages of a QR-decomposition of a matrix, one such being another efficient method for obtaining the least squares solution.

Objectives

b After studying this unit, you should be able to

compute the singular values of a linear transformation/ matrix, and obtain its singular value decomposition; apply the singular value decomposition for obtaining least squares solutions;

explain how the singular value decomposition can be applied for compressing and transmitting data efficiently;

obtain the QRdecomposition of an m x n matrix over R of rank n;

apply the QR-decomposition of a matrix to obtain least squares solutions.

Applications of Unitary

6.2 SINGULAR VALUE DECOMPOSITION

So far you have studied that A E M, (C) can be (Schur) decomposed as A = U*BU,where B is upper triangular and U is unitary. Let us try and see if we can

generalise this for any m x n matrix over C. In fact, we do get a generalisation in

terms of the singular values of the matrix. To appreciate this, you would need some background, which we provide below.

Theorem 1: For any m x n matrix A, LA and AA*are positive semi-definite.

Proof: Since LAis Hermitian, LA= UD u*, where D = diag (A,, ..., A,), hi being

the eigenvalues of A*A. At least one of the eigenvalues, say h, is non-zero (because

otherwise AAand hence A, would be zero). ~ e tAAV= hv. Then, hvev = v'hv = (Av)(Av) 2 0 3 h 2 0. Thus, all eigenvalues of AAare non-negative. So, A'A is positive semi-definite.

Similarly, you ccn show that AA* is positive semidefinite.

Theorem 2: h (> 0) is an eigenvalue of AA* iff it is an eigenvalue of A*A.

Proof: h is an eigenvalue of AA* -3v+O such that AAv=hv 3 AA (Av) = h (Av) and A*v # 0 (why?).

3 h is an eigenvalue of A*A.

Exchanging AA* with A*A,and applying the same argument, we get the result.

Remark: If A is of size m x n, then the sizes of AAand AAare n x n and

m x m, respectively. So, AAwill have n eigenvalues, +le AAwill have m

eigenvalues. Suppose m < n, then AAwill have the m eigenvalues of AAand

(n - m) eigenvalues will be zero. However, the positive eigenvalues of both matrices,

and their algebraic multiplicities, will be the same.

This remark leads us to the following definition.

Definition: The singular values of an m x n matrix A are the square roots of the

positive eigenvalues of LA (or AL), repeated with their multiplicities.

For instance, to obtain the singular values of A = 2 2 ,we consider the eigenvalues [::I of AA or A A. Since AA is of smaller size, we take that. 9 9 Now AA = [ 1. Its ;igenvalues are 18,O. 9 9 Therefore, the eigenvalues of AA* are 18, Q, 0.

~ ~ ~ l i c a t i o ' " sof Unitary 3) Note that in the SVD of A = USV*, with rank (A) = r, U is the matrix of

Matrices

eigenvectors of AA* anciV is the matrix of eigenvectors of A*A.

4) In the SVD,^ if^ V has entries^ from^ R only, then^ V*^ =^ V'.

Let us look at some examples. r 11

Example 1: Find the SVD for A =.

I!

Solution: As given in Remark 1 above, we take

Then A = USV*

Example 2: Find the SVD for A = [1 1 3 51.

Solution: From Example 1, we have At = USV' (as defined in Example 1). So, A=[l][6 0 0 ~ ] ~ ' i s t h e S V D.

Example 3: Find the SVD for A =

[ 1 ' :I.

Solution: Here rank(A)=2and AA* =: [ ,I: with eigenvalues 3, 1, and

corresponding eigenvectorsI : [ and [:I. So we tP*e U =-

AA* = -1 2 -1 has eigenvalues 3,1,O, with corresponding1.i. eigenvectors

[::::I

So, we take V =

I - O Y&Ia

Matrix Decompositions

Then A = U S V '

E x a m ~ l e 4: Find the SVD for A = 1 ' 4 4 ' 1

Solution: AA* = [::E], and its eigenvalues are 81 and 9.

(Can youj?nd these eigenvalues without computing the characteristicpolynomial? Note that

i) the row sum is^ 81,^ so one of the eigenvalues is^ 81,^ and

ii) the trace is^ 90,^ so the other eigenvalue should be^ 9.)

This shows that the singular values of A are 9 and 3, and rank(A) = 2.

The eigenvectors of LA corresponding to 81 and 9 are

v, = r] v2, =I : [ respect- 1 y. io, take v = - [' ' 1. , E l - 1

Now U is a 3x 3 matrix whose first column is a normalised eigenvector of AA*

corresponding to the eigenvalue 8 1, and the second column is a nonnalised eigenvector of AA* corresponding to the eigenvalue 9. To obtain these eigenvectors

we need to fmd the matrix AA*. But here we apply another method, which we have

seen in the proof of Theorem 3, that is, if v is an eigenvector of A*A corresponding to

a non-zero eigenvalue, then Av is an eigenvector of AA*.

So, the eigenvectors of AA* corresponding to these eigenvalues are A[!] = I:] and

A[;J = :1,respectiveiy. Hence, the corresponding nonnalised eigenvectors are

  • 3

Now, the third column of U is the eigenvector corresponding to the eigenvalue 0 of

AA*.

AA*x= 28 32 28 = O gives us x = 7. So, a normalized eigenvector

2 28 2 ] I::[

Let A = USV* be the SVD of A. Then

IIAX- bll = I ~ ( s v * x - ub) 1 = I I U ( S ~ -c) ( ,where = V'x and c = Ub

Matrix

NOW,W U ( S ~ - C ) ~ = tr(~(~y-c)[(u(s~-c)]')=tr(~(~y-c)(sy-c).u)*

= tr ((sy - c)(sy - c)*U'U) = ]Isy- c(I2.

Therefore, l l ~ x- b((= - ell.

So, x minimizes IAx - bl iff y (= V*X)minimizes IIsy- ell.

Now, assume m > n and let a, 2 a, ... 2 a, > 0 be the singular values of A. Then

Hence, I ( s ~-c(( is the least if we choose yi such that a,yi - c, = 0 for i = 1,. ..,r and

y , = O f o r i = r + l , ..., n. Inthiscase,

IlAx - bl =1(sy-c(=,/-.

Sy-c=

Let us consider an example.

Example 5: Obtain the least squares solution to Ax = b, where

OnYn -Cn -%+I

A = 2 2 and b =

[:I: vJ:![

,where we write y =

Y n

Solution: In Sec. 6.2, you found the singular value of A as 3&. Check that the SVD for A is USV*, where

Now, c=u*b=[ii].

We take y = [r:], where y, = %,y2 = 0.

. y = p 5 ].

Applications of Unitary Try^ a related exercise now.

E6) Use the^ SVD^ of^ A^ (in^ Example^ 4)^ to obtain a least squares solution for

-=[I.

Over here, a remark about the practical aspect of this approach to solving a least

squares problem may be appropriate. In practice, the entries in A are subject to

measurement errors. So, very small singular values of A can be treated as zero, without too much difference in output, but with a lot of saving in computation time.

Let us now look at an alternative approach for obtaining the least squares solution, which uses the following concept.

Definition: 1) A generalized inverse of an m x n matrix A is an n x m matrix G such that AGA = A.

his inverse was 2)^ An^ n^ x^ m matrix^ G^ is called the^ Moore-Penrose^ inverse^ of an m^ x^ n matrix^ A, independently defined by E.H.Moore (in 1920) and and is denoted by^ A',^ if^ G^ satisfies Roger Penrose (in 1955). i) AGA= A;

ii) GAG= G; and

iii) AG and GA are Hermitian.

Note: 1) A matrix can have more than one generalised inverses. For example,'if

A = [l '1, then A itself, as well as the identity matrix I,, are generalisd 0 0 inverses of A.

  1. If A is an n x n matrix which is invertible, then A-' is the unique generalized inverse of A.

  2. It can be shown that the Moore-Penrose inverse of a matrix, if it exists, is unique.

Now let us consider A = USV*,the SVD of A, where S is an m x n matrix. Let the singular values of A be o, 2 o, 2 .. .2 or > 0. We define S' = [aij], an n x m matrix, with a,, = for i = 1. ...,r and all other Yi aij = 0.

Then S' is the Moore-Penrose inverse of S. You can check that A+= VS+U.*

Now we see how this can be used to find a least squares solution.

Theorem 5: Let A = USV*be the SVD of the m x n matrix A of rank r. Then

i) x, = A'b minimises IIAx - blI with respect to x.

ii) Among all the least squares solutions of Ax = b, x, has the minimum norm.

ii Applications of Unitary To understand the process, let us take a much smaller matrix, say^ A,^ of size 50^ x^ 30.

$- Matrices This means that we are sending 1500 data points. Suppose that we have W e r

information about this matrix that it is of rank 5. So the matrix has 5 linearly independent rows or (columns) and the remaining rows / columns are a linear combination of these. So, instead of sending 1500 data points, we can send far less

data points, and yet not lose any information. In this process we are compressing the

data.

Let us see how this can be done. In this case thereare 5 linearly independent columns

and all other columns of A are a linear combination of these columns. Suppose the

linearly independent columns of A are at j, ,...,j, positions, where j, c j; < ... < j5.

Now, suppose j, # 1, and the first column of A is a, j, + a, j, + ... + a, j,. Then instead of sending the 50 entries of the 1" column, we can send just the 5 entries a,, a,, ...,a,. Similarly for the other 24 columns. If we do this how much minimum data do we need to obtain A? We need the five linearly independent columns with 50 entries each, 5 numbers for their places, and 5 sets of numbers each (that give the coefficients) for the remaining 25 columns. Therefore, instead of 1500 data points, we need to send (5x 50) + 5 + (25x 5) = 380 data points only, which, as you can see, is a considerable saving! So, this method is useful. However, finding the linearly independent rows or columns for large matrices is difficult, and hence not a preferred method. This is where the singular value decomposition is very useful.

Very efficient softwares to determine the SVD of a matrix are available. The idea used for creating the software can be illustrated with the help of the same matrix A of size 50 x 30 and of rank 5. Its SVD is of the form

Now, instead of transmitting the entries of A we transmit the entries of the right side of the equation above. Thus, to generate all the entries of A, we need the 5 non-zero

singular value of A, 5 x 50 entries for the ui s and 5 x 30 entries for the { s. Thus,

the total number of entries to be transinitted is just 405. This is more than the 380

required earlier, but there is a considerable saving in computation time vis-A-vis obtaining the linearly independent rows/columns of A.

Now, while sending images, it is found that neighbouring pixels are often correlated,

and therefore contain redundant information. So, if we remove the redundant information, our transmission will be more efficient. This is done by using appropriate approximations to the matrix representing the image, i.e., the data matrix. Suppose an

m x n matrix A represents the image, and its SVD is

A = USV* = olulv; +-..+o,u,v~.

Then, each singular value represents the energy levels in a certain direction. So, the

larger singular values will be able to give us a good enough reconstruction of the

image. Accordingly, we take a smaller rank approximation of A by ignoring the very

small singular values. To what extent can this distort the image received? If

neglecting very small singular values gives a small error, we can afford to ignore some singular values. The advantage of this will be that we would send far fewer entries, which would be cheaper and quicker!

Suppose the singular values of A are o, 2 o, 2 .. .2 or. Suppose we take only the fmt

p singular values. Then we get another matrix, A, = US,V' ,called the rank p

approximation of A.

Here S, is the same as S, except that o,,, ,...,o, have been replaced by 0. Now, the

error in doing so is

Thus, neglecting the very small singular values a,+, , ..., or does not give much error.

This process of reconstructing images is also known as applying the KLT (Karhunen- Loeve Transform).

To see examples of how the image reconstruction alters using rank 1, rank 2 or other

rank approximations,please check internet websites like

Try a related exercise now.

El 1) Find a rank 1 approximation of 1 ; ,and the error involved in using this L - instead of the given matrix for data reconstruction.

Let us now study another useful way of decomposing certain m x n matrices over R

6.4 QR-DECOMPOSITION

In your undergraduate studies you studied and applied the Gram-Schmidt

orthogonalisation process. We will now study the matrix form of this process. We

will assume that the matrices are over R.

Let A be an m x n matrix whose columns u, ,u, , ...,u, are linearly independent. So,

{u, , u, , ... , u,} forms a basis of Col(A), the column space of A. Applying the

Gram-Schmidt process to the basis elements, we obtain an orthononnal basis

{v, , v,, .. ., v,} of Co1 (A), where v, = u, ,v, = u, - < '^ U 2^ >^ v,^ ,and so on. This,

< v,, v, >

then, gives us an orthonormal basis {q,, q,, ... ,q,) , where qi = -.^ V^ i

llvi II

Let Q be the m x n matrix [q, ,q,, ...,q,]. Now, by the Gram-Schmidt process, we

know that 1 ui = C r j=, J1 ..q. J V i = l , ..., n.

n = C T ~ ~qj, where rj+,;= O = r ~ + 2I. =...=mi. J=I

Then R = [ r , r,^ ...^ r,^ ]^ is an upper^ triangu!ar^ matrix such that

A=[u, U, ... u,]=QR.

Take g =

Note that R is non-singular.

    • 'ii 2 ' i

n - ' i -

With this, we have just proved the following result.

Matrlx Decompositions

E14) Suppose A = QR, where R is invertible. Show that Col (A) = Col (Q).

What is the use of the QR-decomposition? It mainly helps us simplify the process of solving the least squares problem Ax = b, where the columns of A are linearly

independent. This is because if A = QR, then A'A = RtR. Then the least squares

solution of Ax = b is the solution of the system Rx = Qt b (see Theorem 9, Unit 4).

And, this is just backward substitution since R is upper triangular. Let us consider an example.

Example 7: Find a least squares solution of Ax = b, where A is as in Example 6 and

b=[2 3 1 21t.

Solution: Rx = Qt b gives 0

I 0

So a least squares solution is 141

Here are some related exercises.

E15) Find the least squares solution to Ax = b using the QRdecomposition of

E16) Use the QRdecomposition of A to solve E5.

We will stop our discussions on matrix decompositions here. Let us take a brief look

at what we have studied so far.

6.5 SUMMARY

In this unit, we have covered the following points

1. How to obtain the singular value decomposition (SVD) of an m x n matrix or a

linear tmnsformation.

2. Applying the SVD of A, and of its Moore-Penrose inverse A+,to find least

squares solutions for Ax = b.

Matrix Decompositions

Applications of Unitary 3. Applying SVD for transmitting data efficient after compressing. Matrices

4. How to obtain the QRdecomposition of an m x n matrix over R of rank n.

  1. Applying the QR-decomposition of A to find least squares solutions for Ax = b.

Now, please go back to the objectives of this unit, given in Sec. 6.1. See if you think

you have achieved each of them. One way to judge this is if you have solved each exercise in this unit on your own. You can also match your solutions with those given in Sec. 6.6., to make sure your reasoning and answers are correct.

E 1) A A ~ = [ ,I. Its eigenvalues are 360 and 90.

So, the singular values of A are 6 f i and 3 d 5 We obtain u, and u, as normalized eigenvectors corresponding to 360 and 90.

Yfi s o u , = [ Yfi],u2=[yfi] Yfi

Its eigenvalues will have to be 360,90,O (Why?) The corresponding eigenvectors will be

E2) Let A^ =^ U*BU, where U is unitary and B is upper triangular with

h,, ..., hnalong the diagonal.

n n Then tr(AA) = ~~(uBBu)= tr(BB)= Z Zlbij12 2 2kil j=I i=1 14

Also tr (AA) = tr (AA) since tr (AB) = tr (BA) for any A, B for which AB

and BA are defined.

Finally tr(AA*) = h o!, since the trace of a matrix is the sum of its i=l

eigenvalues. Note that A,,, = 0 = ... = A,.

E3) We assume^ A^ to be mxn, of^ rank^ r, and let A^ =^ USV*be the SVD.

Then X * A ~= X * U S V * ~ = w * ~ t ,where w = UX,t =Vey Hence I X * A ~ = IWS t 1

t

4 Applications of Unitary Matrices 4 ;

Since n = 2 = r in this case, x, is the only least squares solution of Ax = b.

Then check that

i) STS = S, ii) TST = T, iii) ST and TS are Hermitian. Therefore, T = S+.

mxn

E8) Assume that S is m x n, with m < n. Then S =

Let T =

Similarly, you should use this to check that VS'U* is the Moore-Penrose inverse

of USV*.

    • '=I : 0

0, :

-^0 :^^04

.......

E 10) AtA = [9]. So the singular value of A is 3.

Then the SVD for A is U 0 [I]

I:] So A+ =[I] [): 0 01 Ut

ii)

Then x,=A'b=-[1 2 - 9

The singular values of A are 3, 1. So its SVD is

Now, the rank 1 approximation is A, = U [i:]u t , and

i) By the Gram-Schmidt orthogonalisation process, we get

r - 7

Matrix Decompositions

since