Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Numerical Linear Algebra, Lecture Notes - Mathematics - 2, Study notes of Mathematical Methods for Numerical Analysis and Optimization

University of Oxford Mathematical Methods for Numerical Analysis and Optimization

Least Square Approximation, Singular Value Decomposition, QR decomposition , Back substitution

Typology: Study notes

2010/2011

Uploaded on 09/09/2011

luber-1 🇬🇧

4.8

(12)

293 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

4 Least-Squares Approximation

We will first deal with the problem of solving an overdetermined system of the form

Ax=b,(5)

where A∈Rm×nand b∈Rmare given and m>n. The latter means that we have more

equations than unknowns, which makes the system overdetermined. Hence, instead, we

may try to change the problem to solving

min

x∈Rn"Ax−b"2.(6)

For the time being, we will assume that the rank of the matrix rank(A)=dim{Ax:x∈

Rn}is rank(A)=n, i.e. the matrix has full rank.

Then, to solve (6) we can look for critical points of the function

f(x)="Ax−b"2

2=xTATAx−2bTAx+bTb.

Setting the gradient to zero yields

0=∇f(x)=2ATAx−2ATb=0.

Since rank(A)=n, we also have rank(ATA)=n(as we will prove later), which means

that the matrix ATAis invertible. Thus we have in principle the following result.

Theorem 4.1 (Normal Equations) The solution of the least-squares problem (6) with

a matrix A∈Rm×n, which has full rank, is the unique solution of the Normal Equations

(ATA)x=ATb.

Proof: The solution to the normal equations is indeed a critical point. Furthermore, since

the Hessian of the function fis given by the positive definite matrix ATA, we have a local

minimum, which is actually a global one, since fis a convex function. !

The normal equations are usually not a good way of computing the solution to the least-

squares problem since the involved matrix ATAis usually badly conditioned.

We are now going to investigate two better ways of computing the solution to the least-

squares problem. To this end, the following result concerning orthogonal matrices is ex-

tremely helpful.

Definition 4.2 A matrix Q∈Rn×nis called orthogonal if QTQ=I.

Note that an orthogonal matrix is always invertible and that its rows/columns form an

orthonormal basis of Rn.

10

Discover Study notes of Mathematical Methods for Numerical Analysis and Optimization University of Oxford

Partial preview of the text

Download Numerical Linear Algebra, Lecture Notes - Mathematics - 2 and more Study notes Mathematical Methods for Numerical Analysis and Optimization in PDF only on Docsity!

4 Least-Squares Approximation

We will first deal with the problem of solving an overdetermined system of the form

Ax = b, (5)

where A ∈ R

m×n and b ∈ R

m are given and m > n. The latter means that we have more

equations than unknowns, which makes the system overdetermined. Hence, instead, we

may try to change the problem to solving

min

x∈R n

‖Ax − b‖ 2

For the time being, we will assume that the rank of the matrix rank(A) = dim{Ax : x ∈

R

n } is rank(A) = n, i.e. the matrix has full rank.

Then, to solve (6) we can look for critical points of the function

f (x) = ‖Ax − b‖

2

= x

T A

T Ax − 2 b

T Ax + b

T b.

Setting the gradient to zero yields

0 = ∇f (x) = 2A

T Ax − 2 A

T b = 0.

Since rank(A) = n, we also have rank(A

T A) = n (as we will prove later), which means

that the matrix A

T A is invertible. Thus we have in principle the following result.

Theorem 4.1 (Normal Equations) The solution of the least-squares problem (6) with

a matrix A ∈ R

m×n , which has full rank, is the unique solution of the Normal Equations

(A

T A)x = A

T b.

Proof: The solution to the normal equations is indeed a critical point. Furthermore, since

the Hessian of the function f is given by the positive definite matrix A

T A, we have a local

minimum, which is actually a global one, since f is a convex function.!

The normal equations are usually not a good way of computing the solution to the least-

squares problem since the involved matrix A

T A is usually badly conditioned.

We are now going to investigate two better ways of computing the solution to the least-

squares problem. To this end, the following result concerning orthogonal matrices is ex-

tremely helpful.

Definition 4.2 A matrix Q ∈ R

n×n is called orthogonal if Q

T Q = I.

Note that an orthogonal matrix is always invertible and that its rows/columns form an

orthonormal basis of R

n .

Proposition 4.3 Let Q ∈ R

n×n be orthogonal and A ∈ R

n×n then ‖Qx‖ 2 = ‖x‖ 2 for all

x ∈ R

n and ‖QA‖ 2 = ‖A‖ 2.

Proof: We have

‖Qx‖

2

2 = (Qx)

T (Qx) = x

T Q

T Qx = x

T x = ‖x‖

2

Similarly,

‖QA‖ 2 = sup

‖x‖ 2 =

‖QAx‖ 2 = sup

‖x‖ 2 =

‖Ax‖ 2 = ‖A‖ 2.

5 Singular Value Decomposition

One way of interpreting the singular value decomposition, which we want to discuss now,

is to find new orthonormal basis vectors in R

m and R

n , which allow to rewrite the linear

mapping A : R

n → R

m in a geometrically simpler form.

To understand it, we need to recall a few facts from linear algebra. We already introduced

the rank of a matrix. We also need the null space

ker(A) = {x ∈ R

n : Ax = 0}

and the intrinsic relations

n = dim ker(A) + rank(A), rank(A) = rank(A

T ).

Note that an element x ∈ ker(A) satisfies Ax = 0 and hence also A

T Ax = 0, i.e. it belongs

to ker(A

T A). If, on the other hand x ∈ ker(A

T A) then A

T Ax = 0, thus 0 = x

T (A

T Ax) =

‖Ax‖

2 2 and hence x ∈ ker(A). This means

ker(A) = ker(A

T A).

However, the dimension formula above immediately gives

rank(A) = rank(A

T ) = rank(A

T A) = rank(AA

T ),

This means that the symmetric and positive semi-definite matrices A

T A ∈ R

n×n and

AA

T ∈ R

m×m have the same rank.

As before, we choose for A

T A a system of eigenvalues λ 1 ≥ λ 2 ≥... ≥ λ n ≥ 0 and

associated eigenvectors v 1 ,... , v n

∈ R

n satisfying A

T Av j = λ j v j and v

T

j

v k = δ jk

. Exactly

the first r = rank(A) eigenvalues are positive and the remaining ones are zero.

According to the considerations we just made, also the matrix AA

T must have exactly r

positive eigenvalues. We will now see that, interestingly, the positive eigenvalues of AA

T

Theorem 5.2 The general solution to the least-squares problem (6) is given by

x =

r ∑

j=

cj

σ j

v j

m ∑

j=r+

α j v j =: x

m ∑

j=r+

α j v j

Here, σ j denotes the singular values of the matrix A and v j are the columns of the matrix V

from the singular value decomposition. Moreover, c = U

T b, and the α j are free to choose.

From all possible solutions, the solution x

is the solution having minimal Euclidean norm.

Proof: The representation of the general form follows from x = V y. The solution x

has

minimal norm, since we have for a general solution that

‖x‖

2

= ‖x

‖

2

m ∑

j=r+

α

2

j

≥ ‖x

‖

2

using the fact that the v j form an orthonormal system.!

Example 5.3 Let us consider the system Ax = b with the following input:

A =

, b =

We can hopefully immediately see that the norm-minimal solution is given by x

= (0, 2)

T

and that the unavoidable error is (

2

3

2 )

1 / 2. Now, we have

A

T A =

so that λ 1 = 4 and λ 2 = 0. Hence, A has the singular value σ 1 = 2. Corresponding

eigenvectors are v 1 = e 2 and v 2 = e 1

. This defines the other eigenvector u 1

1

σ 1

Av 1

T , and hence we can choose u 2 = e 2 and u 3 = e 3. Thus, we have

V =

, U = I =

Furthermore, c = U

T b = b and y 1 = c 1 /2 = 2, y 2 = 0, y 3 = 0, which gives x

= 2v 1

T .

There are efficient methods available to compute the singular value decomposition, like

the Golub-Reinsch bidiagonalisation algorithm. However, we will not discuss them here.

Instead, we will discuss an alternative decomposition, which will be useful in various ways.

6 The QR decomposition

The goal of this section is to prove, for every matrix A ∈ R

m×n , m ≥ n, the existence of a

factorisation of the form

A = QR,

where Q ∈ R

m×m is orthogonal and R ∈ R

m×n is an upper triangular matrix.

Definition 6.1 A matrix R ∈ R

m×n is called upper triangular if rij = 0 for all i > j, i.e.

if

R =

We will do this using elementary reflections.

Definition 6.2 A Householder matrix is a matrix of the form

H(w) = I − 2 ww

T ∈ R

m×m ,

where w ∈ R

m satisfies either ‖w‖ 2 = 1 or w = 0.

Note that:

H(w) is symmetric.
H(w)H(w) = I (check), so that H(w) is orthogonal. Hence, a product of House-

holder matrices is orthogonal.

det(H(w)) = −1 for w )= 0.
Storing H(w) only requires storing the m elements of w.
The general form of a Householder matrix is given by

H(w) = I − 2

ww

T

w

T w

for an arbitrary vector w )= 0. Here, we find both the inner product w

T w ∈ R and

the outer product ww

T ∈ R

m×m .

such that ‖v‖ 2 = 2a

T w. Hence, the application of H(w) to a becomes

H(w)a = a − 2(a

T w)w = a − ‖v‖ 2

v

‖v‖ 2

= a − v = ‖a‖ 2 e 1

Note that we could also project onto −e 1 by using v = a + ‖a‖ 2 e 1

. In general, one should

pick the sign to avoid numerical cancellation, i.e. one should use v = a + sign(a 1 )‖a‖ 2 e 1.

The cost of a matrix-vector multiplication is usually O(m

2 ). Here, however, we can com-

pute the product between a Householder matrix and a vector in O(m) time. To compute

H(w)a = a − 2(a

T w)w, we first compute the number λ = a

T w in O(m) time, then multi-

ply this number with each component of w to form the vector λw and finally we subtract

twice this vector from the vector a. All of this can be done in O(m) time.

Theorem 6.4 For every A ∈ R

m×m , m ≥ n, there exist an orthogonal matrix Q ∈ R

m×m

and an upper triangular matrix R ∈ R

m×n such that A = QR.

Proof: Let a 1

∈ R

m be the first column of A. We find a first Householder matrix H 1

R

m×m such that

H

1

A =

α 1

. A

1

with an (m − 1) × (m − 1) matrix A 1. Considering the first column a˜ 2 ∈ R

m− 1 of this

matrix, we can find a second Householder matrix

H

2

∈ R

(m−1)×(m−1) such that

H

2 a˜ 2

α 2 e 1

∈ R

m− 1

. The matrix

H

2

H

2

is easily seen to be orthogonal and we derive

H

2

H

1

A =

α 1 ∗... ∗

0 α 2

. A 2

and we can proceed in this form to establish

H

n− 1

H

n− 2

· · · H

2

H

1

A = R,

where R is an upper triangular matrix. Since all H j are orthogonal so is there product and

also their inverse.!

Note that the proof is actually constructive and describes a numerical method for deriving

the QR factorisation of a matrix A.

Returning once again to our least-squares problem (6), we see that we can now produce a

solution by noting that

‖Ax − b‖

2

= ‖QRx − b‖

2

= ‖Rx − c‖

2

R

1

x −

c 1

c 2

2

= ‖R

1 x − c 1

2

‖c 2

2

where we have set c = Q

T b ∈ R

m and where R 1 ∈ R

n×n denotes the upper part of the

upper triangular matrix R.

The solution to the least-squares problem is therefore given by the solution of

R

1 x = c 1

where R 1

∈ R

n×n is a square upper-triangular matrix. This system can be solved quite

easily, as we will discuss in the next section.

7 Back Substitution

We are dealing now with the following problem. We seek a solution x ∈ R

n of the equation

Rx = b, (7)

where b ∈ R

n is given and R ∈ R

n×n is an upper triangular matrix.

The system has a unique solution if the determinant of R is different from zero. Since

we have an upper triangular matrix, the determinant equals the product of all diagonal

entries. Hence, R is invertible if rii )= 0 for 1 ≤ i ≤ n.

Assuming this, we see that the last equation of (7) is simply given by

rnnxn = bn,

which resolves to x n = b n /r nn

. Next, the equation from row n − 1 is

rn− 1 ,n− 1 xn− 1 + rn− 1 ,nxn = bn− 1.

But we already know x n

. Hence, we can compute x n− 1 = (b n− 1 − r n− 1 ,n x n )/r n− 1 ,n− 1

The general algorithm for computing the solution in this situation is called back substitu-

tion.

Numerical Linear Algebra, Lecture Notes - Mathematics - 2, Study notes of Mathematical Methods for Numerical Analysis and Optimization

Related documents

Partial preview of the text

Download Numerical Linear Algebra, Lecture Notes - Mathematics - 2 and more Study notes Mathematical Methods for Numerical Analysis and Optimization in PDF only on Docsity!

4 Least-Squares Approximation

R

(A

5 Singular Value Decomposition

AA

∈ R

A =

A

V =

, U = I =

6 The QR decomposition

A = QR,

R =

∈ R

R

H

A =

. A

H

∈ R

H

∈ R

H

H

H

H

A =

. A 2

H

H

· · · H

H

A = R,

R

= ‖R

R

∈ R

7 Back Substitution