Important Definitions and Theorems Reference Sheet, Study notes of Linear Algebra

A reference sheet containing definitions and theorems given in Math 1553. It includes definitions of solution to a system of linear equations, elementary row operations, row equivalent matrices, inconsistent and consistent systems of equations, row echelon form, reduced row echelon form, free variables, and parametric form for the general solution. The document also includes a theorem stating that every matrix is row equivalent to one and only one matrix in reduced row echelon form. It ends with a fact stating the three possibilities for the solution set of a linear system with augmented matrix A.

Typology: Study notes

2021/2022

Uploaded on 05/11/2023

magicphil
magicphil 🇺🇸

4.3

(16)

241 documents

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
IMPORTANT DEFINITIONS AND THEOREMS
REFERENCE SHEET
This is a (not quite comprehensive) list of definitions and theorems given in Math 1553.
Pay particular attention to the ones in red.
For each definition, find an example of something that satisfies the re-
quirements of the definition, and an example of something that does
not. For each theorem, find an example of something that satisfies the
hypotheses of the theorem, and an example of something that does not
satisfy the conclusions (or the hypotheses, of course) of the theorem.
This is great conceptual practice.
Study Tip
CHA PT ER 1
SECTION 1.1.
Definition. Asolution to a system of linear equations is a list of numbers making all of
the equations true.
Definition. The elementary row operations are the following matrix operations:
Multiply all entries in a row by a nonzero number (scale).
Add (a multiple of) each entry of one row to the corresponding entry in another
(row replacement).
Swap two rows.
Definition. Two matrices are called row equivalent if one can be obtained from the
other by doing some number of elementary row operations.
Definition. A system of equations is called inconsistent if it has no solution. It is con-
sistent otherwise.
SECTION 1.2.
Definition. A matrix is in row echelon form if
(1) All zero rows are at the bottom.
(2) Each leading nonzero entry of a row is to the right of the leading entry of the row
above.
(3) Below a leading entry of a row, all entries are zero.
Definition. Apivot is the first nonzero entry of a row of a matrix in row echelon form.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Important Definitions and Theorems Reference Sheet and more Study notes Linear Algebra in PDF only on Docsity!

IMPORTANT DEFINITIONS AND THEOREMS

REFERENCE SHEET

This is a (not quite comprehensive) list of definitions and theorems given in Math 1553. Pay particular attention to the ones in red.

For each definition, find an example of something that satisfies the re- quirements of the definition, and an example of something that does not. For each theorem, find an example of something that satisfies the hypotheses of the theorem, and an example of something that does not satisfy the conclusions (or the hypotheses, of course) of the theorem. This is great conceptual practice.

Study Tip

CHAPTER 1

SECTION 1.1.

Definition. A solution to a system of linear equations is a list of numbers making all of the equations true.

Definition. The elementary row operations are the following matrix operations:

  • Multiply all entries in a row by a nonzero number (scale).
  • Add (a multiple of) each entry of one row to the corresponding entry in another (row replacement).
  • Swap two rows.

Definition. Two matrices are called row equivalent if one can be obtained from the other by doing some number of elementary row operations.

Definition. A system of equations is called inconsistent if it has no solution. It is con- sistent otherwise.

SECTION 1.2.

Definition. A matrix is in row echelon form if

(1) All zero rows are at the bottom. (2) Each leading nonzero entry of a row is to the right of the leading entry of the row above. (3) Below a leading entry of a row, all entries are zero.

Definition. A pivot is the first nonzero entry of a row of a matrix in row echelon form. 1

Definition. A matrix is in reduced row echelon form if it is in row echelon form, and in addition,

(4) The pivot in each nonzero row is equal to 1. (5) Each pivot is the only nonzero entry in its column.

Theorem. Every matrix is row equivalent to one and only one matrix in reduced row echelon form.

Definition. Consider a consistent linear system of equations in the variables x 1 ,... , xn. Let A be the reduced row echelon form of the matrix for this system. We say that xi is a free variable if its corresponding column in A is not a pivot column.

Definition. The parametric form for the general solution to a system of equations is a system of equations for the non-free variables in terms of the free variables. For instance, if x 2 and x 4 are free, x 1 = 2 − 3 x 4 x 3 = − 1 − 4 x 4

is a parametric form.

Theorem. Every solution to a consistent linear system is obtained by substituting (unique) values for the free variables in the parametric form.

Fact. There are three possibilities for the solution set of a linear system with augmented matrix A:

(1) The system is inconsistent: it has zero solutions, and the last column of A is a pivot column. (2) The system has a unique solution: every column of A except the last is a pivot column. (3) The system has infinitely many solutions: the last column isn’t a pivot column, and some other column isn’t either. These last columns correspond to free variables.

SECTION 1.3.

Definition. R n^ = all ordered n -tuples of real numbers ( x 1 , x 2 , x 3 ,... , xn ).

Definition. A vector is an arrow with a given length and direction.

Definition. A scalar is another name for a real number (to distinguish it from a vector).

Review. Parallelogram law for vector addition.

Definition. A linear combination of vectors v 1 , v 2 ,... , vn is a vector of the form

c 1 v 1 + c 2 v 2 + · · · + cn vn

where c 1 , c 2 ,... , cn are scalars, called the weights or coefficients of the linear combina- tion.

Definition. A vector equation is an equation involving vectors. (It is equivalent to a list of equations involving only scalars.)

Definition. The span of a set of vectors v 1 , v 2 ,... , vn is the set of all linear combinations of these vectors:

Span{ v 1 ,... , vp } =

x 1 v 1 + · · · + xp vp x 1 ,... , xp in R.

SECTION 1.7.

Definition. A set of vectors { v 1 , v 2 ,... , vp } in R n^ is linearly independent if the vector equation x 1 v 1 + x 2 v 2 + · · · + xp vp = 0

has only the trivial solution x 1 = x 2 = · · · = xp = 0.

Definition. A set of vectors { v 1 , v 2 ,... , vp } in R n^ is linearly dependent if the vector equa- tion x 1 v 1 + x 2 v 2 + · · · + xp vp = 0

has a nontrivial solution (not all xi are zero). Such a solution is a linear dependence relation.

Theorem. A set of vectors { v 1 , v 2 ,... , vp } is linearly de pendent if and only if one of the vectors is in the span of the other ones.

Fact. Say v 1 , v 2 ,... , vn are in R m. If n > m then { v 1 , v 2 ,... , vn } is linearly de pendent.

Fact. If one of v 1 , v 2 ,... , vn is zero, then { v 1 , v 2 ,... , vn } is linearly de pendent.

Theorem. Let v 1 , v 2 ,... , vn be vectors in R m, and let A be the m × n matrix with columns v 1 , v 2 ,... , vn. The following are equivalent: (1) The set { v 1 , v 2 ,... , vn } is linearly independent. (2) No one vector is in the span of the others. (3) For every j between 1 and n, vj is not in Span{ v 1 , v 2 ,... , vj − 1 }. (4) Ax = 0 only has the trivial solution. (5) A has a pivot in every column.

SECTION 1.8.

Definition. A transformation (or function or map ) from R n^ to R m^ is a rule T that assigns to each vector x in R n^ a vector T ( x ) in R m.

  • R n^ is called the domain of T.
  • R m^ is called the codomain of T.
  • For x in R n , the vector T ( x ) in R m^ is the image of x under T. Notation: x 7 → T ( x ).
  • The set of all images { T ( x ) | x in R n } is the range of T.

Notation. T : R n^ −→ R m^ means T is a transformation from R n^ to R m.

Definition. Let A be an m × n matrix. The matrix transformation associated to A is the transformation T : R n^ −→ R m^ defined by T ( x ) = Ax.

  • The domain is R n , where n is the number of columns of A.
  • The codomain is R m , where m is the number of rows of A.
  • The range is the span of the columns of A.

Review. Geometric transformations: projection , reflection , rotation , dilation , shear.

Definition. A linear transformation is a transformation T satisfying

T ( u + v ) = T ( u ) + T ( v ) and T ( cv ) = cT ( v )

for all vectors u , v and all scalars c.

SECTION 1.9.

Definition. The unit coordinate vectors in R n^ are

e 1 =

, e 2 =

,... , en − 1 =

, en =

Fact. If A is a matrix, then Aei is the ith column of A.

Definition. Let T : R n^ → R m^ be a linear transformation. The standard matrix for T is | | | T ( e 1 ) T ( e 2 ) · · · T ( en ) | | |

Theorem. If T is a linear transformation, then it is the matrix transformation associated to its standard matrix.

Definition. A transformation T : R n^ → R m^ is onto (or surjective ) if the range of T is equal to R m^ (its codomain). In other words, each b in R m^ is the image of at least one x in R n.

Theorem. Let T : R n^ → R m^ be a linear transformation with matrix A. Then the following are equivalent:

  • T is onto
  • T ( x ) = b has a solution for every b in R m
  • Ax = b is consistent for every b in R m
  • The columns of A span R m
  • A has a pivot in every row.

Definition. A transformation T : R n^ → R m^ is one-to-one (or into , or injective ) if differ- ent vectors in R n^ map to different vectors in R m. In other words, each b in R m^ is the image of at most one x in R n.

Theorem. Let T : R n^ → R m^ be a linear transformation with matrix A. Then the following are equivalent:

  • T is one-to-one
  • T ( x ) = b has one or zero solutions for every b in R m
  • Ax = b has a unique solution or is inconsistent for every b in R m
  • Ax = 0 has a unique solution
  • The columns of A are linearly independent
  • A has a pivot in every column.

SECTION 2.2.

Definition. A square matrix A is invertible (or nonsingular ) if there is a matrix B of the same size, such that AB = In and BA = In.

In this case we call B the inverse of A , and we write A −^1 = B.

Theorem. If A is invertible, then Ax = b has exactly one solution for every b, namely:

x = A −^1 b.

Fact. Suppose that A and B are invertible n × n matrices.

(1) A −^1 is invertible and its inverse is ( A −^1 )−^1 = A. (2) AB is invertible and its inverse is ( AB )−^1 = B −^1 A −^1_._ (3) AT^ is invertible and ( AT^ )−^1 = ( A −^1 ) T^.

Theorem. Let A be an n × n matrix. Here’s how to compute A −^1_._

(1) Row reduce the augmented matrix ( A | In ). (2) If the result has the form ( In | B ) , then A is invertible and B = A −^1_._ (3) Otherwise, A is not invertible.

Theorem. An n × n matrix A is invertible if and only if it is row equivalent to In. In this case, the sequence of row operations taking A to In also takes In to A −^1_._

Definition. The determinant of a 2 × 2 matrix A =

(^) a b c d

is

det( A ) = det

Å

a b c d

ã = adbc.

Fact. If A is a 2 × 2 matrix, then A is invertible if and only if det( A ) 6 = 0_. In this case,_

A −^1 =

det( A )

Å

dbc a

ã .

Definition. A elementary matrix is a square matrix E which differs from the identity matrix by exactly one row operation.

Fact. If E is the elementary matrix for a row operation, and A is a matrix, then EA differs from A by the same row operation.

SECTION 2.3.

Definition. A transformation T : R n^ → R n^ is invertible if there exists another transfor- mation U : R n^ → R n^ such that

TU ( x ) = x and UT ( x ) = x

for all x in R n. In this case we say U is the inverse of T , and we write U = T −^1.

Fact. A transformation T is invertible if and only if it is both one-to-one and onto.

Theorem. If T is an invertible linear transformation with matrix A, then T −^1 is an invertible linear transformation with matrix A −^1_._

I’ll keep all of the conditions of the IMT right here, even though we don’t encounter some until later:

The Invertible Matrix Theorem. Let A be a square n × n matrix, and let T : R n^ → R n^ be the linear transformation T ( x ) = Ax. The following statements are equivalent.

(1) A is invertible. (2) T is invertible. (3) A is row equivalent to In. (4) A has n pivots. (5) Ax = 0 has only the trivial solution. (6) The columns of A are linearly independent. (7) T is one-to-one. (8) Ax = b is consistent for all b in R n. (9) The columns of A span R n. (10) T is onto. (11) A has a left inverse (there exists B such that BA = In). (12) A has a right inverse (there exists B such that AB = In). (13) AT^ is invertible. (14) The columns of A form a basis for R n. (15) Col A = R n. (16) dim Col A = n. (17) rank A = n. (18) Nul A = { 0 }. (19) dim Nul A = 0_._ (20) det( A ) 6 = 0_._ (21) The number 0 is not an eigenvalue of A.

SECTION 2.8.

Definition. A subspace of R n^ is a subset V of R n^ satisfying:

(1) The zero vector is in V. (2) If u and v are in V , then u + v is also in V. (3) If u is in V and c is in R , then cu is in V.

Definition. If V = Span{ v 1 , v 2 ,... , vn }, we say that V is the subspace generated by or spanned by the vectors v 1 , v 2 ,... , vn.

Theorem. A subspace is a span, and a span is a subspace.

Definition. The column space of a matrix A is the subspace spanned by the columns of A. It is written Col A.

Definition. The null space of A is the set of all solutions of the homogeneous equation Ax = 0: Nul A =

x | Ax = 0.

Example. The following are the most important examples of subspaces in this class (some won’t appear until later):

CHAPTER 3

SECTION 3.1.

Definition. The i j minor of an n × n matrix A is the ( n − 1 ) × ( n − 1 ) matrix Ai j you get by deleting the i th row and the j th column from A.

Definition. The i j cofactor of A is Ci j = (− 1 ) i +^ j^ det Ai j.

Definition. The determinant of an n × n matrix A can be calculated using cofactor expansion along any row or column:

det A =

∑^ n

j = 1

ai j Ci j for any fixed i

det A =

∑^ n

i = 1

ai j Ci j for any fixed j

Theorem. There are special formulas for determinants of 2 × 2 and 3 × 3 matrices:

det

Å

a b c d

ã = adbc

det

a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33

a 11 a 22 a 33 + a 12 a 23 a 31 + a 13 a 21 a 32 − a 13 a 22 a 31 − a 11 a 23 a 32 − a 12 a 21 a 33

Theorem. The determinant of an upper-triangular or lower-triangular matrix is the product of the diagonal entries.

Theorem. If A is an invertible n × n matrix, then

A −^1 =

det A

C 11 C 21 C 31 · · · Cn 1 C 12 C 22 C 32 · · · Cn 2 C 13 C 23 C 33 · · · Cn 3 .. .

C 1 n C 2 n C 3 n · · · Cnn

SECTION 3.2.

Definition. The determinant is a function

det: {square matrices} −→ R

with the following defining properties :

(1) det( In ) = 1 (2) If we do a row replacement on a matrix (add a multiple of one row to another), the determinant does not change. (3) If we swap two rows of a matrix, the determinant scales by −1. (4) If we scale a row of a matrix by k , the determinant scales by k.

Theorem. You can use the defining properties of the determinant to compute the determi- nant of any matrix using row reduction.

Magical Properties of the Determinant.

(1) There is one and only one function det: { square matrices } → R satisfying the defin- ing properties (1)–(4). (2) A is invertible if and only if det( A ) 6 = 0_._ (3) If we row reduce A without row scaling, then

det( A ) = (− 1 ) #swaps

product of diagonal entries in REF

(4) The determinant can be computed using any of the 2 n cofactor expansions. (5) det( AB ) = det( A ) det( B ) and det( A −^1 ) = det( A )−^1 (6) det( A ) = det( AT^ ) (7) | det( A )| is the volume of the parallelepiped defined by the columns of A. (8) If A is an n × n matrix with transformation T ( x ) = Ax, and S is a subset of R n, then the volume of T ( S ) is | det( A )| times the volume of S. (Even for curvy shapes S.) (9) The determinant is multi-linear in the columns (or rows) of a matrix.

CHAPTER 5

SECTION 5.1.

Definition. Let A be an n × n matrix.

(1) An eigenvector of A is a nonzero vector v in R n^ such that Av = λv , for some λ in R. In other words, Av is a multiple of v. (2) An eigenvalue of A is a number λ in R such that the equation Av = λv has a nontrivial solution.

If Av = λv for v 6 = 0, we say λ is the eigenvalue for v , and v is an eigenvector for λ.

Fact. The eigenvalues of a triangular matrix are the diagonal entries.

Fact. A matrix is invertible if and only if zero is not an eigenvalue.

Fact. Eigenvectors with distinct eigenvalues are linearly independent.

Definition. Let A be an n × n matrix and let λ be an eigenvalue of A. The λ -eigenspace of A is the set of all eigenvectors of A with eigenvalue λ , plus the zero vector:

λ -eigenspace =

v in R n^ | Av = λv

v in R n^ | ( AλI ) v = 0 = Nul

AλI

Procedure. How to diagonalize a matrix A :

(1) Find the eigenvalues of A using the characteristic polynomial. (2) For each eigenvalue λ of A , compute a basis B λ for the λ -eigenspace. (3) If there are fewer than n total vectors in the union of all of the eigenspaces B λ , then the matrix is not diagonalizable. (4) Otherwise, the n vectors v 1 , v 2 ,... , vn in your eigenspace bases are linearly inde- pendent, and A = P DP −^1 for

P =

v 1 v 2 · · · vn | | |

and D =

λ 1 0 · · · 0 0 λ 2 · · · 0 .. .

0 0 · · · λn

where λi is the eigenvalue for vi.

Definition. Let λ be an eigenvalue of a square matrix A. The geometric multiplicity of λ is the dimension of the λ -eigenspace.

Theorem. Let λ be an eigenvalue of a square matrix A. Then

1 ≤ (the geometric multiplicity of λ)(the algebraic multiplicity of λ).

Corollary. Let λ be an eigenvalue of a square matrix A. If the algebraic multiplicity of λ is 1 , then the geometric multiplicity is also 1_._

The Diagonalization Theorem (Alternate Form). Let A be an n × n matrix. The following are equivalent:

(1) A is diagonalizable. (2) The sum of the geometric multiplicities of the eigenvalues of A equals n. (3) The sum of the algebraic multiplicities of the eigenvalues of A equals n, and the geometric multiplicity equals the algebraic multiplicity of each eigenvalue.

STOCHASTIC MATRICES.

Definition. A square matrix A is stochastic if all of its entries are nonnegative, and the sum of the entries of each column is 1.

Fact. Every stochastic matrix has eigenvalue 1_._

Fact. If λ 6 = 1 is an eigenvalue of a stochastic matrix, then | λ | < 1_._

Definition. A square matrix A is positive if all of its entries are positive.

Definition. A steady state for a stochastic matrix A is an eigenvector w with eigenvalue 1, such that all entries are positive and sum to 1.

Perron–Frobenius Theorem. If A is a positive stochastic matrix, then it admits a unique steady state vector w. Moreover, for any vector v 0 with entries summing to some number c, the iterates v 1 = Av 0 , v 2 = Av 1 ,... , approach cw as n gets large.

SECTION 5.5.

Review. Arithmetic in the complex numbers.

The Fundamental Theorem of Algebra. Every polynomial of degree n has exactly n com- plex roots, counted with multiplicity.

Fact. Complex roots of real polynomials come in conjugate pairs_._

Fact. If λ is an eigenvalue of a real matrix with eigenvector v, then λ is also an eigenvalue, with eigenvector v.

Theorem. Let A be a 2 × 2 matrix with complex (non-real) eigenvalue λ, and let v be an eigenvector. Then A = PC P −^1

where

P =

Re v Im v | |

and C =

Å

Re λ Im λ − Im λ Re λ

ã .

The matrix C is a composition of rotation by − arg( λ ) and scaling by | λ |.

Theorem. Let A be a real n × n matrix. Suppose that for each (real or complex) eigenvalue, the dimension of the eigenspace equals the algebraic multiplicity. Then A = PC P −^1 , where P and C are as follows: (1) C is block diagonal , where the blocks are 1 × 1 blocks containing the real eigenval- ues (with their multiplicities), or 2 × 2 blocks containing the matrices

Å

Re λ Im λ − Im λ Re λ

ã

for each complex eigenvalue λ (with multiplicity). (2) The columns of P form bases for the eigenspaces for the real eigenvectors, or come in pairs ( Re v Im v ) for the complex eigenvectors.

CHAPTER 6

SECTION 6.1.

Definition. The dot product of two vectors x , y in R n^ is

x · y =

x 1 x 2 .. . xn

y 1 y 2 .. . yn

def = x 1 y 1 + x 2 y 2 + · · · + xn yn.

Thinking of x , y as column vectors, this is the same as the number x T^ y.

Definition. The length or norm of a vector x in R n^ is

x ‖ =

p x · x.

Fact. If x is a vector and c is a scalar, thenc x ‖ = | c | · ‖ x.

SECTION 6.2.

Definition. Let L = Span{ u } be a line in R n , and let x be in R n. The orthogonal projec- tion of x onto L is the point

proj L ( x ) =

x · u u · u

u.

Definition. A set of nonzero vectors is orthogonal if each pair of vectors is orthogonal. It is orthonormal if, in addition, each vector is a unit vector.

Lemma. A set of orthogonal vectors is linearly independent. Hence it is a basis for its span.

Theorem. Let B = { u 1 , u 2 ,... , um } be an orthogonal set, and let x be a vector in W = Span B_. Then_

x =

∑^ m

i = 1

x · ui ui · ui

ui =

x · u 1 u 1 · u 1

u 1 +

x · u 2 u 2 · u 2

u 2 + · · · +

x · um um · um

um.

In other words, the B -coordinates of x are

Å

x · u 1 u 1 · u 1

x · u 2 u 2 · u 2

x · um u 1 · um

ã .

SECTION 6.3.

Definition. Let W be a subspace of R n , and let { u 1 , u 2 ,... , um } be an orthogonal basis for W. The orthogonal projection of a vector x onto W is

proj W ( x )

def

∑^ m

i = 1

x · ui ui · ui

ui.

Fact. Let W be a subspace of R n. Every vector x can be decompsed uniquely as

x = xW + xW

where xW is the closest vector to x in W , and xWis in W.

Theorem. Let W be a subspace of R n, and let x be a vector in R n. Then proj W ( x ) is the closest point to x in W. Therefore

xW = proj W ( x ) and xW ⊥ = x − proj W ( x ).

Best Approximation Theorem. Let W be a subspace of R n, and let x be a vector in R n. Then y = proj W ( x ) is the closest point in W to x, in the sense that

dist( x , y ′) ≥ dist( x , y ) for all y ′^ in W.

Definition. We can think of orthogonal projection as a transformation :

proj W : R n^ −→ R n^ x 7 → proj W ( x ).

Theorem. Let W be a subspace of R n.

(1) proj W is a linear transformation. (2) For every x in W , we have proj W ( x ) = x. (3) For every x in W, we have proj W ( x ) = 0_._ (4) The range of proj W is W.

Fact. Let W be an m-dimensional subspace of R n, let proj W : R n^ → W be the projection, and let A be the matrix for proj L.

(1) A is diagonalizable with eigenvalues 0 and 1 ; it is similar to the diagonal matrix with m ones and nm zeros on the diagonal. (2) A^2 = A.

SECTION 6.4.

The Gram–Schmidt Process. Let { v 1 , v 2 ,... , vm } be a basis for a subspace W of R n. Define:

(1) u 1 = v 1 (2) u 2 = v 2 − projSpan{ u 1 }( v 2 ) = v 2 −

v 2 · u 1 u 1 · u 1

u 1

(3) u 3 = v 3 − projSpan{ u 1 , u 2 }( v 3 ) = v 3 −

v 3 · u 1 u 1 · u 1

u 1 −

v 3 · u 2 u 2 · u 2

u 2 .. .

m. um = vm − projSpan{ u 1 , u 2 ,..., um − 1 }( vm ) = vm

m ∑− 1

i = 1

vm · ui ui · ui

ui

Then { u 1 , u 2 ,... , um } is an orthogonal basis for the same subspace W.

QR Factorization Theorem. Let A be a matrix with linearly independent columns. Then

A = QR

where Q has orthonormal columns and R is upper-triangular with positive diagonal entries.

Review. Procedure for computing Q and R given A.

SECTION 6.5.

Definition. A least squares solution to Ax = b is a vector b x in R n^ such that

bA b x ‖ ≤ ‖ bAx

for all x in R n.

Theorem. The least squares solutions to Ax = b are the solutions to

( AT^ A )b x = AT^ b.

Theorem. If A has orthogonal columns v 1 , v 2 ,... , vn, then the least squares solution to Ax = b is b x =

Å

b · v 1 v 1 · v 1

b · v 2 v 2 · v 2

b · vn vn · vn

ã .

Theorem. Let A be an m × n matrix. The following are equivalent:

(1) Ax = b has a unique least squares solution for all b in R n. (2) The columns of A are linearly independent. (3) AT^ A is invertible.

In this case, the least squares solution is ( AT^ A )−^1 ( AT^ b ).

Review. Examples of best fit problems using least squares.