












































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
These lecture notes cover the basic concepts of matrices and vectors, linear systems of equations, vector space, linear independence, basis, and span, matrix properties, eigenvector and eigenvalue, matrix inversion, spectral mapping theorem, matrix exponentials, inner product, and norms. The notes are specifically designed for the course Linear Algebra for Controls at the University of Washington and are taught by Xu Chen, the Bryan T. McMinn Endowed Research Professorship Associate Professor in the Department of Mechanical Engineering. The notes are comprehensive and include exercises for practice.
Typology: Lecture notes
1 / 52
This page cannot be seen from the preview
Don't miss anything!













































1 Basic concepts of matrices and vectors
A linear equation set
3 x 1 + 4x 2 + 10x 3 = 6
x 1 + 4x 2 − 10 x 3 = 5 (1)
4 x 2 + 10x 3 = − 1
can be simply written as
x 1
x 2
x 3
(2) wrote x 1 , x 2 , and x 3 just once rather than two or three times in (1). There are only three unknowns
in the above linear equation set. The notational simplicity and many algebraic convenience that will
arise, however, are significant when we have thousands of unknowns...
Formally, we write an m × n matrix A as
A = [ajk] =
a 11 a 12... a 1 n
a 21...... a 2 n
. .
.......
am 1 am 2... amn
columns.
that ajk = bjk for any j and k.
the diagonal entries of A.
main diagonal.
b =
b 1
b 2
. . .
bm
Example (Matrix and quadratic forms). We can use matrices to express general quadratic functions
of vectors. For instance
f (x) = x
T Ax + 2bx + c
is equivalent to
f (x) =
x
1
A b
b
T c
x
1
The sum of two matrices A and B (of the same size) is
A + B = [ajk + bjk]
The product between a m × n matrix A and a scalar c is
cA = [cajk]
i.e. each entry of A is multiplied by c to generate the corresponding entry of cA.
The matrix product C = AB is meaningful only if the column number of A equals the row number
of B. The computation is done as shown in the following example:
a 11 a 12 a 13
a 21 a 22 a 23
a 31 a 32 a 33
a 41 a 42 a 43
b 11 b 12
b 21 b 22
b 31 b 32
c 11 c 12
c 21 c 22
c 31 c 32
c 41 c 42
where
c 21 = a 21 b 11 + a 22 b 21 + a 23 b 31
= [a 21 , a 22 , a 23 ]
b 11
b 21
b 31
= "second row of A" × "first column of B"
More generally:
cjk = aj 1 b 1 k + aj 2 b 2 k + · · · + ajnbnk
= [aj 1 , aj 2 ,... , ajn]
b 1 k
b 2 k
. . .
bnk
namely, the jk entry of C is obtained by multiplying each entry in the jth row of A by the corresponding
entry in the kth column of B and then adding these n products. This is called a multiplication of rows
into columns.
2 Linear systems of equations
A linear system of m equations in n unknowns x 1 ,... , xn is a set of equations of the form
a 11 x 1 + a 12 x 2 +... a 1 nxn = b 1
a 21 x 1 + a 22 x 2 +... a 2 nxn = b 2 (4)
am 1 x 1 + am 2 x 2 +... amnxn = bm
nonhomogeneous system.
The m equations (4) can be written as a single vector equation
Ax = b
where
a 11 a 12...... a 1 n
a 21 a 22...... a 2 n
. . .
am 1 am 2...... amn
, x =
x 1
x 2
. . .
. . .
xn
, b =
b 1
b 2
. . .
bm
Gauss
1 elimination is a systematic method to solve linear equations. Consider
| {z } A
x 1
x 2
x 3
| {z } b
A b
(^1) Johann Carl Friedrich Gauss, 1777-1855, German mathematician: contributed significantly to many fields, including
number theory, algebra, statistics, analysis, differential geometry, geodesy, geophysics, electrostatics, astronomy, Matrix
theory, and optics.
Gauss was an ardent perfectionist. He was never a prolific writer, refusing to publish work which he did not consider
complete and above criticism. Mathematical historian Eric Temple Bell estimated that, had Gauss published all of his
discoveries in a timely manner, he would have advanced mathematics by fifty years.
Adding the first row to the second row gives
pivot role :
row 2 −→ add pivot role
row 4 −→ add -20×pivot role
What we have done is using the pivot row to eliminate x 1 in the other equations. At this stage,
the linear equations look like
x 1 − x 2 + x 3 = 0 (5)
10 x 2 + 25x 3 = 90 (7)
30 x 2 − 20 x 3 = 80 (8)
Re-arranging yields
x 1 − x 2 + x 3 = 0 (9)
10 x 2 + 25x 3 = 90 (10)
30 x 2 − 20 x 3 = 80 (11)
Moving on, we can get ride of x 2 in the third equation, by adding to it -3 times the second
equation. Correspondingly in the augmented matrix, we have
normalizing
| {z }
the row echelon form
namely
x 3 = 38/ 19
x 2 + x 3 = 9
x 1 − x 2 + x 3 = 0
The unknowns can now be readily obtained by back substitution: x 3 = 38/ 19 , x 2 = 9 − x 3 ,
x 1 = x 2 − x 3.
3 Vector space, linear independence, basis, and span
Given a set of m vectors a 1 , a 2 , ..., am with the same size,
k 1 a 1 + k 2 a 2 + · · · + kmam
is called a linear combination of the vectors. If
a 1 = k 2 a 2 + k 3 a 3 + · · · + kmam
then a 1 is said to be linearly dependent on a 2 , a 3 , ..., am. The set
{a 1 , a 2 ,... , am} (13)
is then a linearly dependent set. The same idea holds if a 2 or any vector in the set (13) is linearly
dependent on others.
Generalizing, if
k 1 a 1 + k 2 a 2 + · · · + kmam = 0
holds if and only if
k 1 = k 2 = · · · = km = 0
then the vectors in (13) are linearly dependent. This is saying that at least one of the vectors can be
expressed as a linear combination of the other vectors.
Why is linear independence important? If a set of vectors is linearly dependent, then we
can get rid of one or perhaps more of the vectors until we get a linearly independent set. This set is
then the smallest “truly essential” set with which we can work.
Consider a set of n linearly independent vectors, a 1 , a 2 , ..., an, each with n components. All the
possible linear combinations of a 1 , a 2 , ..., an form the vector space R
n
. This is the span of the n
vectors.
Definition 2 (Basis). A basis of V is a set B of vectors in V, such that any v ∈ V can be uniquely
expressed as a finite linear combination of vectors in B.
Example 3. In R
2
v 1 =
, v 2 =
is a linearly independent set and forms a basis.
v 1 =
, v 2 =
, v 3 =
is not a linearly independent set.
4 Matrix properties
Definition 4 (Rank). The rank of a matrix A is the maximum number of linearly independent row or
column vectors.
Theorem. Row or column operations do not change the rank of a matrix.
With the concept of linear dependence, many matrix-matrix operations can be understood from the
view point of vector manipulations.
Example (Dyad). A = uv
T is called a dyad, where u and v are vectors of proper dimensions. It is a
rank 1 matrix, as can be seen that A = uv
T is formed by linear combinations of the vector u, where
the weights of the combinations are coefficients of v.
Fact. For A, B ∈ R
n×n , if rank (A) = n then AB = 0 implies B = 0. If AB = 0 but A ̸= 0 and
B ̸= 0, then rank (A) < n and rank (B) < n.
Definition 5 (Range space). The range space of a matrix A, denoted as R (A), is the span of all the
column vectors of A.
Definition 6 (Null space). The null space of a matrix A ∈ R
n×n , denoted as N (A), is the vector
space
{x ∈ R
n : Ax = 0}
The dimension of the null space is called nullity of the matrix.
Fact 7. The following is true:
T
T
T
Determinants were originally introduced for solving linear equations in the form of Ax = y, with a
square A. They are cumbersome to compute for high-order matrices, but their definitions and concepts
are partially very important.
We review only the computations of second- and third-order matrices
det
a b
c d
= ad − bc
x = xo + z : Az = 0 (15)
where x 0 is any (fixed) solution of (14) and z runs through all the homogeneous solutions of
Az = 0, namely, z runs through all vectors in the null space of A.
You should be familiar with solving 2nd or 3rd-order linear equations by hand.
6 Eigenvector and eigenvalue
Think of Ax this way: A defines a linear operator; Ax is a vector produced by feeding the vector x to
this linear operator. In the two-dimensional case, we can look at Fig. 1. Certainly, Ax does not (at all)
need to be in the same direction as x. An example is
which gives that
x 1
x 2
x 1
0
namely, Ax is x projected on the first axis in the two-dimensional vector space, which will not be in the
same direction as x as long as x 2 ̸= 0.
x
Ax
A 0 x
Figure 1: Example relationship between x and Ax
From here comes the concept of eigenvectors and eigenvalues. It says that there are certain “special
directions/vectors” (denoted as v 1 and v 2 in our two-dimensional example) for A such that Avi = λivi.
Thus Avi is on the same line as the original vector vi, just scaled by the eigenvalue λi. It can be shown
that if λ 1 ̸= λ 2 , then v 1 and v 2 are linearly independent (your homework). This is saying that any
vector in R
2 can be decomposed as
x = a 1 v 1 + a 2 v 2
Therefore
Ax = a 1 Av 1 + a 2 Av 2 = a 1 λ 1 v 1 + a 2 λ 2 v 2
Knowing λi and vi thus can directly tell us how Ax looks like. More important, we have decomposed
Ax into small modules that are from time to time more handy for analyzing the system properties.
Figs. 2 and 3 demonstrate the above idea graphically.
Remark 11. The above geometric interpretations are for matrices with distinct real eigenvalues.
After obtaining an eigenvalue λ, we can find the associated eigenvector by solving (17). This is
nothing but solving a homogeneous system.
Example 12. Consider
Then
det (A − λI) = 0 ⇒ det
− 5 − λ 2
2 − 2 − λ
⇒ (5 + λ) (2 + λ) − 4 = 0
⇒ λ = − 1 or − 6
So A has two eigenvalues: − 1 and − 6. The characteristic polynomial of A is λ
2
To obtain the eigenvector associated to λ = − 1 , we solve
(A − λI) x = 0 ⇔
x =
x = 0
One solution is
x =
As an exercise, show that an eigenvector associated to λ = − 6 is
Example 13 (Multiple eigenvectors). Obtain the eigenvalues and eigenvectors of
Analogous procedures give that
λ 1 = 5, λ 2 = λ 3 = − 3
So there are repeated eigenvalues. For λ 2 = λ 3 = − 3 , the characteristic matrix is
The second row is the first row multiplied by 2. The third row is the negative of the first row. So the
characteristic matrix has only rank 1. The characteristic equation
(A − λ 2 I) x = 0
has two linearly independent solutions
Theorem 14 (Eigenvalue and determinant). Let A ∈ R
n×n
. Then
det A =
Y^ n
i=
λi
Proof. Letting λ = 0 in the characteristic polynomial
p (λ) = det (A − λI) = (λ 1 − λ) (λ 2 − λ)...
gives
det (A) = p (0) =
n Y
i=
λi
Example 15. For the two-dimensional case
a 11 a 12
a 21 a 22
⇒ p (λ) = det (A − λI) = (a 11 − λ) (a 22 − λ) − a 12 a 21
On the other hand
p (λ) = (λ 1 − λ) (λ 2 − λ)
Matching the coefficients we get
λ 1 + λ 2 = a 11 + a 22
λ 1 λ 2 = a 11 a 22 − a 12 a 21
If A
− 1 exists, A is called nonsingular; otherwise, A is singular.
Theorem 18 (Diagonalization of a Matrix). Let an n × n matrix A have a basis of eigenvectors
{x 1 , x 2 ,... , xn}, associated to its n distinct eigenvectors {λ 1 , λ 2 ,... , λn}, respectively. Then
− 1 = [x 1 , x 2 ,... , xn]
λ 1 0... 0
0 λ 2
0... 0 λn
[x 1 , x 2 ,... , xn]
− 1 (20)
Also,
A
m = XD
m X
− 1 , (m = 2, 3 ,... ). (21)
Remark 19. From (21), you can find some intuition about the benefit of (20): A
m can be tedious to
compute while D
m is very simple!
Proof. From Theorem 16, the n linearly independent eigenvectors of A form a basis. Write
Ax 1 = λ 1 x 1
Ax 2 = λ 2 x 2
Axn = λnxn
as
A [x 1 , x 2 ,... , xn] = [x 1 , x 2 ,... , xn]
λ 1 0... 0
0 λ 2
0... 0 λn
The matrix [x 1 , x 2 ,... , xn] is square. Linear independence of the eigenvectors implies that [x 1 , x 2 ,... , xn]
is invertible. Multiplying [x 1 , x 2 ,... , xn]
− 1 on both sides gives (20).
(21) then immediately follows, as
− 1
m = XDX
− 1 XDX... XDX
− 1 = XD
m X
− 1
Example 20. Let
The matrix has eigenvalues at 1 and -1, with associated eigenvectors
Then
− 1
Now if we are to compute A
3000
. We just need to do
3000 = X
− 1 = I
7 Similarity transformation
Definition 21 (Similar Matrices. Similarity Transformation). An n × n matrix Aˆ is called similar to
an n × n matrix A if
Aˆ = T −^1 AT
for some nonsingular n × n matrix T. This transformation, which gives Aˆ from A, is called a similarity
transformation.
Let S 1 and S 2 be two vector spaces of the same dimension. Take the same point P. Let u be its
coordinate in S 1 and uˆ be its coordinate in S 2. These coordinates in the two vector spaces are related
by some linear transformation T :
u = T u,ˆ uˆ = T
− 1 u
Consider Fig. 4. Let the point P go through a linear transformation A in the vector space S 1
to generate an output point Po. Po is physically the same point in both S 1 and S 2. However, the
coordinates of Po are different: if we see it from “standing inside” S 1 , then
y = Au
If we see it in S 2 , then the coordinate is some other value ˆy.
Figure 4: Same points in different vector spaces
How does the linear transformation A mathematically “look like” in S 2?
Result:
y ˆ = T
− 1 y = T
− 1 Au =
− 1 AT
uˆ