















































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
In Chapter 5 we will arrive at the same matrix algebra from the viewpoint of linear transformations. 3.1 SYSTEMS OF LINEAR EQUATIONS.
Typology: Slides
1 / 55
This page cannot be seen from the preview
Don't miss anything!
















































In this chapter we introduce matrices via the theory of simultaneous linear
equations. This method has the advantage of leading in a natural way to the
concept of the reduced row-echelon form of a matrix. In addition, we will for-
mulate some of the basic results dealing with the existence and uniqueness of
systems of linear equations. In Chapter 5 we will arrive at the same matrix
algebra from the viewpoint of linear transformations.
(also called variables or indeterminates ). Then an equation of the form
aè xè + ~ ~ ~ + añ xñ = y
the coefficients of the unknowns, and y is called the constant term of the
tion if and only if
a 1 c 1 + ~ ~ ~ + an cn = y
in which case we say that (cè,... , cñ) satisfies the equation. The set of all
such solutions is called the solution set (or the general solution ).
Now consider the following system of m linear equations in n
unknowns :
a 11
x 1
+!+ a 1 n
x n
= y 1
a 21
x 1
+!+ a 2 n
x n
= y 2
a m 1
x 1
+!+ a mn
x n
= y m
We abbreviate this system by
a ij
x j
= y i
,!!!!!!!!!!!! i = 1 ,!…!,! m !!.
j = 1
n
!
If we let Si denote the solution set of the equation Íé aáéxé = yá for each i, then
the solution set S of the system is given by the intersection S = ⁄Sá. In other
solution of each of the m equations in the system.
Example 3.1 Consider this system of two equations in three unknowns over
the real field ®:
2 x 1
! 3 x 2
+!!! x 3
!! x 1
! 2 x 3
The vector (3, 1, 3) ∞ ® 3 is not a solution of this system because
while
However, the vector (5, 1, - 1) ∞ ® 3 is a solution since
and
Associated with a system of linear equations are two rectangular arrays of
significance. For the system Íé aáéxé = yá, we define the matrix of coefficients
A as the array
!! x 1
+!!( 1 / 2 ) x 2
!!!! x 3
!! x 1
!!!!!!!!!! 3 x 2
+!!! x 3
4 x 1
!!!!!!!!!!!! x 2
! 2 x 3
Multiply the first equation by - 1 and add it to the second to obtain a new sec-
ond equation, then multiply the first by - 4 and add it to the third to obtain a
new third equation:
x 1
+!!( 1 / 2 ) x 2
!!! x 3
!!!!!!( 7 / 2 ) x 2
!!!!!!!!!!!!!! 3 x 2
! 2 x 3
Multiply the second by - 2/7 to get the coefficient of xì equal to 1, then mul-
tiply this new second equation by 3 and add to the third:
x 1
+!!( 1 / 2 ) x 2
!!!!!!!!!!! x 3
!!!!!!!!!!!!!!!!! x 2
! ( 4 / 7 ) x 3
!!!!!!!!!!!!!!!!!!!!!!!!!( 2 / 7 ) x 3
Multiply the third by 7/2, then add 4/7 times this new equation to the second:
x 1
+!!( 1 / 2 ) x 2
! x 3
!!!!!!!!!!!!!!!!! x 2
!!!!!!!!!!!!!!!!!!!!!!!!! x 3
Add the third equation to the first, then add - 1/2 times the second equation to
the new first to obtain
x 1
x 2
x 3
This is now a solution of our system of equations. While this system could
have been solved in a more direct manner, we wanted to illustrate the system-
atic approach that will be needed below. ∆
Two systems of linear equations are said to be equivalent if they have
equal solution sets. That each successive system of equations in Example 3.
is indeed equivalent to the previous system is guaranteed by the following
theorem.
a 11
x 1
x 2
+!+ a 1 n
x n
= b 1
a 21
x 1
x 2
+!+ a 2 n
x n
= b 2
with aèè ≠ 0 is equivalent to the system
a 11
x 1
x 2
+!+ a 1 n
x n
= b 1
a! 22
x 2
+!+ a! 2 n
x n
= b! 2
in which
aæ2i = a 21 a1i - a 11 a2i
for each i = 1,... , n and
bæ 2 = a 21 b 1 - a 11 b 2.
Proof Let us define
i
= a ij
j = 1
n
!
x j
so that (1) may be written as the system
1
= b 1
2
= b 2
(1æ)
while (2) is just
1
= b 1
a 21
1
! a 11
2
= a 21
b 1
! a 11
b 2
(2æ)
If (xè,... , xn) ∞ F n is a solution of (1æ), then the two equations
a 21
1
= a 21
b 1
a 11
2
= a 11
b 2
and hence also
aìè Lè - aèè Lì = aìè bè - aèè bì
are all true equations. Therefore every solution of (1æ) also satisfies (2æ).
Conversely, suppose that we have a solution (xè,... , xñ) to the system
(2æ). Then clearly
aìè Lè = aìè bè
is a true equation. Hence, subtracting the second of (2æ) from this gives us
( 1 + i / 2 ) x +!!!!!!! 8 y! iz !!! t = 0
( 2 / 3 ) x! ( 1 / 2 ) y + z + 7 t = 0
The important point to realize in Example 3.2 is that we solved a system of
linear equations by performing some combination of the following operations:
(a) Change the order in which the equations are written.
(b) Multiply each term in an equation by a nonzero scalar.
(c) Multiply one equation by a nonzero scalar and then add this new
equation to another equation in the system.
Note that (a) was not used in Example 3.2, but it would have been necessary if
the coefficient of xè in the first equation had been 0. The reason for this is that
we want the equations put into echelon form as defined below.
We now see how to use the matrix aug A as a tool in solving a system of
linear equations. In particular, we define the following so-called elementary
row operations (or transformations ) as applied to the augmented matrix:
(å) Interchange two rows.
(∫) Multiply one row by a nonzero scalar.
(©) Add a scalar multiple of one row to another.
It should be clear that operations (å) and (∫) have no effect on the solution set
of the system and, in view of Theorem 3.1, that operation (©) also has no
effect.
The next two examples show what happens both in the case where there is
no solution to a system of linear equations, and in the case of an infinite
number of solutions. In performing these operations on a matrix, we will let Rá
denote the i th row. We leave it to the reader to repeat Example 3.2 using this
notation.
Example 3.3 Consider this system of linear equations over the field ®:
x + 3 y + 2 z = 7
2 x +!! y !!!! z = 5
! x + 2 y + 3 z = 4
The augmented matrix is
and the reduction proceeds as follows. We first perform the following elemen-
tary row operations:
2
1
3
1
Now, using this matrix, we obtain
2
3
2
It is clear that the equation 0z = 2 has no solution, and hence this system has
no solution. ∆
Example 3.4 Let us solve the following system over the field ®:
x 1
! 2 x 2
!!!! x 4
3 x 1
!!!! x 3
2 x 1
!!!! x 3
!!! x 4
! 2 x 1
! 3 x 3
! 3 x 4
We have the matrix aug A given by
and hence we obtain the sequence
2
1
3
1
4
1
matrix is said to be in reduced row-echelon form if it has the following
properties (which are more difficult to state precisely than they are to under-
stand):
(1) All zero rows (if any) occur below all nonzero rows.
(2) The first nonzero entry (reading from the left) in each row is equal to
(3) If the first nonzero entry in the i th row is in the já th column, then
every other entry in the já th column is 0.
(4) If the first nonzero entry in the i th row is in the já th column, then jè <
jì < ~ ~ ~.
We will call the first (or leading ) nonzero entries in each row of a row-
echelon matrix the distinguished elements of the matrix. Thus, a matrix is in
reduced row-echelon form if the distinguished elements are each equal to 1,
and they are the only nonzero entries in their respective columns.
Example 3.5 The matrix
is in row-echelon form but not in reduced row-echelon form. However, the
matrix
is in reduced row-echelon form. Note that the distinguished elements of the
first matrix are the numbers 1, 5 and 7, and the distinguished elements of the
second matrix are the numbers 1, 1 and 1. ∆
The algorithm detailed in the proof of our next theorem introduces a tech-
nique generally known as Gaussian elimination.
Theorem 3.3 Every m x n matrix A is row equivalent to a reduced row-
echelon matrix.
Proof This is essentially obvious from Example 3.4. The detailed description
which follows is an algorithm for determining the reduced row-echelon form
of a matrix.
Suppose that we first put A into the form where the leading entry in each
nonzero row is equal to 1, and where every other entry in the column contain-
ing this first nonzero entry is equal to 0. (This is called simply the row-
reduced form of A.) If this can be done, then all that remains is to perform a
finite number of row interchanges to achieve the final desired reduced row-
echelon form.
To obtain the row-reduced form we proceed as follows. First consider row
row 1 is nonzero, then let jè be the smallest positive integer for which aèjè ≠ 0
and multiply row 1 by (aèjè)î. Next, for each i ≠ 1 we add - aájè times row 1 to
row i. This leaves us with the leading entry aèjè of row 1 equal to 1, and every
other entry in the jè th column equal to 0.
Now consider row 2 of the matrix we are left with. Again, if row 2 is equal
to 0 there is nothing to do. If row 2 is nonzero, assume that the first nonzero
entry occurs in column jì (where jì ≠ jè by the results of the previous para-
graph). Multiply row 2 by (aìjì)î so that the leading entry in row 2 is equal to
1, and then add - aájì times row 2 to row i for each i ≠ 2. Note that these opera-
tions have no effect on either column jè, or on columns 1,... , jè of row 1.
It should now be clear that we can continue this process a finite number of
times to achieve the final row-reduced form. We leave it to the reader to take
an arbitrary matrix (aáé) and apply successive elementary row transformations
to achieve the desired final form. ˙
While we have shown that every matrix is row equivalent to at least one
reduced row-echelon matrix, it is not obvious that this equivalence is unique.
However, we shall show in the next section that this reduced row-echelon
matrix is in fact unique. Because of this, the reduced row-echelon form of a
matrix is often called the row canonical form.
Exercises
matrices.
then to row canonical form:
for some x ∞ (a, b). (The determinant of W(x) is called the Wronskian of
the set of functions {fi}.)
Show that each of the following sets of functions is linearly independent:
(c) fè(x) = - x 2 + x + 1, fì(x) = x 2 + 2x, f 3 (x) = x 2 - 1.
(d) fè(x) = exp(-x), fì(x) = x, f 3 (x) = exp(2x).
(e) fè(x) = exp(x), fì(x) = sin x, f 3 (x) = cos x.
Determine the values of Y = (yè, yì, y 3 ) for which the system Íáaáéxé = yá
has a solution.
We will find it extremely useful to consider the rows and columns of an arbi-
trary m x n matrix as vectors in their own right. In particular, the rows of A
are to be viewed as vector n-tuples Aè,... , Am where each Aá = (ai1,... ,
should write Aj as the column vector
a 1 j
a mj
but it is typographically easier to write this horizontally whenever possible.
Note that we label the row vectors of A by subscripts, and the columns of A
by superscripts.
space spanned by the rows Aá, and hence any v ∞ V may be written as
v = c i
i
i = 1
m
!
called the row space of A. The dimension of V is called the row rank of A,
follows that rr(A) = dim V ¯ n. On the other hand, V is spanned by the m
vectors Aá, so that we must have dim V ¯ m. It then follows that rr(A) ¯
min{m, n}.
In an exactly analogous manner, we define the column space W of a
any w ∞ W is given by
w = b j
j
j = 1
n
!
The column rank of A, denoted by cr(A), is given by cr(A) = dim W and, as
above, we must have cr(A) ¯ min{m, n}.
An obvious question is whether a sequence of elementary row operations
changes either the row space or the column space of a matrix. A moments
thought should convince you that the row space should not change, but it may
not be clear exactly what happens to the column space. These questions are
answered in our next theorem. While the following proof appears to be rather
long, it is actually quite simple to understand.
Theorem 3.4 Let A and Aÿ be row equivalent m x n matrices. Then the row
space of A is equal to the row space of Aÿ, and hence rr(A) = rr(Aÿ).
Furthermore, we also have cr(A) = cr(Aÿ). (However, note that the column
space of A is not necessarily the same as the column space of Aÿ.)
Proof Let V be the row space of A and Vÿ the row space of Aÿ. Since A and Aÿ
are row equivalent, A may be obtained from Aÿ by applying successive ele-
mentary row operations. But then each row of A is a linear combination of
rows of Aÿ, and hence V ™ Vÿ. On the other hand, Aÿ may be obtained from A
in a similar manner so that Vÿ ™ V. Therefore V = Vÿ and hence rr(A) = rr(Aÿ).
Now let W be the column space of A and Wÿ the column space of Aÿ.
Under elementary row operations, it will not be true in general that W = Wÿ,
but we will show it is still always true that dim W = dim Wÿ. Let us define the
mapping f: W ‘ Wÿ by
We first consider a transformation of type å. For definiteness, we inter-
change rows 1 and 2, although it will be obvious that any pair of rows will
work. In other words, we define Aÿè = Aì, Aÿì = Aè and Aÿé = Aé for j = 3,... ,
n. Therefore
f(ÍcáAi) = ÍcáAÿi = (Ícáaìá, Ícáaèá, Ícáa 3 á,... , Ícáamá).
If
ÍcáAi = 0
then
Ícáaéá = 0
for every j = 1,... , m and hence we see that f(ÍcáAi) = 0. This shows that f is
well-defined for type å transformations. Conversely, if
f(ÍcáAi) = 0
then we see that again
Ícáaéá = 0
for every j = 1,... , m since each component in the expression ÍcáAÿ i = 0
must equal 0. Hence ÍcáAi = 0 if and only if f(ÍcáAi) = 0, and hence Ker f =
{0} for type å transformations (which also shows that f is well-defined).
We leave it to the reader (see Exercise 3.3.1) to show that f is well-defined
and Ker f = {0} for transformations of type ∫, and we go on to those of type ©.
Again for definiteness, we consider the particular transformation Aÿè = Aè +
kAì and Aÿé = Aé for j = 2,... , m. Then
i
i
( ) =^! c i
i
i
a 1 i
,! a 2 i
,!…!,! a mi
( )
i
a 1 i
i
a 2 i
i
a 2 i
i
a mi
( )
If
ÍcáAi = 0
then
Ícáaéá = 0
for every j = 1,... , m so that ÍcáAÿi = 0 and f is well-defined for type ©
transformations. Conversely, if
ÍcáAÿi = 0
then
Ícáaéá = 0
for j = 2,... , m, and this then shows that Ícia1i = 0 also. Thus ÍcáAÿi = 0
implies that ÍcáAi = 0, and hence ÍcáAi = 0 if and only if f(ÍcáAi) = 0. This
shows that Ker f = {0} for type © transformations also, and f is well-defined.
In summary, by constructing an explicit isomorphism in each case, we
have shown that the column spaces W and Wÿ are isomorphic under all three
types of elementary row operations, and hence it follows that the column
spaces of row equivalent matrices must have the same dimension. ˙
Corollary If Aÿ is the row-echelon form of A, then ÍcáAi = 0 if and only if
ÍcáAÿi = 0.
Proof This was shown explicitly in the proof of Theorem 3.4 for each type of
elementary row operation. ˙
In Theorem 3.3 we showed that every matrix is row equivalent to a
reduced row-echelon matrix, and hence (by Theorem 3.4) any matrix and its
row canonical form have the same row space. Note though, that if the original
matrix has more rows than the dimension of its row space, then the rows
obviously can not all be linearly independent. However, we now show that the
nonzero rows of the row canonical form are in fact linearly independent, and
hence form a basis for the row space.
Theorem 3.5 The nonzero row vectors of an m x n reduced row-echelon
matrix R form a basis for the row space of R.
Proof From the four properties of a reduced row-echelon matrix, we see that
if R has r nonzero rows, then there exist integers jè,... , j r with each já ¯ n
and jè < ~ ~ ~ < jr such that R has a 1 in the i th row and já th column, and every
other entry in the já th column is 0 (it may help to refer to Example 3.5 for
visualization). If we denote these nonzero row vectors by Rè,... , Rr then any
arbitrary vector
v = c i
i
i = 1
r
!
has cá as its já th coordinate (note that v may have more than r coordinates if r <
n). Therefore, if v = 0 we must have each coordinate of v equal to 0, and
hence cá = 0 for each i = 1,... , r. But this means that the Rá are linearly
independent, and since {Rá} spans the row space by definition, we see that
they must in fact form a basis. ˙
Theorem 3.4). But bijè = 0 for every i, and hence a1jè = 0 which contradicts the
assumption that a1jè is a distinguished element of A (and must be nonzero by
definition). We are thus forced to conclude that jè ˘ kè. However, we could
clearly have started with the assumption that kè < jè, in which case we would
have been led to conclude that kè ˘ jè. This shows that we must actually have
jè = kè.
Now let Aæ and Bæ be the matrices which result from deleting the first row
of A and B respectively. If we can show that Aæ and Bæ have the same row
space, then they will also satisfy the hypotheses of the theorem, and our con-
clusion follows at once by induction.
Let R = (aè, aì,... , añ) be any row of Aæ (and hence a row of A), and let
Bè,... , Bm be the rows of B. Since A and B have the same row space, we
again have
R = d i
i
i= 1
m
!
for some set of scalars dá. Since R is not the first row of A and Aæ is in row-
echelon form, it follows that aá = 0 for i = jè = kè. In addition, the fact that B is
in row-echelon form means that every entry in the kè th column of B must be 0
except for the first, i.e., b1kè ≠ 0, b2kè = ~ ~ ~ = bmkè = 0. But then
0 = akè = dè b1kè + dì b2kè + ~ ~ ~ + dm bmkè = dè b1kè
which implies that dè = 0 since b1kè ≠ 0. This shows that R is actually a linear
combination of the rows of Bæ, and hence (since R was arbitrary) the row
space of Aæ must be a subspace of the row space of Bæ. This argument can
clearly be repeated to show that the row space of Bæ is a subspace of the row
space of Aæ, and hence we have shown that Aæ and Bæ have the same row
space. ˙
Theorem 3.7 Let A = (aáé) and B = (báé) be reduced row-echelon matrices.
Then A and B have the same row space if and only if they have the same
nonzero rows.
Proof Since it is obvious that A and B have the same row space if they have
the same nonzero rows, we need only prove the converse. So, suppose that A
and B have the same row space. Then if Aá is an arbitrary nonzero row of A,
we may write
i
r
c r
r
where the Br are the nonzero rows of B. The proof will be finished if we can
show that cr = 0 for r ≠ i and cá = 1.
To show that cá = 1, let aijá be the first nonzero entry in Aá, i.e., aijá is the
distinguished element of the i th row of A. Looking at the já th component of
(1) we see that
a ij i
r
c r
b rj i
(see the proof of Theorem 3.4). From Theorem 3.6 we know that bijá is the
distinguished element of the i th row of B, and hence it is the only nonzero
entry in the já th column of B (by definition of a reduced row-echelon matrix).
This means that (2) implies aijá = cá bijá. In fact, it must be true that aijá = bijá =
1 since A and B are reduced row-echelon matrices, and therefore cá = 1.
Now let bkjÉ be the first nonzero entry of BÉ (where k ≠ i). From (1) again
we have
a ij k
r
c r
b rj k
Since B is a reduced row-echelon matrix, bkjÉ = 1 is the only nonzero entry in
the jÉ th column of B, and hence (3) shows us that aijÉ = cÉbkjÉ. But from
Theorem 3.6, akjÉ is a distinguished element of A, and hence the fact that A is
row-reduced means that aijÉ = 0 for i ≠ k. This forces us to conclude that cÉ =
0 for k ≠ i as claimed. ˙
Suppose that two people are given the same matrix A and asked to trans-
form it to reduced row-echelon form R. The chances are quite good that they
will each perform a different sequence of elementary row operations to
achieve the desired result. Let R and Ræ be the reduced row-echelon matrices
that our two students obtain. We claim that R = Ræ. Indeed, since row equiva-
lence defines an equivalence relation, we see from Theorem 3.4 that the row
spaces of R and Ræ will be the same. Therefore Theorem 3.7 shows us that the
rows of R must equal the rows of Ræ. Hence we are justified in calling the
reduced row-echelon form of a matrix the row canonical form as mentioned
earlier.
Exercises
space: