Linear Algebra, Lecture Notes - Mathematics - Prof Peter M Neumann, Study notes of Mathematics

Permutations, Determinants of square matrix, eigenvalues and eigenvectors

Typology: Study notes

2010/2011

Uploaded on 09/09/2011

andreasphd
andreasphd 🇬🇧

4.7

(28)

287 documents

1 / 42

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Version of 6.ii.2011; revised 18 March 2011
Oxford University Mathematical Institute
Notes on Linear Algebra II for Mathematical Moderations
by Peter M. Neumann (Queen’s College, Oxford)
Preface
These notes are intended as a rough guide to the eight-lecture course Linear Algebra II
which is a part of the Oxford 1st year undergraduate course in mathematics. Please do
not expect a polished account. They are lecture notes, not a carefully checked textbook.
Nevertheless, I hope they may be of some help.
The course is designed to build on Linear Algebra I, the Michaelmas Term course.
The synopsis for that course includes the following—though not exactly in this order:
vector spaces over the real or complex numbers, subspaces, linear independence
and linear dependence, the span of a (finite) set of vectors, spanning sets;
algebra of matrices, the space of m×nmatrices;
matrix representation of a system of linear equations;
elementary row operations on matrices, echelon form and row-reduction; invariance
of the row space under row operations, row rank;
definition of a basis, application of elementary row operations to finding a basis of
a vector space; reduction of a spanning set and extension of a linearly independent
set to a basis; the theorem that all bases have the same size, the dimension of a
vector space, co-ordinates with respect to a basis;
sums and intersections of subspaces, a formula for the dimension of the sum;
linear maps (or transformations), the image and kernel of a linear map, the Rank-
Nullity Theorem;
the matrix representation of a linear transformation with respect to fixed bases,
change of basis and co-ordinate systems;
composition of linear maps and product of matrices;
significance of image, kernel, rank and nullity for systems of linear equations, solu-
tion by Gaussian Elimination, bases of the solution space of homogeneous equations,
applications to finding bases of vector spaces;
invertible matrices, use of row operations to decide invertibility and to calculate
inverses;
column space and column rank, equality of row rank and column rank.
Although it is appreciated that the reader may not yet have become familiar and
comfortable with these concepts and ideas, we must take them as understood and move
on. The synopsis for Linear Algebra II is as follows.
i
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a

Partial preview of the text

Download Linear Algebra, Lecture Notes - Mathematics - Prof Peter M Neumann and more Study notes Mathematics in PDF only on Docsity!

Version of 6.ii.2011; revised 18 March 2011

Oxford University Mathematical Institute

Notes on Linear Algebra II for Mathematical Moderations

by Peter M. Neumann (Queen’s College, Oxford)

Preface

These notes are intended as a rough guide to the eight-lecture course Linear Algebra II which is a part of the Oxford 1st^ year undergraduate course in mathematics. Please do not expect a polished account. They are lecture notes, not a carefully checked textbook. Nevertheless, I hope they may be of some help.

The course is designed to build on Linear Algebra I, the Michaelmas Term course. The synopsis for that course includes the following—though not exactly in this order:

  • vector spaces over the real or complex numbers, subspaces, linear independence and linear dependence, the span of a (finite) set of vectors, spanning sets;
  • algebra of matrices, the space of m × n matrices;
  • matrix representation of a system of linear equations;
  • elementary row operations on matrices, echelon form and row-reduction; invariance of the row space under row operations, row rank;
  • definition of a basis, application of elementary row operations to finding a basis of a vector space; reduction of a spanning set and extension of a linearly independent set to a basis; the theorem that all bases have the same size, the dimension of a vector space, co-ordinates with respect to a basis;
  • sums and intersections of subspaces, a formula for the dimension of the sum;
  • linear maps (or transformations), the image and kernel of a linear map, the Rank- Nullity Theorem;
  • the matrix representation of a linear transformation with respect to fixed bases, change of basis and co-ordinate systems;
  • composition of linear maps and product of matrices;
  • significance of image, kernel, rank and nullity for systems of linear equations, solu- tion by Gaussian Elimination, bases of the solution space of homogeneous equations, applications to finding bases of vector spaces;
  • invertible matrices, use of row operations to decide invertibility and to calculate inverses;
  • column space and column rank, equality of row rank and column rank.

Although it is appreciated that the reader may not yet have become familiar and comfortable with these concepts and ideas, we must take them as understood and move on. The synopsis for Linear Algebra II is as follows.

i

  • Permutations of a finite set, two-line notation; composition of permutations. Cycles and cycle notation. Transpositions; every permutation may be expressed as a product of transpositions; the parity of a permutation and its relation to cycle structure.
  • Determinants of square matrices. Simple properties of the determinant function; determinants, area and volume.
  • Determinants and row-operations on square matrices; computation of determinant by reduction of matrices to row echelon form.
  • Multiplicativity of the determinant function. Determinants and invertibility of square matrices.
  • Review of the matrix of a linear transformation of a vector space to itself with respect to a given basis. The determinant of a linear transformation of a finite- dimensional vector space to itself.
  • Eigenvalues and eigenvectors of square matrices and of linear maps from a vector space to itself. The characteristic polynomial of a square matrix or a linear map of a finite-dimensional vector space to itself.
  • The linear independence of eigenvectors associated with distinct eigenvalues, diag- onalisability of matrices.
  • Application to the solution of systems of first order linear differential equations.

The synopsis provides a good guide to the content of the lectures, but it should never be forgotten that it is the syllabus which defines the course. It is the syllabus which should guide student learning, it is the syllabus which specifies to the examiners what they should test, it is the syllabus which suggests to tutors what they should teach, it is the syllabus which provides the basis of the lecture synopses.

A set of four exercise sheets goes with this lecture course. The questions they contain will be found embedded in these notes along with a number of supplementary exercises.

I am most grateful to Professor Victor Flynn who pointed an error in the proof of Theorem 2.6, now (18 March 2011) corrected. Furher feedback will be warmly wel- comed: please let me know of any errors, infelicities and obscurities. Email me at [email protected] or write a note to me at Queen’s College.

Π MN: Queen’s: Version of 6.ii.2011; revised 18 March 2011

ii

iv

1 Permutations

We begin with permutations for two reasons: first, permutations enter into an important description of determinants of square matrices; secondly, they are needed in group theory which is to be introduced in the second half of Hilary Term.

The word permutation has two meanings nowadays. In the first, a permutation of a finite set S is an arrangement, that is an ordered list of the members of S. Thus in this sense the permutations of the set { 1 , 2 , 3 } are 1 2 3, 1 3 2, 2 1 3, 2 3 1, 3 1 2, and 3 2 1, six of them. This is the meaning of the word in the phrase ‘permutations and combinations’ which occurs in school mathematics.

The second meaning, the most common one in modern mathematics and the one that is intended in the Oxford syllabus, is this. Let S be a set.

Definition 1.1. A permutation of S is a bijective (one-to-one and onto) map S → S. The set of all permutations of S will be denoted Sym(S) or SymS.

Throughout these lectures the set S will be finite, usually with n members. If S = { 1 , 2 ,... , n}, or if it does not matter what set of size n is under consideration we will usually write Sym(n) instead of Sym(S) or SymS. Many authors use Sn or Σn.

Observation 1.2. The set Sym(n) has size n!.

For, for an element of Sym(n) there are n possibilities for the image of the first element of S , then n − 1 possibilities for the image of the second element, and so on. Since these choices are independent there are n×(n−1)×· · ·× 2 ×1, that is n! possibilities in all.

We may think of a permutation as a rearrangement of S. Thus starting with any arrangement we get another arrangement. This leads to Cauchy’s two-line notation for permutations. The top line is a list of the elements of the set S , below is the list of their images under the bijection∗. Thus any of ( 1 2 3 3 1 2

is acceptable notation for the bijection 1 7 → 3, 2 7 → 1, 3 7 → 2 of the set { 1 , 2 , 3 }.

One of the facts to be found in the Michaelmas Term course Introduction to Pure Mathematics is that a map is bijective if and only if it has a (two-sided) inverse. Another is that if ρ, σ are bijective maps then also the composite map ρ σ is bijective. This means that Sym(S) is what is called a group. Groups are to be introduced and studied in your next algebra course. I anticipate the terminology here only because, being very familiar with it, I am likely to slip up from time to time and unintentionally use the term ‘symmetric group’ as a name for the set Sym(S).

In algebraic work it is common to write permutations to the right of their arguments, and I shall do so in these lectures. Thus the image of x ∈ S under the map ρ : S → S will be written xρ. This has the disadvantage that it is different from what is common in Analysis, Geometry and Applied Mathematics, but it has the advantage that the

∗A.-L. Cauchy introduced this notation in 1815. Curiously, when he returned to the subject of permutations—or substitutions as they were then called—in a series of papers in 1845, he turned the notation the other way up.

The number m is known as the length of the cycle. Note that cycles of length 1 are the same as fixed points of ρ, that is members x of S such that xρ = x.

Theorem 1.3. Let S be a finite set and let ρ ∈ SymS. Every member of S lies in one and only one cycle of ρ.

A proof will be given in the algebra course An Introduction to Groups, Rings and Fields later this term. For the moment we will take it for granted—though you may well like to discover and write out a proof for yourself.

Theorem 1.3 leads to cycle notation, a very economical notation for permutations of the finite set S. Cycle notation for the permutation ρ is simply a list of the distinct cycles of ρ. Conventionally, however, cycles of length 1, that is fixed points of ρ , are not included in the list. Thus instead of (1)(2 3) we simply write (2 3) for the permutation in

Sym(3) which in Cauchy’s two-line notation is

. This convention would lead to

the identity permutation ‘make no change’ being represented by an empty list, invisible on the page, and so it is in fact denoted ι or 1. It is worth noting that in cycle notation the inverse is easy to write down. Just reverse the order of the members of each cycle:

( (a 0 a 1... ak− 2 ak− 1 )(b 0 b 1... bl− 2 bl− 1 ) · · · (d 0 d 1... dp− 2 ap− 1 )

= (ak− 1 ak− 2... a 1 a 0 )(bl− 1 bl− 2... b 1 b 0 ) · · · (dp− 1 dp− 2... d 1 a 0 ).

Example. For Sym(3) the following table compares two-line notation with cycle- notation: (^) ( 1 2 3 1 2 3

= ι,

Exercise 1.2. In Sym(4) let α := (1 2 3 4), β := (2 3 4), and γ := (1 3)(2 4). (a) Write down two-line forms for α , β , and γ. (b) Calculate α β , γ β−^1 , and α β−^1 γ , expressing them in cycle notation.

The cycle-type of a permutation ρ ∈ Sym(n) is a list of the number of m-cycles that occur in ρ for each m. Thus, for example, there are three cycle-types in Sym(3): three 1-cycles (the identity ι), one 1-cycle and one 2-cycle, one 3-cycle.

Exercise 1.3. What are the possible cycle-types in Sym(4)? And how many permutations are there in Sym(4) of each possible cycle-type?

Often it is useful to have notation for the set of members of S fixed by the permutation ρ and for the set, known as the support of ρ, of those moved by ρ: thus

Fixρ := {x ∈ S | xρ = x} and Suppρ := {x ∈ S | xρ 6 = x}.

Exercise 1.4. Let ρ ∈ Sym(n), let p be a prime number, and let r be the remain- der when n is divided by p (so that 0 6 r < p and n = qp + r for some integer q ). (i) Show that ρp^ = ι if and only if the cycles of ρ all have lengths 1 or p. (ii) Show that if ρp^ = ι then |Suppρ| is a multiple of p and |Fixρ| ≡ r (mod p).

A permutation in Sym(n) that has n − k fixed points and one k -cycle is often itself referred to as a k -cycle. This is perhaps a mild abuse of language, but it is comfortable and convenient, and the context will always clarify the sense in which the term k -cycle is being used. A particularly important case is that in which k = 2:

Definition 1.4. A transposition in SymS is a permutation that has n − 2 fixed points (1-cycles) and one 2-cycle.

Exercise 1.5. Let τ 1 , τ 2 be transpositions in Sym(n). Show that τ 1 τ 2 = ι or (τ 1 τ 2 )^2 = ι or (τ 1 τ 2 )^3 = ι. What condition on τ 1 and τ 2 determines which of these three possibilities occurs? And what is the cycle-structure of τ 1 τ 2 in each case?

Theorem 1.5. Every permutation in Sym(n) may be expressed as a product (that is a composite) of some transpositions.

Proof. Note first that if τi := (x 0 xi) then (as may easily be checked by direct calculation)

τ 1 τ 2 · · · τm− 1 = (x 0 x 1 )(x 0 x 2 ) · · · (x 0 xm− 1 ) = (x 0 x 1... xm− 1 ),

and so any m-cycle may be expressed as a product of m − 1 transpositions. Now an arbitrary permutation ρ ∈ Sym(n) is a product of cycles, and since each of these is a product of transpositions ρ itself may be expressed as a product of transpositions, as the theorem states.

Exercise 1.6. Give an alternative proof of Theorem 1.5 by using induction on |Supp(ρ)| to show that any ρ ∈ Sym(n) may be expressed as a product of n − 1 or fewer transpositions.

The expression of a permutation ρ ∈ Sym(n) as a product of transpositions is far from unique. For example, in Sym(3) the identity ι may be expressed as the empty product (the product of 0 transpositions), but also

ι = (1 2)(1 2) = (1 3)(1 3) = (2 3)(2 3) = (1 2)(2 3)(1 2)(2 3)(1 2)(2 3),

and of course there are many more such expressions. There is, however, an apparently feeble but actually very powerful fact about different such expressions. Namely, no per- mutation can be expressed in one way as a product of an even number of transpositions and also in another way as a product of an odd number of transpositions. Thus if ρ may be expressed as a product of an even number of transpositions then every expression of ρ as a product of transpositions contains an even number of them; likewise, if ρ may be expressed as a product of an odd number of transpositions then every expression of ρ as a product of transpositions contains an odd number of them. That is the point of the following theorem.

Theorem 1.6. For ρ ∈ Sym(n), if ρ = τ 1 τ 2 · · · τr = τ 1 ′τ 2 ′ · · · τ (^) s′ , where τ 1 , τ 2 ,.. ., τr and τ 1 ′ , τ 2 ′ ,.. ., τ (^) s′ are transpositions, then r ≡ s (mod 2).

Proof. We use the ideas in the original proof given by A.-L. Cauchy in 1815. For ρ ∈ Sym(n) define k(ρ) to be the total number of cycles, including 1-cycles (fixed

Proof. Let E := {ρ ∈ Sym(n) | ρ is even} and O := {ρ ∈ Sym(n) | ρ is odd}. Let τ be any transposition and define the mapping f : Sym(n) → Sym(n) by f : ρ 7 → ρ τ. Then f 2 is the identity mapping, so f is invertible (its inverse is f itself) and therefore f is bijective. Also E f ⊆ O and O f ⊆ E. It follows immediately that the restriction of f to E is a bijection E → O and therefore |E| = |O|. Then, since Sym(n), which has size n! , is the disjoint union of E and O , it follows that |E| = |O| = 12 n!.

For (2) note first that if ρ = τ 1 τ 2 · · · τr where the τi are transpositions then, as may easily be checked, ρ−^1 = τr · · · τ 2 τ 1. Therefore sign(τ −^1 ) = sign(τ ). Also, if ρ = τ 1 τ 2 · · · τr where the τi are transpositions and σ = τ 1 ′τ 2 ′ · · · τ (^) s′ where the τ (^) i′ are transpositions, then ρ σ = τ 1 · · · τrτ 1 ′ · · · τ (^) s′ , and so

sign(ρ σ) = (−1)r+s^ = (−1)r(−1)s^ = sign(ρ)sign(σ),

as required.

The parity of a permutation can easily be calculated from its cycle structure:

Observation 1.9. Consider permutations in Sym(n). (1) An m-cycle is an even permutation if m is odd, an odd permutation if m is even. (2) Define the index of a permutation ρ ∈ Sym(n) by

ind(ρ) := (m 1 − 1) + · · · + (mk − 1),

where m 1 ,... , mk are the lengths of its cycles. Then sign(ρ) = (−1)ind (ρ)^. That is, ρ is an even permutation if ind(ρ) is even, and ρ is an odd permutation if ind(ρ) is odd.

Proof. We saw in the proof of Theorem 1.5 that

(x 0 x 1... xm− 1 ) = (x 0 x 1 )(x 0 x 2 ) · · · (x 0 xm− 1 ).

Thus an m-cycle may be expressed as a product of m − 1 transpositions and so it is even if m is odd and it is odd if m is even. Part (2) follows from Theorem 1.8(2) together with the fact that if γ is an m-cycle then sign(γ) = (−1)ind (γ)^.

Exercise 1.7. In Sym(11) let

ρ := (1 4 2 3)(6 8 9 10 11), σ := (1 2)(4 5)(10 11), τ := (1 2 3)(4 5 6)(7 8)(9 10). What are the parities of ρ, σ , τ , ρ σ , σ τ −^1 , and ρ σ−^1 τ?

Exercise 1.8. Let ρ ∈ Sym(n). (i) Suppose that ρm^ = ι for some odd integer m. Show that ρ must be even. (ii) For which values of n does the converse (that if ρ is even then ρm^ = ι for some odd integer m) hold?

Exercise 1.9. Consider permutations in Sym(n). Show that ind(ρ) (as defined in Observation 1.9) is the smallest number m such that ρ may be expressed as a product of m transpositions. Deduce that ind(ρ σ) 6 ind(ρ) + ind(σ). [Hint. The technique in the proof of Theorem 1.6 may be used to show that if ρ can be expressed as a product of r transpositions then r > ind(ρ).]

Exercise 1.10. Let ρ, σ ∈ Sym(n). Suppose that ρ^2 = σ^2 = ι and ρ σ = γ where γ is the n-cycle (1 2 3... n − 1 n). Show that if n is even then one of Fix(ρ), Fix(σ) is empty, the other has size 2 (notation is as defined on p. 3), whereas if n is odd then each of Fix(ρ), Fix(σ) has exactly one member. [Hint. Use the result of the previous exercise.]

Exercise 1.11. For any function f (x 1 , x 2 ,... , xn) and any ρ ∈ Sym(n) we define the function f ρ^ by permuting the variables in f. To be precise,

f ρ(x 1 , x 2 ,... , xn) := f (x 1 ρ− 1 , x 2 ρ− 1 ,... , xnρ− 1 ).

(The formula involves ρ−^1 rather than simply ρ for a technical reason which is of no relevance here.) Define a function ∆ of n variables by

∆(x 1 , x 2 ,... , xn) :=

16 i<j 6 n

(xj − xi).

Show that if τ is a transposition in Sym(n) then ∆τ^ (x 1 ,... , xn) = −∆(x 1 ,... , xn). Hence show that if ρ = τ 1 τ 2 · · · τr where τ 1 , τ 2 ,... , τr are transpositions then ∆ρ(x 1 ,... , xn) = (−1)r^ ∆(x 1 ,... , xn). Deduce that no permutation is both even and odd.

[Note. This is a popular alternative proof of Theorem 1.6 that is to be found in many textbooks.]

Exercise 1.12. Let S be a set (finite or infinite) and let ρ : S → S be a permu- tation (that is, a bijection). Define a binary relation ∼ on S as follows:

for x, y ∈ S , x ∼ y :⇔ ∃r ∈ Z : y ρr^ = x.

Show that ρ is an equivalence relation on S.

2 Determinants of square matrices

To each square matrix is associated a scalar called its determinant. There are many possible approaches to determinants. The one below is classical. It is important to realise that the determinant is defined for an m × n matrix only when m = n, that is, only when the matrix is square.

In this section it does not matter much what the domain of numbers is from which the scalars or the entries of matrices are taken. For simplicity, and for consistency with the Michaelmas Term course Linear Algebra I , when reading these notes you can take it that linear algebra involves vector spaces over R and that matrices have real coefficients. Nevertheless, as I have said, the coefficient domain does not matter (as long as it is a field, that is, as long as it permits division by non-zero numbers as well as addition, subtraction and multiplication). Therefore in my writing I shall be coy about the exact nature of the scalars in order that generalisation to linear algebra over C or Q or other fields may be achieved painlessly when the time comes.

Definition 2.1. Let A be an n × n matrix with coefficients ai j. The determinant

of A is defined by the equation detA :=

ρ∈Sym (n)

sign(ρ)

∏^ n

i=

ai iρ.

Thus detA =

α

σ(α) a 1 j 1 a 2 j 2 · · · anjn ,

where α ranges over all the n! arrangements j 1 , j 2 ,... , jn of 1, 2 ,... , n , and σ(α)

is 1 or −1 according as the permutation

1 , 2 ,... , n j 1 , j 2 ,... , jn

is even or odd. Thus the

determinant is obtained by taking one entry from each row of the matrix in such a way that the chosen entries lie in different columns, doing this in each of the n! possible ways, multiplying these n numbers together, multiplying by ±1 and adding all these ingredients together.

Examples: det

a b c d

= ad − bc; detIn = 1, where In is the n × n identity matrix.

Exercise 2.1. Consider the geometry of R^2. Let a := (a 1 , a 2 ) ∈ R^2 and b := (b 1 , b 2 ) ∈ R^2. Show that, up to sign, det

( a 1 a 2 b 1 b 2

) is the area of the parallelogram with vertices at O(0, 0), a, b, and a + b.

By way of further examples we dwell for a moment on determinants of 3 × 3 matrices.

Exercise 2.2. Check carefully that

det

 

a 1 a 2 a 3 b 1 b 2 b 3 c 1 c 2 c 3

  (^) = a 1 b 2 c 3 − a 1 b 3 c 2 + a 2 b 3 c 1 − a 2 b 1 c 3 + a 3 b 1 c 2 − a 3 b 2 c 1.

Writing the right hand side as a 1 (b 2 c 3 − b 3 c 2 ) + a 2 (b 3 c 1 − b 1 c 3 ) + a 3 (b 1 c 2 − b 2 c 1 ) we see that it is the same as the scalar (or dot) product of the (row) vectors

(a 1 , a 2 , a 3 ) and (b 2 c 3 − b 3 c 2 , b 3 c 1 − b 1 c 3 , b 1 c 2 − b 2 c 1 )

in R^3. The second of these is the vector product (or cross product) of the two row vectors (b 1 , b 2 , b 3 ) and (c 1 , c 2 , c 3 ) in R^3. Now a · (b × c) is, by definition the so-called scalar triple product [a, b, c] of the three vectors a := (a 1 , a 2 , a 3 ), b := (b 1 , b 2 , b 3 ), c := (c 1 , c 2 , c 3 ). Recall that b×c is a vector that is perpendicular both to the plane spanned by b and c (at least in the general case where b and c are linearly independent of each other) and that its magnitude is |b||c| sin α, where α is the angle between b and c. Thus its magnitude is double the area of the triangle with vertices at 0, b, c, and so it equals the area of the parallelogram with vertices 0, b, c and b + c. Forming the scalar product with a then gives a number whose magnitude is the volume of the parallelepiped spanned by a, b and c, that is, the parallelepiped with vertices at 0, a, b, c, a + b, a + c, b + c and a + b + c. Let’s call this vol(a, b, c). Then we can summarise this discussion as follows:

Theorem 2.2. If a := (a 1 , a 2 , a 3 ), b := (b 1 , b 2 , b 3 ) and c := (c 1 , c 2 , c 3 ) in R^3 then

det

a 1 a 2 a 3 b 1 b 2 b 3 c 1 c 2 c 3

 (^) = [a, b, c]

and

vol(a, b, c) =

det

a 1 a 2 a 3 b 1 b 2 b 3 c 1 c 2 c 3

As one further example we record the following:

Observation 2.3. If the n × n matrix A is upper triangular, that is aij = 0 when i > j , then detA = a 11 a 22 · · · ann.

Here is the reason. Consider the summand

∏n i=1 ai iρ^ in the determinant. If it is to be non-zero then ai iρ 6 = 0 for every value of i, and therefore i 6 iρ for all i between 1 and n. It follows that nρ = n; then since ρ is bijective, also (n − 1)ρ = n − 1; and so on. In fact iρ = i for 1 6 i 6 n, that is, ρ = ι. Thus all terms in the determinant are 0 except possibly the one corresponding to the identity permutation, and this term is +a 11 a 22 · · · ann. Hence detA = a 11 a 22 · · · ann as claimed.

Exercise 2.3. Suppose that A is lower triangular, that is, aij = 0 when i < j. Find detA. Exercise 2.4. Suppose that the n × n matrix A has the form

( U V W X

) in which U , V , W , X are n 1 × n 1 , n 1 × n 2 , n 2 × n 1 and n 2 × n 2 matrices respectively, where n 1 + n 2 = n. Show that if V = 0 then detA = detU detX. Exercise 2.5. Let A be an n × n matrix. Show that if one of its rows is the zero (row) vector, or if one of its columns is the zero (column) vector, then detA = 0. Exercise 2.6. Let A be an n×n matrix and λ a scalar (that is, a number). Show that det(λ A) = λndetA.

Observation 2.4. Let A be an n × n matrix with coefficients ai j. Then

detA =

ρ∈Sym (n)

sign(ρ)

∏^ n

j=

ajρ j.

That is, detA =

α

σ(α) ai 11 ai 22 · · · ainn , where α ranges over all the n! arrange-

ments( i 1 , i 2 ,... , in of 1 , 2 ,... , n , and σ(α) is 1 or − 1 according as the permutation 1 , 2 ,... , n i 1 , i 2 ,... , in

is even or odd.

For, interchange of two equal rows (or columns) leaves A unchanged, but replaces detA by −detA. Thus detA = −detA and therefore detA = 0.

We define the (i, j)th^ minor Aij of A to be the (n − 1) × (n − 1) matrix obtained by deleting the ith^ row and the jth^ column from A.

Theorem 2.8. If n = 1 then detA = a 11 ; if n > 1 then detA =

∑^ n

j=

(−1)1+j^ a 1 j detA 1 j.

Proof. Write detA = a 11 A 1 + a 12 A 2 + · · · + a 1 nAn where A 1 , A 2 ,... , An are certain functions of the entries of A. Let’s first examine A 1. Terms in the expression for detA that involve a 11 are those of the form

ai iρ in which 1ρ = 1. Thus the relevant permutations fix 1 and permute { 2 ,... , n}. They are even or odd as permutations of this set of size n − 1 according to whether they are even or odd as members of Sym(n). Thus

A 1 =

ρ∈Sym ({ 2 , ..., n})

sign(ρ)

∏^ n

i=

ai iρ ,

and so A 1 = det(A 11 ).

To identify Am we work with a new matrix A′^. It is the matrix obtained from A by permuting its first m columns cyclically, that is to say according to the m-cycle γ := (1 2... m). By Theorem 2.6 detA′^ = sign(γ)detA. Let a′ ij be the (i, j)-entry of A′^. Thus a′ ij = aij if j > m, a′ ij = ai (j−1) if 2 6 j 6 m, and a′ i 1 = aim. We have already seen that what multiplies a′ 11 in the expression for detA′^ is detA′ 11 , where A′ 11 is the (1, 1)th^ minor of A′^. Clearly, however, A′ 11 = A 1 m. Therefore what multiplies a 1 m in the expression for sign(γ)detA is detA 1 m. Thus Am = sign(γ)detA 1 m. Observation 1. says that sign(γ) is 1 if m is odd and it is −1 if m is even. Hence sign(γ) = (−1)1+m^ , and so (returning now to j as column index, rather than m)

detA =

∑^ n

j=

a 1 j Aj =

∑^ n

j=

(−1)1+j^ a 1 j detA 1 j ,

as the theorem states. Exercise 2.8. Construct a proof of Corollary 2.7 as follows. Let A be an n × n matrix with two rows the same. We seek to prove that detA = 0. Check that this is true for n = 2. Now suppose that n > 2 and that the determinant of an (n − 1) × (n − 1) matrix that has two equal rows is zero. By Theorem 2.6, without loss of generality we may suppose that the (n − 1)th^ and nth^ rows are equal. Now use Theorem 2.8 and the induction hypothesis to derive that detA = 0. [Note. There is a slight disadvantage to the original proof of Corollary 2.7 in that it does not generalise to matrices with coefficients from arithmetic systems in which a = −a for all elements a. One such system is Z 2 , the domain consisting of just the two elements 0 and 1 with addition and multiplication modulo 2. Linear algebra over domains other than R (and C) is important, and although it is not treated in the Mods syllabus, it is the subject of an important second year algebra course. The proof sketched in this exercise works for arbitrary coefficient domains.]

Some authors take Theorem 2.8 as an inductive definition of the determinant function and infer that the equation of Definition 2.1 holds. The second assertion in the statement of Theorem 2.8 is known as expansion of detA by its first row. In fact one can expand by any row or any column. That is the force of our next result.

Theorem 2.9. Let A be an n × n matrix with entries aij and minors Aij. If 1 6 r 6 n then detA =

∑^ n

j=

(−1)r+j^ arj detArj

(expansion of the determinant by its rth^ row ).

If 1 6 s 6 n then detA =

∑^ n

i=

(−1)i+saisdetAis

(expansion of the determinant by its sth^ column).

Proof. Let A′^ be the matrix obtained from A by permuting its first r rows cyclically, so that the first row of A′^ is the rth^ row of A. This permutation has sign (−1)r−^1 and so det ∑ A′^ = (−1)r−^1 detA. Expansion of detA′^ by its first row says that detA′^ = (−1)j+1a′ 1 j A′ 1 j , where a′ ij A′ ij are the entries and minors of A′^. But a′ 1 j = arj and A′ 1 j = Arj. Therefore

detA = (−1)r−^1 detA′^ = (−1)r−^1

(−1)j+1a′ 1 j A′ 1 j =

(−1)r+j^ arj Arj ,

as required.

To prove the equation for column expansion we use Corollary 2.5 and expand det(Atr^ ) by its sth^ row.

Theorem 2.10. Let A be an n × n matrix with entries aij and minors Aij and let 1 6 r 6 n, 1 6 s 6 n. If r 6 = s then

∑^ n

j=

(−1)s+j^ arj detAsj = 0 and

∑^ n

i=

(−1)i+sairdetAis = 0.

Proof. Let A′^ be the matrix obtained from A by replacing the sth^ row with the

rth^ row and leaving all other rows unchanged. The sum

∑^ n

j=

(−1)s+j^ arj detAsj is the

expansion of det(A′) by its sth^ row. But the rth^ and sth^ rows of A′^ are the same and therefore det(A′) = 0 by Corollary 2.7 (see also Exercise 2.8). This proves the first equation. The second may be proved in a similar way or by applying the first to the transpose Atr^ of A.

The information in the previous two theorems may be encapsulated in a pair of matrix equations.

Definition 2.11. Let A be an n × n matrix with entries aij and minors Aij. The matrix A∗^ :=

(−1)i+j^ det(Aij )

is known as the adjoint of A. Its entries (−1)i+j^ det(Aij ) are known as the cofactors of A.

Theorem 2.12. Let A be an n × n matrix with adjoint A∗^. Then

A (A∗)tr^ = (A∗)tr^ A = (detA) In ,

where In is the n × n identity matrix.

(2) Let ρ 1 , ρ 2 ,... , ρk be a sequence of elementary row operations that produces from A a matrix R in row-reduced echelon form. Define numbers di as follows:

di :=

− 1 if ρi interchanges two rows, λ if ρi multiplies a row by the non-zero number λ, 1 if ρi adds a multiple of one row to another row. If the last row of R is the 0 row-vector then detA = 0 ; otherwise R is the identity matrix In and detA = (d 1 d 2 · · · dk)−^1.

Proof. First recall that to each elementary row operation on the matrix A there corresponds a matrix E (known as an elementary matrix) such that if A′^ is the result of applying the row operation to A then A′^ = E A. In fact, E is the matrix obtained by applying the row operation to the identity matrix In. Thus

  • if the operation is interchange of rows r and s, then E is the matrix whose entries are 1 in the (i, i) position if i 6 = r and i 6 = s, 1 in the (r, s) and (s, r) positions, and 0 elsewhere;
  • if the operation is multiplication of row r by a non-zero number λ then E is the matrix with entry λ in the (r, r) position, entries 1 in the (i, i) position for i 6 = r , and entries 0 in all off-diagonal positions;
  • if the operation is addition of λ times row r to row s, then E has entries 1 on the diagonal, entry λ in the (s, r) position, and entries 0 in all other positions.

It is routine to check that in the first case detE = −1, in the second case detE = λ, and in the third detE = 1. Therefore in all cases det(E A) = detE detA by Theorem 2.14.

Now for 1 6 i 6 k let Ei be the elementary matrix corresponding to the row op- eration ρi. Then R = Ek Ek− 1 · · · E 2 E 1 A. Now use what was proved: det(E 1 A) = det(E 1 ) detA; then

det(E 2 E 1 A) = det(E 2 ) det(E 1 A) = det(E 2 ) det(E 1 ) detA ;

and so on. Ultimately what emerges is that

detR = det(Ek) det(Ek− 1 ) · · · det(E 1 ) detA.

Note, however, that det(Ei) = di. Therefore detR = dk dk− 1 · · · d 1 detA.

Recall now that R is in row-reduced echelon form. What this means is that in R all non-zero rows precede any zero rows; in each non-zero row the first non-zero entry is 1 and the entries above this 1 are 0; and if r < s then the leading entry in the rth^ row comes to the left of the leading entry in the sth^ row. There are therefore two possibilities. Either the last row of R is the 0 vector or all rows of R are non-zero. If R has a zero row then detR = 0 (see Exercise 2.5) and so detA = 0. Moreover, this occurs if and only if the rows of A are linearly dependent. This proves clause (1) of the theorem.

If all rows of R are non-zero then, since R is in row-reduced echelon form and has the same number of columns as rows, R = In. Then detR = 1, whence

dk dk− 1 · · · d 1 detA = 1

and so detA = (d 1 d 2 · · · dk)−^1 , as stated in clause (2).

Note. An n × n matrix A for which detA = 0 is said to be singular. If A is non- singular (that is, detA 6 = 0) then by Corollary 2.13 it is invertible. By Theorem 2.15(1), if A is singular then its rows are linearly dependent and so it does not have an inverse. Thus a square matrix is invertible if and only if it is non-singular. 15

Exercise 2.9. Use row operations to calculate detA 1 and detA 2 where

A 1 :=

  

1 2 3 4 2 1 4 3 3 4 1 2 4 3 2 1

   and^ A^2 :=

  

1 2 3 4 4 1 2 3 3 4 1 2 2 3 4 1

  .

Row operations can also be used to prove the following very important theorem.

Theorem 2.16. If A, B are n × n matrices then det(A B) = detA detB.

Proof. As in the preceding theorem and its proof let ρ 1 , ρ 2 ,... , ρk be elementary row operations that produce from A a matrix R in row-reduced echelon form. Let Ei be the elementary matrix corresponding to ρi. Then R = Ek Ek− 1 · · · E 2 E 1 A.

For each i the matrix Ei has an inverse Fi. If Ei corresponds to a transposition of rows then Fi = Ei ; if Ei corresponds to multiplication of a row by the non-zero number λ then Fi is the same as Ei except that the entry λ is replaced by λ−^1 ; and if Ei corresponds to addition of λ times the rth^ row to the sth^ row then Fi is the same as Ei except that the entry λ is replaced by −λ. Thus F 1 , F 2 ,... , Fk are again elementary matrices. Multiplying each side of the equation R = Ek Ek− 1 · · · E 2 E 1 A by F 1 F 2 · · · Fk− 1 Fk we find that on the right Fk cancels Ek , then Fk− 1 cancels Ek− 1 , and so on. The outcome is that A = F 1 F 2 · · · Fk R. Therefore A B = F 1 F 2 · · · Fk R B. As in the proof of Theorem 2.15, since the matrices Fi are elementary we know that det(Fi B′) = det(Fi) det(B′) for any n × n matrix B′^ and it follows that

det(A B) = det(F 1 ) det(F 2 ) · · · det(Fk) det(R B).

If the last row of R is the 0 row-vector then also the last row of R B is the zero row- vector and so det(R B) = 0, whence det(A B) = 0. But in this case the rows of A are linearly dependent and so detA = 0, whence also detA detB = 0. Thus in this case detA detB = det(A B).

If, on the other hand the last row of R is non-zero then R = In , so R B = B , and then det(A B) = det(F 1 ) det(F 2 ) · · · det(Fk) detB.

But det(F 1 ) det(F 2 ) · · · det(Fk) = detA, and therefore det(A B) = detA detB also in this case. This completes the proof of the theorem.

There is a simple consequence of this theorem that is important enough to be worth making explicit:

Corollary 2.17. Let A be an n × n invertible matrix. Then

det(A−^1 ) = (detA)−^1.

For, by the theorem, detA det(A−^1 ) = det(A A−^1 ) = detIn = 1.

The concept of determinant applies as well to linear transformations of a finite- dimensional vector space to itself as to square matrices. That phrase ‘linear transform- ation of a vector space to itself’, or its equivalent ‘linear mapping of a vector space to itself’, is a bit of a mouthful and so we introduce the following abbreviation (which is standard in more advanced algebra):