Docsity
Docsity

Prepara tus exámenes
Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity


Consigue puntos base para descargar
Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium


Orientación Universidad
Orientación Universidad


Matrix Properties and Special Types: Diagonal Matrices, Determinants, and Quadratic Forms, Apuntes de Métodos Matemáticos

Various properties and uses of matrices, including diagonal matrices, determinants, Hermitian matrices, and quadratic forms. Topics covered include matrix multiplication, eigenvalues and eigenvectors, and the relationship between matrices and quadratic forms.

Tipo: Apuntes

2020/2021

Subido el 28/08/2021

joarisrosa
joarisrosa 🇩🇴

1 documento

1 / 86

Toggle sidebar

Esta página no es visible en la vista previa

¡No te pierdas las partes importantes!

bg1
1Matrices and vector spaces
In so far as vector algebra is concerned (see the summary in Section A.9 of Appendix A),
avector can be considered as a geometrical object which has both a magnitude and a
direction, and may be thought of as an arrow fixed in our familiar three-dimensional
space. This space, in turn, may be defined by reference to, say, the fixed stars. This
geometrical definition of a vector is both useful and important since it is independent of
any coordinate system with which we choose to label points in space.
In most specific applications, however, it is necessary at some stage to choose a
coordinate system and to break down a vector into its component vectors in the
directions of increasing coordinate values. Thus for a particular Cartesian coordinate
system (for example) the component vectors of a vector awill be axi,ayjand azkand the
complete vector will be
a=axi+ayj+azk.(1.1)
Although for many purposes we need consider only real three-dimensional space, the
notion of a vector may be extended to more abstract spaces, which in general can have
an arbitrary number of dimensions N. We may still think of such a vector as an “arrow”
in this abstract space, so that it is again independent of any (N-dimensional) coordinate
system with which we choose to label the space. As an example of such a space, which,
though abstract, has very practical applications, we may consider the description of a
mechanical or electrical system. If the state of a system is uniquely specified by assigning
valuestoasetofNvariables, which could include angles or currents, for example, then
that state can be represented by a vector in an N-dimensional space, the vector having
those values as its components.1
In this chapter we first discuss general vector spaces and their properties. We then go
on to consider the transformation of one vector into another by a linear operator. This
leads naturally to the concept of a matrix, a two-dimensional array of numbers. The
properties of matrices are then developed and we conclude with a discussion of how to
use these properties to solve systems of linear equations and study some oscillatory
systems.
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
1This is an approach often used in control engineering.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56

Vista previa parcial del texto

¡Descarga Matrix Properties and Special Types: Diagonal Matrices, Determinants, and Quadratic Forms y más Apuntes en PDF de Métodos Matemáticos solo en Docsity!

1 Matrices and vector spaces

In so far as vector algebra is concerned (see the summary in Section A.9 of Appendix A), a vector can be considered as a geometrical object which has both a magnitude and a direction, and may be thought of as an arrow fixed in our familiar three-dimensional space. This space, in turn, may be defined by reference to, say, the fixed stars. This geometrical definition of a vector is both useful and important since it is independent of any coordinate system with which we choose to label points in space. In most specific applications, however, it is necessary at some stage to choose a coordinate system and to break down a vector into its component vectors in the directions of increasing coordinate values. Thus for a particular Cartesian coordinate system (for example) the component vectors of a vector a will be ax i , ay j and az k and the complete vector will be

a = ax i + ay j + az k. (1.1)

Although for many purposes we need consider only real three-dimensional space, the notion of a vector may be extended to more abstract spaces, which in general can have an arbitrary number of dimensions N. We may still think of such a vector as an “arrow” in this abstract space, so that it is again independent of any ( N -dimensional) coordinate system with which we choose to label the space. As an example of such a space, which, though abstract, has very practical applications, we may consider the description of a mechanical or electrical system. If the state of a system is uniquely specified by assigning values to a set of N variables, which could include angles or currents, for example, then that state can be represented by a vector in an N -dimensional space, the vector having those values as its components.^1 In this chapter we first discuss general vector spaces and their properties. We then go on to consider the transformation of one vector into another by a linear operator. This leads naturally to the concept of a matrix , a two-dimensional array of numbers. The properties of matrices are then developed and we conclude with a discussion of how to use these properties to solve systems of linear equations and study some oscillatory systems.

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 1 This is an approach often used in control engineering.

1

2 Matrices and vector spaces

1.1 Vector spaces

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • A set of objects (vectors) a , b , c ,... is said to form a linear vector space V if:

(i) the set is closed under commutative and associative addition, so that

a + b = b + a , (1.2) ( a + b ) + c = a + ( b + c ); (1.3)

(ii) the set is closed under multiplication by a scalar (any complex number) to form a new vector λ a , the operation being both distributive and associative so that

λ( a + b ) = λ a + λ b , (1.4) (λ + μ) a = λ a + μ a , (1.5) λ(μ a ) = (λμ) a , (1.6)

where λ and μ are arbitrary scalars; (iii) there exists a null vector 0 such that a + 0 = a for all a ; (iv) multiplication by unity leaves any vector unchanged, i.e. 1 × a = a ; (v) all vectors have a corresponding negative vectora such that a + (− a ) = 0. It follows from (1.5) with λ = 1 and μ = −1 that − a is the same vector as (−1) × a.

We note that if we restrict all scalars to be real then we obtain a real vector space (an example of which is our familiar three-dimensional space); otherwise, in general, we obtain a complex vector space. We note that it is common to use the terms “vector space” and “space”, instead of the more formal “linear vector space”. The span of a set of vectors a , b ,... , s is defined as the set of all vectors that may be written as a linear sum of the original set, i.e. all vectors

x = α a + β b + · · · + σ s (1.7)

that result from the infinite number of possible values of the (in general complex) scalars α, β,... , σ. If x in (1.7) is equal to 0 for some choice of α, β,... , σ (not all zero), i.e. if

α a + β b + · · · + σ s = 0 , (1.8)

then the set of vectors a , b ,... , s , is said to be linearly dependent. In such a set at least one vector is redundant, since it can be expressed as a linear sum of the others. If, however, (1.8) is not satisfied by any set of coefficients (other than the trivial case in which all the coefficients are zero) then the vectors are linearly independent , and no vector in the set can be expressed as a linear sum of the others. If, in a given vector space, there exist sets of N linearly independent vectors, but no set of N + 1 linearly independent vectors, then the vector space is said to be N- dimensional. In this chapter we will limit our discussion to vector spaces of finite dimensionality.

4 Matrices and vector spaces

a ∗^ superscript denotes complex conjugation):^3 〈 a | b 〉 = 〈 b | a 〉 ∗^ , (1.12) 〈 ab + μ c 〉 = λ 〈 a | b 〉 + μ 〈 a | c 〉 , (1.13) 〈λ a + μ b | c 〉 = λ∗^ 〈 a | c 〉 + μ∗^ 〈 b | c 〉 , (1.14) 〈λ ab 〉 = λ∗μ 〈 a | b 〉. (1.15)

Following the analogy with the dot product in three-dimensional real space, two vectors in a general vector space are defined to be orthogonal if 〈 a | b 〉 = 0. In the same way, the norm of a vector a , defined by || a || = 〈 a | a 〉^1 /^2 , is clearly a generalization of the length or modulus | a | of a vector a in three-dimensional space. In a general vector space 〈 a | a 〉 can be positive or negative; however, we will be concerned only with spaces in which 〈 a | a 〉 ≥ 0 and which are therefore said to have a positive semi-definite norm. In such a space 〈 a | a 〉 = 0 implies a = 0. It is usual when working with an N-dimensional vector space to use a basis ˆ e 1 , e ˆ 2 ,... , e ˆN that has the desirable property of being orthonormal (the basis vectors are mutually orthogonal and each has unit norm), i.e. a basis that has the property 〈 e ˆi | ˆ e j

= δ (^) ij. (1.16) Here δij is the Kronecker delta symbol, defined by the properties

δ (^) ij =

1 for i = j , 0 for i = j. Using the above basis, any two vectors a and b can be written as

a =

∑^ N

i= 1

ai e ˆi and b =

∑^ N

i= 1

bi e ˆi.

Furthermore, in such an orthonormal basis we have, for any a ,

e ˆj | a

∑^ N

i= 1

e ˆj |ai ˆ e i

∑^ N

i= 1

ai

e ˆj | e ˆi

= a (^) j. (1.17)

Thus the components of a are given by ai = 〈 ˆ e i | a 〉. Note that this is not true unless the basis is orthonormal. We can write the inner product of a and b in terms of their components in an orthonormal basis as 〈 a | b 〉 = 〈a 1 e ˆ 1 + a 2 e ˆ 2 + · · · + aN ˆ e N |b 1 e ˆ 1 + b 2 ˆ e 2 + · · · + bN e ˆN 〉

=

∑^ N

i= 1

a i∗ bi 〈 ˆ e i | e ˆi 〉 +

∑^ N

i= 1

∑^ N

j =i

a i∗ bj

ˆ e i | e ˆj

∑^ N

i= 1

a i∗ bi ,

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 3 It is a useful exercise in close analysis to deduce properties (1.14) and (1.15), on a justified step-by-step basis, using only those given in (1.12) and (1.13) and the general properties of complex conjugation.

5 1.2 Linear operators

where the second equality follows from (1.15) and the third from (1.16). This is clearly a generalization of the expression for the dot product of vectors in three-dimensional space. The extension of the above results to the case where the base vectors e 1 , e 2 ,... , e N are not orthonormal is more mathematically complicated and given in Appendix B.

1.1.3 Some useful inequalities

For a set of objects (vectors) forming a linear vector space in which 〈 a | a 〉 ≥ 0 for all a , there are a number of inequalities that often prove useful. Here we only list them; for the corresponding proofs the reader is referred to Appendix C.

(i) Schwarz’s inequality states that | 〈 a | b 〉 | ≤ || a || || b || , (1.18)

where the equality holds when a is a scalar multiple of b , i.e. when a = λ b. It is important here to distinguish between the absolute value of a scalar, |λ|, and the norm of a vector, || a ||. (ii) The triangle inequality states that || a + b || ≤ || a || + || b || (1.19)

and is the intuitive analogue of the observation that the length of any one side of a triangle cannot be greater than the sum of the lengths of the other two sides. (iii) Bessel’s inequality states that if ˆ e i , i = 1 , 2 ,... , N form an orthonormal basis in an N-dimensional vector space, then

|| a ||^2 ≥

∑^ M

i

| 〈^ e ˆi | a 〉 |^2 , (1.20)

where the equality holds if M = N. If M < N then inequality results, unless the basis vectors omitted all have ai = 0. This is the analogue of | x |^2 for a three-dimensional vector v being equal to the sum of the squares of all its components, and if any are omitted the sum may fall short of | x |^2. To these inequalities can be added one equality that sometimes proves useful. The parallelogram equality reads

|| a + b ||^2 + || ab ||^2 = 2

|| a ||^2 + || b ||^2

and may be proved straightforwardly from the properties of the inner product.

1.2 Linear operators

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • We now discuss the action of linear operators on vectors in a vector space. A linear

operator A associates with every vector x another vector

y = A x ,

in such a way that, for two vectors a and b ,

A(λ a + μ b ) = λA a + μA b ,

7 1.3 Matrices

where in the last equality we see that the action of two linear operators in succession is associative. However, the product of two general linear operators is not commutative, i.e.

AB x = BA x in general.^4

In an obvious way we define the null (or zero) and identity operators by

O x = 0 and I x = x ,

for any vector x in our vector space. Two operators A and B are equal if A x = B x for all

vectors x. Finally, if there exists an operator A−^1 such that

AA−^1 = A−^1 A = I

then A−^1 is the inverse of A. Some linear operators do not possess an inverse and are

called singular , whilst those operators that do have an inverse are termed non-singular.

1.3 Matrices

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • We have seen that in a particular basis e i both vectors and linear operators can be described in terms of their components with respect to the basis. These components may be displayed

as an array of numbers called a matrix. In general, if a linear operator A transforms vectors

from an N-dimensional vector space, for which we choose a basis e j , j = 1 , 2 ,... , N, into vectors belonging to an M-dimensional vector space, with basis f i , i = 1 , 2 ,... , M,

then we may represent the operator A by the matrix

A =

A 11 A 12... A 1 N

A 21 A 22... A 2 N

AM 1 AM 2... AMN

The matrix elements Aij are the components of the linear operator with respect to the bases e j and f i ; the component Aij of the linear operator appears in the ith row and j th column of the matrix. The array has M rows and N columns and is thus called an M × N matrix. If the dimensions of the two vector spaces are the same, i.e. M = N (for example,

if they are the same vector space) then we may represent A by an N × N or square matrix

of order N. The component Aij , which in general may be complex, is also commonly denoted by (A)ij. In a similar way we may denote a vector x in terms of its components x (^) i in a basis e i , i = 1 , 2 ,... , N, by the array

x =

x 1 x 2 .. . xN

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 4 Consider a two-dimensional linear vector space in which a typical vector is x = (x 1 , x 2 ), with linear operators A, B and C defined by A x = (2x 1 + x 2 , x 2 ), B x = (x 1 , x 1 + 2 x 2 ) and C x = (x 1 − x 2 , 2 x 2 ). Show that, although A and C commute, A and B do not.

8 Matrices and vector spaces

which is a special case of (1.24) and is called a column matrix (or conventionally, and slightly confusingly, a column vector or even just a vector – strictly speaking the term “vector” refers to the geometrical entity x ). The column matrix x can also be written as

x = (x 1 x 2 · · · x (^) N )T,

which is the transpose of a row matrix (see Section 1.6). We note that in a different basis e ′ i the vector x would be represented by a different column matrix containing the components x i′ in the new basis, i.e.

x′^ =

x 1 ′ x 2 ′ .. . x N′

Thus, we use x and x′^ to denote different column matrices which, in different bases e i and e ′ i , represent the same vector x. In many texts, however, this distinction is not made and x (rather than x) is equated to the corresponding column matrix; if we regard x as the geometrical entity, however, this can be misleading and so we explicitly make the

distinction. A similar argument follows for linear operators; the same linear operator A

is described in different bases by different matrices A and A′, containing different matrix elements.

1.4 Basic matrix algebra

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • The basic algebra of matrices may be deduced from the properties of the linear operators

that they represent. In a given basis the action of two linear operators A and B on

an arbitrary vector x (see towards the end of Section 1.2), when written in terms of components using (1.23), is given by ∑

j

(A + B)ij xj =

j

Aij xj +

j

Bij xj , ∑

j

(λA)ij xj = λ

j

Aij xj , ∑

j

(AB)ij xj =

k

Aik (Bx)k =

j

k

Aik Bkj xj.

Now, since x is arbitrary, we can immediately deduce the way in which matrices are added or multiplied, i.e.^5

(A + B)ij = Aij + Bij , (1.25)

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 5 Express the operators appearing in footnote 4 in matrix form and then use (1.27) to demonstrate their commutation or otherwise. Do operators B and C commute?

10 Matrices and vector spaces

The following example illustrates these three elementary properties or definitions.

Example (^) The matrices A, B and C are given by

A =

( 2 − 1 3 1

) , B =

( 1 0 0 − 2

) , C =

( − 2 1 − 1 1

) .

Find the matrix D = A + 2 B − C.

Dealing separately with the elements in each particular position in the various matrices, we have

D =

( 2 − 1 3 1

)

  • 2

( 1 0 0 − 2

) −

( − 2 1 − 1 1

)

=

( 2 + 2 × 1 − (−2) − 1 + 2 × 0 − 1 3 + 2 × 0 − (−1) 1 + 2 × (−2) − 1

)

( 6 − 2 4 − 4

) .

As a reminder, we note that for the question to have had any meaning, A, B and C all had to have

the same dimensions, 2 × 2 in practice; the answer, D, is also 2 × 2. 

From the above considerations we see that the set of all, in general complex, M × N matrices (with fixed M and N) provide an example of a linear vector space – one whose elements have no obvious “arrow-like” qualities. The space is of dimension MN. One basis for it is the set of M × N matrices E(p,q) with the property that E ij(p,q )= 1 if i = p and j = q whilst E( ijp,q )= 0 for all other values of i and j , i.e. each matrix has only one non-zero entry, and that equals unity. Here the pair (p, q) is simply a label that picks out a particular one of the matrices E(p,q)^ , the total number of which is MN.

1.4.2 Multiplication of matrices

Let us consider again the “transformation” of one vector into another, y = A x , which,

from (1.23), may be described in terms of components with respect to a particular basis as

yi =

∑^ N

j = 1

Aij xj for i = 1 , 2 ,... , M. (1.31)

Writing this in matrix form as y = Ax we have ⎛ ⎜⎜ ⎜ ⎝

y 1 y 2 .. . yM

A 11 A 12... A 1 N

A 21 A 22... A 2 N

AM 1 AM 2... AMN

x 1 x 2 .. . xN

where we have highlighted with boxes the components used to calculate the element y 2 : using (1.31) for i = 2,

y 2 = A 21 x 1 + A 22 x 2 + · · · + A 2 N xN.

All the other components yi are calculated similarly.

11 1.4 Basic matrix algebra

If, instead, we operate with A on a basis vector e j having all components zero except

for the j th, which equals unity, then we find

Aej =

A 11 A 12... A 1 N

A 21 A 22... A 2 N

AM 1 AM 2... AMN

A 1 j A 2 j .. . AMj

and so confirm our identification of the matrix element Aij as the ith component of Aej in this basis. From (1.27) we can extend our discussion to the product of two matrices P = AB, where P is the matrix of the quantities formed by the operation of the rows of A on the columns of B, treating each column of B in turn as the vector x represented in component form in (1.31). It is clear that, for this to be a meaningful definition, the number of columns in A must equal the number of rows in B. Thus the product AB of an M × N matrix A with an N × R matrix B is itself an M × R matrix P, where

Pij =

∑^ N

k= 1

Aik Bkj for i = 1 , 2 ,... , M, j = 1 , 2 ,... , R.

For example, P = AB may be written in matrix form ( P 11 P 12 P 21 P 22

A 11 A 12 A 13

A 21 A 22 A 23

B 11 B 12

B 21 B 22

B 31 B 32

where

P 11 = A 11 B 11 + A 12 B 21 + A 13 B 31 , P 21 = A 21 B 11 + A 22 B 21 + A 23 B 31 , P 12 = A 11 B 12 + A 12 B 22 + A 13 B 32 , P 22 = A 21 B 12 + A 22 B 22 + A 23 B 32.

Multiplication of more than two matrices follows naturally and is associative. So, for example,

A(BC) ≡ (AB)C, (1.33)

provided, of course, that all the products are defined. As mentioned above, if A is an M × N matrix and B is an N × M matrix then two product matrices are possible, i.e.

P = AB and Q = BA.

These are clearly not the same, since P is an M × M matrix whilst Q is an N × N matrix. Thus, particular care must be taken to write matrix products in the intended order; P = AB

13 1.6 The transpose of a matrix

It is clear that, in order for the above products to be defined, the identity matrix must be square. The N × N identity matrix (often denoted by IN ) has the form

IN =

1.5 Functions of matrices

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • If a matrix A is square then, as mentioned above, one can define powers of A in a straightforward way. For example A^2 = AA, A^3 = AAA, or in the general case An^ = AA · · · A (n times),

where n is a positive integer. Having defined powers of a square matrix A, we may construct functions of A of the form

S =

n

anAn^ ,

where the a (^) k are simple scalars and the number of terms in the summation may be finite or infinite. In the case where the sum has an infinite number of terms, the sum has meaning only if it converges. A common example of such a function is the exponential of a matrix, which is defined by

exp A =

n= 0

An n!

This definition can, in turn, be used to define other functions such as sin A and cos A.^7

1.6 The transpose of a matrix

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • In the next few sections we will consider some of the quantities that characterize any given matrix and also some other matrices that can be derived from the original. A tabulation of these derived quantities and matrices is given in the end-of-chapter Summary. We start with the transposed matrix. We have seen that the components of a linear operator in a given coordinate system can be written in the form of a matrix A. We will also find it useful, however, to consider the different (but clearly related) matrix formed by interchanging the rows and columns of A. The matrix is called the transpose of A and is denoted by AT^. It is obvious that if A is an M × N matrix then its transpose AT^ is an N × M matrix.
  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 7 For the 3 × 3 matrix A that has A 11 = A 33 = 1, A 22 = −1 and all other Aij = 0, show that the trace of exp iA, i.e. the sum of its diagonal elements, is equal to 3 cos 1 + i sin 1.

14 Matrices and vector spaces

Example Find the transpose of the matrix

A =

( 3 1 2 0 4 1

) .

By interchanging the rows and columns of A we immediately obtain

AT^ =

⎛ ⎝

3 0 1 4 2 1

⎞ ⎠ (^).

As it must be, given that A is a 2 × 3 matrix, AT^ is a 3 × 2 matrix. 

As mentioned in Section 1.3, the transpose of a column matrix is a row matrix and vice versa. An important use of column and row matrices is in the representation of the inner product of two real vectors in terms of their components in a given basis. This notion is discussed fully in the next section, where it is extended to complex vectors. The transpose of the product of two matrices, (AB)T^ , is given by the product of their transposes taken in the reverse order, i.e.

(AB)T^ = BTAT. (1.37)

This is proved as follows:

(AB)T ij = (AB)j i =

k

Aj k Bki

k

(AT^ )kj (BT^ )ik =

k

(BT^ )ik (AT^ )kj = (BTAT^ )ij ,

and the proof can be extended to the product of several matrices to give^8

(ABC · · · G) T^ = GT^ · · · CTBTAT.

1.7 The complex and Hermitian conjugates of a matrix

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Two further matrices that can be derived from a given general M × N matrix are the complex conjugate , denoted by A∗, and the Hermitian conjugate , denoted by A†. The complex conjugate of a matrix A is the matrix obtained by taking the complex conjugate of each of the elements of A, i.e.

(A∗)ij = (Aij )∗.

Obviously if a matrix is real (i.e. it contains only real elements) then A∗^ = A.

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 8 Convince yourself that, even if A, B, C,... , G are not necessarily square matrices, but are compatible and the product ABC · · · G is meaningful, then their transposes are such that the product given on the RHS is also meaningful.

16 Matrices and vector spaces

Taking the Hermitian conjugate of a, to give a row matrix, and multiplying (on the right) by b we obtain

a†b = (a∗ 1 a∗ 2 · · · a N∗ )

b 1 b 2 .. . bN

∑^ N

i= 1

a i∗ bi , (1.40)

which is the expression for the inner product 〈 a | b 〉^ in that basis.^9 We note that for real vectors (1.40) reduces to aTb =

∑N

i= 1 ai^ bi^.

1.8 The trace of a matrix

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • For a given matrix A, in the previous two sections we have considered various other matrices that can be derived from it. However, sometimes one wishes to derive a single number from a matrix. The simplest example is the trace (or spur ) of a square matrix, which is denoted by Tr A. This quantity is defined as the sum of the diagonal elements of the matrix,

Tr A = A 11 + A 22 + · · · + ANN =

∑^ N

i= 1

Aii. (1.41)

At this point, the definition may seem arbitrary, but as will be seen in this section, as well as later in the chapter, the trace of a matrix has properties that characterize the linear operator it represents, and are independent of the basis chosen for that representation. It is clear that taking the trace is a linear operation so that, for example,

Tr(A ± B) = Tr A ± Tr B. A very useful property of traces is that the trace of the product of two matrices is independent of the order of their multiplication; this result holds whether or not the matrices commute and is proved as follows:

Tr AB =

∑^ N

i= 1

(AB)ii =

∑^ N

i= 1

∑^ N

j = 1

Aij Bj i =

∑^ N

i= 1

∑^ N

j = 1

Bj i Aij =

∑^ N

j = 1

(BA)jj = Tr BA. (1.42)

The result can be extended to the product of several matrices. For example, from (1.42), we immediately find

Tr ABC = Tr BCA = Tr CAB, which shows that the trace of a multiple product is invariant under cyclic permutations of the matrices in the product. Other easily derived properties of the trace are, for example, Tr AT^ = Tr A and Tr A†^ = (Tr A)∗.

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 9 It also follows that a† a = ∑Nn= 1 a i∗ a (^) i = ∑Nn= 1 |ai |^2 is real for any vector a , whether or not it has complex components.

17 1.9 The determinant of a matrix

1.9 The determinant of a matrix

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • For a given matrix A, the determinant det A (like the trace) is a single number (or algebraic expression) that depends upon the elements of A. Also like the trace, the determinant is defined only for square matrices. If, for example, A is a 3 × 3 matrix then its determinant, of order 3, is denoted by

det A = |A| =

A 11 A 12 A 13

A 21 A 22 A 23

A 31 A 32 A 33

∣∣ ,^ (1.43)

i.e. the round or square brackets are replaced by vertical bars, similar to (large) modulus signs, but not to be confused with them. In order to calculate the value of a general determinant of order n, we first define that of an order-1 determinant. We would not normally refer to an array with only one element as a matrix, but formally it is a 1 × 1 matrix, and it is useful to think of it as such for the present purposes. The determinant of such a matrix is defined to be the value of its single entry. Notice that, although when it is written in determinantal form it looks exactly like a modulus sign, |a 11 |, it must not be treated as such, and, for example, a 1 × 1 matrix with a single entry −7 has determinant −7, not 7. In order to define the determinant of an n × n matrix we will need to introduce the notions of the minor and the cofactor of an element of a matrix. We will then see that we can use the cofactors to write an order-3 determinant as the weighted sum of three order-2 determinants; these, in turn, will each be formally expanded in terms of two order-1 determinants.^10 The minor Mij of the element Aij of an N × N matrix A is the determinant of the (N − 1) × (N − 1) matrix obtained by removing all the elements of the ith row and j th column of A; the associated cofactor, Cij , is found by multiplying the minor by (−1)i+j^. The following example illustrates this.

Example Find the cofactor of the element A 23 of the matrix

A =

⎛ ⎝

A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33

⎞ ⎠ (^).

Removing all the elements of the second row and third column of A and forming the determinant of the remaining terms gives the minor

M 23 =

∣∣ ∣∣A^11 A^12 A 31 A 32

∣∣ ∣∣.

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 10 Though in practice the values of order-2 determinants are nearly always computed directly.

19 1.9 The determinant of a matrix

where the final expression gives the form in which the determinant is usually remembered and is the form that is obtained immediately by considering the Laplace expansion using the first row of the determinant. The last equality, which essentially rearranges a Laplace expansion using the second row into one using the first row, supports our assertion that the value of the determinant is unaffected by which row or column is chosen for the expansion. An alternative, but equivalent, view is contained in the next example.

Example Suppose the rows of a real 3 × 3 matrix A are interpreted as the components, in a given basis, of three (three-component) vectors a , b and c. Show that the determinant of A can be written as |A| = a · ( b × c ).

If the rows of A are written as the components in a given basis of three vectors a , b and c , we have from (1.45) that

|A| =

∣∣ ∣∣ ∣∣

a 1 a 2 a 3 b 1 b 2 b 3 c 1 c 2 c 3

∣∣ ∣∣ ∣∣ =^ a^1 (b^2 c^3 −^ b^3 c^2 )^ +^ a^2 (b^3 c^1 −^ b^1 c^3 )^ +^ a^3 (b^1 c^2 −^ b^2 c^1 ).

From the general expression for a scalar triple product, it follows that we may write the determinant as |A| = a · ( b × c ). (1.46) In other words, |A| is the volume of the parallelepiped defined by the vectors a , b and c. One could equally well interpret the columns of the matrix A as the components of three vectors, and result (1.46) would still hold. This result provides a more memorable (and more meaningful) expression than (1.45) for the value of a 3 × 3 determinant. Indeed, using this geometrical interpretation, we see immediately that, if the vectors a 1 , a 2 , a 3 are not linearly independent then the value of the determinant vanishes:

|A| = 0.^11 

The evaluation of determinants of order greater than 3 follows the same general method as that presented above, in that it relies on successively reducing the order of the determi- nant by writing it as a Laplace expansion. Thus, a determinant of order 4 is first written as a sum of four determinants of order 3, which are then evaluated using the above method. For higher-order determinants, one cannot write down directly a simple geometrical expres- sion for |A| analogous to that given in (1.46). Nevertheless, it is still true that if the rows or columns of the N × N matrix A are interpreted as the components in a given basis of N (N-component) vectors a 1 , a 2 ,... , a N , then the determinant |A| vanishes if these vectors are not all linearly independent.

1.9.1 Properties of determinants

A number of properties of determinants follow straightforwardly from the definition of det A; their use will often reduce the labor of evaluating a determinant. We present them here without specific proofs, though they all follow readily from the alternative form for a

  • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 11 Each can be expressed in terms of the other two; consequently, (i) they all lie in a plane, and (ii) the parallelepiped they define has zero volume.

20 Matrices and vector spaces

determinant, given in equation (1.138) on p. 79, and expressed in terms of the Levi–Civita symbol  (^) ij k (see Problem 1.37).

(i) Determinant of the transpose. The transpose matrix AT^ (which, we recall, is obtained by interchanging the rows and columns of A) has the same determinant as A itself, i.e.

|AT| = |A|. (1.47) It follows that any theorem established for the rows of A will apply to the columns as well, and vice versa. (ii) Determinant of the complex and Hermitian conjugate. It is clear that the matrix A∗ obtained by taking the complex conjugate of each element of A has the determinant |A∗^ | = |A|∗. Combining this result with (1.47), we find that

|A†| = |(A∗)T| = |A∗^ | = |A|∗. (1.48)

(iii) Interchanging two rows or two columns. If two rows (columns) of A are interchanged, its determinant changes sign but is unaltered in magnitude. (iv) Removing factors. If all the elements of a single row (column) of A have a common factor, λ, then this factor may be removed; the value of the determinant is given by the product of the remaining determinant and λ. Clearly this implies that if all the elements of any row (column) are zero then |A| = 0. It also follows that if every element of the N × N matrix A is multiplied by a constant factor λ then

|λA| = λN^ |A|. (1.49)

(v) Identical rows or columns. If any two rows (columns) of A are identical or are multiples of one another, then it can be shown that |A| = 0. (vi) Adding a constant multiple of one row (column) to another. The determinant of a matrix is unchanged in value by adding to the elements of one row (column) any fixed multiple of the elements of another row (column). (vii) Determinant of a product. If A and B are square matrices of the same order then

|AB| = |A||B| = |BA|. (1.50)

A simple extension of this property gives, for example,

|AB · · · G| = |A||B| · · · |G| = |A||G| · · · |B| = |A · · · GB|,

which shows that the determinant is invariant under permutation of the matrices in a multiple product.

1.9.2 Evaluation of determinants

There is no explicit procedure for using the above results in the evaluation of any given determinant, and judging the quickest route to an answer is a matter of experience. A general guide is to try to reduce all terms but one in a row or column to zero and hence in effect to obtain a determinant of smaller size. The steps taken in evaluating the determinant in the example below are certainly not the fastest, but they have been chosen in order to illustrate the use of most of the properties listed above.