Vector Space, Lecture Notes - Mathematics - Prof Brian Stewart, Study notes of Mathematics

Fields and vector space, Quotient space, Projections, Dual Space, Dual spaces, dual transformations, eigenvalues and eigenvectors, determinants characteristic polynomial, minimum polynomial, inner product spaces

Typology: Study notes

2010/2011

Uploaded on 09/09/2011

andreasphd
andreasphd 🇬🇧

4.7

(28)

287 documents

1 / 80

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
A brief word to the reader
These are lecture notes, not a text book.
I have posted separately on the webpage a brief synopsis of the first year course on Linear
Algebra.
They are (initially) a guide to the lecturer; if I deviate significantly from the text I will
add a note on the course webpage.
The sections do not necessarily correspond to the lectures.
I’d be grateful if any corrections and suggested amendments could be sent to me by email
at
I won’t keep changing the notes on the webpage, if there are any significant errors I will
mention these in my lectures.
Brian Stewart
10th August 2010
i
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50

Partial preview of the text

Download Vector Space, Lecture Notes - Mathematics - Prof Brian Stewart and more Study notes Mathematics in PDF only on Docsity!

A brief word to the reader

These are lecture notes, not a text book. I have posted separately on the webpage a brief synopsis of the first year course on Linear Algebra. They are (initially) a guide to the lecturer; if I deviate significantly from the text I will add a note on the course webpage. The sections do not necessarily correspond to the lectures. I’d be grateful if any corrections and suggested amendments could be sent to me by email at [email protected]

I won’t keep changing the notes on the webpage, if there are any significant errors I will mention these in my lectures.

Brian Stewart

10th August 2010

i

Contents

  • 1 Fields and Vector Spaces
    • 1.1 Just for the record: definition of ‘field’
    • 1.2 Some Examples
      • 1.2.1 The rational field Q
      • 1.2.2 The real field R
      • 1.2.3 The complex field C
      • 1.2.4 The field of two elements F
      • 1.2.5 The prime field Fp
      • 1.2.6 Non-examples
    • 1.3 Definition of ‘vector space’
    • 1.4 Subspaces
      • 1.4.1 The X + Y lemma
    • 1.5 A key idea: Direct Sums
      • 1.5.1 Bases
      • 1.5.2 Another viewpoint
      • 1.5.3 Three or more subspaces
  • 2 Quotient Spaces
    • 2.1 Recap of abelian groups
      • 2.1.1 Quotients in abelian groups
      • 2.1.2 Homomorphisms of abelian groups
      • 2.1.3 The Isomorphism Theorem for abelian groups
    • 2.2 Quotients of vector spaces
    • 2.3 The Isomorphism Theorem
    • 2.4 Bases and Dimension
    • 2.5 Induced Linear Transformations
  • 3 Projections
    • 3.1 The geometric description
    • 3.2 An algebraic approach
    • 3.3 The algebraic characterisation of projections
    • 3.4 Matrices for projections
    • 3.5 The complementary projection
    • 3.6 The algebraic version
    • 3.7 Sums of idempotents
  • 4 Dual Spaces
    • 4.1 The Canonical Example
    • 4.2 Definition
    • 4.3 The dual basis
      • 4.3.1 Example
    • 4.4 Annihilators
    • 4.5 The second dual
  • 5 Dual Transformations
    • 5.1 Definition
    • 5.2 Duals and linearity
    • 5.3 Duals and composition
    • 5.4 The Matrix of a dual map
  • 6 Eigenvectors and Eigenvalues
    • 6.1 Definitions
    • 6.2 Eigenvalues of f (T )
    • 6.3 Eigenspaces
    • 6.4 The independence of two eigenspaces
    • 6.5 The independence of eigenspaces
  • 7 Determinants and the Characteristic Polynomial
    • 7.1 Determinants of matrices
    • 7.2 Determinants of linear transformations
    • 7.3 Definition of Characteristic Polynomial
    • 7.4 Characteristic Polynomials and Eigenvalues
    • 7.5 Cayley–Hamilton Theorem
  • 8 Minimum Polynomial
    • 8.1 The polynomials satisfied by T
    • 8.2 The minimum polynomial is non-zero
    • 8.3 The minimum polynomial and eigenvalues
    • 8.4 How to Work Examples
      • 8.4.1 An explicit example
    • 8.5 The minimum polynomial of the dual transformation
  • 9 Primary Decomposition
    • 9.1 Invariant subspaces
    • 9.2 T -invariant subspaces and matrices for T
    • 9.3 The Primary Decomposition Theorem
    • 9.4 Diagonalisability
    • 9.5 Triangular form
    • 9.6 Cayley–Hamilton over C
    • 9.7 The Jordan Canonical Form
  • 10 Inner Product Spaces
    • 10.1 The real case
    • 10.2 The complex case
    • 10.3 Examples
    • 10.4 The Matrix of an inner product
    • 10.5 Length
  • 11 Orthogonality
    • 11.1 Definitions
    • 11.2 The algebra of orthogonal complements
    • 11.3 Orthonormal Bases
    • 11.4 Inner Product Spaces and Duality
    • 11.5 U ⊥⊥^6 = U
  • 12 The Gram–Schmidt Process - 12.1.1 An example: Laguerre’s Polynomials - 12.1.2 An application - 12.1.3 A geometrical description of a group of matrices
  • 13 Inequalities
    • 13.1 Bessel’s Inequality
      • 13.1.1 An example
      • 13.1.2 Another example
    • 13.2 The Cauchy–Schwartz Inequality
      • 13.2.1 An Application
      • 13.2.2 Another Application
    • 13.3 Trigonometry
  • 14 Adjoints
    • 14.1 Dual maps and the inner product
    • 14.2 Definition of the adjoint map
      • 14.2.1 ‘Existence’
      • 14.2.2 ‘Uniqueness’
      • 14.2.3 Linearity of the adjoint map
    • 14.3 Examples
      • 14.3.1 Shift maps
      • 14.3.2 Derivatives
    • 14.4 Self-adjoint transformations
      • 14.4.1 Example: Perpendicular Projections
  • 15 Properties of Adjoints
    • 15.1 Additive and scalar properties
    • 15.2 Multiplicative properties
    • 15.3 The adjoint of the adjoint
    • 15.4 The minimal polynomial of the adjoint
      • 15.4.1 Two corollaries
    • 15.5 Kernels and Images of Adjoints
  • 16 Adjoints and their matrices
    • 16.1 Matrix representation of the adjoint
    • 16.2 The characteristic polynomial of T ∗
    • 16.3 Reality of the eigenvalues of a self-adjoint transformation
  • 17 The Spectral Theorem
    • 17.1 The complex case
    • 17.2 The real case
    • 17.3 An old fashioned interpretation
    • 17.4 A return to abstraction
  • 18 Application of the Spectral Theorem: Real Quadratic Forms
    • 18.1 Orthogonal classification of central quadrics
    • 18.2 Maxima and minima of quadratic functions
    • 18.3 Rank and Signature
    • 18.4 Classification of quadrics
    • 18.5 Simultaneous reduction of two forms
    • 18.6 Two inner products
  • A Automorphisms of Inner Product Spaces

1 Fields and Vector Spaces

In the first-year courses on Linear Algebra we focussed in practice on ‘real vector spaces’. Although this was a very natural place to begin, what we learned then can be extended with almost no effort to a far wider range of situations, namely the wider class of ‘vector spaces over a field K’. What do we mean by a ‘field K’? The good news is you already have (from the introduc- tion to analysis) a very good feeling for what a field is; and also (from the abstract algebra course) a formal definition. Operationally, a field is an algebraic system having the same ‘algebraic properties’ as the real numbers. A field will therefore share all the properties of the real number system R which can be derived from these algebraic properties; those which don’t depend on ‘order’ or ‘completeness’. Although in the web-page notes there will be a formal definition, that’s not how to think of it. As far as this course goes we need to remember that: in a field we can add, subtract, multiply, and divide (except by 0), and the natural commutativity, associativity and distributivity conditions hold. So for now jump over the axioms, and see the examples!

1.1 Just for the record: definition of ‘field’

A field is a set together with certain operations and special elements for which the following properties hold. Algebraic Properties For every pair of elements a, b ∈ K there is a unique element a + b, called their ‘sum’. For every pair of elements a, b ∈ K there is a unique element a · b, called their ‘product’. For element a ∈ K there is a unique element −a, called its ‘negative’. For element a ∈ K, with a 6 = 0, there is a unique element^1 a , called its ‘reciprocal’. There is a special element 0 ∈ K called the ‘zero’. There is a special element 1 ∈ K called the ‘unit element’. The following hold for all elements a, b, c: A1 a + b = b + a [+ is commutative] A2 a + (b + c) = (a + b) + c [+ is associative] A3 a + 0 = a [zero and addition] A4 a + (−a) = 0 [negatives and addition] M1 a · b = b · a [· is commutative] M2 a · (b · c) = (a · b) · c [· is associative] M3 a · 1 = a [the unit element and multiplication] M4 If a 6 = 0 then a · 1 a = 1 [reciprocals and multiplication] D a · (b + c) = a · b + a · c [· distributes over +] Z 0 6 = 1 [to avoid total collapse]

Notation We write

    

ab for a · b a − b for a + (−b); a/b for a 1 b ; a−^1 for 1 a.

1.2 Some Examples

1.2.1 The rational field Q

The rational numbers Q, with the usual operations, form a field. To prove this would, of course, involve us in pages of detailed inductions. We’re not going to do that, we can just continue to accept that Q satisfies these properties.

(b)(i) a special vector 0 ∈ V called the ‘zero vector’; (b)(ii) for every vector u, a vector −u called the ‘additive inverse’ of u (b)(iii) a binary operation + : V × V → V , (a, b) 7 → a + b called the ‘vector sum’; (b)(iv) for each element λ ∈ K an operation λ· : V → V , v 7 → λ · v called ‘scalar multipli- cation by λ’.

The operations must satisfy the following laws for all u, v, w ∈ V and all λ, μ ∈ K.

V1 u + v = v + u; V2 (u + v) + w = u + (v + w); V3 u + 0 = u; V4 u + (−u) = 0

SM1 λ · (u + v) = λ · u + λ · v; SM2 (λ + μ) · u = λ · u + μ · u; SM3 λ · (μ · u) = (λμ) · u; SM4 1 · u = u

Notation We write λu for λ · u whenever it won’t confuse us.

1.4 Subspaces

We know what it means to say that ‘U is a subspace of V , which we always denote by U 6 V. The following is how I am always going to test for subspacehood:

Lemma (The Subspace Test). Let V be a vector space over the field K. Let U ⊆ V. Then U 6 V provided that

(i) 0 ∈ U ;

(ii) u 1 , u 2 ∈ U , α 1 , α 2 ∈ K =⇒ α 1 u 1 + α 2 u 2 ∈ U.

1.4.1 The X + Y lemma

The following result tells us how the various subspaces fit together.

Lemma (The X + Y lemma). Let V be a vector space over the field K, and let X 6 V , Y 6 V. Then

(i) X ∩ Y := {v ∈ V | v ∈ X and v ∈ Y } 6 V ;

(ii) X + Y := {v ∈ V | v = x + y for some x ∈ X and y ∈ Y } 6 V ;

(iii) and moreover, if V is finite-dimensional, dim (X + Y ) = dim X +dim Y −dim (X ∩ Y ).

1.5 A key idea: Direct Sums

‘Divide and rule’ is a key technique, and a many of the results in this course can be best understood as providing effective ways of splitting large, complex, problems into smaller, more tractable, ones.

Definition. Let V be a vector space over a field K, and suppose X, Y 6 V are such that

(i) V = X + Y ; and

(ii) X ∩ Y = { 0 }.

Then we say that ‘V is the direct sum of the subspaces X and Y ’ and we write V = X ⊕ Y.

Often we write ‘let V = X ⊕ Y ’ to mean that V is the direct sum of the subspaces X and Y. Note that it’s clear that V = X ⊕ Y if and only if V = Y ⊕ X.

1.5.1 Bases

Suppose that V = X ⊕ Y , and that X is a basis of X and Y is a basis of Y. Then—as we saw last year—it is easy to prove that X ∪ Y is a basis of X ⊕ Y.

1.5.2 Another viewpoint

There is another way to think of direct sums which is often helpful.

Lemma. Let V be a vector space over a field K, and suppose X, Y 6 V. Then V = X ⊕ Y if and only if every element v ∈ V can be expressed uniquely as v = x + y for some x ∈ X, y ∈ Y.

1.5.3 Three or more subspaces

How can this idea be generalised to three or more subspaces? One natural way is to write V = X ⊕ Y ⊕ Z to mean that there is a subspace U 6 V such that V = U ⊕ Z and U = X ⊕ Y. It is then tedious but not difficult to check such things as (X ⊕ Y ) ⊕ Z = (X ⊕ Y ) ⊕ Z. In practice it is usually easier to adopt the alternative viewpoint and to use this definition:

Definition. Let V be a vector space over a field K, and let Xi, for i = 1... r, be subspaces of V. We write V =

⊕r i=1 Xi^ to mean that every vector^ v^ ∈^ V^ can be expressed uniquely as v =

∑r i=1 xi, for some^ xi^ ∈^ Xi, i^ = 1^... r.

2.2 Quotients of vector spaces

Let V be a vector space and U as subspace of V. Then, ignoring the scalar multiplication, V is an abelian group and U a subgroup. So we can form the (group) quotient V. You will not be surprised to hear that we can make V into a vector space. First we define a scalar multiplication. For each scalar λ we define λ · v := λ · v. We check that this is unambiguous: if v′^ is another name for v then it must be that v − v′^ ∈ U ; as U is a subspace and not just subgroup we have that λ · (v − v′) ∈ U , so that λ · v − λ · v′^ ∈ U as required to ensure λ · v = λ · v′. Next we must check that V with this scalar multiplication satisfies the axioms SM1– of the definition. This is trivial. So now we have a quotient space V /U.

2.3 The Isomorphism Theorem

Let α : V → W be a linear transformation. Then we know that the kernel ker α and image im α are subspaces. In particular we can form the quotient space V / ker α. The Isomorphism Theorem for abelian groups lets us write V / ker α ≃ im α as groups. It is very easy to check that the isomorphism α is actually a vector space homomorphism (linear transformation).

α(λ 1 ·v 1 +λ 2 ·v 2 ) = α(λ 1 · v 1 + λ 2 · v 2 ) = α(λ 1 ·v 1 +λ 2 ·v 2 ) = λ 1 ·α(v 1 )+λ 2 ·α(v 2 ) = λ 1 ·α(v 1 )+λ 2 ·α(v 2 )

where the reasons for the equalities are to be supplied. So we have the Isomorphism Theorem for Vector Spaces:

V / ker α ≃ α(V ).

2.4 Bases and Dimension

Let V be a vector space of finite dimension, and let U be a subspace. Pick a basis of U : e 1 , e 2 ,... , ek and then extend this linearly independent set to a basis e 1 , e 2 ,... , ek,... en of V. Note that {ek+1,... , en} spans V /U : for any v ∈ V we have that v =

∑n j=1 aj^ ej^ , so that v =

∑n j=k+1 aj^ ej^ as^ e^1 =^ e^2 =^ · · ·^ =^ ek^ = 0 in^ V /U^. Moreover {ek+1,... , en} is linearly independent in V /U. For suppose that

∑n j=k+1 aj^ ej^ = 0; then

∑n j=k+1 aj^ ej^ = 0, and so^

∑n j=k+1 aj^ ej^ ∈^ U^ and so expressible in terms of^ e^1 ,... , ek. So we have

∑n j=1 aj^ ej^ = 0 for certain^ a^1 ,... , ak. As the^ ej^ form a basis all the^ aj^ are zero. In summary, dim(V /U ) = dim V − dim U.

Suppose we apply this result to the Isomorphism Theorem for Vector Spaces: we get

dim V − dim ker α = dim im α,

or the Rank–Nullity Theorem.

Of course the Isomorphism Theorem is stronger, it gives a result even when the spaces are not finite dimensional.

2.5 Induced Linear Transformations

Again let V be a vector space, and let U be a subspace of V. Suppose that T : V → V is a linear transformation.

Consider the quotient space V /U. Can we force T to be a linear map on this? Minimally we will need to have T (0) = 0. So let us assume that T (U ) 6 U ; later we will call such a subspace ‘T -invariant’. Now define T : V → V by T (v) = T (v). Is this unambiguous? That is, if v = v′^ have we made sure that T (v) = T (v′)? The answer is yes; this needs checked. Have we defined a linear map? The answer is yes; this needs checked.

We end by describing the matrices of T and T with respect to certain bases. As before, choose e 1 ,... , ek as a basis of U ; and extend this to a basis of V. As T (U ) 6 U we will have that T (uj ) =

∑k i=1 aij^ ui, so that in the matrix for^ T^ the

entries in the first k columns are zero below the k-th row; the matrix of T is

[

A B

O D

]

The top left k × k block is easily identified; it is just the matrix of the transformation S : U → U given by S(x) = T (x). The bottom right block is also identifiable. We have, for any j = k + s that

T (ek+s) =

i

bisei +

i

disek+i

which in the quotient V /U gives

T (ek+s) =

i

dij ek+i.

That is, D is the matrix of the induced T with respect to the ‘obvious’ basis.

3 Projections

If we have studied mechanics we know that often we can get all the useful information by looking at the problem in the right way, and ‘resolving’ in various directions. That is, we project on to one of the axes, throwing away what is currently irrelevant. Map makers do the same sort of thing; rather than carry around a solid model of Oxford or wherever we project on to a plane, and neglect, for the moment, our height above sea level. These are just small examples of an important general technique used in almost every part of mathematics. This section is devoted to describing the technique in general, and in an economical way.

For the rest of the section let V be a vector space over the field K.

3.1 The geometric description

Suppose that V = X ⊕ Y. Then every vector v ∈ V can be expressed uniquely v = x + y, where x ∈ X and y ∈ Y. We can then define a map P(X,Y ) : V → V by P(X,Y )(v) = x. We call this map ‘the projection of V along Y on to X’. What properties does such a map enjoy? Firstly,

3.1.1. P(X,Y ) is linear.

It is trivial to check this; we don’t intend to write up every such trivial check, but do this one as an example. Let v 1 , v 2 ∈ V and λ 1 , λ 2 ∈ K. For unique x 1 , x 2 ∈ X and y 1 , y 2 ∈ Y we have v 1 = x 1 + y 1 and v 2 = x 2 +y 2. Hence λ 1 v 1 +λ 2 v 2 = (after some manipulation) (λ 1 x 1 +λ 2 x 2 )+(λ 1 y 1 +λ 2 y 2 ). As X, Y are subspaces (λ 1 x 1 + λ 2 x 2 ) ∈ X and (λ 1 y 1 + λ 2 y 2 ) ∈ Y. Hence

P(X,Y )(λ 1 v 1 + λ 2 v 2 ) = λ 1 x 1 + λ 2 x 2 = λ 1 P(X,Y )(x 1 ) + λ 2 P(X,Y )(x 2 )

as required.

It’s also clear that repeating the process does nothing more:

3.1.2. P (^) (^2 X,Y ) = P(X,Y ).

Write v = x + y as usual^2 , then P (^) (^2 X,Y )(v) = P(X,Y )(x) = P(X,Y )(x + 0) = x = P(X,Y )(v). We also—as we should for every linear transformation—compute:

3.1.3. (i) ker P(X,Y ) = Y ; (ii) im P(X,Y ) = X.

These are immediate: v = x + y is in the kernel if and only if y = 0; the image is clearly a subset of X, and as x = P(X,Y )(x + 0) all of X is in the image.

3.2 An algebraic approach

Suppose now that we have any linear transformation E : V → V , which satisfies E^2 = E. Such a linear transformation is called ‘an idempotent linear transformation’. We want to find X and Y such that E = P(X,Y ). What are likely guesses for these? Well X has got to be the image of P(X,Y )—when we find this projection!—so let’s define X := im E. Similarly the only sensible choice is Y := ker E. First we show that V is a direct sum of the right sort.

3.2.1. V = im E ⊕ ker E. (^2) When I say this, I mean that x ∈ X and y ∈ Y. It is usually safer to spell that out if you suspect that your readers/examiners are ultra-careful.

Suppose z ∈ im E ∩ ker E. Then z = E(v) for some v ∈ V , and E(z) = 0. But E^2 = E, so we can put these together and get

z = E(v) = E^2 (v) = E(E(z)) = E(0) = 0

so that im E ∩ ker E = 0 Moreover, if v ∈ V , then E(v − E(v)) = E(v) − E^2 (v) = 0; that is, v − E(v) ∈ ker E. As E(v) is patently in im E that gives us v ∈ im E + ker E, or V = im E + ker E. We have checked the two requirements to ensure that V = im E ⊕ ker E. Next we show that E is the correct projection.

3.2.2. E = P(im E,ker E).

Consider v ∈ V , and as above, write v = E(v) + (v − E(v). We have just seen that this decomposes v into a (unique) sum of vectors E(v) ∈ im E and v − E(v) ∈ ker E. So by definition of P(im E,ker E) we have that P(im E,ker E)(v) = E(v) as we require.

3.3 The algebraic characterisation of projections

The detailed results of the previous two subsections can be summarised so:

Theorem (Algebraic characterisation of projections). The set of idempotent linear trans- formations on V coincides with the set of projections on V.

3.4 Matrices for projections

Suppose that E is a projection on V , so that E^2 = E. How does E act on the vectors in im E and in ker E?

3.4.1. For every v ∈ im E we have that E(v) = v.

Such a v can be written v = E(w) and then E(v) = E(E(w)) = E^2 (w) = E(w) = v.

3.4.2. For every v ∈ ker E we have that E(v) = 0.

This is definition. In the finite dimensional case we can then say something about matrices:

Theorem. Let V be an n-dimensional vector space over K, and suppose that E is an idem- potent linear transformation on V (or equivalently, a projection). Then there exists a basis A of V with respect to which the matrix of E is given

MAA(E) =

[

Ir Or,n−r On−r,r On−r,n−r

]

where r = rk E.

Here Op,q denotes a p × q block of 0’s. All we need to do is take the union of any basis of im E and any basis of ker E; this will be a basis of their direct sum. We’ve just proved that E acts on the elements of this basis in the way the theorem asserts.

From the above our condition is certainly sufficient. To establish necessity, note that E 1 E 2 + E 2 E 1 = 0 implies E 1 (E 1 E 2 + E 2 E 1 ) = E 1 0 = 0, and also that (E 1 E 2 + E 2 E 1 )E 1 = 0 E 1 = 0. Hence we get E 1 E 2 = −E 1 E 2 E 1 = E 2 E 1 , so that we have 2E 1 E 2 = 0. As 2 = 1 + 1 6 = 0 we have an inverse for 2 in K and can deduce that E 1 E 2 = 0. It follows at once that E 2 E 1 = 0

For completeness we compute the kernel and image of the sum.

3.7.3. Suppose that in K it is not the case that 1 + 1 = 0. Let E 1 , E 2 , and E 1 + E 2 be idempotents. Then (i) ker (E 1 + E 2 ) = ker E 1 ∩ker E 2 and (ii) im (E 1 + E 2 ) = im E 1 +im E 2.

For (i) let z ∈ ker E 1 ∩ ker E 2 ; then E 1 (z) = E 2 (z) = 0. Then (E 1 + E 2 )(z) = 0 + 0 = 0. For the other direction, let z ∈ ker (E 1 + E 2 ); then (E 1 + E 2 )(z) = 0. Hence

0 = E 1 (0) = E 1 ((E 1 + E 2 )(z)) = E 12 (z) + E 1 E 2 (z) = E 1 (z) + 0 = E 1 (z)

so that z ∈ ker E 1 as required; similarly z ∈ ker E 2. For (ii) let w ∈ im (E 1 + E 2 ); then w = (E 1 + E 2 )(v) for some v. That can be expressed as w = E 1 (v) + E 2 (v) ∈ im E 1 + im E 2 so one inclusion is done. Conversely suppose that w ∈ im E 1 + im E 2 ; then w = E 1 (v 1 ) + E 2 (v 2 ). (Thought: gulp! but hang on,... the elements in the image are images of themselves under the projection. So let’s see... ). Now

(E 1 + E 2 )(w) = E 1 E 1 (v 1 ) + E 2 E 1 (v 1 ) + E 1 E 2 (v 2 ) + E 2 E 2 (v 2 ) = E 1 (v 1 ) + 0 + 0 + E 2 (v 2 ) = w

so that w ∈ im (E 1 + E 2 ) and we are done.

Note. The results of this section can be generalised to the sum of r idempotents; and if we exploit also the symmetry between E and I − E we can say things about differences of idempotents as well

4 Dual Spaces

What we are to do in this section is motivated by our experience of ‘finding the coordinates’ of a point (vector). What is going on abstractly? When we find in R^3 the ‘coordinate in the x-direction’ what we find is a map R^3 → R—and not just any old map, but a linear map: the first coodinate of (a + b) us the sum of the first coordinates of a and b and so on. So we are going to study the linear maps from a vector space to the base field. But before we even start let’s look at an example.

4.1 The Canonical Example

Let V = Rn^ be the vector space of column n-tuples of real numbers. What are the linear maps Rn^ → R? We already know all about these, they are given by n × 1 matrices, that is by row n-tuples. Each row n-tuple gives us a linear map so

(a 1 , a 2 ,... , an) :

x 1 x 2 .. . xn

→ a 1 x 1 + a 2 x 2 +... anxn.

Already there is a lot to think about here: we get another vector space, it has the same dimension, and if only we could pin it down there is clearly some sort of reciprocity—the x’s are defining linear maps on the a’s as well.

4.2 Definition

Let V be a vector space over the field K. The dual space V ′^ is the vector space over K given as follows:

(a) the set V ′^ is {f : V → K | f is linear};

(b)(i) the ‘zero vector’ is the map v 7 → 0; (b)(ii) the ‘additive inverse’ of f is the map v 7 → −f (v); (b)(iii) the ‘vector sum’ of f and g is the map v 7 → f (v) + g(v); (b)(iv) the ‘scalar multiplication of f by λ ∈ K’ is the map v 7 → λ · f (v).

First, note that 0, −f , f + g, and λ · f are all members of V ′. Now with these operations V ′^ truly a vector space? One way to answer this we need to check out the axioms. This is tedious, and we will do only one of them.

Lemma (Axiom SM2 for V ′). Let λ, μ ∈ K, and let f ∈ V ′. Then (λ + μ) · f = λ · f + μ · f.

To prove this recall that maps are equal if and only if they yield the same values. So let v ∈ V. Then

LHS(v) = ((λ + μ) · f )(v) = (λ + μ) · f (v) by defn (b)(iv) = λ · f (v) + μ · f (v) Axiom SM2 for V = (λ · f )(v) + (μ · f )(v) by defn (b)(iv) twice = ((λ · f ) + (μ · f ))(v) by defn (b)(iii) = RHS(v).

Another, cheaper, way to show we have a vector space is to use one of the examples^3 from the first year: the set of all maps φ : V → K with the same addition and scalar multiplication (^3) See Dr Henke’s notes, example 3.6. Although ostensibly this only deals with the case K = R it is trivial to see that the more general result holds.

4.3.1 Example

Let V := Rn^ be the vector space of real column n-tuples.

Take for E the usual basis e 1 :=

, e 2 :=

,... , en :=

It is easy to check that the dual basis is given by e′ 1 = (1, 0 ,... , 0); e′ 2 = (0, 1 ,... , 0);... ; e′ n = (0, 0 ,... , 1).

On the other hand, the basis dual to the basis F := {

} of R^2 is

F′^ := {(1, −1), (0, 1)}.

Note. There’s an implicit warning here that we should not be misled by notation: e′ 1 depends not just on e 1 but on the whole basis E.

4.4 Annihilators

Throughout this subsection we let V be a vector space over a field K.

Definition. The annihilator of a subspace X of V is written X◦^ is defined to be the set {f ∈ V ′^ | f (x) = 0 for all x ∈ X}.

Note. Again a word of warning; we are working in a fixed space V , so that X◦^ depends not just on X but also on this ‘universal’ space V. Just occasionally we work in a situation when X 6 V 1 6 V 2 ; when we speak of the annihilator of X we may have to make clear whether we mean in V 1 ′ or V 2 ′ —the notation doesn’t help us.

4.4.1. Let X 6 V. Then X◦^6 V ′^.

We use the subspace test and the fact that (λf + μg)(x) = (λf (x) + μg(x) = 0 whenever f (x) = g(x) = 0.

4.4.2. { 0 }◦^ = V ′^ and V ◦^ = { 0 }.

4.4.3. Let X 1 6 X 2 6 V. Then X 2 ◦ 6 X 1 ◦.

Clearly if f ‘kills’ every vector in X 2 it kills every vector in the smaller subspace X 1.

4.4.4. Let X 1 , X 2 6 V. Then (X 1 + X 2 )◦^ = X 1 ◦ ∩ X 2 ◦.

As X 1 , X 2 6 (X 1 + X 2 ) we can use the previous result to get (X 1 + X 2 )◦^6 X 1 ◦ ∩ X 2 ◦. If contrariwise f ∈ X 1 ◦ ∩ X◦ 2 , then f kills every element of X 1 and every element of X 2 ; by linearity it kills every element of X 1 + X 2.

4.4.5. Let V be finite-dimensional. Let X 1 , X 2 6 V. Then (X 1 ∩ X 2 )◦^ = X 1 ◦ + X 2 ◦.

This is an exercise, best left until we have dealt with “double duals”.

4.4.6. Let X 6 V , and suppose that {e 1 ,... , er} is a basis of X, with {e 1 ,... , en} a basis of V. Then {e′ r+1,... , e′ n} is a basis of X◦.

Let f ∈ X◦. Then we can express it in terms of the dual basis, and get f =

∑n j=1 βj^ e

′ j. Then for i = 1,... , r we have that 0 = f (ei) =

∑n j=1 βj^ e ′ j (ei) =^ βi, so that^ {e ′ r+1,... , e ′ n} spans X◦; the linear independence of this subset of the dual basis is clear.

4.4.7. If V is of finite dimension, and X 6 V , then dim V = dim X + dim X◦.

We can in this case find a basis of X, extend to a basis of V and use the previous result.

4.5 The second dual

We have proved that when V is n-dimensional over K then its dual V ′^ is another vector space of the same dimension. But aren’t all vector spaces of a dimension n ‘isomorphic’? Aren’t they all just the same thing, essentially the space of n-tuples? So isn’t the dual space just like the original vector space? This is indeed true, but only up to a point. What we have seen about annihilators suggests that there’s more going on than that—the subspace structure seems to be getting inverted. Our canonical example is a better guide. The distinction between column and row vectors matters; but (at least in finite dimension) the original space is the dual of its dual. Throughout this section let V be a vector space over a field K.

4.5.1. For each v ∈ V define θv : V ′^ → K by θv(f ) = f (v). Then θv ∈ V ′′.

Let f 1 , f 2 ∈ V ′^ and β 1 , β 2 ∈ K. Then for any v ∈ V

θv(β 1 f 1 + β 2 f 2 ) = (β 1 f 1 + β 2 f 2 )(v) = β 1 f 1 (v) + β 2 f 2 (v) = β 1 θv(f 1 ) + β 2 θv(f 2 )

as required. (Exercise: justify the = signs.)

4.5.2. The mapping θ : V → V ′′^ given by θ : v → θv is linear.

Let v 1 , v 2 ∈ V ′^ and α 1 , α 2 ∈ K. Then for any f ∈ V ′

θα 1 v 1 +α 2 v 2 (f ) = f (α 1 v 1 + α 2 v 2 ) = α 1 f (v 1 ) + α 2 f (v 2 ) = α 1 θv 1 (f ) + α 2 θv 2 (f )

as required.

4.5.3. Let V be of finite dimension, and suppose that f (v) = 0 for all f ∈ V ′; then v = 0.

If v 6 = 0 then we can choose a basis {e 1 = v, e 2 ,... , en}, and then e′ 1 (v) = 1 6 = 0. (The hypothesis that V is finite-dimensional is not necessary, but in the general case we have not proved the essential fact that we need, that every LI set extends to a basis.)

4.5.4. If V is finite dimensional, then the mapping θ : V → V ′′^ given by θ : v → θv is injective.

The kernel (or null space) consists of the vectors v for which f (v) = 0 for every f ∈ V ′. Hence v = 0.

4.5.5. If V is finite dimensional, then the mapping θ : V → V ′′^ given by θ : v → θv is surjective.

First we have that dim V ′′^ = dim V ′^ = dim V. By the previous result n (θ) = 0, so by the Rank–Nullity Theorem, rk θ = dim V. Putting these together, rk θ = dim V ′′^ and so θ is onto. (The hypothesis that V is finite-dimensional is essential, although we don’t prove that.) All this can be summarised in this result:

Theorem (Natural Isomorphism of V and V ′′). Let V be a finite dimensional vector space over a field K. Then there is natural isomorphism between V and V ′′.

The word ‘natural’ in these circumstances means that we can define it—as indeed we did for θ—without reference to any particular basis.

From now on, whenever V is finite-dimensional, we identify V ′′^ with V by means of θ: that is, we treat v ∈ V as being both a vector and also the linear map f 7 → f (v) defined on V ′. In particular, if H 6 V ′^ we can investigate H◦^ := {v ∈ V |f (v) = 0 for all f ∈ H}.