
















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concepts of commutative semigroups, groups, and vector spaces. It explains the definitions of semigroups, commutative semigroups, binary operations, identities, invertible elements, and the existence of inverses for each element in a group. Additionally, it discusses vector spaces, unitary s-modules, and linear combinations of vectors. The document also includes a proof of the isomorphism between a finite-dimensional vector space and the vector space of ordered n-tuples of scalars.
Typology: Study notes
1 / 24
This page cannot be seen from the preview
Don't miss anything!

















This chapter briefly presents the algebraic structure of vector spaces starting with the rudimentary concepts of groups and fields. This chapter should be read along with Chapter 4 Part A of Naylor & Sell. Specifically, both solved examples and exercises in Naylor & Sell are very useful.
Let S be a nonempty set. With m ∈ N, an m-ary operation ⊛ is a function ⊛ : Sm^ → S, i,e., mapping S ︸ × S︷︷ × · · · S︸ m times
into S. Most commonly encountered
operations are binary, i.e., m = 2, where ⊛ : S × S → S. This section focuses on binary algebras that are algebraic systems with a single binary operation. We first move from a primitive algebra called semigroup to monoid, and then to a highly structured algebraic system called group. Concepts of groups are extensively used in Physics and Engineering.
Definition 1.1. (Binary Algebra) Let S be a nonempty set. The operation ⊛ : S × S → S is called a binary algebra and is referred to as (S; ⊛).
Definition 1.2. (Semigroup) A binary algebra (S; ⊛) is called a semigroup if it satisfies the associativity property, i.e., (α ⊛ β) ⊛ γ = α ⊛ (β ⊛ γ) ∀α, β, γ ∈ S. A semigroup is called commutative if α ⊛ β = β ⊛ α ∀α, β ∈ S.
Definition 1.3. (Identity Element) An element (^1) ℓ ∈ S is said to be a left identity of the binary algebra (S; ⊛) if (^1) ℓ ⊛ α = α ∀α ∈ S. Similarly, an element (^1) r ∈ S is said to be a right identity of the binary algebra (S; ⊛) if α ⊛ (^1) r = α ∀α ∈ S.
Finally, an element 1 ∈ S is said to be an identity of the binary algebra (S; ⊛) if 1 ⊛ α = α ⊛ 1 = α ∀α ∈ S.
Definition 1.4. (Zero Element) An element (^0) ℓ ∈ S is said to be a left zero of the binary algebra (S; ⊛) if (^0) ℓ ⊛ α = (^0) ℓ ∀α ∈ S. Similarly, an element (^0) r ∈ S is said to be a right zero of the binary algebra (S; ⊛) if α ⊛ (^0) r = (^0) r ∀α ∈ S. Finally, an element 0 ∈ S is said to be a zero of the binary algebra (S; ⊛) if 0 ⊛ α = α ⊛ 0 = 0 ∀α ∈ S.
Definition 1.5. (Monoid) A semigroup with an identity element is called a monoid. That is, a monoid is a binary algebra that satisfies the associativity property and has an identity element. A monoid is called commutative if α⊛β = β⊛α ∀α, β ∈ S.
Definition 1.6. (Invertible Element) Let (S; ⊛) be a monoid with an identity element 1. Then, an element α ∈ S is called left invertible if there exists α− ℓ 1 ∈ S such that α− ℓ 1 ⊛ α = 1. Similarly, an element α ∈ S is called right invertible if there exists α− r 1 ∈ S such that α ⊛ α− r 1 = 1. An element α ∈ S is called invertible if there exists α−^1 ∈ S such that α−^1 ⊛ α = α ⊛ α−^1 = 1.
Definition 1.7. (Group) A monoid whose every element is invertible is called a group. That is, a binary algebra (S; ⊛) is called a group if the following properties are satisfied:
Definition 1.8. (Abelian Group) A group (S; ⊛) is called commutative (also called Abelian) if the following additional property: α ⊛ β = β ⊛ α ∀α, β ∈ S is satisfied. It is customary to denote the binary operator ⊛ as + for commutative groups. That is, (S, +) indicates an Abelian group. The inverse of an element α is denoted as −α and the identity as 0.
Example 1.1. Let S = R , (−∞, ∞) and let + be the usual addition operation. Then, (S, +) is an Abelian group whose identity element is 0.
Definition 2.1. (Ring) Let (S; +, • ) be an algebraic system, i.e., + : S × S → S and • : S × S → S. Then, (S; +, • ) is called a ring if (S; +) is an Abelian group (with 0 ∈ S as the additive identity element) and if, for every α, β, γ ∈ S, the following conditions are satisfied:
A ring is called commutative if, in addition, it satisfies the commutative property, i.e., α • β = β • α ∀α, β ∈ S.
Remark 2.1. A ring (S; +, • ) may or may not satisfy the following properties:
Definition 2.2. (Ideals) Let (S; +, • ) be a ring and let D be a nonempty subset of S that is closed under the operations of + and • in S. If (D; +, • ) is itself a ring, then it is called a subring of (S; +, • ). A subring (D; +, • ) of a ring (S; +, • ) is called a left ideal if ( s ∈ S and x ∈ D
s • x ∈ D
and is called a right ideal if ( s ∈ S and x ∈ D
x • s ∈ D
The subring (D; +, • ) is called an ideal if it is both a left ideal and a right ideal.
Remark 2.2. Just as normal groups play an important role in the theory of groups (see Definition 1.10), so do ideals in the theory of rings.
Definition 2.3. (Zero-divisor) Let (S; +, • ) be a ring. Then, an element α ∈ S{ 0 } is called a left [resp. right] zero-divisor if there exists β ∈ S { 0 } such that α•β = 0 [resp. β • α = 0 ]. An element of S, which is both a left and a right zero-divisor, is called a zero-divisor.
Definition 2.4. (Invertible Element) Let (S; +, • ) be a ring with 1 as the mul- tiplicative identity (i.e., the identity with respect to the operation • ). Then, an element α ∈ S is called a left [resp. right] invertible if there exists β ∈ S [resp. γ ∈ S] such that β • α = 1 [resp. α • γ = 1 ]. The element β [resp. γ] is called a left [resp. right] inverse of. If an element α ∈ S is both left and right invertible, then α is called an invertible element.
Definition 2.5. (Integral Domain) Let (S; +, • ) be a commutative ring with the multiplicative identity 1 6 = 0 and no zero divisors. Then, (S; +, • ) is called an integral domain.
Definition 2.6. (Division Ring) Let (S; +, • ) be a ring with the multiplicative identity 1 6 = 0 and let (S \ { 0 }, • ) be a group with 1 as the identity. Then, (S; +; • ) is called a division ring.
Definition 2.7. (Field) A commutative division ring is called a field.
Remark 2.3. A field (F ; +, • ) has the following properties.
Example 2.1. (R : +; • ) and (C : +; • ) are fields that are very commonly used in engineering analysis. Note that (Q : +; • ) is also a field but it is seldom used because, as we have seen, Q is not a complete set whereas R and C are. Note that (Z : +; • ) is a commutative ring but it is not a field because no element of Z \ { 1 } has a multiplicative inverse.
Example 2.2. Fields can be both infinite and finite. Examples of infinite fields are: (R : +; • ) and (C : +; • ). An example of a finite field (also called Galois field) is GF (2) ,
, where ⊕ 2 is the addition operation under modulo 2 and ⊗ 2 is the multiplication operation under modulo 2. This is the smallest field.
Example 2.3. A polynomial over a field (F ; +, • ) is defined as an expression of the form: a 0 + a 1 x + a 1 x^2 + · · · + anxn^ where ai ∈ F and n is a non-negative integer; and x is called the indeterminate. Examples of the indeterminate include real numbers, complex numbers, and square matrices.
The elements of V are called vectors and the elements of F are called scalars. Often the multiplicative operators • and ⊗ are simply omitted, i.e., we write α • β as αβ and α ⊗ x as αx. However, it is important to distinguish between the operators of the scalar addition + and the vector addition ⊕. We denote the zero (i.e., additive identity) of the field (F ; +, • ) as 0 and the zero vector (i.e., the identity of the Abelian group (V ; ⊕)) as 0.
Remark 3.2. It follows from the definition of a vector space that ∀x ∈ V ∀α ∈ F
Remark 3.3. Every vector space must contain the zero vector 0. That is, the set V in a vector space (V ; ⊕) can never be empty. A geometric interpretation of this fact is that every coordinate frame of a vector space must have the origin.
Definition 3.3. (Subspace) Let (V ; ⊕) be a vector space over a field (F ; +, • ). Let U ⊆ V such that (U ; ⊕) is a vector space over a field (F ; +, • ). Then, (U ; ⊕) is a subspace of (V ; ⊕). In addition, if U is a proper subset of V , then (U ; ⊕) is a proper subspace of (V ; ⊕).
Remark 3.4. If (U ; ⊕) is a subspace of (V ; ⊕), then it follows that (U ; ⊕) is a subgroup of the group (V ; ⊕).
Proposition 3.1. (U ; ⊕) is a subspace of a vector space (V ; ⊕) over a field (F ; +, • ) if and only if the following condition holds: (αx ⊕ y) ∈ U ∀x, y ∈ U ∀α ∈ F.
Proof. The proof follows from the definition of a subspace.
Definition 3.4. (Linear Combination) Let (V ; ⊕) be a vector space over a field (F ; +, • ) and let S be a nonempty (finite or countably infinite or uncountable) set of vectors. Then, x ∈ V is said to be a linear combination of vectors in S if there exists a finite set of vectors {u^1 , · · · , un} and a finite set of scalars {α 1 , · · · , αn}, where n ∈ N, such that x = ⊕nj=1 αj uj^.
Definition 3.5. (Linear Dependence) Let (V ; ⊕) be a vector space over a field (F ; +, • ) and let S be a nonempty (finite or countably infinite or uncountable) set of vectors. Then, the vectors in S are said to linearly independent if, for each x ∈ S, x is not a linear combination of the vectors in S \ {x} (i.e., the set S with x removed). The set S in the vector space (V ; ⊕) is linearly dependent if it is not linearly independent, i.e., there exists a vector x ∈ S such that x is a linear combination of vectors in S \ {x}.
Theorem 3.1. Let (V ; ⊕) be a vector space over a field (F ; +, • ) and let S be a nonempty (finite or countably infinite or uncountable) set of vectors. Then, S is linearly independent if and only if, for each nonempty finite subset of S, say, {u^1 , · · · , un}, the only n-tuple of scalars satisfying the equation:
⊕n j=1 αj^ uj^ =^0 , is the trivial solution: α 1 = · · · = αn = 0.
Proof. See Naylor & Sell pp. 177-178.
Remark 3.5. A (nonempty) set S in a vector space (V ; ⊕) is linearly dependent if and only if there exists a nonempty finite subset of S, say, {u^1 , · · · , un} and scalars α 1 , · · · , αn, where not all αi’s are zero, such that ⊕nj=1 αj uj^ = 0.
Definition 3.6. (Spanning) Let (V ; ⊕) be a vector space over a field (F ; +, • ) and let S be a nonempty (finite or countably infinite or uncountable) set of vectors. Then, the set of all (finite) linear combinations of vectors in S is the space spanned by S and is denoted as span(S).
Remark 3.6. It follows from Definition 3.6 that span(S) is the smallest subspace of (V ; ⊕) containing the set S, i.e., span(S) is the intersection of all subspaces of (V ; ⊕) that contain the set S.
numbers C forms a one-dimensional vector space over the complex field C. The implications of this fact are illustrated below by a physical example. Let an unforced single-degree-of-freedom underdamped linear time-invariant mass-spring-damper (or inductance-capacitance-resistance) system, with non-zero initial conditions, be governed by the following equation.
d^2 y(t) dt^2 + 2^ ξ ωn
dy(t) dt +^ ω
(^2) n y(t) = 0
where y(t) is the time-dependent displacement (or charge); ωn is the natural fre- quency; and ξ is the damping coefficient (0 ≤ ξ < 1). A state-space representation of the above equation in the two-dimensional vector space R^2 over the real field R is given below.
d dt
x 1 x 2
−ξ ωn
1 − ξ^2 ωn −
1 − ξ^2 ωn −ξ ωn
x 1 x 2
where a choice for the states is: x 1 = ξωny + dy dt and x 2 = −
1 − ξ^2 ωny and the
state transition matrix
−ξ ωn
1 − ξ^2 ωn −
1 − ξ^2 ωn −ξ ωn
∈ R^2 ×^2 maps R^2 into R^2.
Let us construct a vector space over the complex field C instead of the real field R. To this end, let us define a complex-valued state z(t) , x 1 (t) + ix 2 (t), where i =
−1; and x 1 and x 2 are as defined above. Then it follows that dx 1 dt +^ i
dx 2 dt =
− ξωnx 1 +
1 − ξ^2 ωnx 2
1 − ξ^2 ωnx 1 − ξωnx 2
which reduces to
d dt [x^1 +^ ix^2 ] =
− (ξ + i
1 − ξ^2 )ωn
[x 1 + ix 2 ].
Consequently, the state-space representation of the above equation in the one- dimensional vector space C^1 over the complex field C is as given below.
d dt [z] =
− (ξ + i
1 − ξ^2 )ωn
[z]
where the state transition matrix
− (ξ + i
1 − ξ^2 )ωn
∈ C^1 ×^1 maps C^1 into C^1. Note that, in the disciplines of Physics and Engineering, such a 2nd^ order underdamped system is often referred to as a single degree-of-freedom system. Here, we have shown that an underdamped system is represented on a vector space of
dimension 2 over the real field R, or on a vector space of dimension 1 over the complex field C. Next we consider an overdamped system, i.e., the damping coefficient ξ > 1. A state-space representation of the overdamped system in the two-dimensional vector space R^2 over the real field R is given below.
d dt
x 1 x 2
−ξ ωn
√ ξ^2 −^1 ωn ξ^2 − 1 ωn −ξ ωn
x 1 x 2
where a choice for the states is: x 1 = ξωny + dy dt and x 2 =
ξ^2 − 1 ωny and the
state transition matrix
−ξ ωn
√ ξ^2 −^1 ωn ξ^2 − 1 ωn −ξ ωn
∈ R^2 ×^2 maps R^2 into R^2.
Similar to what was done for the underdamped system, the vector space is constructed over the complex field C instead of the real field R. The complex- valued state is defined as z(t) , x 1 (t) + ix 2 (t), where i =
−1; and x 1 and x 2 are as defined above. Then it follows that
dx 1 dt +^ i
dx 2 dt =
− ξωnx 1 +
ξ^2 − 1 ωnx 2
ξ^2 − 1 ωnx 1 − ξωnx 2
which reduces to
d dt [x^1 +^ ix^2 ] =^ ωn
(−ξx 1 +
ξ^2 − 1 x 2 ) + i(
ξ^2 − 1 x 1 − ξx 2 )
Consequently, the state-space representation of the above equation in the one- dimensional vector space C^1 over the complex field C as given below.
d dt [z] =
− (ξ + i
ξ^2 − 1)ωn
[z] −
ξ^2 − 1 ωn
(z − z¯)
where ¯z is the complex conjugation of z ∈ C^1. Note that the operation of com- plex conjugation may not be linear because, given any z, ˜z, γ ∈ C, it follows that z + γ ˜z = ¯z + ¯γ z¯˜ instead of z + γ z˜ = ¯z + γ ¯˜z. For the critically damped system (i.e., ξ = 1), The system dynamics is given as
d dt
x 1 x 2
−ωn ωn 0 −ωn
x 1 x 2
where a choice for the states is: x 1 = ωny and x 2 = ωny + dy dt and the state
transition matrix
−ωn ωn 0 −ωn
∈ R^2 ×^2 maps R^2 into R^2.
Linearity of h: Let α ∈ F and a ∈ Fn^ and b ∈ Fn. Then, αa ⊕ b ∈ Fn^ and h(αa ⊕ b) =
⊗n j=1(αaj^ +^ bj^ )^ vj^ = (αh(a))^ ⊗^ h(b)^ ⇒^ linearity is established. Injectivity of h: Let h(a) = h(b) for a ∈ Fn^ and b ∈ Fn. By linearity of h and using − 1 ∈ F, it follows that 0 = h(a ⊕ (−1)b) ⇒
⊗n j=
aj + (−1)bj
vj^. Because of linear independence of vj^ ’s, it follows that aj + (−1)bj = 0 ⇒ aj = bj ∀j. Therefore, a = b ⇒ injectivity is established. Surjectivity of h: Since vj^ ’s form a basis of (V ; ⊗), there exist unique scalars cj ’s such that x = ⊗nj=1 cj vj^ = h(c) ∀x ∈ V , i.e., there exists c ∈ Fn^ ∀x ∈ V ⇒ surjectivity is established.
Remark 3.11. Since a finite-dimensional (say n ∈ N) vector space V over a field F (e.g., the real field R or the complex field C) is isomorphic to Fn, it is natural to study the properties of Fn^ in lieu of those of V. The vectors in Fn^ can be interpreted as: (i) n-tuples of scalars in F, or (ii) linear mappings from F to Fn. Let us present a generalization of the second concept. Let N , { 1 , 2 , · · · , n} be the set of all integers from 1 to n. Then, a function x : N → F implies that a rule assigns a scalar x(j) to each j ∈ N and let the set of all such functions be denoted as F(N^ ), i.e., specifying a function x ∈ F(N^ )^ amounts to presenting n consecutive scalars x(1), x(2), · · · , x(n), which could be arranged as a column vector or as a row vector in Fn. This provides a bijectivity between F(N^ )^ and Fn. For all c ∈ F and x, y ∈ F(N^ ), it is meaningful to define the operations of scalar multiplication and vector addition in F(N^ )^ respectively by
(cx)(j) = cx(j) and (x ⊕ y)(j) = x(j) + y(j)
These operations make F(N^ )^ a vector space over the field F, which reveals that there is no essential difference between F(N^ )^ and Fn. Now we generalize this concept to infinite-dimensional vector spaces. Let S be an arbitrary (i.e., finite, countable, or uncountable) nonempty set that serves as an index set and let FS^ denote the set of all functions f : S → F. If S = N, then FS^ is the vector space of all possible sequences s : N → F which includes all (i.e., real or complex) scalar-valued sequences. Similarly, if S = R, then FS^ is the vector space of all possible (not necessarily linear) functions f : R → F that includes all (i.e., real or complex) scalar-valued time-dependent signals; for example, f (t) could be the output voltage of an operational amplifier. This concept allows extension of the tools of linear algebra to functions spaces. In general, if S is a countably infinite set, i.e., S ∼ N, the the vector x ∈ FS^ is called a sequence; if S is an uncountably infinite set, i.e., S ∼ R, the vector x ∈ FS^ is called a function.
This subsection introduces the rudimentary concepts of linear transformation of vector spaces. More advanced materials on transformation of vector spaces will be presented in later chapters.
Definition 3.10. (Linear Transformation) Let (V ; ⊕) and (W ; ⊗) be two vector spaces over the same field F. Then, L : V → W is called a linear transformation if the following condition holds for all xk^ ∈ V and all αk ∈ F:
L
⊕nk=1 (αkxk)
= ⊗nk=
αkL(xk)
for any arbitrary n ∈ N
If W = V , then the transformation L : V → V is called an operator. If W = F, then the transformation L : V → F is called a functional. The space of all linear bounded functionals of a vector space V is called the dual space V ⋆.
Definition 3.11. (Null Space) The null space N (L) of a linear transformation L : V → W is the subspace of V that satisfies the following condition:
N (L) ,
x ∈ V : Lx = (^0) W
Definition 3.12. (Range Space) The range space R(L) of a linear transformation L : V → W is the subspace of W that satisfies the following condition:
R(L) ,
Y ∈ W : y = Lx and x ∈ V
Definition 3.13. (Sum, Direct Sum, and Algebraic Complement) Let V be a vector space, where the operation of vector addition is ⊛, and let U and W be two subspaces of V. Then, the sum of U and W is a subspace of V , denoted as Y = U + W , which is spanned by all vectors in U and W , i.e.,
U + W ,
x = u ⊛ w : u ∈ U and w ∈ W
The direct sum of of U and W is a subspace of V , denoted as Y = U ⊕^ W , if
∀y ∈ Y ∃ u ∈ U and w ∈ W such that there is a unique representation y = u⊛w.
In this setting, U is called the algebraic complement of W (alternatively, W is the algebraic complement of U ) in Y.
Definition 3.14. (Projection) Let V be a vector space. Then, a linear transfor- mation P : V → V is called a projection on V if P 2 = P , i.e., P (P x) = P x ∀x ∈ V.
Let V and W be two finite-dimensional vector spaces over the same field F, where dim(V ) = n and dim(W ) = m; let A ∈ L(V, W ) be a linear transformation from V to W. If B , {b^1 , b^2 , · · · , bn} is a basis for V , then each vector x ∈ V can be uniquely represented as x =
∑n k=1 bkβk, where^ βk^ ∈^ F. Let the vector of coordinates βk be denoted as [x]B^ , [β 1 β 2 · · · βn]T^ , which implies that [x]B^ ∈ Fn. Similarly, if C , {c^1 , c^2 , · · · , cm} is a basis for W , then the image of x ∈ V under A ∈ L(V, W ), i.e., the vector Ax ∈ W can be uniquely represented as Ax =
∑m k=1 ckγk, where γk ∈ F; thus, the vector of coordinates γk is denoted as [Ax]C^ , [γ 1 γ 2 · · · γm]T^ , which implies that [Ax]C^ ∈ Fm.
Definition 3.15. (Matrix representation) The linear transformations from the co- ordinates, [x]B^ ∈ Fn, of a vector x ∈ V to the coordinates, [Ax]C^ ∈ Fm, of a vector Ax ∈ W is an (m × n) matrix. This matrix, denoted by A ∈ Fm×n, serves as the coordinates of the linear transformation A ∈ L(V, W ) relative to the bases (B, C).
Theorem 3.4. (Matrix Representation Theorem) Let B , {b^1 , b^2 , · · · , bn} and C , {c^1 , c^2 , · · · , cm} be bases for the vector spaces V and W , respectively. Let the kth^ column of the matrix A ∈ Fm×n^ be defined as ak^ , [Abk]C^ for 1 ≤ k ≤ n and k ∈ N. Then, L(V, W ) is isomorphic to Fm×n^ and [Ax]C^ = A[x]B^ ∀x ∈ V.
Proof. Let x ∈ V and x = ∑nk=1 bkβk. Then, [x]B^ , [β 1 β 2 · · · βn]T^. Since A is linear, it follows that
[Ax]C^ =
∑^ n k=
bkβk
∑^ n k=
Abk
βk =
∑^ n k=
akβk = A[x]B
Next, to show that the spaces L(V, W ) and Fm×n^ are isomorphic, let us define a linear transformation T : L(V, W ) → Fm×n, i.e., for each A ∈ L(V, W ), there exists A ∈ Fm×n^ whose kth^ column is ak^ = [Abk]C^. What remain to be shown are that T is linear, injective, surjective. This part of the proof is similar to that of Theorem 3.3.
In view of Theorem 3.4, every finite-dimensional linear transformation can be represented by a matrix of appropriate dimension as seen in Figure 1 that pictorially illustrates the isomorphic relationship between the spaces L(V, W ) and Fm×n. The matrix A ∈ Fm×n^ is the matrix representation of the linear transformation A induced by the bases B and C.
Example 3.1. (Integration of a Polynomial) Let the vector spaces V and W over the real field R be defined as: V = P n−^1 ((0, ∞)) and W = P n((0, ∞)), where P n((0, ∞)) is the space of polynomials of degree n or less and the indeterminate is a real scalar in the range of (0, ∞). Let us consider the integral transform A ∈ L(V, W ), i.e., for each x ∈ V and each t ∈ (0, ∞),
(Ax)(t) ,
∫ (^) t
0
dτ x(τ )
Let B be the basis of elementary polynomials, i.e., B = { 1 , t, t^2 , · · · }, t ∈ (0, ∞). Then, the matrix representation of A induced by the bases of elementary polyno- mials is computed as:
ak^ = [Atk−^1 ]B^ =
[ ∫ (^) t
0
dτ τ k−^1
[ (^) tk k
=^1 k ek+
where ek^ denotes the kth^ column of the identity matrix. Thus, for n = 3, it follows that the matrix representation A of the integral transform A mapping the coordinates of a polynomial into the coordinates of its integral is given as;
For example, if x(t) = 4 − 3 t + t^2 , then [x]B^ = [4 − 3 1]T^ and the coordinates of
x
[x]B^ [Ax]D
Ax
[x]C^ [Ax]E
This subsection introduces the basic concepts of coordinate transformation in finite- dimensional vector spaces through the usage of linear transformation. The concept of matrix representation in Theorem 3.4 leads to the fact that, by choosing bases for two finite-dimensional vector spaces V ∼ Fn^ and W ∼ Fm^ over the same field F, each vector in L(V, W ) can be represented by its coordinates as a vector in Fm×n. To develop a strategy for selecting bases of V and W , we encounter the problem of transforming from one matrix representation to another. That is, given the matrix representation of A ∈ L(V, W ) relative to a given pair of bases, the problem is how to find its matrix representation relative to another pair of bases. Here each matrix represents the same linear transformation in a different coordinate system. Consequently, moving from one matrix representation to another is a coordinate transformation problem.
Theorem 3.5. (Coordinate Transformation Theorem) Let V ∼ Fn^ and W ∼ Fm be two finite-dimensional vector spaces over the same field F and let A ∈ L(V, W ). Let B = {b^1 , b^2 , · · · , bn} and C = {c^1 , c^2 , · · · , cn} be two bases for V , and let D = {d^1 , d^2 , · · · , dm} and E = {e^1 , e^2 , · · · , en} be two bases for W. Let the matrix P ∈ Fn×n^ whose kth^ column is [ck]B^ , and let the matrix Q ∈ Fm×m^ whose kth column is [dk]E^. Then, it follows that
[A]C,E^ = Q [A]B,DP
Proof. First we show that the matrix P maps C-coordinates into B-coordinates. For each x ∈ V , we have
[x]B^ =
[ (^) ∑n
k=
ck[x]Ck
∑^ n k=
[ck]B^ [x]Ck =
∑^ n k=
pk[x]Ck = P [x]C
By using a similar argument we show that the matrix Q maps D-coordinates into E-coordinates. That is, for each y ∈ W , we have [y]E^ = Q [y]D. Therefore, for each x ∈ V , it follows that
Q [A]B,DP [x]C^ = Q [A]B,D[x]B^ = Q [Ax]D^ = [Ax]E^ = [A]C,E^ [x]C
Since x ∈ V is arbitrarily chosen, the proof follows by application of Theorem 3.4.
Pre-multiplication and post-multiplication by the respective transformation ma- trices is needed to obtain one matrix representation from another as seen pictorially in Figure 2 that summarizes the relationship between a linear transformation and two of its matrix representations. A matrix itself is a linear transformation. As such, it is possible to find a matrix representation of a matrix.
Example 3.3. (Coordinate Transformation) This example deals with a linear transformation, namely, differentiation on the space of polynomials, and two of its matrix representations. Let V = P n((0, ∞) and let A be the derivative trans- formation on V. That is, for each x ∈ V and each t ∈ (0, ∞),
(Ax)(t) , (^) dt d x(t) Then, A ∈ L(V, W ), where W = P n−^1 ((0, ∞)). Let B be the standard basis of elementary polynomials, i.e., B = { 1 , t, t^2 , · · · }. If B ∈ Fn×(n+1)^ is the matrix representation of A induced by the the bases of elementary polynomials, then it follows from Example 3.2 for the case n = 3 that
Next, let us consider the following alternative basis for the space of polynomials: D = { 1 , 2 t, 3 t^2 , · · · }. If D ∈ Fn×(n+1)^ is the matrix representation of A induced by the the bases associated with D, then it follows from Theorem 3.5 that D = QAP , where the coordinate transformation matrix P is computed as pk^ = [ktk−^1 ]B^ = kek for 1 ≤ k ≤ (n + 1), and the coordinate transformation matrix Q is computed as qk^ = [ktk−^1 ]D^ = (^1) k ek^ for 1 ≤ k ≤ n). For n = 3, it follows that
and Q =