Linear Transformations and Functionals: Basic Concepts, Study notes of Engineering Systems Analysis

Review of the fundamentals of real analysis and point set topology. Concepts of finite-dimensional vector spaces from both algebraic and topological points of view. Introduction to infinite-dimensional vector spaces and function spaces along with the notion of completeness. Key points in this lecture handout are: Linear Transformations and Functionals, Linear Bounded Functionals and Dual Spaces, Linear Bounded Functionals, Hahn-Banach Theorem, Zorn's Lemma, Extension of Linear Functionals, Appli

Typology: Study notes

2012/2013

Uploaded on 10/02/2013

aanila
aanila 🇮🇳

4.4

(36)

170 documents

1 / 25

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ME(EE) 550 Foundations of Engineering Systems Analysis
Chapter 05: Linear Transformations and Functionals
The concepts of normed vector vector spaces and inner product spaces, pre-
sented in Chapter 3 and Chapter 4, synergistically combine the top ological structure
of metric spaces and the algebraic structure of vector spaces. Now we present lin-
ear transformations between such spaces, where these linear transformations form
a vector space in their own right. We also introduce the concept of a norm in the
space of linear transformations. This chapter should be read along with Chapter
4 and Chapter 5 of Naylor & Sell. Specifically, some of the solved examples and
exercises in Naylor & Sell would be very useful very useful.
1 Basic concepts
Definition 1.1. (Transformations, Operators, and Functionals) Let Vand Wbe
two vector spaces (not necessarily of the same dimension) defined over the same
field F(which is either Ror C). Then,
(i) A mapping f:VWis called a transformation from Vinto W.
(ii) A mapping f:VVis called an operator from Vinto itself. Hence, an
operator belongs to a specific class of transformations.
(iii) A mapping f:VFis called a function from Vinto its field. Hence, a
functional belongs to a specific class of transformations.
Furthermore, if the mapping fis linear, i.e., if f(α xVy) = α f (x)Wf(y)x, y
Vand αF, then these mappings are respectively called a linear transformation,
a linear operator, or a linear functional. The collection of all linear transformations
from Vinto Wforms a vector space, denoted as L(V , W ), over the field F.
Example 1.1. let V=Fnand W=Fmfor some n, m N. Then, the linear
transformation A:VWis an (m×n) matrix, i.e., AFn×m.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19

Partial preview of the text

Download Linear Transformations and Functionals: Basic Concepts and more Study notes Engineering Systems Analysis in PDF only on Docsity!

ME(EE) 550 Foundations of Engineering Systems Analysis

Chapter 05: Linear Transformations and Functionals

The concepts of normed vector vector spaces and inner product spaces, pre- sented in Chapter 3 and Chapter 4, synergistically combine the topological structure of metric spaces and the algebraic structure of vector spaces. Now we present lin- ear transformations between such spaces, where these linear transformations form a vector space in their own right. We also introduce the concept of a norm in the space of linear transformations. This chapter should be read along with Chapter 4 and Chapter 5 of Naylor & Sell. Specifically, some of the solved examples and exercises in Naylor & Sell would be very useful very useful.

1 Basic concepts

Definition 1.1. (Transformations, Operators, and Functionals) Let V and W be two vector spaces (not necessarily of the same dimension) defined over the same field F (which is either R or C). Then,

(i) A mapping f : V → W is called a transformation from V into W.

(ii) A mapping f : V → V is called an operator from V into itself. Hence, an operator belongs to a specific class of transformations.

(iii) A mapping f : V → F is called a function from V into its field. Hence, a functional belongs to a specific class of transformations.

Furthermore, if the mapping f is linear, i.e., if f (α x⊕V y) = α f (x)⊕W f (y) ∀x, y ∈ V and ∀α ∈ F, then these mappings are respectively called a linear transformation, a linear operator, or a linear functional. The collection of all linear transformations from V into W forms a vector space, denoted as L(V, W ), over the field F.

Example 1.1. let V = Fn^ and W = Fm^ for some n, m ∈ N. Then, the linear transformation A : V → W is an (m × n) matrix, i.e., A ∈ Fn×m.

Example 1.2. let V = P (F), where P (F) denotes the space of polynomials of any degree with coefficients in F. Then, the linear mapping A : V → V is an infinite-dimensional operator.

Example 1.3. let V = P (R). Then, the mapping f : V → R is a functional. For example, the norm in a normed vector space is a functional; however, the norm is not a linear functional.

Definition 1.2. (Injectivity, surjectivity, and bijectivity of transformations) Let V and W be two vector spaces (not necessarily of the same dimension) defined over the same field F. Let T : V → W be a transformation. Then,

(i) T is called one-to-one or injective if

T (x) = T (y)

x = y

∀x, y ∈ V. If T : V → W is injective, then its left inverse is S : W → V such that T S = IV.

(ii) T is called onto or surjective if the range space of T is equal to W , i.e., if ∀z ∈ W ∃ x ∈ V such that T (x) = z. If T : V → W is surjective, then its right inverse is S : W → V such that T S = IW.

(iii) T is called bijective if T is both injective and surjective. In that case, there exists a unique inverse of T , denoted as T −^1 : W → V that is also bijective, and T −^1 T = T T −^1 = I.

Definition 1.3. (Null space) Let L : V → W be a linear transformation, then the null space of L is defined as

N (L) ,

x ∈ V : Lx = (^0) W

Proposition 1.1. (Injectivity of null spaces) Let L : V → W be a linear transfor- mation and let A ∈ L(V, W )). A is injective if and only if N (A) = { (^0) V }.

Proof. To show the if part, let N (A) = { (^0) V }. Then, A(x−y) = (^0) W ⇒ (x−y) = (^0) V or x = y. So, Ax = Ay ⇒ x = y, which implies A is injective. Next, to show the only if part, let N (A) 6 = { (^0) V }. Then, ∃ z 6 = (^0) V such that Az = (^0) W. Now, z = (x − y) ⇒ A(x − y) = (^0) W ⇒ Ax = Ay with x 6 = y, which implies A is not injective.

Definition 1.4. (Boundedness of transformations) Let T : V → W be a trans- formation (not necessarily linear), where V and W be normed vector spaces (with possibly different norms) over the same field F. Then T is defined to be bounded if

∀ x ∈ V ∃ M ∈ (0, ∞) such that ‖T (x)‖W ≤ M ‖x‖V

Az = Ax−Ay. For continuity of A, we must show that ∀x ∈ V ∀ε > 0 ∃δ(ε, x) > 0 such that

‖x − y‖V < δ

‖Ax − Ay‖W < ε

. This achieved by choosing δ = (^) Mε. Next, to show the only if part, let A be continuous. Let us consider a Cauchy sequence {xk} of non-zero vectors in V converging to (^0) V. Let us define zk^ , (^) kx‖kx‖ ; since A is linear and continuous, {Azk} must converge to (^0) V as k → ∞. Now, if A is unbounded, then ∀k ∈ N ∃xk^ ∈ V such that ‖Axk‖W > k‖xk‖V , which implies that ‖Azk‖W =

∥ Ax

k k‖xk^ ‖V

∥ =^ ‖Ax

k (^) ‖W k‖xk^ ‖V >^1 ∀k^ ∈^ N. This is a contradiction because {Azk} must converge to (^0) V as k → ∞. Therefore, A must be bounded.

Corollary 1.1. (Boundedness of finite-dimensional linear transformation) If V is finite-dimensional, then every A ∈ L(V, W ) is continuous.

Proof. Let dim V = n for some n ∈ N and let {ek^ : k = 1, · · · , n} be a basis of V , where ej^ consists of all zero elements except 1 being in the jth^ position. Since A is linear, it follows that

‖Ax‖W =

∥A

∑^ n k=

αkek

∥W =

∑^ n k=

|αk|‖Aek‖W ≤

( (^) ∑n

k=

|αk|

max k ‖Aek‖W

Since the vectors ek’s are linearly independent, it follows from Lemma 1.1 in Chap- ter 3 (on linear combination in normed spaces) (see also Kreyszig pp. 72-73) that ∃c ∈ (0, ∞) such that ∥∥ ∥

∑^ n k=

αkek

∥V ≥ c

∑^ n k=

|αk| for every choice of αk’s

Therefore, ‖Ax‖W ≤ (^1) c maxk ‖Aek‖W ‖x‖V ⇒ A is bounded. Then, by Theo- rem 1.1, A is continuous.

Corollary 1.2. (Continuity of a linear transformation at a point) Let V and W be normed vector spaces over the same field and let a linear transformation A ∈ L(V, W ) be continuous at a point y ∈ V. Then, A is bounded on V.

Proof. Let {xk} be a convergent sequence in V such that xk^ → x ∈ V. Then, ‖Axk^ − Ax‖W = ‖A(xk^ − x)‖W = ‖A(xk^ − x + y) − Ay‖W. Then, as k → ∞, it follows that (xk^ − x + y) → y. Since A is continuous at y ∈ V , it follows that A(xk^ − x + y) → Ay as k → ∞. Therefore, Axk^ → Ax. Hence, A is continuous on V implying that A is bounded on V.

Corollary 1.3. Let A ∈ L(V, W ) bounded. Let {xk} be a sequence in V that converges to x ∈ V. Then

(i) The sequence {Axk} in W converges to Ax ∈ W.

(ii) The null space N (A) is closed in V.

Proof. Part (i)) Since A is bounded, A is continuous by Theorem 1.1, the image of a convergent sequence under a continuous mapping is also a convergent sequence by Theorem 3.7.1 in Naylor & Sell (see p.74). This is also seen from the following. ∥∥ Axk^ − Ax

W =^

A

xk^ − x

W ≤ ‖A‖ind

xk^ − x

V Part (ii) Let {xk} be a Cauchy sequence in N (A) that converges to x ∈ N (A). Then, it follows from Part (i) that Axk^ → Ax. Since A is continuous and Axk^ = (^0) W ∀k, we conclude that Ax = (^0) W ⇒ x ∈ N (A). Therefore, N (A) is closed in V.

Remark 1.3. Let V and W be two vector spaces (not necessarily of the same dimension) defined over the same field F. Then, the vector space L(V, W ) of lin- ear transformations from V into W must be bounded if V is finite-dimensional, regardless of whether W is finite-dimensional or not. However, if V is infinite- dimensional, then L(V, W ) may or may not be bounded, regardless of whether W is finite-dimensional or not.

Example 1.4. (Unbounded transformation) Let P∞[0, 1] be the space of all real polynomials on [0, 1] with the L∞-norm as the metric. Let D , (^) dtd be a transfor- mation D : P∞[0, 1] → P∞[0, 1]. It is concluded that D is a linear transformation based on the fact that

D(p 1 + αp 2 ) = d

p 1 (t) + αp 2 (t)

dt =^

dp 1 (t) dt +^ α

dp 2 (t) dt = Dp 1 + αDp 2 ∀p 1 , p 2 ∈ P∞[0, 1] ∀α ∈ R Now we show that D is an unbounded transformation. let xk(t) = tk^ k ∈ N. Then,

Dxk^ = d(t

k (^) ) dt =^ kt

k− (^1) = kxk− 1

Therefore, ‖Dxk‖L∞ = ‖kxk− 1 ‖L∞ = k‖xk− 1 ‖L∞. Since ‖xk‖L∞ = 1 k ∈ N, it follows that ‖Dxk‖L∞ = k‖xk‖L∞ and there is no upper bound on k ∈ N. Therefore, D is unbounded. It is concluded from Theorem 1.1 that D is a discontinuous transformation. Discontinuity of the derivative operator has been demonstrated earlier from the ε − δ perspective.

The notions of boundedness and norm of a functional follow those of a trans- formation.

Definition 2.1. (Norm of a bounded functional) Let V be a vector space over the field F. Then, a functional f : V → F is defined to be bounded if ∀x ∈ V ∃M ∈ (0, ∞) such that |f (x)| ≤ M ‖x‖V. The norm of a bounded functional f : V → F is defined as: ‖f ‖ , sup‖x‖V |f (x)|

Definition 2.2. (Dual space) Let V be a vector space over the field F. Then, the dual space of V , denoted as V ⋆, is the vector space of all linear bounded functionals on V , i.e., V ⋆^ , {f ∈ L(V, F) : ∃M ∈ (0, ∞) such that |f (x)| ≤ M ‖x‖V ∀x ∈ V.

Remark 2.1. Every bounded linear functional on a normed space (V, ‖ • ‖) is con- tinuous by Theorem 1.1 based on the fact that every functional is a transformation. However, note that all functionals are not bounded as seen below.

Example 2.3. (An example of a linear unbounded functional) Let us consider a subspace U of the space ℓ∞ over the real field R, in which each sequence has finitely many non-zero elements. Let us define a linear functional f : U → R such that f (x) =

∑N

k=1 nkξnk^ , where^ N^ ∈^ N^ and the (finitely many) non-zero elements of the sequence x ∈ U are ξn 1 , ξn 2 , · · · , ξnN. Although N ∈ N, there is no upper bound on N and hence the linear functional f is unbounded.

Theorem 2.1. (Completion of dual spaces) Let V be a normed space over a (com- plete) field F. Then, its dual space V ⋆^ is a Banach space.

Proof. Let {zk} be a Cauchy sequence in V ⋆. For any x ∈ V , {zk(x)} is a Cauchy sequence of scalars because |zk(x) − zℓ(x)| ≤ ‖zk^ − zℓ‖‖x‖V and ‖zk^ − zℓ‖ → 0 as k, ℓ → ∞. Since the field F is complete, the Cauchy sequence {zk(x)} of scalars converges to a scalar z(x) ∈ F. that is, zk(x) → z(x) ∀x ∈ V .We need to show is that the functional z is linear and bounded. Linearity of z is established as follows.

z(αx + βy) = lim k→∞ zk(αx + βy) = lim k→∞

αzk(x) + βzk(y)

= αz(x) + βz(y)

Since zk^ is continuous (because it is bounded), the sequence {zk(x)} converges to a continuous functional z. So, z ∈ V ⋆. Hence, V ⋆^ is a Banach space.

Theorem 2.2. (Dual space of Rn) The dual space

Rn

is isometrically isomor- phic to Rn^ with Euclidean norm.

Proof. Let x = [ξ 1 · · · ξn]T^ ∈ Rn, where n ∈ N and let ‖x‖ ,

( ∑n k=1 |ξk|^2

Let f ∈ (Rn

be expressed as f (x) = ∑nk=1 ηk ξk, where ηk ∈ R, which is a linear combination of ξk’s. Therefore, f is linear. Furthermore, f is bounded because

|f (x)| =

∑^ n k=

ηkξk

( (^) ∑n

k=

|ηk |^2

) 12 ( (^) ∑n

k=

|ξk|^2

∑^ n k=

|ηk|^2

‖x‖ < ∞

If we choose x = [η 1 · · · ηn]T^ , then |f (x)| = ∑nk=1 |ηk|^2 by equality in the

Cauchy-Schwarz sense. That is, ‖f ‖ =

( ∑n k=1 |ηk|^2

⇒ f (x) = yT^ x, where y = [η 1 · · · ηn]T^ ∈ Rn.

Theorem 2.3. (Dual Space of ℓp) Let p ∈ (1, ∞) and q be its conjugate, i.e., (^1) p + (^1) q = 1. Then, dual space ℓ⋆p is isometrically isomorphic to ℓq.

Proof. Let {ek} be a Schauder basis for ℓp, where {ek} , δkj. Then, every x ∈ ℓp has a unique representation x =

k=1 ξkek, where^ x^ ,^ {ξ^1 ξ^2 ξ^3 · · · }. Let^ f^ ∈^ ℓ⋆p. Since f is linear and bounded, it follows that

f (x) = f

k=

ξk ek

∑^ ∞

k=

ξkf (ek) =

∑^ ∞

k=

ξk ηk

by defining ηk , f (ek). Let us denote y , {η 1 η 2 η 3 · · · }. Let a sequence {xn} in ℓp be defined as xn^ , {ξkn }, i.e., ∑∞ k=1 |ξnk |p^ < ∞, such that

ξnk =

{ (^) |ηk |q ηk if^ k^ ≤^ n^ and^ |^ ηk^ |>^0 0 if k > n or | ηk |= 0

It follows from the constraint (^1) p + (^1) q = 1 that (q − 1)p = q. By substituting the expression for ξkn in f (x), it follows that

f (xn) =

∑^ n j=

|ηj |q

and, from the property of the induced norm ‖ f ‖,

f (xn) ≤ ‖ f ‖ ‖ xn^ ‖ =‖ f ‖

k=1 |ξnk^ |p

) (^1) p

=‖ f ‖

( ∑n k=1 |ηk^ |(q−1)p

) (^1) p

=‖ f ‖

( ∑n k=1 |ηk|q

) (^1) p

by defining ηj , f (ej^ ), which are uniquely determined by f. Let us denote y , {η 1 η 2 η 3 · · · }. Since ‖ek‖ℓ 1 = 1, it follows that

|ηk | = |f (ek)| ≤‖ f ‖ ⇒ sup k

| ηk |≤‖ f ‖ and y ∈ ℓ∞

To establish the equality ‖ y ‖ℓ∞ = supk | ηk |=‖ f ‖, it is necessary to show that supk | ηk |≥‖ f ‖. Therefore,

| f (x) |=

∑^ ∞

k=

ξkηk

∣ ≤‖^ x^ ‖ℓ 1 sup k

| ηk |

Therefore, ∀x 6 = 0, | ‖fx^ (‖xℓ) 1 | ≤ supk ηk =‖ y ‖ℓ∞ , which implies ‖ f ‖≤‖ y ‖ℓ∞. Hence, by combining the inequalities, it follows that ‖ f ‖=‖ y ‖ℓ∞. he mapping ℓ∞ → ℓ⋆ 1 , defined by y 7 → f is linear and surjective, and the linear span of the vectors in the Schauder basis {ek} is dense in ℓ 1 ; furthermore, this mapping is norm-preserving. Therefore, the dual space ℓ⋆ 1 is isometrically isomorphic to ℓ∞.

Theorem 2.5. (Dual Space of co) The dual space c⋆o is isometrically isomorphic to ℓ 1.

Proof. It is known that co is a closed subspace of the complete space ℓ∞ and co is complete relative to the metric induced by the norm ‖ • ‖ℓ∞. Let f ∈ c⋆o. Since f is linear and bounded, it follows that

f (x) = f

k=

ξk ek

∑^ ∞

k=

ξkf (ek) =

∑^ ∞

k=

ξk ηk

by defining ηj , f (ej^ ), which are uniquely determined by f. Let us denote y , {η 1 η 2 η 3 · · · }. Then,

‖ f ‖, sup ‖x‖=

∑^ ∞

k=

ηkξk |≤

∑^ ∞

k=

| ηk |< ∞

Hence, y = {ηk} ∈ ℓ 1 and ‖ f ‖≤‖ y ‖ℓ 1. Next we establish the equality that is trivial if y = 0ℓ 1 implying that f = 0c⋆ 0. So, we assume that y 6 = 0ℓ 1.

Given ǫ > 0 ∃ n ∈ N such that

‖ y ‖ − ǫ 2 <

∑^ n k=

| ηk |= f (z) ≤‖ f ‖

where the vector z ∈ co has all zero coordinates after the nth^ and, for j = 1, 2 , · · · , n, zj = |y yjj^ | if yj 6 = 0 and zj = 0 yj = 0. As ǫ → 0, n → ∞, and hence the equality ‖ f ‖=‖ y ‖ℓ 1 is established. Bijectivity between ℓ 1 and c⋆o is established in the same way as between ℓq and ℓ⋆p in Theorem 2.3.

Remark 2.2. The dual space ℓ⋆ ∞ of ℓ∞ that has a very abstract concept is not encountered in the engineering discipline; it may occasionally come up in analytic number theory. Note that ℓ⋆ ∞ 6 = ℓ 1.

Theorem 2.6. (Riesz-Frech´et Theorem, also called Riesz Representation Theorem) Let H be a Hilbert space over the (complete) field F and H⋆^ be its dual space. Then, every vector in H⋆^ uniquely identifies a vector in H, i.e.,

∀f ∈ H⋆^ ∃ a unique y ∈ H such that f (x) = 〈x, y〉H ∀x ∈ H, and ‖f ‖ind = ‖y‖H

(See Naylor & Sell, p. 345.)

Proof. If f = (^0) H⋆ , i.e., if f (x) = 〈x, y〉H = 0 ∀x ∈ H, then y = (^0) H. Therefore, we assume f 6 = (^0) H⋆ , i.e., ∃ x ∈ H such that f (x) = 〈x, y〉H 6 = 0 for some y 6 = (^0) H. Then, the null space N (f ) , {x ∈ H : f (x) = 0} is a proper closed subspace of H, i.e., N (f )

N ⊥(f ) = H and dim

N ⊥(f )

= 1 because f : H → F. Now, the orthogonal projection of x ∈ H onto the one-dimensional space N ⊥(f ) is f (x)z for some z ∈ N ⊥(f ) where z 6 = (^0) H , which implies that

( x − f (x)z

∈ N (f ), i.e., 〈

x − f (x)z

, z〉H = 0 ⇒ f (x) = 〈x, z ‖z‖〉 2 H

In other words, the projection of x onto N ⊥(f ) is f (x)z = 〈x, u〉u where u , (^) ‖zz‖ is the unique unit vector in the one-dimensional space N ⊥(f ). Notice that the vector u that spans the space N ⊥(f ) is independent of the choice of x; however, u is dependent on the choice of f. By setting y , (^) ‖zz‖ 2 , we have f (x) = 〈x, y〉H ∀x ∈ H. Thus, existence of y ∈ H such that f (x) = 〈x, y〉H ∀x ∈ H is established. To show uniqueness of y, let there exist ˜y ∈ H such that f (x) = 〈x, y˜〉H ∀x ∈ H. Then, 〈x, y〉H − 〈x, ˜y〉H = f (x) − f (x) = 0 ⇒ 〈x, (y − y˜)〉H = 0 ∀x ∈ H ⇒ y = ˜y. Thus, uniqueness of y ∈ H is established.

Definition 3.3. A totally ordered (also called linearly ordered) set or a chain is a partially ordered set such that every pair of elements in the set are comparable. In other words, a chain is a a partially ordered set having no incomparable elements.

Definition 3.4. Let

P, 4

be a partially ordered set. Then, Q is a maximally totally ordered subset of P if (i) Q ⊆ P, (ii)

Q, 4

is totally ordered, and (iii) if any member of P not in Q is adjoined to Q, then the resulting collection of sets is no longer totally ordered by 4.

Remark 3.1. Every subset of a nonempty set, which consists of a single element, is totally ordered.

Definition 3.5. Let S be a partially ordered set. An upper bound of W ⊆ S is an element α ∈ S such that θ 4 α ∀ θ ∈ W (1)

A lower bound of W ⊆ S is an element β ∈ S such that

β 4 θ ∀ θ ∈ W (2)

Depending on S and W , an upper bound or a lower bound of W may or may not exist.

Definition 3.6. Let (S, 4 ) be a partially ordered set. An element α ∈ S is called a maximal element of S if θ 4 α for every θ ∈ S which is comparable to α. In other words, If θ ∈ S, then (α 4 θ) ⇒ (α = θ) (3)

Similarly, a minimal element of S is an element β ∈ S such that

If θ ∈ S, then (θ 4 β) ⇒ (β = θ) (4)

A partially ordered set S may or may not have a maximal element or a minimal element. Furthermore, a maximal element need not be an upper bound. Similarly, a minimal element need not be a lower bound.

Example 3.1. Let S = (0, 1) ⊂ R; then,

S, ≤

is a totally ordered set that has no maximal element and no minimal element. However, 1 ∈ R is an upper bound of S; similarly, 0 ∈ R is a lower bound of S. As a matter of fact, 1 is the least upper bound of S and 0 is the greatest lower bound of S.

Example 3.2. Let S be the set of all points (x, y) in the plane R^2 with y ≤ 0. Let us define an ordering 4 on S as ( (x, y) 4 (˜x, y˜)

(x = ˜x)

(y ≤ y˜)

Then, the partially ordered set (S, 4 ) has infinitely many maximal elements.

Zorn’s lemma: Let S 6 = ∅ be a partially ordered set such that every chain T ⊆ S has an upper bound. Then, S has at least one maximal element. Hausdorff Maximality Theorem: Every (nonempty) partially ordered set contains a maximal totally ordered subset. In other words, if S is a maximal totally ordered subset of a (nonempty) partially ordered set X and if T is a totally ordered subset of X, then

S ⊆ T ⊆ X

S = T

Axiom of Choice: Let S 6 = ∅ be a set and I 6 = ∅ be an index set. Then, there exists a mapping, called the choice function, f : I → S such that f (α) ∈ Sα ⊆ S and Sα 6 = ∅. That is, for every nonempty set, there exists a choice function. The axiom of choice can also be stated as: The product of a family of nonempty sets indexed by a nonempty set is nonempty.

Remark 3.2. Zorn’s Lemma and Hausdorff Maximality Theorem are equivalent and they are also equivalent to Axiom of Choice. For details, see Appendix, pp. 392-393, on Hausdorff Maximality Theorem in Real and Complex Analysis by Rudin and p. 13 in Algebra by Thomas Hungerford.

Let us illustrate a simple application of Zorn’s lemma. We first make the following assertions:

  • V is a vector space and A is a set of linearly independent vectors belonging to V.
  • X is the collection of all linearly independent sets of vectors in V such that A is a subset of each member in X.
  • ⊆ is a partial ordering on X.
  • H is a Hamel basis of V such that A ⊆ H.
  • I is a non-empty index set and Y = {Bi : i ∈ I} is a chain of X.
  • B =

i∈I Bi It follows that the sets in the chain Y can be ordered as: Bi 1 ⊆ Bi 2 ⊆ · · · ⊆ Bin ⊆ · · · and Y has an upper bound B. Since Y can be arbitrarily chosen, X has a maximal element H by Zorn’s lemma.

For any chain H ⊆ E, let us define a linear functional ˜g ∈ E as:

D(˜g) =

g∈H

D(g) and ˜g(x) = g(x) if x ∈ D(g) (5)

Note that, for an x ∈ D(g 1 ) ⋂^ D(g 2 ) with g 1 , g 2 ∈ H, we have g 1 (x) = g 2 (x) because H is a chain so that g 1 ≤ g 2 or g 2 ≤ g 1. Then, g ≤ ˜g for all g ∈ H. Hence, H has an upper bound. Since selection of H ⊆ E is arbitrary, Zorn’s lemma implies that E has a maximal element; let us call this maximal element as fext. By definition, fext is a linear extension of f that satisfies the condition:

fext(x) ≤ p(x) ∀x ∈ D(fext) (6)

Step 2 : Now we prove, by contradiction, that D(fext) spans the entire vector space V. Let us assume that the assertion is false, i.e., D(fext) is a proper subset of V. Then, there exists z ∈

V \ D(fext)

and z 6 = 0 because 0 ∈ D(fext). Let the subspace W be spanned by D(fext) and the vector z. Thus, any x ∈ W can be expressed as:

x = y + αz where y ∈ D(fext) and α is a scalar (7)

The above representation is unique because y ∈ D(fext) and z ∈

V − D(fext

A linear functional g on W is defined by

g(y + αz) = fext(y) + αc where g(z) = c ∈ R (8)

Note that g is a proper extension of fext, i.e., D(fext) is a proper subset of D(g), because if α = 0, then g(y) = fext(y) ∀y ∈ D(fext). Consequently, if it is proven that g ∈ E by showing that g(x) ≤ p(x) ∀x ∈ D(g), then this will contradict maximality of fext so that the assertion D(fext) 6 = V is false, i.e., the truth of the statement D(fext) = V is established. Step 3 : We will show that g with a real constant value of c in Eq. (8) satisfies the condition g(x) ≤ p(x) ∀x ∈ D(g). Let y, z ∈ D(fext) and let w ∈ D(fext) be fixed. Since p is a subadditive functional and the linear functional fext ≤ p,

fext(y) − fext(z) = fext(y − z) ≤ p(y − z) = p(y + w − w − z) ≤ p(y + w) + p(−w − z) (9)

Taking the last term to the left and the term fext(y) to the right in Eq. (9), we have

−p(−w − z) − fext(z) ≤ p(y + w) − fext(y) (10) Since y does not appear on the left and z does not appear on the right, the inequality in Eq. (10) continues to hold if the supremum, m, is taken over z ∈ D(fext) on the left and the infimum, M , over y ∈ D(fext) on the right. Therefore, with the constant c in Eq. (8) being in the closed interval [m, M ], it follows from Eq. (10) that

−p(−w − z) − fext(z) ≤ c ∀z ∈ D(fext) (11) c ≤ p(y + w) − fext(y) ∀y ∈ D(fext) (12)

For α = 0, we already have x ∈ D(fext). Let us first prove g(x) ≤ p(x) ∀x ∈ D(g) for α < 0 in Eq. (8). Replacing z in Eq. (11) by α−^1 y and multiplying both sides by the positive quantity −α yields:

αp(−w − α−^1 y) + fext(y) ≤ −αc (13)

From Eqs. (8) and (11), using x = y + αw yields:

g(x) = fext(y) + αc ≤ −αp(−w − α−^1 y) = p(αw + y) = p(x) (14)

For α > 0, let us replace y in Eq. (12) by α−^1 y to obtain:

c ≤ p(α−^1 y + w) − fext(α−^1 y) (15)

Multiplication of Eq. (15) by α yields

αc ≤ αp(α−^1 y + w) − αfext(α−^1 y) = p(x) − fext(y) (16)

A combination of Eq. (16) with Eq. (8) yields:

g(x) = fext(y) + αc ≤ p(x) (17)

Remark 3.4. In some cases (e.g., finite-dimensional and separable Hilbert spaces), it is possible to prove Hahn-Banach Theorem without using Zorn’s Lemma (see Chapter 5, p. 111 in Optimization by Vector Space Methods by Luenberger).

Task 1 holds from the fact that, for any complex scalar a + ib, the following relation holds based on Eq. (20):

fext((a + ib)x) = f (^) extreal (ax + ibx) − if (^) extreal (iax − bx) = af (^) extreal (x) + bf (^) extreal (ix) − i[af (^) extreal (ix) − bf (^) extreal (x)] = (a + ib)[f (^) extreal (x) − if (^) extreal (ix)] = (a + ib)f (^) extreal (x)

Now we prove Task 2. Let fext(0) = 0 which holds because p(x) ≥ 0 ∀x ∈ V. Let x 6 = 0 be such that fext(0) 6 = 0. Using the polar notation, fext(x) = |fext(x)| exp(iθ) ⇒ |fext(x)| = exp(−iθ)fext(x). Since |fext(x)| is real, the absolute homogeneity property of the sublinear functional p yields

|fext(x)| = f (^) extreal

exp(−iθ)x

≤ p

exp(−iθ)x

= |exp(−iθ(x)|p(x) = p(x)

The proof is thus complete.

Further details are available in Real and Complex Analysis by Rudin (see Chap- ter 5, p. 105).

4 Applications of Hahn-Banach Theorem to Bounded

Linear Functionals

Theorem 4.1. (Hahn-Banach Theorem: Normed Spaces) Let f be a bounded linear functional on a subspace U of a vector space V , defined on the real field R or the complex field C. Then, there exists a bounded linear functional fext on V , which is an extension of f to V having the same norm,

||fext|| = ||f || (21)

Proof. If U = { 0 }, then f = 0 and consequently fext = 0. Let f 6 = 0. Since we will use Theorem 3.2 to prove this theorem, we must first find an appropriate sublinear functional p. We have |f (x)| ≤ ||f ||U ||x|| ∀x ∈ U

where we select p(x) = ||f ||U ||x|| (see Remark 3.3). Using Theorem 3.2, it follows that there exists a linear functional fext, which is an extension of f , satisfies the condition: |fext(x)| ≤ p(x) = ||f ||U ||x|| ∀x ∈ V Taking supremum over all unity norm x ∈ V , we obtain the inequality:

||fext||V = sup||x||=1 |fext(x)| ≤ ||f ||U (22)

Since a norm cannot decrease under extension, we claim that

||fext||V ≥ ||f ||U (23)

A combination of Eqs. (22) and (23) proves the theorem.

Corollary 4.1. Let V be a normed space and let x^0 6 = 0 be an arbitrary vector in V. Then, there exists a bounded linear functional g on V such that ||g|| = 1 and g(x^0 ) = ||x^0 ||V.

Proof. Let U be the subspace spanned by the vector x^0. Let us define a linear functional f on U as f (αx^0 ) = αf (x^0 ) = α||x^0 ||, where α is a scalar. Then, f is bounded and ||f || = 1 because if x = αx^0 , then

|f (x)| = |f (αx^0 )| = |α|||x^0 || = ||αx^0 || = ||x||

Then, Theorem 4.1 implies that f has a linear extension from U to V of norm ||fext|| = ||f || = 1 because fext(x^0 ) = f (x^0 ) = ||x^0 ||.

Corollary 4.2. Let V be a normed vector space and f ∈ V ∗. Then, every x ∈ V has the following property:

||x||V = sup||f ||=1|f (x)| (24)

and if x^0 ∈ V is such that f (x^0 ) = 0 ∀ f ∈ V ∗^ for all f ∈ V ∗, then x^0 = 0.

Proof. By replacing x^0 by x in Corollary 4.1, it follows that

supx∈V ∗{ (^0) V }^ |f ||^ (fx ||) |≥ |f ||extf(x)| ext||^

= ||x||

and the proof follows from the fact that |f (x| ≤ ||f ||||x||.