Matrix and Vector Review: Matrices, Vectors, Probability, and Linear Systems, Study notes of Microelectronic Circuits

A brief review of matrices and vectors, probability and random variables, and linear systems. It covers topics such as matrices and vectors, eigenvalues and eigenvectors, probability and random variables, and linear transformations of random vectors.

Typology: Study notes

Pre 2010

Uploaded on 07/31/2009

koofers-user-0eh
koofers-user-0eh 🇺🇸

10 documents

1 / 43

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DigitalImageProcessing
Second Edition
ReviewMaterial
RafaelC.Gonzalez
RichardE.Woods
Prentice Hall
UpperSaddleRiver,NJ07458
www.prenhall.com/gonzalezwoods
or
www.imageprocessingbook.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b

Partial preview of the text

Download Matrix and Vector Review: Matrices, Vectors, Probability, and Linear Systems and more Study notes Microelectronic Circuits in PDF only on Docsity!

Digital Image Processing

Second Edition

Review Material

Rafael C. Gonzalez

Richard E. Woods

Prentice Hall

Upper Saddle River, NJ 07458

www.prenhall.com/gonzalezwoods

or

www.imageprocessingbook.com

ii

Revision history 10 9 8 7 6 5 4 3 2 1 Copyright °c1992-2002 by Rafael C. Gonzalez and Richard E. Woods

Contents

1 A Brief Review of Matrices

and Vectors

The purpose of this short document is to provide the reader with background sufÆcient to follow the discussions in Digital Image Processing , 2nd ed., by Gonzalez and Woods. The notation is the same that we use in the book.

1.1 Matrices

Introductory DeÆnitions

We begin with the deÆnition of a matrix. An m £ n (read |m by n}) matrix , denoted by A, is a rectangular array of entries or elements (numbers, or symbols representing numbers) enclosed typically by square brackets. In this notation, m is the number of horizontal rows and n the number of vertical columns in the array. Sometimes m and n are referred to as the dimensions or order of the matrix, and we say that matrix A has dimensions m by n or is of order m by n: We use the following notation to represent an m £ n matrix A:

A =

a 11 a 12 ¢ ¢ ¢ a 1 n a 21 a 22 ¢ ¢ ¢ a 2 n .. .

am 1 am 2 ¢ ¢ ¢ amn

where aij represents the (i; j)-th entry.

If m = n, then A is a square matrix. If A is square and aij = 0 for all i 6 = j, and not all aii are zero, the matrix is said to be diagonal. In other words, a diagonal matrix is a square matrix in which all elements not on the main diagonal are zero. A diagonal matrix in which all diagonal elements are equal to 1 is called the identity matrix, typically denoted by I. A matrix in which all elements are 0 is called the zero or null matrix, typically denoted by 0. The trace of a matrix A (not necessarily diagonal),

2 Chapter 1 A Brief Review of Matrices and Vectors

denoted tr(A), is the sum of the elements in the main diagonal of A. Two matrices A and B are equal if and only if they have the same number of rows and columns, and aij = bij for all i and j.

The transpose of an m £ n matrix A, denote AT^ , is an n £ m matrix obtained by interchanging the rows and columns of A. That is, the Ærst row of A becomes the Ærst column of AT^ , the second row of A becomes the second column of AT^ , and so on. A square matrix for which AT^ = A is said to be symmetric.

Any matrix X for which XA = I and AX = I is called the inverse of A. Usually, the inverse of A is denoted A¡^1. Although numerous procedures exist for computing the inverse of a matrix, the procedure usually is to use a computer program for this purpose, so we will not dwell on this topic here. The interested reader can consult any book an matrix theory for extensive theoretical and practical discussions dealing with matrix inverses. A matrix that possesses an inverse in the sense just deÆned is called a nonsingular matrix.

Associated with matrix inverses is the computation of the determinant of a matrix. Al- though the determinant is a scalar, its deÆnition is a little more complicated than those discussed in the previous paragraphs. Let A be an m £ m (square) matrix. The (i; j)- minor of A, denoted Mij , is the determinant of the (m¡ 1) £ (m¡ 1) matrix formed by deleting the ith row and the jth column of A. The (i; j)- cofactor of A, denoted Cij , is (¡1)i+j^ Mij. The determinant of a 1 £ 1 matrix [®], denoted det ([®]), is det ([®]) = ®: Finally, we deÆne the determinant of an m £ m matrix A as det (A) =

X^ m j=

a 1 j C 1 j :

In other words, the determinant of a (square) matrix is the sum of the products of the elements in the Ærst row of the matrix and the cofactors of the Ærst row. As is true of inverses, determinants usually are obtained using a computer.

Basic Matrix Operations

Let c be a real or complex number (often called a scalar ). The scalar multiple of scalar c and matrix A, denoted cA, is obtained by multiplying every elements of A by c. If c = ¡ 1 , the scalar multiple is called the negative of A.

Assuming that they have the same number of rows and columns, the sum of two matrices A and B, denoted A + B, is the matrix obtained by adding the corresponding elements

4 Chapter 1 A Brief Review of Matrices and Vectors

b =

b 1 b 2 .. . bm

Keeping in mind the matrix dimensions required for matrix products deÆned above, the product of a and b is a 1 £ 1 matrix, given by aT^ b = bT^ a = a 1 b 1 + a 2 b 2 + ¢ ¢ ¢ + ambm =

X^ m i=

aibi:

This particular product is often called the dot- or inner product of two vectors. We have much more to say about this in the following section. ¤

1.2 Vectors and Vector Spaces

Vectors

As introduced in the previous section, we refer to an m £ 1 column matrix as a column vector. Such a vector assumes geometric meaning when we associate geometrical prop- erties with its elements. For example, consider the familiar two-dimensional (Euclid- ean) space in which a point is represented by its (x; y) coordinates. These coordinates can be expressed in terms of a column vector as follows:

u =

x y

Then, for example, point (1; 2) becomes the speciÆc vector

u =

Geometrically, we represent this vector as a directed line segment from the origin to point (1; 2). In three-dimensional space the vector would have components (x; y; z). In m-dimensional space we run out of letters and use the same symbol with subscripts to represent the elements of a vector. That is, an m-dimensional vector is represented as

x =

x 1 x 2 .. . xm

1.2 Vectors and Vector Spaces 5

When expressed in the form of these column matrices, arithmetic operations between vectors follow the same rules as they do for matrices. The product of a vector by scalar is obtained simply by multiplying every element of the vector by the scalar. The sum of two vectors x and y is formed by the addition of corresponding elements (x 1 + y 1 , x 2 + y 2 , and so on), and similarly for subtraction. Multiplication of two vectors is as deÆned in Example 1. Division of one vector by another is not deÆned.

Vector Spaces

DeÆnition of a vector space is both intuitive and straightforward. A vector space is deÆned as a nonempty set V of entities called vectors and associated scalars that satisfy the conditions outlined in A through C below. A vector space is real if the scalars are real numbersu it is complex if the scalars are complex numbers.

Condition A: There is in V an operation called vector addition , denoted x + y, that satisÆes:

  1. x + y = y + x for all vectors x and y in the space.
  2. x + (y + z) = (x + y) + z for all x, y, and z.
  3. There exists in V a unique vector, called the zero vector , and denoted 0 , such that x + 0 = x and 0 + x = x for all vectors x.
  4. For each vector x in V , there is a unique vector in V , called the negation of x, and denoted ¡x, such that x + (¡x) = 0 and (¡x) + x = 0.

Condition B: There is in V an operation called multiplication by a scalar that associates with each scalar c and each vector x in V a unique vector called the product of c and x, denoted by cx and xc, and which satisÆes:

  1. c(dx) = (cd)x for all scalars c and d, and all vectors x.
  2. (c + d)x = cx + dx for all scalars c and d, and all vectors x.
  3. c(x + y) = cx + cy for all scalars c and all vectors x and y.

Condition C: 1 x = x for all vectors x.

We are interested particularly in real vector spaces of real m £ 1 column matrices, with vector addition and multiplication by scalars being as deÆned earlier for matrices. We shall denote such spaces by <m: Using the notation introduced previously, vectors (col-

1.2 Vectors and Vector Spaces 7

Figure 1.

A set S of vectors v 1 ; v 2 ; : : : ; vn in V is said to span some subspace V 0 of V if and only if S is a subset of V 0 and every vector v 0 in V 0 is linearly dependent on the vectors in S. The set S is said to be a spanning set for V 0. A basis for a vector space V is a linearly independent spanning set for V. The number of vectors in the basis for a vector space is called the dimension of the vector space. If, for example, the number of vectors in the basis is n, we say that the vector space is n-dimensional.

An important aspect of the concepts just discussed lies in the representation of any vector in <m^ as a linear combination of the basis vectors. For example, any vector

x =

x 1 x 2 x 3

in <^3 can be represented as a linear combination of the basis vectors 2 (^64)

(^75) , and

8 Chapter 1 A Brief Review of Matrices and Vectors

Vector Norms

A vector norm on a vector space V is a function that assigns to each vector v in V a nonnegative real number, called the norm of v, denoted by kvk. By deÆnition, the norm satisÆes the following conditions:

  1. kvk > 0 for v 6 = 0 ; k 0 k = 0;
  2. kcvk = jcj kvk for all scalars c and vectors v, and
  3. ku + vk · kuk + kvk :

There are numerous norms that are used in practice. In our work, the norm most often used is the so-called 2- norm , which, for a vector x in real <m, space is deÆned as

kxk =

x^21 + x^22 + ¢ ¢ ¢ + x^2 m

The reader will recognize this expression as the Euclidean distance from the origin to point x, which gives this expression the familiar name of the Euclidean norm. The expression also is recognized as the length of a vector x, with origin at point 0. Based on the multiplication of two column vectors discussed earlier, we see that the norm also can be written as kxk =

xT^ x

The well known Cauchy-Schwartz inequality states that ¯¯ xT^ y

· kxk kyk : In words, this result states that the absolute value of the inner product of two vectors never exceeds the product of the norms of the vectors. This result is used in several places in the book. Another well-known result used in the book is the expression cos μ = x

T (^) y kxk kyk where μ is the angle between vectors x and y, from which we have that the inner product of two vectors can be written as xT^ y = kxk kyk cos μ: Thus, the inner product of two vectors can be expressed as a function of the norms of the vectors and the angle between the vectors.

From the preceding results we have the deÆnition that two vectors in <m^ are orthogonal if and only if their inner product is zero. Two vectors are orthonormal if, in addition to being orthogonal, the length of each vector is 1. From the concepts just discussed,

10 Chapter 1 A Brief Review of Matrices and Vectors

Example 1.3 Consider the matrix

M =

It is easy to verify that Me 1 = ¸ 1 e 1 and Me 2 = ¸ 2 e 2 for ¸ 1 = 1, ¸ 2 = 2 and

e 1 =

and e 2 =

In other words, e 1 is an eigenvector of M with associated eigenvalue ¸ 1 , and similarly for e 2 and ¸ 2 : ¤

The following properties, which we give without proof, are essential background in the use of vectors and matrices in digital image processing. In each case, we assume a real matrix of order m £ m although, as stated earlier, these results are equally applicable to complex numbers. We focus on real quantities simply because they play the dominant role in our work.

  1. If f¸ 1 ; ¸ 2 ; : : : ; ¸qg; q · m; is set of distinct eigenvalues of M, and ei is an eigenvec- tor of M with corresponding eigenvalue ¸i; i = 1; 2 ; : : : ; q; then fe 1 ; e 2 ; : : : ; eq g is a linearly independent set of vectors. Note an important implication of this property: If an m £ m matrix M has m distinct eigenvalues, its eigenvectors will constitute an orthogonal (orthonormal) set, which means that any m-dimensional vector can be expressed as a linear combination of the eigenvectors of M.
  2. The numbers along the main diagonal of a diagonal matrix are equal to its eigenval- ues. It is not difÆcult to show using the deÆnition Me = ¸e that the eigenvectors can be written by inspection when M is diagonal.
  3. A real, symmetric m £ m matrix M has a set of m linearly independent eigenvec- tors that may be chosen to form an orthonormal set. This property is of particular importance when dealing with covariance matrices (e.g., see Section 11.4 and our review of probability) which are real and symmetric.
  4. A corollary of Property 3 is that the eigenvalues of an m £ m real symmetric matrix are real, and the associated eigenvectors may be chosen to form an orthonormal set of m vectors.
  5. Suppose that M is a real, symmetric m £ m matrix, and that we form a matrix A whose rows are the m orthonormal eigenvectors of M. Then, the product AAT^ = I because the rows of A are orthonormal vectors. (Recall from the discussion on matrix multiplication that the product of two matrices is formed by the inner product of the rows of one matrix with the column of the other. Since the rows of A and columns of AT^ are orthonormal, their inner products are either 0 or 1). Thus, we see

1.3 Eigenvalues and Eigenvectors 11

that A¡^1 = AT^ when matrix A is formed as was just described.

  1. Consider matrices M and A as deÆned in 5. Then, the product D = AMA¡^1 = AMAT^ is a diagonal matrix whose elements along the main diagonal are the eigen- values of M. The eigenvectors of D are the same as the eigenvectors of M.

Example 1.4 Suppose that we have a random population of vectors, denoted by fxg, with covariance matrix (see the following chapter on a review of probability):

Cx = Ef(x ¡ mx)(x ¡ mx)T^ g

where E is the expected value operator and mx is the mean of the population. Covari- ance matrices are real, square, symmetric matrices which, from Property 3, are known to have a set of orthonormal eigenvectors.

Suppose that we perform a transformation of the form y = Ax on each vector x, where the rows of A are the orthonormal eigenvectors of Cx. The covariance matrix of the population fyg is

Cy = Ef(y ¡ my)(y ¡ my )T^ g = Ef(Ax ¡ Amx)(Ax ¡ Amx)T^ g = EfA(x ¡ mx)(x ¡ mx)T^ AT^ g = AEf(x ¡ mx)(x ¡ mx)T^ gAT = ACxAT

where A was factored out of the expectation operator because it is a constant matrix.

From Property 6, we know that Cy = ACxAT^ is a diagonal matrix with the eigenval- ues of Cx along its main diagonal. Recall that the elements along the main diagonal of a covariance matrix are the variances of the components of the vectors in the popu- lation. Similarly, the off diagonal elements are the covariances of the components of these vectors The fact that the covariance Cy is diagonal means that the elements of the vectors in the population fyg are uncorrelated (their covariances are 0). Thus, we see that application of the linear transformation y = Ax involving the eigenvectors of Cx decorrelates the data, and the elements of Cy along its main diagonal give the variances of the components of the yzs along the eigenvectors. Basically, what has been accom- plished here is a coordinate transformation that aligns the data along the eigenvectors of the covariance matrix of the population.

The preceding concepts are illustrated in Fig. 1.2. Figure 1.2(a) shows a data population fxg in two dimensions, along with the eigenvectors of Cx (the black dot is the mean). The result of performing the transformation y = A(x ¡ mx) on the xzs is shown in

2 A Brief Review of Probability

and Random Variables

The principal objective of the following material is to start with the basic principles of probability and to bring the reader to the level required to be able to follow all probability-based developments in the book.

2.1 Sets and Set Operations

Probability events are modeled as sets, so it is customary to begin a study of probability by deÆning sets and some simple operations among sets.

Sets

Informally, a set is a collection of objects , with each object in a set often referred to as an element or member of the set. Familiar examples include the set of all image processing books in the world, the set of prime numbers, and the set of planets circling the sun. Typically, sets are represented by uppercase letters, such as A, B, and C, and members of sets by lowercase letters, such as a, b, and c. We denote the fact that an element a belongs to set A by a 2 A If a is not an element of A then we write a = 2 A:

A set can be speciÆed by listing all of its elements, or by listing properties common to all elements. For example, suppose that I is the set of all integers. A set B consisting the Ærst Æve nonzero integers is speciÆed using the notation B = f 1 ; 2 ; 3 ; 4 ; 5 g: The set of all integers less than 10 is speciÆed using the notation C = fc 2 I j c < 10 g

14 Chapter 2 A Brief Review of Probability and Random Variables

which we read as |C is the set of integers such that each members of the set is less than 10.} The |such that} condition is denoted by the symbol \ j " and, as is shown in the previous two equations, the elements of the set are enclosed by curly brackets. The set with no elements is called the empty or null set, which we denote by ;:

Two sets A and B are said to be equal if and only if they contain the same elements. Set equality is denoted by A = B: If the elements of two sets are not the same, we say that the sets are not equal, and denote this by A 6 = B: If every element of B is also an element of A, we say that B is a subset of A: B μ A where the equality is included to account for the case in which A and B have the same elements. If A contains more elements than B, then B is said to be a proper subset of A, and we use the notation B ½ A:

Finally, we consider the concept of a universal set , which we denote by U and deÆne to be the set containing all elements of interest in a given situation. For example, in an experiment of tossing a coin, there are two possible (realistic) outcomes: heads or tails. If we denote heads by H and tails by T , the universal set in this case is fH; T g. Similarly, the universal set for the experiment of throwing a single die has six possible outcomes, which normally are denoted by the face value of the die, so in this case U = f 1 ; 2 ; 3 ; 4 ; 5 ; 6 g: For obvious reasons, the universal set is frequently called the sample space , which we denote by S. It then follows that, for any set A, we assume that ; μ A μ S, and for any element a, a 2 S and a =2 ;:

Some Basic Set Operations

The operations on sets associated with basic probability theory are straightforward. The union of two sets A and B, denoted by A [ B is the set of elements that are either in A or in B, or in both. In other words, A [ B = fz j z 2 A or z 2 Bg: Similarly, the intersection of sets A and B, denoted by A \ B