



































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A brief review of matrices and vectors, probability and random variables, and linear systems. It covers topics such as matrices and vectors, eigenvalues and eigenvectors, probability and random variables, and linear transformations of random vectors.
Typology: Study notes
1 / 43
This page cannot be seen from the preview
Don't miss anything!




































Second Edition
Prentice Hall
Upper Saddle River, NJ 07458
www.prenhall.com/gonzalezwoods
or
www.imageprocessingbook.com
ii
Revision history 10 9 8 7 6 5 4 3 2 1 Copyright °c1992-2002 by Rafael C. Gonzalez and Richard E. Woods
The purpose of this short document is to provide the reader with background sufÆcient to follow the discussions in Digital Image Processing , 2nd ed., by Gonzalez and Woods. The notation is the same that we use in the book.
We begin with the deÆnition of a matrix. An m £ n (read |m by n}) matrix , denoted by A, is a rectangular array of entries or elements (numbers, or symbols representing numbers) enclosed typically by square brackets. In this notation, m is the number of horizontal rows and n the number of vertical columns in the array. Sometimes m and n are referred to as the dimensions or order of the matrix, and we say that matrix A has dimensions m by n or is of order m by n: We use the following notation to represent an m £ n matrix A:
a 11 a 12 ¢ ¢ ¢ a 1 n a 21 a 22 ¢ ¢ ¢ a 2 n .. .
am 1 am 2 ¢ ¢ ¢ amn
where aij represents the (i; j)-th entry.
If m = n, then A is a square matrix. If A is square and aij = 0 for all i 6 = j, and not all aii are zero, the matrix is said to be diagonal. In other words, a diagonal matrix is a square matrix in which all elements not on the main diagonal are zero. A diagonal matrix in which all diagonal elements are equal to 1 is called the identity matrix, typically denoted by I. A matrix in which all elements are 0 is called the zero or null matrix, typically denoted by 0. The trace of a matrix A (not necessarily diagonal),
2 Chapter 1 A Brief Review of Matrices and Vectors
denoted tr(A), is the sum of the elements in the main diagonal of A. Two matrices A and B are equal if and only if they have the same number of rows and columns, and aij = bij for all i and j.
The transpose of an m £ n matrix A, denote AT^ , is an n £ m matrix obtained by interchanging the rows and columns of A. That is, the Ærst row of A becomes the Ærst column of AT^ , the second row of A becomes the second column of AT^ , and so on. A square matrix for which AT^ = A is said to be symmetric.
Any matrix X for which XA = I and AX = I is called the inverse of A. Usually, the inverse of A is denoted A¡^1. Although numerous procedures exist for computing the inverse of a matrix, the procedure usually is to use a computer program for this purpose, so we will not dwell on this topic here. The interested reader can consult any book an matrix theory for extensive theoretical and practical discussions dealing with matrix inverses. A matrix that possesses an inverse in the sense just deÆned is called a nonsingular matrix.
Associated with matrix inverses is the computation of the determinant of a matrix. Al- though the determinant is a scalar, its deÆnition is a little more complicated than those discussed in the previous paragraphs. Let A be an m £ m (square) matrix. The (i; j)- minor of A, denoted Mij , is the determinant of the (m¡ 1) £ (m¡ 1) matrix formed by deleting the ith row and the jth column of A. The (i; j)- cofactor of A, denoted Cij , is (¡1)i+j^ Mij. The determinant of a 1 £ 1 matrix [®], denoted det ([®]), is det ([®]) = ®: Finally, we deÆne the determinant of an m £ m matrix A as det (A) =
X^ m j=
a 1 j C 1 j :
In other words, the determinant of a (square) matrix is the sum of the products of the elements in the Ærst row of the matrix and the cofactors of the Ærst row. As is true of inverses, determinants usually are obtained using a computer.
Let c be a real or complex number (often called a scalar ). The scalar multiple of scalar c and matrix A, denoted cA, is obtained by multiplying every elements of A by c. If c = ¡ 1 , the scalar multiple is called the negative of A.
Assuming that they have the same number of rows and columns, the sum of two matrices A and B, denoted A + B, is the matrix obtained by adding the corresponding elements
4 Chapter 1 A Brief Review of Matrices and Vectors
b =
b 1 b 2 .. . bm
Keeping in mind the matrix dimensions required for matrix products deÆned above, the product of a and b is a 1 £ 1 matrix, given by aT^ b = bT^ a = a 1 b 1 + a 2 b 2 + ¢ ¢ ¢ + ambm =
X^ m i=
aibi:
This particular product is often called the dot- or inner product of two vectors. We have much more to say about this in the following section. ¤
As introduced in the previous section, we refer to an m £ 1 column matrix as a column vector. Such a vector assumes geometric meaning when we associate geometrical prop- erties with its elements. For example, consider the familiar two-dimensional (Euclid- ean) space in which a point is represented by its (x; y) coordinates. These coordinates can be expressed in terms of a column vector as follows:
u =
x y
Then, for example, point (1; 2) becomes the speciÆc vector
u =
Geometrically, we represent this vector as a directed line segment from the origin to point (1; 2). In three-dimensional space the vector would have components (x; y; z). In m-dimensional space we run out of letters and use the same symbol with subscripts to represent the elements of a vector. That is, an m-dimensional vector is represented as
x =
x 1 x 2 .. . xm
1.2 Vectors and Vector Spaces 5
When expressed in the form of these column matrices, arithmetic operations between vectors follow the same rules as they do for matrices. The product of a vector by scalar is obtained simply by multiplying every element of the vector by the scalar. The sum of two vectors x and y is formed by the addition of corresponding elements (x 1 + y 1 , x 2 + y 2 , and so on), and similarly for subtraction. Multiplication of two vectors is as deÆned in Example 1. Division of one vector by another is not deÆned.
DeÆnition of a vector space is both intuitive and straightforward. A vector space is deÆned as a nonempty set V of entities called vectors and associated scalars that satisfy the conditions outlined in A through C below. A vector space is real if the scalars are real numbersu it is complex if the scalars are complex numbers.
Condition A: There is in V an operation called vector addition , denoted x + y, that satisÆes:
Condition B: There is in V an operation called multiplication by a scalar that associates with each scalar c and each vector x in V a unique vector called the product of c and x, denoted by cx and xc, and which satisÆes:
Condition C: 1 x = x for all vectors x.
We are interested particularly in real vector spaces of real m £ 1 column matrices, with vector addition and multiplication by scalars being as deÆned earlier for matrices. We shall denote such spaces by <m: Using the notation introduced previously, vectors (col-
1.2 Vectors and Vector Spaces 7
Figure 1.
A set S of vectors v 1 ; v 2 ; : : : ; vn in V is said to span some subspace V 0 of V if and only if S is a subset of V 0 and every vector v 0 in V 0 is linearly dependent on the vectors in S. The set S is said to be a spanning set for V 0. A basis for a vector space V is a linearly independent spanning set for V. The number of vectors in the basis for a vector space is called the dimension of the vector space. If, for example, the number of vectors in the basis is n, we say that the vector space is n-dimensional.
An important aspect of the concepts just discussed lies in the representation of any vector in <m^ as a linear combination of the basis vectors. For example, any vector
x =
x 1 x 2 x 3
in <^3 can be represented as a linear combination of the basis vectors 2 (^64)
(^75) , and
8 Chapter 1 A Brief Review of Matrices and Vectors
A vector norm on a vector space V is a function that assigns to each vector v in V a nonnegative real number, called the norm of v, denoted by kvk. By deÆnition, the norm satisÆes the following conditions:
There are numerous norms that are used in practice. In our work, the norm most often used is the so-called 2- norm , which, for a vector x in real <m, space is deÆned as
kxk =
x^21 + x^22 + ¢ ¢ ¢ + x^2 m
The reader will recognize this expression as the Euclidean distance from the origin to point x, which gives this expression the familiar name of the Euclidean norm. The expression also is recognized as the length of a vector x, with origin at point 0. Based on the multiplication of two column vectors discussed earlier, we see that the norm also can be written as kxk =
xT^ x
The well known Cauchy-Schwartz inequality states that ¯¯ xT^ y
· kxk kyk : In words, this result states that the absolute value of the inner product of two vectors never exceeds the product of the norms of the vectors. This result is used in several places in the book. Another well-known result used in the book is the expression cos μ = x
T (^) y kxk kyk where μ is the angle between vectors x and y, from which we have that the inner product of two vectors can be written as xT^ y = kxk kyk cos μ: Thus, the inner product of two vectors can be expressed as a function of the norms of the vectors and the angle between the vectors.
From the preceding results we have the deÆnition that two vectors in <m^ are orthogonal if and only if their inner product is zero. Two vectors are orthonormal if, in addition to being orthogonal, the length of each vector is 1. From the concepts just discussed,
10 Chapter 1 A Brief Review of Matrices and Vectors
Example 1.3 Consider the matrix
M =
It is easy to verify that Me 1 = ¸ 1 e 1 and Me 2 = ¸ 2 e 2 for ¸ 1 = 1, ¸ 2 = 2 and
e 1 =
and e 2 =
In other words, e 1 is an eigenvector of M with associated eigenvalue ¸ 1 , and similarly for e 2 and ¸ 2 : ¤
The following properties, which we give without proof, are essential background in the use of vectors and matrices in digital image processing. In each case, we assume a real matrix of order m £ m although, as stated earlier, these results are equally applicable to complex numbers. We focus on real quantities simply because they play the dominant role in our work.
1.3 Eigenvalues and Eigenvectors 11
that A¡^1 = AT^ when matrix A is formed as was just described.
Example 1.4 Suppose that we have a random population of vectors, denoted by fxg, with covariance matrix (see the following chapter on a review of probability):
Cx = Ef(x ¡ mx)(x ¡ mx)T^ g
where E is the expected value operator and mx is the mean of the population. Covari- ance matrices are real, square, symmetric matrices which, from Property 3, are known to have a set of orthonormal eigenvectors.
Suppose that we perform a transformation of the form y = Ax on each vector x, where the rows of A are the orthonormal eigenvectors of Cx. The covariance matrix of the population fyg is
Cy = Ef(y ¡ my)(y ¡ my )T^ g = Ef(Ax ¡ Amx)(Ax ¡ Amx)T^ g = EfA(x ¡ mx)(x ¡ mx)T^ AT^ g = AEf(x ¡ mx)(x ¡ mx)T^ gAT = ACxAT
where A was factored out of the expectation operator because it is a constant matrix.
From Property 6, we know that Cy = ACxAT^ is a diagonal matrix with the eigenval- ues of Cx along its main diagonal. Recall that the elements along the main diagonal of a covariance matrix are the variances of the components of the vectors in the popu- lation. Similarly, the off diagonal elements are the covariances of the components of these vectors The fact that the covariance Cy is diagonal means that the elements of the vectors in the population fyg are uncorrelated (their covariances are 0). Thus, we see that application of the linear transformation y = Ax involving the eigenvectors of Cx decorrelates the data, and the elements of Cy along its main diagonal give the variances of the components of the yzs along the eigenvectors. Basically, what has been accom- plished here is a coordinate transformation that aligns the data along the eigenvectors of the covariance matrix of the population.
The preceding concepts are illustrated in Fig. 1.2. Figure 1.2(a) shows a data population fxg in two dimensions, along with the eigenvectors of Cx (the black dot is the mean). The result of performing the transformation y = A(x ¡ mx) on the xzs is shown in
The principal objective of the following material is to start with the basic principles of probability and to bring the reader to the level required to be able to follow all probability-based developments in the book.
Probability events are modeled as sets, so it is customary to begin a study of probability by deÆning sets and some simple operations among sets.
Informally, a set is a collection of objects , with each object in a set often referred to as an element or member of the set. Familiar examples include the set of all image processing books in the world, the set of prime numbers, and the set of planets circling the sun. Typically, sets are represented by uppercase letters, such as A, B, and C, and members of sets by lowercase letters, such as a, b, and c. We denote the fact that an element a belongs to set A by a 2 A If a is not an element of A then we write a = 2 A:
A set can be speciÆed by listing all of its elements, or by listing properties common to all elements. For example, suppose that I is the set of all integers. A set B consisting the Ærst Æve nonzero integers is speciÆed using the notation B = f 1 ; 2 ; 3 ; 4 ; 5 g: The set of all integers less than 10 is speciÆed using the notation C = fc 2 I j c < 10 g
14 Chapter 2 A Brief Review of Probability and Random Variables
which we read as |C is the set of integers such that each members of the set is less than 10.} The |such that} condition is denoted by the symbol \ j " and, as is shown in the previous two equations, the elements of the set are enclosed by curly brackets. The set with no elements is called the empty or null set, which we denote by ;:
Two sets A and B are said to be equal if and only if they contain the same elements. Set equality is denoted by A = B: If the elements of two sets are not the same, we say that the sets are not equal, and denote this by A 6 = B: If every element of B is also an element of A, we say that B is a subset of A: B μ A where the equality is included to account for the case in which A and B have the same elements. If A contains more elements than B, then B is said to be a proper subset of A, and we use the notation B ½ A:
Finally, we consider the concept of a universal set , which we denote by U and deÆne to be the set containing all elements of interest in a given situation. For example, in an experiment of tossing a coin, there are two possible (realistic) outcomes: heads or tails. If we denote heads by H and tails by T , the universal set in this case is fH; T g. Similarly, the universal set for the experiment of throwing a single die has six possible outcomes, which normally are denoted by the face value of the die, so in this case U = f 1 ; 2 ; 3 ; 4 ; 5 ; 6 g: For obvious reasons, the universal set is frequently called the sample space , which we denote by S. It then follows that, for any set A, we assume that ; μ A μ S, and for any element a, a 2 S and a =2 ;:
The operations on sets associated with basic probability theory are straightforward. The union of two sets A and B, denoted by A [ B is the set of elements that are either in A or in B, or in both. In other words, A [ B = fz j z 2 A or z 2 Bg: Similarly, the intersection of sets A and B, denoted by A \ B