
















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Professor: Reichenbach; Class: Digital Image Processing; Subject: Computer Science and Engineering ; University: University of Nebraska - Lincoln; Term: Unknown 1989;
Typology: Study notes
1 / 24
This page cannot be seen from the preview
Don't miss anything!

















If a function or signal is highly variable over intervals smaller than the scale of interest, then it is common to smooth the function to more clearly illustrate the trend over larger intervals. For example, it is common in stock analysis to look at the moving-average price, where the moving average price on a given date is the average during a specified time period around that date.
[Illustration of a graph with variable and moving-average stock price.]
Similarly, we can compute a result in which each output value is the local mean in a small, two-dimensional neighborhood around each pixel of the input image.
Example. Consider this 4x4 image p and the result r of smoothing by averaging over the marked 3x3 neighborhood about pixel p [1, 1]:
p =
r [1, 1] =
The local mean of the indicated neighborhood is (2 + 2 + 2 + 1 + 1 + 1 + 2 + 2 + 2)/9 = 15/9. The local mean values at other pixels could computed similarly.
One of the motivations for local averaging is to suppress noise. Consider an image p with additive noise e corrupting a scene s:
p [m, n] = s [m, n] + e [m, n]
where the noise has the following properties:
E {e [m, n]} = 0
E {e [m, n] e [m′, n′]} =
{ Re [0, 0] if [m, n] = [m′, n′] 0 otherwise
E {s [m, n] e [m′, n′]} = 0
As we have seen before, the expected mean-square error for this system model is just the mean-square of the noise:
{ (p [m, n] − s [m, n])^2
} = E
{ (s [m, n] + e [m, n] − s [m, n])^2
}
{ (e [m, n])^2
}
= Re [0, 0]
Consider the image resulting from the local average of a pixel with its left neighbor:
r [m, n] =
(p [m, n] + p [m, n − 1])
We can compute the mean-square error of this image as:
{ (r [m, n] − s [m, n])^2
}
{( 1 2
(p [m, n] + p [m, n − 1]) − s [m, n]
) 2 }
{( 1 2
(s [m, n] + e [m, n] + s [m, n − 1] + e [m, n − 1]) − s [m, n]
) 2 }
{ (s [m, n − 1] − s [m, n] + e [m, n] + e [m, n − 1])^2
}
E {(s [m, n − 1] (s [m, n − 1] − s [m, n] + e [m, n] + e [m, n − 1]) −s [m, n] (s [m, n − 1] + s [m, n] − e [m, n] − e [m, n − 1]) +e [m, n] (s [m, n − 1] − s [m, n] + e [m, n] + e [m, n − 1]) +e [m, n − 1] (s [m, n − 1] − s [m, n] + e [m, n] + e [m, n − 1]))}
Exercise. What is the simplified expression for the expected mean-square error in the previous example if the smoothed result at each pixel is the sum of one-half of the image pixel value plus one-quarter of the immediate left neighbor plus one-quarter of the immediate right neighbor?
r [m, n] =
(p [m, n − 1] + 2 · p [m, n] + p [m, n + 1])
Typically, there are a few non-zero weights centered about the central element, so it is convenient to write the weighting function with only these central, non-zero pixels, regardless of the image size. For example, the weights in the previous exercise (the sum of one-half of the image pixel value plus one-quarter of the immediate left neighbor plus one-quarter of the immediate right neighbor) can be written
[ 1 2 1
]
where the box indicates the [0, 0] element.
This example is one-dimensional for simple presentation. Even with a two-dimensional image, directional smoothing may be desired, but most image smoothing is done with a neighborhood extending in both dimensions. A variety of weighting functions are used for two-dimensional smoothing. For example, the weights for averaging a pixel equally with its nearest four-neighbors is:
.
Likewise, the average with the nearest eight-neighbors is:
.
A digital approximation of a two-dimensional Gaussian smoothing function rolls off with distance from the center. For example:
.
The degree of smoothing and the required computation both increase with the number of weighted neighbors.
As in the previous subsection, the result r of smoothing by summing one-half of the pixel value plus one-quarter of the immediate left neighbor plus one-quarter of the immediate right neighbor is:
r [m, n] =
(p [m, n − 1] + 2 · p [m, n] + p [m, n + 1])
This operation of weighting and summing pixels can be generalized to weight and sum any neighborhood of pixels (even all of the pixels).
The generalized operation of weighting and summing neighboring pixels is called convo- lution. The mathematical equation for convolution of image p with weights w is written r = p ⊗ w and defined as:
r [m, n] =
∑
m′
∑
n′
p [m − m′, n − n′] w [m′, n′]
The weights w are applied to the (general) neighborhood around the pixel at [m, n] — the weight on the pixel p [m, n] is w [0, 0], the weight on the neighbor to the left p [m, n − 1] is w [0, 1], the weight on the neighbor on the right p [m, n − (−1)] is w [0, −1], the weight on the neighbor above p [m − 1 , n] is w [1, 0], and so on. The sum is normalized by the number of pixels. (Some alternate definitions of convolution do not normalize.)
The weighting function is referred to as a convolution mask. The weighting function often involves only a few pixels around its center. The neighborhood of non-zero elements is called
Exercise. Detail the terms of the sum for [m, n] = r [3, 0] in the in this example.
Note that convolution with a mask of uniform weights is the discrete analog of calculus integration over the range of the mask. As will be seen, convolution with a different mask also is the discrete analog of calculus differntiation. So, convolution is a powerful and fundamental operation.
In convolution, each neighbor pixel in the input image p is shifted according to the index of the weighting function. The shift operation is:
r = Sm′,n′ {p}
where:
r [m, n] = p [(m − m′) %M, (n − n′) %N ]
with the modulus operator ’%’ implicit on array indices, even where where not written. A positive row shift is down and a negative row shift is up. A positive column shift is right and a negative row shift is left.
Example. Consider the image p, the result of shifting +1 row, and the result of shifting −1 column:
p =
, S 1 , 0 {p} =
, S 0 ,− 1 {p} =
.
Shifting the image 1 row causes the rows to shift down and the bottom row to wrap-around to the top. Shifting the image -1 column causes the columns to shift left and the left-most row to wrap-around to the right.
Local averaging (such as for noise suppression) causes a blurring of the image. This is most obvious along a sharp graylevel transition or edge. Although one might think that a blurry image is undesireable, it can be intended. For example, in motion pictures, shots of the “leading lady” often are blurred slightly to give a softer appearance (and attenuate blemishes and wrinkles).
Example. Consider the result r of convolving the image p with a kernel w that sums half of the pixel with one-quarter of its left and right neighbors:
p =
[
... 0 0 0 16 16 16...
]
w =
[ 1 2 1
]
[
... 0 0 4 12 16 16...
]
The original image has a sharp graylevel transition that is blurred in the result.
Note that the two-dimensional digital-approximation of a Gaussian introduced as a blurring function in the previous subsection is separable with respect to convolution. That is, it is equivalent to two one-dimensional convolutions:
=
( M N 6
[ 1 4 1
]) ⊗
M N 6
Because convolution is associative, the 3×3 kernel and the sequence of the 1×3 and 3× 1 kernel yield the same result when applied to an image. Note that not all convolution kernels are separable.
The computational complexity of convolution is the number of pixels times the number of non-zero kernel elements. The 3×3 kernel requires approximately 9M N multiplications and additions, but the sequence of the 1×3 and 3×1 kernels requires only 6M N multiplications and additions (3M N for each kernel). So, the sequence of these two smaller kernels can be applied with fewer operations than the larger kernel.
the convolution kernel for unsharp filtering is:
r = p ⊗ 2 M N δ − p ⊗ w
= p ⊗ (2M N δ − w) = p ⊗ w′
where
w′^ =
[ − 1 6 − 1
] .
Note that the unsharp subtraction with a blurring function is not the convolution inverse of blurring with that function. The problem of inverting convolution is presented after more mathematical tools are introduced.
A two-dimensional unsharp filter based on the two-dimensional digital-approximation of a Gaussian presented earlier would be:
.
Convolution is also the operation by which discrete-difference analogs of directional deriva- tives are computed. Because digital images are discrete, they don’t have true derivatives, but the discrete differences betweeen pixels indicate the rate of graylevel change much the same way that derivatives do.
Example. Consider the result r of convolving image p and a kernel w that yields the difference between each pixel and its left neighbor:
p =
[
... 0 0 8 8 8 0 0...
]
w = M N
[ 1 − 1
]
r =
[
... 0 0 8 0 0 − 8 0...
]
This kernel w is not symmetric and the assymmetry shifts the response to the rising and falling edge to the right of the transition. A similar kernel that computed the difference between the right neighbor pixel and the pixel shifts the response to the left of the edge. The directional derivative is sometimes computed as the average of these two differences with adjacent neighbors, which eliminates spatial shift and provides a smoother result:
w =
( M N
[ 0 1 − 1
]
[ 1 − 1 0
])
[ 1 0 − 1
]
This result is anti-symmetric. That is, there is symmetry in magnitude, but the value is negated:
w [m, n] = −w [−m, −n].
For a two-dimensional images there are two directional, discrete-difference approximations of derivatives:
∂xp = p ⊗
∂yp = p ⊗
[ 1 0 − 1
]
The pair of these differences is called the gradient:
∇p = (∂xp, ∂yp)
The gradient can be written in Cartesian coordinates (as above) or in radial coordinates with magnitude and angle:
|∇p| =
√ (∂xp)^2 + (∂yp)^2
∇φp = tan−^1
( ∂yp ∂xp
)
Without wraparound and ignoring the scale of the mask, serial multiplication yields:
Recognizing the product of the [0, 0] elements (noted with the square), implementing wraparound, and scaling yields:
Convolution is a special sort of matrix multiplication.
Example. Consider the one-dimensional convolution :
[ 12 4 0 0 4 12 16 16
] ⊗
[ 1 2 1
]
Without wraparound, this operation can be written as matrix multiplication as:
With wraparound, this operation can be written as matrix multiplication as:
For convolution without wraparound, the convolution mask defines a matrix with identical values along each diagonal. A matrix with identical values along each diagonal is called a Toeplitz matrix. This matrix has many special properties and is well documented in the mathematical literature.
For convolution with wraparound, the convolution mask defines a Toeplitz matrix with the additional constraint that values that cycle off the end of the previous row are circulated back to the beginning of the new row. This type of matrix is called circulant. Circulant matrices have many important properties that are well documented in the mathematical literature. One of the most important properties is that the eigenvectors of all circulant matrices are the complex exponentials. This is the basis of discrete Fourier methods.
Expressing two-dimensional convolutions to matrix form is a more tedious process entailing vectorization (typically by raster ordering) of the two-dimensional image and the convolution mask.
Convolution is related to the cross-correlation function. The convolution of images p and q is:
r [m, n] =
M∑ − 1
m′=
N∑ − 1
n′=
p [m − m′, n − n′] q [m′, n′]
The cross-corelation of images p and q is:
Rp,q [m, n] =
M∑ − 1
m′=
N∑ − 1
n′=
p [m′, n′] q [m′^ + m, n′^ + n].
M∑ − 1
m′=
N∑ − 1
n′=
p [m′^ − m, n′^ − n] q [m′, n′].
Prove p ⊗ (q ⊗ r) == (p ⊗ q) ⊗ r.
The definition of convolution is:
p ⊗ (q ⊗ r) =
∑
n′
p [n − n′]
( 1 N
∑
n′′
q [n′^ − n′′] r [n′′]
)
With the distributive property of scalar multiplication over addition and the commutative property of scalar addition:
∑
n′′
( 1 N
∑
n′
p [n − n′] q [n′^ − n′′]
) r [n′′].
Let n′′′^ = n′^ − n′′. With periodicity:
∑
n′′
( 1 N
∑
n′′′
p [n − n′′^ − n′′′] q [n′′′]
) r [n′′]
= (p ⊗ q) ⊗ r
Prove p ⊗ (q + r) = p ⊗ q + p ⊗ r.
p ⊗ (q + r) = p ⊗ q + p ⊗ r
The identity operand for convolution is the discrete delta function:
N δ[n] =
{ N n = 0 0 otherwise
p [n] =
N∑ − 1
n′=
p [n − n′] (N δ [n′])
Linear, shift-invariant systems are fundamental to digital image processing. Cameras, dis- play monitors, and most acquisition and display devices as well as many image processing algorithms are generally linear and shift-invariant.
Linearity consists of two properties — additivity and scaling.
Additivity means that the system gives the same result whether operands are added before or after the system. That is, for a system O and two images p and q:
O {p + q} = O {p} + O {q}.
Scaling means that the system gives the same result whether a multiplicative gain is applied before or after the system. That is, for a system O, image p, and scale factor α:
O {αp} = αO {p}
Example. A system that multiplies each pixel value by a constant factor β has both the additivity property and the scaling property. Defining:
r = O {p}
where
r [m, n] = βp [m, n].
Then, the system has the additivity property:
O {p + q} = O {p} + O {q}
because
β (p [m, n] + q [m, n]) = βp [m, n] + βq [m, n].
Exercise. Show that the shift operation has the additive property and the scaling property. That is, for:
r = Sm′,n′^ {p}
where:
r [m, n] = p [(m − m′) %M, (n − n′) %N ]
that:
Sm′,n′^ {p + q} = Sm′,n′^ {p} + Sm′,n′^ {q}
and
αSm′,n′ {p} = Sm′,n′ {αp}.
Exercise. Define an operation that is linear (i.e., has both additivity and scaling) and show that it is linear.
Define an operation that is not linear (i.e., either does not have additivity or does not have scaling) and show that it is not linear.
Linearity combines additivity and scaling, so a system O is linear if and only if:
O {αp + βq} = αO {p} + βO {q}.
Linearity can be generalized by induction for any number of images. That is, a system O is linear if and only if:
{ ∑
i
αipi
∑
i
αiO {pi}
A system is shift-invariant if the result is the same whether the shifting occurs before or after the system. That is, a system O is shift-invariant if and only if:
Sm′,n′^ {O {p}} = O {Sm′,n′^ {p}}.
Again, wraparound shift is assumed.
Example. A system that multiplies each pixel value by a constant factor β has the property of shift-invariance: Defining:
r = O {p}
where
r [m, n] = βp [m, n].
Then:
O {Sm′,n′ {p}} = Sm′,n′ {O {p}}
because
βp [m − m′, n − n′] = βp [m − m′, n − n′]
Example. A system that adds the value of the pixel at [0, 0] to each pixel does not have the property of shift-invariance: Defining:
r = O {p}
where
r [m, n] = p [m, n] + p [0, 0].
Then, there are images for which:
O {Sm′,n′^ {p}} 6 = Sm′,n′^ {O {p}}
because
p [m − m′, n − n′] + p [−m′, −n′] 6 = p [m − m′, n − n′] + p [0, 0]