Linear Minimum Mean-Square Error Filtering - Lecture Notes | ECE 6010, Study notes of Stochastic Processes

Material Type: Notes; Class: Stochastic Processes in Electronic Systems; Subject: Electrical & Computer Engr; University: Utah State University; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-flz
koofers-user-flz 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECE 6010
Lecture 9 Linear Minimum Mean-Square Error Filtering
Background
Recall that for random variable Xand Ywith finite variance, the MSE E[(Xh(Y))2]is
minimized by h(Y) = E[X|Y]. That is, the best estimate of Xusing a measured value of
Yis to find the conditional average of X. One aspect of this estimate is that:
The error is orthogonal to the data.
More precisely, the error XE[X|Y]is orthogonal to Yand to every function of Y:
E[(XE[X|Y])g(Y)] = 0
for all measurable functions g. We will assume that E[g2(Y)] <.
We wantto show that hminimizes E[(Xh(Y))2]if and only if E[(Xh(Y))g(Y)] =
0(orthogonality), for all measurable gsuch that E[g2(Y)] <.
E[(XE[X|Y])g(Y)] = E[E[(XE[X|Y])|Y]g(Y)] = E[(E[X|Y]E[X|Y])g(Y)] = 0.
Conversely, suppose for some g,E[(Xh(Y))g(Y)] 6= 0. Consider the estimate
ˆ
h(Y) = h(Y) + αg(Y),
where
α=E[(Xh(Y))g(Y)]
E[g2(Y)] .
Then
E[(Xˆ
h(Y))2] = E[(Xh(Y))2](E[(Xh(Y))g(Y)])2
E[g2(Y)] < E[(Xh(Y))2].
Suppose now we are given two random processes {Xt}and {Yt}that are statistically
related (that is, not independent). Suppose, to begin, that T=R. Suppose we observe
Yover the interval [a, b], and based on the informationgained we want to estimate Xtfor
some fixed tas a function of {Yt, a tb}. That is, we form
ˆ
Xt=f({Yτ, a τb})
for some functional fmapping the function to real numbers.
If t < b: We say that the operation of the function is smoothing.
If t=b: We way that the operation of the function is filtering.
If t > b: We way that the operation of the function is prediction.
The error in the estimate is Xtˆ
Xt. The mean-squared error is E[(Xtˆ
Xt)2].
Fact (built on our previous intuition): The MSE E[(Xtˆ
Xt)2]is minimized by the
conditional expectation ˆ
X(t) = E[Xt|Yτ, a τb].
Furthermore, the orthogonality principle applies: XtE[Xt|Yτ, a τb]is orthogonal
to every function of {Yτ, a τb}.
While we know the theoretical result, it is difficult in general to compute the desired
conditional expectation.
Definition 1 Suppose {Yt}is second order. Let Hybe the set of all random variables of
the form Pn
i=1 aiYti+cfor nZand ai, c Rand ti[a, b].2
pf3
pf4
pf5

Partial preview of the text

Download Linear Minimum Mean-Square Error Filtering - Lecture Notes | ECE 6010 and more Study notes Stochastic Processes in PDF only on Docsity!

ECE 6010

Lecture 9 – Linear Minimum Mean-Square Error Filtering

Background

Recall that for random variable X and Y with finite variance, the MSE E[(X − h(Y ))^2 ] is minimized by h(Y ) = E[X|Y ]. That is, the best estimate of X using a measured value of Y is to find the conditional average of X. One aspect of this estimate is that:

The error is orthogonal to the data.

More precisely, the error X − E[X|Y ] is orthogonal to Y and to every function of Y :

E[(X − E[X|Y ])g(Y )] = 0

for all measurable functions g. We will assume that E[g^2 (Y )] < ∞. We want to show that h minimizes E[(X−h(Y ))^2 ] if and only if E[(X−h(Y ))g(Y )] = 0 (orthogonality), for all measurable g such that E[g^2 (Y )] < ∞.

E[(X − E[X|Y ])g(Y )] = E[E[(X − E[X|Y ])|Y ]g(Y )] = E[(E[X|Y ] − E[X|Y ])g(Y )] = 0.

Conversely, suppose for some g, E[(X − h(Y ))g(Y )] 6 = 0. Consider the estimate

ˆh(Y ) = h(Y ) + αg(Y ),

where

α =

E[(X − h(Y ))g(Y )] E[g^2 (Y )]

Then

E[(X − ˆh(Y ))^2 ] = E[(X − h(Y ))^2 ] −

(E[(X − h(Y ))g(Y )])^2 E[g^2 (Y )]

< E[(X − h(Y ))^2 ].

Suppose now we are given two random processes {Xt} and {Yt} that are statistically related (that is, not independent). Suppose, to begin, that T = R. Suppose we observe Y over the interval [a, b], and based on the information gained we want to estimate Xt for some fixed t as a function of {Yt, a ≤ t ≤ b}. That is, we form

X^ ˆt = f ({Yτ , a ≤ τ ≤ b})

for some functional f mapping the function to real numbers. If t < b: We say that the operation of the function is smoothing. If t = b: We way that the operation of the function is filtering. If t > b: We way that the operation of the function is prediction. The error in the estimate is Xt − Xˆt. The mean-squared error is E[(Xt − Xˆt)^2 ]. Fact (built on our previous intuition): The MSE E[(Xt − Xˆt)^2 ] is minimized by the conditional expectation Xˆ(t) = E[Xt|Yτ , a ≤ τ ≤ b].

Furthermore, the orthogonality principle applies: Xt − E[Xt|Yτ , a ≤ τ ≤ b] is orthogonal to every function of {Yτ , a ≤ τ ≤ b}. While we know the theoretical result, it is difficult in general to compute the desired conditional expectation.

Definition 1 Suppose {Yt} is second order. Let Hy be the set of all random variables of the form

∑n i=1 aiYti^ +^ c^ for^ n^ ∈^ Z^ and^ ai, c^ ∈^ R^ and^ ti^ ∈^ [a, b].^2

Note that Hy may include infinite sequences, so we assume mean-square limits. The set Hy contains mean-square derivatives, mean-square integrals, and other linear transfor- mations of {Yt, t ∈ [a, b]}. (The set Hy is the Hilbert space generated by the linear span of Yt.) Let’s now solve min X^ ˆt ∈Hy

E[(Xt − Xˆt)^2 ]. ((*))

A couple important properties:

  • If E[X t^2 ] < ∞ then Xˆt ∈ Hy solves (*) if and only if E[(Xt − Xˆt)Z] = 0 for all Z ∈ Hy. That is, the error is orthogonal to all elements of Hy.

Proof “If”: Suppose Xˆt ∈ Hy satisfies E[(Xt − Xˆt)Z] = 0 for all Z ∈ Hy. Let X t∗ be an element of Hy.

E[(Xt − X t∗ )^2 ] = E[(Xt − Xˆt + Xˆt − X∗ t )^2 ] = E[(Xt − Xˆt)^2 ] + 2E[(Xt − Xˆt) ( Xˆt − X t∗ ) ︸ ︷︷ ︸ ∈Ht

] + E[( Xˆt − X t∗ )^2 ]

= E[(Xt − Xˆt)^2 ] + E[( Xˆt − X t∗ )^2 ] ≥ E[(Xt − Xˆt)^2 ].

So the orthogonality condition is sufficient for achieving MMSE. “Only if”: Suppose Xˆt ∈ Hy , and there is an element Z ∈ Hy such that E[(Xt − X^ ˆt)Z] 6 = 0. We will show that there would then be a better estimate: Let

X t∗ = Xˆt + E[(Xt − Xˆt)Z] E[Z^2 ]

Then

E[(Xt − Xt∗)^2 ] = E[(Xt − Xˆt)^2 ] −

(E[(Xt − Xˆt)Z])^2 E[Z^2 ] < E[(Xt − X t∗ )^2 ].

So Xˆt cannot be the MMSE estimator, which implies the necessity of the orthogo- nality condition. 2

  • E[(Xt − Xˆt)Z] = 0 for all Z ∈ Hy if and only if E[ Xˆt] = E[Xt] and E[(Xt − X^ ˆt)Yτ ] = 0 for all τ ∈ [a, b]. This is a restatement of orthogonality, but for a restricted space.

Proof “Only if” (necessity): Want to show that E[(Xt − Xˆt)Z] = 0 only if E[(Xt − X^ ˆt)] = 0 and E[(Xt − Xˆt)Yτ ] = 0 for all τ ∈ [a, b]. But this comes by definition, since 1 ∈ Hy and Yτ ∈ Hy for each τ ∈ [a, b]. “If”: (sufficiency) Suppose Z ∈ Hy and E[Xt − Xˆt] = 0 and E[(Xt − Xˆt)Yτ ] = 0 for all τ ∈ [a, b]. (That is, the error is orthogonal to each Y τ .) Then for Z = lim n→∞

(M S)(

∑^ n

i=

aiYti + c)

we have E[(Xt − Xˆt)Z] = lim n→∞ E[(Xt − Xˆt)(

∑^ n

i=

aiYti + c)],

for −∞ < τ < ∞. Let ν = σ − τ. Then

RXY (t − τ ) =

−∞

h(t, ν + τ )RY (ν) dν

Let s = t − τ :

RXY (s) =

−∞

h(t, ν + t − s)RY (ν) dν.

Observe that the left-hand side is independent of t. Thus, if there is a solution, there must be a solution which is independent of t. This means that there is a time-invariant solution; we will call it h 0. Then, by a particular choice of the form of h 0 we can write

RXY (s) =

−∞

h 0 (s − ν)RY (ν) dν.

That is, RXY = h 0 ∗ RY.

How to solve for h 0? Easiest way is to use Fourier transforms:

SXY (ω) = H 0 (ω)SY (ω)

so

H 0 (ω) =

SXY (ω) SY (ω)

The filter in this case is called a Non-Causal Wiener Filter.

Example 2 Suppose Yt = St + Nt

where St is some signal (random process) of interest and Nt is some “noise” process. Assume that St and Nt are independent, and individually and jointly W.S.S. Also assume that they are zero mean. Let Xt = St+λ.

Given {Yt}, we want to estimate Xt+λ. If λ = 0, this is a filtering problem. If λ > 0 , this is a prediction problem.If λ < 0 , this is a smoothing problem. We find SY (ω) = SS (ω) + SN (ω) + 2 Re[SSN (ω)] SXY (ω) = eiωλSSY (ω) = eiωλ[SS (ω) + SSN (ω)]

We obtain the transfer function

H 0 (ω) = SXY (ω) SY (ω)

SS (ω) + SSN (ω) SS (ω) + SN (ω) + 2<[SSN (ω)]

eiωλ.

If signal and noise are orthogonal,

H 0 (ω) =

SXY (ω) SY (ω)

SS (ω) SS (ω) + SN (ω)

eiωλ.

Let us look at the amplitude gain part:

SS SS + SN

SS /SN

SS /SN + 1

1 SS /SN  1

0 SS /SN  1

It can be shown that the residual error for the noncausal Wiener filter is

M M SE =

2 π

−∞

[SX (ω) −

|SXY (ω)|^2 SY (ω) ]dω.

This can be seen as follows:

E[(Xt − Xˆt)^2 ] = E[(Xt − Xˆt)Xt] − E[(Xt − Xˆt) Xˆt].

By orthogonality, the last term is 0, which implies that E[Xt Xˆt] = E[ Xˆ t^2 ]. We thus obtain

E[(Xt − Xˆt)^2 ] = E[X t^2 ] − E[ XˆtXt] = E[X t^2 ] − E[ Xˆ^2 t ] =

2 π

−∞

[SX (ω) − S (^) Xˆ (ω)] dω

The MMSE is sometimes written as

M M SE =

2 π

−∞

SX (ω)[1 − |ρXY (ω)|^2 ] dω

where

ρXY (ω) =

SXY (ω) √ SX (ω)SY (ω)

Example 3 For the signal + noise problem, we have

M M SE =

2 π

−∞

SS (ω)SN (ω) SS (ω) + SN (ω)

dω.

2

Example 4 Let us now do the signal + noise problem for a particular signal source. Sup- pose

SS (ω) =

A^2

α^2 + ω^2 and

SN (ω) =

N 0

(white noise). Then

H 0 (ω) = SS (ω) SS (ω) + SN (ω)

eiωλ

A^2 /(α^2 + ω^2 ) A^2 /(α^2 + ω^2 ) + N 0 / 2

eiωλ

2 A^2

2 A^2 + N 0 (α^2 + ω^2 ) eiωλ

2 A^2

N 0

ω^2 + α^2 + 2A^2 /N 0

eiωλ

2 A^2

N 0

ω^2 + β^2

eiωλ

A^2

βN 0

2 β ω^2 + β^2 eiωλ

Then

h 0 (t) =

A^2

βN 0

e−β|t+λ|.

This is not a causal filter. Plot for various values of λ. 2