



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Stochastic Processes in Electronic Systems; Subject: Electrical & Computer Engr; University: Utah State University; Term: Unknown 1989;
Typology: Study notes
1 / 6
This page cannot be seen from the preview
Don't miss anything!




Recall that for random variable X and Y with finite variance, the MSE E[(X − h(Y ))^2 ] is minimized by h(Y ) = E[X|Y ]. That is, the best estimate of X using a measured value of Y is to find the conditional average of X. One aspect of this estimate is that:
The error is orthogonal to the data.
More precisely, the error X − E[X|Y ] is orthogonal to Y and to every function of Y :
E[(X − E[X|Y ])g(Y )] = 0
for all measurable functions g. We will assume that E[g^2 (Y )] < ∞. We want to show that h minimizes E[(X−h(Y ))^2 ] if and only if E[(X−h(Y ))g(Y )] = 0 (orthogonality), for all measurable g such that E[g^2 (Y )] < ∞.
E[(X − E[X|Y ])g(Y )] = E[E[(X − E[X|Y ])|Y ]g(Y )] = E[(E[X|Y ] − E[X|Y ])g(Y )] = 0.
Conversely, suppose for some g, E[(X − h(Y ))g(Y )] 6 = 0. Consider the estimate
ˆh(Y ) = h(Y ) + αg(Y ),
where
α =
E[(X − h(Y ))g(Y )] E[g^2 (Y )]
Then
E[(X − ˆh(Y ))^2 ] = E[(X − h(Y ))^2 ] −
(E[(X − h(Y ))g(Y )])^2 E[g^2 (Y )]
< E[(X − h(Y ))^2 ].
Suppose now we are given two random processes {Xt} and {Yt} that are statistically related (that is, not independent). Suppose, to begin, that T = R. Suppose we observe Y over the interval [a, b], and based on the information gained we want to estimate Xt for some fixed t as a function of {Yt, a ≤ t ≤ b}. That is, we form
X^ ˆt = f ({Yτ , a ≤ τ ≤ b})
for some functional f mapping the function to real numbers. If t < b: We say that the operation of the function is smoothing. If t = b: We way that the operation of the function is filtering. If t > b: We way that the operation of the function is prediction. The error in the estimate is Xt − Xˆt. The mean-squared error is E[(Xt − Xˆt)^2 ]. Fact (built on our previous intuition): The MSE E[(Xt − Xˆt)^2 ] is minimized by the conditional expectation Xˆ(t) = E[Xt|Yτ , a ≤ τ ≤ b].
Furthermore, the orthogonality principle applies: Xt − E[Xt|Yτ , a ≤ τ ≤ b] is orthogonal to every function of {Yτ , a ≤ τ ≤ b}. While we know the theoretical result, it is difficult in general to compute the desired conditional expectation.
Definition 1 Suppose {Yt} is second order. Let Hy be the set of all random variables of the form
∑n i=1 aiYti^ +^ c^ for^ n^ ∈^ Z^ and^ ai, c^ ∈^ R^ and^ ti^ ∈^ [a, b].^2
Note that Hy may include infinite sequences, so we assume mean-square limits. The set Hy contains mean-square derivatives, mean-square integrals, and other linear transfor- mations of {Yt, t ∈ [a, b]}. (The set Hy is the Hilbert space generated by the linear span of Yt.) Let’s now solve min X^ ˆt ∈Hy
E[(Xt − Xˆt)^2 ]. ((*))
A couple important properties:
Proof “If”: Suppose Xˆt ∈ Hy satisfies E[(Xt − Xˆt)Z] = 0 for all Z ∈ Hy. Let X t∗ be an element of Hy.
E[(Xt − X t∗ )^2 ] = E[(Xt − Xˆt + Xˆt − X∗ t )^2 ] = E[(Xt − Xˆt)^2 ] + 2E[(Xt − Xˆt) ( Xˆt − X t∗ ) ︸ ︷︷ ︸ ∈Ht
] + E[( Xˆt − X t∗ )^2 ]
= E[(Xt − Xˆt)^2 ] + E[( Xˆt − X t∗ )^2 ] ≥ E[(Xt − Xˆt)^2 ].
So the orthogonality condition is sufficient for achieving MMSE. “Only if”: Suppose Xˆt ∈ Hy , and there is an element Z ∈ Hy such that E[(Xt − X^ ˆt)Z] 6 = 0. We will show that there would then be a better estimate: Let
X t∗ = Xˆt + E[(Xt − Xˆt)Z] E[Z^2 ]
Then
E[(Xt − Xt∗)^2 ] = E[(Xt − Xˆt)^2 ] −
(E[(Xt − Xˆt)Z])^2 E[Z^2 ] < E[(Xt − X t∗ )^2 ].
So Xˆt cannot be the MMSE estimator, which implies the necessity of the orthogo- nality condition. 2
Proof “Only if” (necessity): Want to show that E[(Xt − Xˆt)Z] = 0 only if E[(Xt − X^ ˆt)] = 0 and E[(Xt − Xˆt)Yτ ] = 0 for all τ ∈ [a, b]. But this comes by definition, since 1 ∈ Hy and Yτ ∈ Hy for each τ ∈ [a, b]. “If”: (sufficiency) Suppose Z ∈ Hy and E[Xt − Xˆt] = 0 and E[(Xt − Xˆt)Yτ ] = 0 for all τ ∈ [a, b]. (That is, the error is orthogonal to each Y τ .) Then for Z = lim n→∞
∑^ n
i=
aiYti + c)
we have E[(Xt − Xˆt)Z] = lim n→∞ E[(Xt − Xˆt)(
∑^ n
i=
aiYti + c)],
for −∞ < τ < ∞. Let ν = σ − τ. Then
RXY (t − τ ) =
−∞
h(t, ν + τ )RY (ν) dν
Let s = t − τ :
RXY (s) =
−∞
h(t, ν + t − s)RY (ν) dν.
Observe that the left-hand side is independent of t. Thus, if there is a solution, there must be a solution which is independent of t. This means that there is a time-invariant solution; we will call it h 0. Then, by a particular choice of the form of h 0 we can write
RXY (s) =
−∞
h 0 (s − ν)RY (ν) dν.
That is, RXY = h 0 ∗ RY.
How to solve for h 0? Easiest way is to use Fourier transforms:
SXY (ω) = H 0 (ω)SY (ω)
so
H 0 (ω) =
SXY (ω) SY (ω)
The filter in this case is called a Non-Causal Wiener Filter.
Example 2 Suppose Yt = St + Nt
where St is some signal (random process) of interest and Nt is some “noise” process. Assume that St and Nt are independent, and individually and jointly W.S.S. Also assume that they are zero mean. Let Xt = St+λ.
Given {Yt}, we want to estimate Xt+λ. If λ = 0, this is a filtering problem. If λ > 0 , this is a prediction problem.If λ < 0 , this is a smoothing problem. We find SY (ω) = SS (ω) + SN (ω) + 2 Re[SSN (ω)] SXY (ω) = eiωλSSY (ω) = eiωλ[SS (ω) + SSN (ω)]
We obtain the transfer function
H 0 (ω) = SXY (ω) SY (ω)
SS (ω) + SSN (ω) SS (ω) + SN (ω) + 2<[SSN (ω)]
eiωλ.
If signal and noise are orthogonal,
H 0 (ω) =
SXY (ω) SY (ω)
SS (ω) SS (ω) + SN (ω)
eiωλ.
Let us look at the amplitude gain part:
SS SS + SN
It can be shown that the residual error for the noncausal Wiener filter is
2 π
−∞
[SX (ω) −
|SXY (ω)|^2 SY (ω) ]dω.
This can be seen as follows:
E[(Xt − Xˆt)^2 ] = E[(Xt − Xˆt)Xt] − E[(Xt − Xˆt) Xˆt].
By orthogonality, the last term is 0, which implies that E[Xt Xˆt] = E[ Xˆ t^2 ]. We thus obtain
E[(Xt − Xˆt)^2 ] = E[X t^2 ] − E[ XˆtXt] = E[X t^2 ] − E[ Xˆ^2 t ] =
2 π
−∞
[SX (ω) − S (^) Xˆ (ω)] dω
The MMSE is sometimes written as
M M SE =
2 π
−∞
SX (ω)[1 − |ρXY (ω)|^2 ] dω
where
ρXY (ω) =
SXY (ω) √ SX (ω)SY (ω)
Example 3 For the signal + noise problem, we have
2 π
−∞
SS (ω)SN (ω) SS (ω) + SN (ω)
dω.
2
Example 4 Let us now do the signal + noise problem for a particular signal source. Sup- pose
SS (ω) =
α^2 + ω^2 and
SN (ω) =
(white noise). Then
H 0 (ω) = SS (ω) SS (ω) + SN (ω)
eiωλ
A^2 /(α^2 + ω^2 ) A^2 /(α^2 + ω^2 ) + N 0 / 2
eiωλ
2 A^2 + N 0 (α^2 + ω^2 ) eiωλ
ω^2 + α^2 + 2A^2 /N 0
eiωλ
ω^2 + β^2
eiωλ
βN 0
2 β ω^2 + β^2 eiωλ
Then
h 0 (t) =
βN 0
e−β|t+λ|.
This is not a causal filter. Plot for various values of λ. 2