Conditioning on a Random Variable with Continuous Distribution: Formula and Applications, Study notes of Probability and Statistics

This chapter from a statistics textbook explains how to apply the conditioning formula to random variables with continuous distributions. The author uses the example of the poisson process to illustrate the concept and derives the convolution formula for densities. The chapter also includes an example of a queuing problem and its surprising solution.

Typology: Study notes

Pre 2010

Uploaded on 11/08/2009

koofers-user-xat
koofers-user-xat 🇺🇸

9 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 10
Conditioning on a random variable with
a continuous distribution
At this point in the course, I hope you understand the importance of the conditioning
formula
EY|info=i
PFi|infoEY|Fi,info
for finite or countably infinite collections of disjoint events Fifor which =∪
iFi.Asa
particular case, if Xis a random variable that takes only a discrete set of values {x1,x2,...}
then
EY|info=i
P{X=xi|info}EY|X=xi,info.
This formula can be simplified by the introduction of the function
h(x)=EY|X=x,info.
For then
EY|info=i
P{X=xi|info}h(xi)=Eh(X)|info.
In this Chapter, I want to persuade you that a similar formula applies when Xhas a
continuous distribution, with density function f(given the info):
()EY|info=Eh(X)|info=
−∞
h(x)f(x)dx.
As a special case, when Yequals the indicator function of an event B, the formula reduces
to
PB=
−∞
P(B|X=x)f(x)dx.
From now on, I will omit explicit mention of the conditioning information “info”, writ-
ing h(x)for E(Y|X=x).
There are several ways to arrive at formula (). The most direct relies on the plausible
assertion that
EY|XJh(x)if Jis a small interval with xJ.
The error of approximation should disappear as Jshrinks to the point x. Split Rinto a
union of disjoint, small intervals Ji=[xi,xi+1), where xi+1=xi+δ, then condition:
EY=i
P{XJi}E(Y|XJi}≈iδf(xi)h(xi)
−∞
h(x)f(x)dx.
The combined errors of all the approximation should disappear in the limit as δtends to
zero.
Statistics 241: 25 October 2005 C10-1 c
David Pollard
pf3
pf4

Partial preview of the text

Download Conditioning on a Random Variable with Continuous Distribution: Formula and Applications and more Study notes Probability and Statistics in PDF only on Docsity!

Chapter 10

Conditioning on a random variable with

a continuous distribution

At this point in the course, I hope you understand the importance of the conditioning formula E

Y | info

i P^

Fi | info

E

Y | Fi , info

for finite or countably infinite collections of disjoint events Fi for which  = ∪i Fi. As a particular case, if X is a random variable that takes only a discrete set of values {x 1 , x 2 ,.. .} then E

Y | info

i P{X^ =^ x^ i^ |^ info}E^

Y | X = x (^) i , info

This formula can be simplified by the introduction of the function h(x) = E

Y | X = x, info

For then E

Y | info

i P{X^ =^ x^ i^ |^ info}h(x^ i^ )^ =^ E^

h(X) | info

In this Chapter, I want to persuade you that a similar formula applies when X has a continuous distribution, with density function f (given the info):

(∗) E

Y | info

= E

h(X) | info

−∞

h(x) f (x) dx.

As a special case, when Y equals the indicator function of an event B, the formula reduces to PB =

−∞

P(B | X = x) f (x) dx.

From now on, I will omit explicit mention of the conditioning information “info”, writ- ing h(x) for E(Y | X = x). There are several ways to arrive at formula (∗). The most direct relies on the plausible assertion that E

Y | X ∈ J

≈ h(x) if J is a small interval with x ∈ J. The error of approximation should disappear as J shrinks to the point x. Split R into a union of disjoint, small intervals Ji = [x (^) i , x (^) i+ 1 ), where x (^) i+ 1 = x (^) i + δ, then condition:

EY =

i P{X^ ∈^ Ji^ }E(Y^ |^ X^ ∈^ Ji^ } ≈^

i δ^ f^ (x^ i^ )h(x^ i^ )^ ≈

−∞

h(x) f (x) dx.

The combined errors of all the approximation should disappear in the limit as δ tends to zero.

Alternatively, we could start from a slightly less intuitive assumption that EY should be nonnnegative if E(Y | X = x) ≥ 0 for every x. If we replace Y by Y − h(X) then we have E

Y − h(X) | X = x

= E

Y | X = x

− h(x) = 0 , which gives E(Y −h(X)) ≥ 0. A similar argument applied to h(X)−Y gives E

h(X) − Y

  1. Equality (∗) follows.

Remark. Notice that formula (∗) also implies that (∗∗), E (Y g(X )) = E

g(X )h(X )

at least for bounded functions g because E

Y g(X ) | X = x

= g(x)h(x). In advanced probability theory, the treatment of conditional expectations becomes most abstract. Formula (∗∗) is used to define the conditional expectation h(x) = E(Y | X = x). One needs to show that there exists a random variable of the form h(X ), which is uniquely determined up to trivial changes on sets of zero probability, for which Eg(X )

Y − h(X )

= 0 for every bounded g. Essentially h(X ) is the best approximation to Y using only information given by X. With this abstract approach, one then needs to show that conditional expecta- tions have the properties that I have taken as axiomatic for Stat 241.

Example <10.1>: The convolution formula for densities derived from (∗).

The Poisson process is often used to model the arrivals of customers in a waiting line, or the arrival of telephone calls at an exchange. The underlying idea is that of a large popu- lation of potential customers, each of whom acts independently of all the others.

Example <10.2>: A queuing problem with a surprising solution (can be skipped)

Examples for Chapter 10

<10.1> Example. Suppose X and Y are independent random variables with continuous distri- butions. If X has density f and Y has density g then (see Chapter 7) the random variable Z = X + Y has density h(z) =

−∞

g(z − x) f (x) dx

The same formula can be derived from the formula

PB =

−∞

P(B | X = x) f (x) dx,

applied with B = {z ≤ Z ≤ z + δ} for a small, positive δ. Note that P(z ≤ Z ≤ z + δ | X = x) = P(z − x ≤ Y ≤ z − x + δ | X = x) = P(z − x ≤ Y ≤ z − x + δ) because X, Y independent ≈ δg(z − x). Invoke the conditioning formula.

P{z ≤ Z ≤ z + δ} ≈

−∞

δg(z − x) f (x) dx,

which leads us back to the convolution formula. 

<10.2> Example. Suppose an office receives two different types of inquiry: persons who walk in off the street, and persons who call by telephone. Suppose the two types of arrival are described by independent Poisson processes, with rate λ (^) w for the walk-ins, and rate λc for

You should be able to write out the necessary conditioning argument for (2′).

The two-step mechanism explains the appearance of the geometric distribution in the problem posed at the start of the Example. The classification of each inquiry as either a walk-in or a call is effectively carried out by a sequence of independent coin tosses, with probability p of a head (= a walk-in). The problem asks for the distribution of the number of tails before the first head. The embedding of the inquiries into continuous time is irrele- vant.