






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concepts of joint distributions, including joint probability mass and density functions, marginal distributions, and conditional distributions. It provides examples of calculating joint probabilities, marginal distributions, and conditional distributions for discrete and continuous random variables. The document also discusses the relationship between the joint distribution and the marginal distributions.
Typology: Study notes
1 / 10
This page cannot be seen from the preview
Don't miss anything!







Two given random variables X and Y have a general distribution — a joint distribution — that is an extension of the single variable definition and notation we generate from first principles
FX,Y (x, y) = P(X ≤ x, Y ≤ y) (1)
Taking derivatives, dFX,Y (x, y), yields..
...In the discrete case:
P((X, Y ) = (x, y)) = p(x, y) (2)
the joint probability mass function.
...In the continuous case
P((X, Y ) = (x ± , y ± )) = f (x, y) (3)
We generate the marginal distributions for X and Y alone just as we did for contingency tables by summing over all values of the other variable.
px(x) =
y
p(x, y); py(y) =
x
p(x, y) (4)
fx(x) =
R
f (x, v)dv; fy(y) =
R
f (u, y)dx (5)
If you will recall our contingency table example(s), where we generated a marginal distribu- tion by summing over columns of the tale to yield the distributions of the margins.
FX (x) = P(X ≤ x) = P(X ≤ x, Y ≤ ∞) = P(lim y↑∞
{X ≤ x, Y ≤ y})
= lim y↑∞
P({X ≤ x, Y ≤ y})
= lim y↑∞ FX,Y (x, y)
The joint survival distribution can be generated from first principles as well:
P(X > x, Y > y) = = 1 − P({X > x, Y > y}c) = 1 − P({X > x}c^ ∪ {Y > y}c)
...by deMorgan’s laws...
= 1 − P({X ≤ x} ∪ {Y ≤ y})
...by the inclusion-exclusion principle...
1 − [P(X ≤ x) + P(Y ≤ y) − P(X ≤ x, Y ≤ y)]
...by changing notation...
= 1 − FX (x) − FY (y) + FX,Y (x, y)
You can use the facts above to verify:
1 / 3
0
fX,Y (x, y)dxdy
1 / 3
0
6 x^2 ydxdy +
1
0
0 dxdx
= 3/8 + 0 = 3/ 8
...which is the volume under the curve 6x^2 y.
Let fX,Y (x, y) = c (^1) {x (^2) +y (^2) ≤R (^2) }; c and R some constants. What is c? What are the marginal distributions? What is the distribution of the distance of any point (X, Y )
First:
c
x^2 +y^2 ≤R^2
dxdy = 1
→ c = (
x^2 +y^2 ≤R^2
dxdy)−^1 =
πR^2
Second:
fX (x) =
fX,Y (x, y)dy =
πR^2
x^2 +y^2 ≤R^2
dy
Substitute y =
R^2 − x^2
πR^2
∫ (^) y
−y
dt
πR^2
2 y =
πR^2
R^2 − x^2
Third:
Let D =
x^2 + y^2. Then, from first principles,
FD(d) = P(
X^2 + Y 2 < d)
= P(X^2 + Y 2 ≤ d^2 )
→
x^2 +y^2 ≤d^2
πR^2
dxdy
πR^2
x^2 +y^2 ≤d^2
dxdy
πd^2 πR^2
The multinomial distribution is an extension of the binomial distribution.
In this model, there are n identical experiments each with k possible outcomes, each outcome having probability pi,
∑k i=1 pi^ = 1.
From first principles, the probability mass function is:
P(X 1 = n 1 , ..., Xk = nk) = Cnn 1 ,...,nk pn 1 1 · · · pn k k (8)
Cnn 1 ,...,nk = (^) n 1 !n···!nk! is the multinomial coefficient, which you have seen before -
∑k i=1 nk^ =^ n
Example:
Roll an unfair die - let the probability of rolling a one be 1/2, the probability of rolling either a two or three be equal and twice that of rolling a four or five, the probability of rolling a six 1 /4. What is the probability of rolling: 3 ones, 2 twos, 2 threes, 1 four, 1 five, and no 6’s?
3 The conditional distribution
Recall from first principles:
fX|Y (x|y) =
fX,Y (x, y) fY (y)
P(X ∈ x ± , Y ∈ y ± ) P(Y ∈ y ± )
= P(X ∈ x ± |Y ∈ y ± ) (9)
This is the conditional probability density function for X given Y = y. We generate the conditional cumulative distribution function in the usual way:
E(X 1 |X 2 = x 2 ) =
∫ (^) x 2
0
x 1 (
x 2
)dx 1 =
x 2 2
in this instance is a function of x 2. The conditional variance
V ar(X 1 |X 2 = x 2 ) =
∫ (^) x 2
0
(x 1 −
x 2 2
x 2
)dx 1 =
x^22 12
is also, in this instance, a function of x 2.
N.B. (nota bene): E(X 1 ) =
0 x(2^ −^2 x^1 )dx^1 = 2/3 but^ E(X^1 |X^2 =^ x^2 ) =^
x 2 2.^ The ex- pectation of X 1 is a constant, but the conditional expectation of X 1 given X 2 is a random variable.
Are X 1 and X 2 independent? Heuristically, by just looking at the pdf — with the indicator — we could conclude no.
Does P(0 < X 1 < 12 |X 2 = 34 ) =?^ P(0 < X 1 < 12 )
Well...: On the one hand — P(0 < X 1 < 12 |X 2 = 34 ) =
0 fX^1 |X^2 =3/^4 (x^1 |x^2 =^
3 ∫ (^1) / 2 4 )dx^1 = 0
4 3 dx^1 =^
2 3 , but on the other —^ P(0^ < X^1 <^
1 2 ) =^
0 fX^1 (x^1 )dx^1 =^
0 2(1−x^1 )dx^1 =^
3
5 Example
Take two random variable X 1 , X 2 ∼ fX 1 ,X 2 = 6x 2 · (^1) { 0 <x 2 <x 1 < 1 }
The marginal pdf for X 1 is...
fX 1 (x 1 ) =
∫ (^) x 1
0
6 x 2 dx 2 = 3x^21 · (^1) { 0 <x 1 < 1 }
...the conditional pdf for X 2 |X 1 = x 1 is...
fX 2 |X 1 =x 1 (X 2 |X 1 = x 1 ) =
fX 1 ,X 2 fX 1
6 x 2 3 x^21
2 x 2 x^21
· (^1) { 0 <x 2 <x 1 < 1 }
and the conditional expectation is...
E(X 2 |X 1 = x 1 ) =
∫ (^) x 1
0
x 2 (
2 x 2 x^21
)dx 2 =
x 1 · (^1) { 0 <x 1 < 1 }
...a random variable.
Let Y = E(X 2 |X 1 = x 1 ), then Y is a random variable, dependent upon the value of X 1 , where 0 < y < 2 /3 — since 0 < x 1 < 1.
The cdf for Y is
FY (y) = P(Y ≤ y) = P(
X 1 ≤ y) = P(X 1 ≤
which can be computed using the pdf for X 1 as...
∫ (^3) y/ 2
0
3 x^21 dx 1 =
27 y^3 8
The pdf for Y is...
dFY (y) = fY (y) =
81 y^2 8
the expectation for Y is...
0
y(
81 y^2 8
)dy =
and the variance is...
V ar(Y ) =
0
y^2 (
81 y^2 8
)dy −
So Y = E(X 2 |X 1 = x 1 ) is a random variable with Y ∼ μY = 12 , σ^2 Y = 14.
N.B.: fX 2 (x 2 ) =
x 2 6 x^2 dx^1 = 6x^2 (1^ −^ x^2 )^ ·^1 {^0 <x^2 <^1 }^ which yields^ E(X^2 ) =^
1 2 and^ V ar(X^2 ) = 1
6 E(E(X 1 |X 2 )) = E(X 1 ) and V ar(E(X 1 |X 2 )) ≤ V ar(X 1 )
The big result, implied by the last example is
→ V ar(X 2 ) ≥ V ar(E(X 2 |X 1 ))
The upshot is that variance can be reduced by conditioning, though conditioning does not change the expectation. That makes sense.
These results may not be covered in order in your text — but you should know them now, nonetheless.
7 Exercises