






















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Professor: Zhang; Class: Statistical Theory I; Subject: Statistics; University: North Carolina State University; Term: Unknown 1989;
Typology: Study notes
1 / 30
This page cannot be seen from the preview
Don't miss anything!























We study the joint distribution of more than two random variables, called a random vector, such that (X, Y ), (X, Y, Z), (X 1 , · · · , Xn), and the distri- bution of their functions like X + Y , XY Z, or X 1 + X 2 + · · · + Xn.
Assume both X and Y are random. We treat (X, Y ) as a two-dimensional random vector and study their relationship.
Assume that both X and Y are discrete random variables, with the sample space X and Y respectively.
Joint pmf:
fX,Y (x, y) = P (X = x, Y = y), ∀x ∈ X , y ∈ Y.
Properties:
x∈X
y∈Y fX,Y^ (x, y) = 1.
The probability of a set A is given by
P ((X, Y ) ∈ A) =
(x,y)∈A
fX,Y (x, y).
Marginal pmf: If the joint distribution of (X, Y ) is known, their marginal pmf are fX (x) = P (X = x) =
y∈Y
fX,Y (x, y).
fY (y) = P (Y = y) =
x∈X
fX,Y (x, y)
Example 1 Two fair dice thrown. Let X=maximum, Y =sum. Possible values: X: 1, 2, 3, 4, 5, 6. Y : 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. Can write the probabilities in a table.
Remark:
Example: Define the joint pmf by
f (0, 0) = f (0, 1) =
; f (1, 0) = f (1, 1) =
; f (x, y) = 0, otherwise.
Consider another joint pmf by
f (0, 0) =
; f (1, 0) =
; f (0, 1) = f (1, 1) =
; f (x, y) = 0, otherwise.
They share the same marginal distributions, but not the same joint distri- bution!
Example. Check whether the following function a valid pdf
f (x, y) = ye−(x+y)I{ 0 < x < y}.
Example. f (x, y) = 2I{ 0 ≤ x ≤ y ≤ 1 }.
Example. f (x, y) = e−y^ I{ 0 < x < y}.
Ex. f (x, y) = e−yI{ 0 < x < y}. Compute E(X), E(Y ), E(XY ), MX,Y (t, s).
2 Conditional Distributions
Assume both X and Y are discrete. For any x such that P (X = x) > 0, the conditional pmf of Y given X = x is defined as
fY |X (y|x) = P (Y = y|X = x) = P (X = x, Y = y) P (X = x)
, ∀y ∈ Y.
We can define fX|Y (x|y) similarly.
Remark: The function f (y|x) is indeed a pmf, since for any fixed x it satisfies
y fY^ |X^ (y|x) = 1.
Example. The two dice example, X=maximum, Y =sum.
fY |X (y|3).
fX|Y (x|7).
For discrete random variables:
E(Y |X = x) =
y
yfY |X (y|x),
Var(Y |X = x) =
y
{y − E(Y |X = x)}^2 fY |X (y|x).
For continuous random variables:
E(Y |X = x) =
yfY |X (y|x)dy,
Var(Y |X = x) =
{y − E(Y |X = x)}^2 fY |X (y|x)dy.
Remark: As before, we have
Var(Y |X = x) = E(Y 2 |X = x) − {E(Y |X = x)}^2.
Example. Two dice example, X=max, Y =sum. Compute E(Y |X = 3).
Ex. f (x, y) = e−y^ I{ 0 < x < y}. Find E(Y |X = x) and Var(Y |X = x).
Remark: Note E(Y |X = x) is a function of x. Therefore, E(Y |X) is a random variable as a function of X.
Theorem:
E(Y ) = E(E(Y |X)).
Var(Y ) = E(Var(Y |X)) + Var(E(Y |X)).
Remark:
E(Y − E(Y |X))^2 ≤ E(Y − g(X))^2 , ∀ g function
So E(Y |X) is “closest” (in above sense) to Y among all the functions of X.
Theorem: If X and Y are independent, then
(i) E(Y |X) = E(Y ).
(ii) The events {X ∈ A} and {Y ∈ B} are independent.
P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B), ∀A ⊂ R, B ⊂ R.
(iii) E(g(X)h(Y )) = E(g(X))E(h(Y ). In particular, E(XY ) = E(X)E(Y ).
(iv) In addition, we have MX,Y (t, s) = E(etX+sY^ ) = MX (t)MY (s). And
MX+Y (t) = E(et(X+Y^ )) = MX (t)MY (t).
If it is easy to identify the right-hand side as the MGF of some standard distribution, then the sum of two independent variables is easy to find.
Example 1. binomial+binomial.
Example 2. Poisson+Poisson.
Example 3. negative binomial+negative binomial.
Example 4. normal+normal.
Example 5. gamma+gamma.
Theorem. If fX,Y (x, y) is the joint density of (X, Y ), then
fU,V (u, v) = fX,Y (h 1 (u, v), h 2 (u, v))| det(J)|.
Proof follows from change of variable rules for integration — omitted.
Example. Sum and difference of independent normals.
Example. Polar transform of independent normals.
Example. Sum and ratio of independent gammas.
Example. Ratio of two standard normals.
Many-to-One Transformation: Assume (X, Y ) takes value from A = A 0 ∪ A 1 ∪ · · · ∪ Ak, where P ((X, Y ) ∈ A 0 ) = 0. Also U = g 1 i(X, Y ), V = g 2 i(X, Y ) is one-to-one transformation from Ai to B, for i = 1, · · · , k. Then
fU,V (u, v) =
∑^ k
i=
fX,Y (h 1 i(u, v), h 2 i (u, v))| det(Ji)|,
Example Poisson-gamma.
Example chi square-gamma.
Example binomial-beta.
Example binomial-Poisson-gamma (optional).