




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Mathematical Statistics; Subject: mathematics; University: Boston College; Term: Unknown 1989;
Typology: Study notes
1 / 8
This page cannot be seen from the preview
Don't miss anything!





Probability- the good parts version
I. Random variables and their distributions; continuous random variables.
A random variable (r.v) X is continuous if its distribution is given by a probability density function (pdf) f (x) that is positive on an interval. For real numbers a < b,
P (a < X < b) =
∫ (^) b
a
f (x)dx.
Random variables X and Y are jointly continuous if there’s a joint den- sity function f (x, y) such that
P (a < X < b, c < Y < d) =
∫ (^) b
a
∫ (^) d
c
f (x, y)dydx.
The marginal density for X is found by integrating out the y and vice- versa for Y. X and Y are called independent if their joint density is the product of their marginals. In this case, it follows that P (a < X < b, c < Y < d) = P (a < X < b)P (c < Y < d).
The cumulative distribution function (cdf) of X is the function F ,
F (x) = P (X ≤ x) =
∫ (^) x
−∞
f (t)dt.
Given a function g(), the expected value of g(X) =
E(g(X)) =
−∞
g(x)f (x)dx.
In particular, the mean of X= E(X)= μ; the variance of X= V (X)= σ^2 = E(X − μ)^2 = EX^2 − μ^2 ; and the standard deviation σ =
σ^2.
The moment generating function of X is
MX (t) = E(eXt) =
−∞
extf (x)dx.
Properties of the moment generating function:
(1) M (^) X(n )(0) = d
n dtn^ MX^ (t),^ evaluated at t=0=^ EX
n.
(2) If X 1 , X 2 , ..., Xn are independent random variables then
MX 1 +X 2 +...+Xn (t) = MX 1 (t) ∗ MX 2 (t) ∗ · · · ∗ MXn (t).
(3) The m.g.f. specifies the distribution: if MX (t) = MY (t), then X and Y have the same distribution.
(4) MaX+b(t) = ebtMX (at).
The standard class of continuous distributions.
(1) X ∼ N (μ, σ^2 )
density: f (x) = √ 21 πσ e−(x−μ)
(^2) / 2 σ 2 , −∞ < x < ∞.
mean: E(X) = μ.
variance: V (X) = σ^2.
mgf: M (t) = eμt+σ
(^2) t (^2) / 2 .
If X ∼ N (μ, σ^2 ), then (X − μ)/σ ∼ N (0, 1).
If X ∼ N (μX , σ X^2 ) and Y ∼ N (μY , σ^2 Y ) are independent, then aX + bY ∼ N (aμX + bμY , a^2 σ^2 X + b^2 σ^2 Y ).
(2) X ∼ Chi-Square with n degrees of freedom (X ∼ χ^2 (n)) if X ∼ Gamma(n/ 2 , 1 /2).
mean: E(X) = n.
variance: V (X) = 2n.
mgf: M (t) = ( (^1) −^12 t )n/^2 , t < 1 / 2.
If [c,d] ⊂ [a,b], then P (c ≤ X ≤ d) is Length[c,d]/Length[a,b].
II. Random variables and their distributions; discrete random variables.
A discrete rv X takes on a finite or countable number of values. Prob- abilities are computed using a frequency function p(k) = P (X = k); this is also called a probability density function (pdf) or probability mass function.
P (a < X < b) =
k∈(a,b)
p(k).
Given a function g, the expected value of g(X) =
E(g(X)) =
k
g(k)p(k);
in particular, the moment generating function of X is
MX (t) = E(eXt) =
k
ektp(k).
The standard class of discrete distributions.
(1) X ∼ Bernoulli(p)
frequency function: p(1) = p, p(0) = q = 1 − p.
mean: E(X) = p.
variance: V (X) = pq.
mgf: M (t) = (q + pet), −∞ < t < ∞.
(2) X ∼ Binomial(n, p)
frequency function: p(k) =
n k
pkqn−k, k = 0, 1 ,... , n.
mean: E(X) = np.
variance: V (X) = npq.
mgf: M (t) = (q + pet)n, −∞ < t < ∞.
If X is the number of successes in n independent Bernoulli trials, then X ∼ Binomial(n, p).
(3) X ∼ Geometric(p). There are two definitions for the Geometric(p) distribution. (I) X is the number of failures required to see the first success in a sequence of Bernoulli trials and (II) X is the number of trials required to see the first success in a sequence of Bernoulli trials. If X and Y represent have these respective distributions, then Y = X + 1. We give results separately for the two definitions.
Case I.
frequency function: p(k) = pqk, k = 0, 1 ,.. ., where q = 1 − p.
mean: E(X) = q/p.
variance: V (X) = q/p^2.
mgf: M (t) = (^1) −pqet , −∞ < t < ln(1/q).
Case II.
frequency function: p(k) = pqk−^1 , k = 1, 2 ,.. ., where q = 1 − p.
mean: E(X) = 1/p.
variance: V (X) = q/p^2.
mgf: M (t) = pe
t 1 −qet^ ,^ −∞^ < t <^ ln(1/q).
(4) X ∼ Negative Binomial(r, p). Again, there are two definitions for the Negative Binomial(r, p) distribution. (I) X is the number of failures before the rth success in a sequence of Bernoulli trials and (II) Y is the number of trials required to see the rth success in a sequence of Bernoulli trials. If X and Y have these respective distributions, then Y = X + r. We give results separately for the two definitions.
Case I.
V (X) = E[(X − μ)^2 ] = E(X^2 ) − μ^2.
V (aX + b) = a^2 V (X).
V (X + Y ) = V (X) + V (Y ) + 2Cov(X, Y ) for any random variables X and Y.
V (X + Y ) = V (X) + V (Y ) if X and Y are independent.
Cov(X, Y ) = E[(X − μX )(Y − μY )] = E(XY ) − E(X)E(Y ).
Cov(X, Y ) = 0 if X and Y are independent.
Cov[X, X) = V (X).
Cov(aX, bY ) = abCov(X, Y ).
Cov(X + Y, U + V ) = Cov(X, U ) + Cov(X, V ) + Cov(Y, U ) + Cov(Y, V ).
Sampling
If {Xi} are n independent, identically distributed random variables with E(Xi) = μ and V (Xi) = σ^2 and X¯ = (^1) n
∑n i=1 Xi^ is the sample mean, then:
E( X¯) = μ and V ( X¯) = σ^2 /n.
Central limit theorem
If {Xi} are n independent, identically distributed random variables with E(Xi) = μ and V (Xi) = σ^2 , then
∑n i=1 Xi^ approximately^ ∼^ N^ (nμ, nσ
(^2) ) as n → ∞.
X^ ¯ approximately ∼ N (μ, σ^2 /n) as n → ∞.
Joint and conditional distributions
If X and Y have joint pdf fX,Y (x, y), then
fX (x) =
−∞ fX,Y^ (x, y)dy^ or^
all y fX,Y^ (x, y);
fY (y) =
−∞ fX,Y^ (x, y)dx^ or^
all x fX,Y^ (x, y);
fX|Y =y (x) = fX,Y (x, y)/fY (y);
fY |X=x(y) = fX,Y (x, y)/fX (x);
fX,Y (x, y) = fX (x)fY (y) if and only if X and Y are independent.
Xmax, Xmin
If {Xi} are n independent, identically distributed random variables with pdf fX (x), then
fXmax (x) = nfX (x)[FX (x)]n−^1
fXmin (x) = nfX (x)[1 − FX (x)]n−^1