Download Probability Distributions and Moment Generating Functions and more Summaries Statistics in PDF only on Docsity! Probability Distributions and Functions Probability Concepts and Distributions 1.2 Set Notation Sample Space (S): The set of all possible sample points (i.e., the collection of all the sample events). Complement (A^c): The set of all elements not in the event A; P(A^c) = 1 - P(A). Union (A ∪ B): The set of all elements either in set A, B, or both. Intersection (A ∩ B): The set of all elements occurring in both sets A and B. Empty Set (∅): The set where neither sets A nor B have elements in common. If A ∩ B = ∅, then sets A and B are mutually exclusive or disjoint. 1.3 Additive Law The probability that at least one of the two events A and B occurs is: P(A ∪ B) = P(A) + P(B) - P(A ∩ B) If events A and B are mutually exclusive, then: P(A ∪ B) = P(A) + P(B) 1.4 Conditional Probability The probability that both events A and B occur is: P(A ∩ B) = P(A)P(B|A) = P(B)P(A|B) Two events are said to be independent if any one of the following holds: - P(A ∩ B) = P(A)P(B) - P(A|B) = P(A) - P(B|A) = P(B) 1.6 Law of Total Probability and Bayes' Rule Let B1, ..., Bk be a partition of the sample space S, where every Bi has P(Bi) > 0. Then, for any event A: P(A) = Σ P(A|Bi)P(Bi) Bayes' Rule (Conditioning on a Partition): P(Bj|A) = (P(A|Bj)P(Bj)) / Σ P(A| Bi)P(Bi) 1.7 Random Sampling Random Variable: A numeric variable whose value is determined by the outcome of a chance experiment. Its domain is the sample space. Statistics: The use of samples to make statements about populations (i.e., inference). • • • • • • • Simple Random Sample (SRS): Selection of n individuals from a population of size N so that every set of n individuals has the same probability of selection. The number of possible SRS are: N!/(n!(N-n)!). 2 Discrete Distributions 2.1 Bernoulli(p) Counts successes of a single trial with Boolean responses. Special case of the binomial distribution when n = 1. P(X = x|p) = p^x(1 - p)^(1-x); x = 0, 1; 0 ≤ p ≤ 1 E[X] = p, Var[X] = p(1 - p) Moment Generating Function (MGF): MX(t) = (1 - p) + pe^t 2.2 Binomial(n, p) Counts successes of n independent trials with Boolean responses (samples with replacement). Related to the multinomial distribution, a multivariate version of the binomial distribution. Can be approximated by the Poisson as n approaches ∞. P(X = x|n, p) = (n choose x) p^x(1 - p)^(n-x); x = 0, 1, 2, ...; 0 ≤ p ≤ 1 E[X] = np, Var[X] = np(1 - p) MGF: MX(t) = [pe^t + (1 - p)]^n 2.3 Geometric(p) Determines the probability until the first success. Y = X - 1 is negative binomial (1, p). The distribution is memoryless: P(X > s|X > t) = P(X > s - t). P(X = x|p) = p(1 - p)^(x-1); x = 1, 2, ...; 0 ≤ p ≤ 1 E[X] = 1/p, Var[X] = (1 - p)/p^2 MGF: MX(t) = p/(1 - (1 - p)e^t), t < -log(1 - p) 2.4 Hypergeometric Counts successes of n trials with Boolean responses (samples without replacement). If K << M and N, the range of x = 0, 1, 2, ..., K will be appropriate. P(X = x|N, M, K) = (M choose x)(N-M choose K-x)/(N choose K); x = 0, 1, 2, ..., K; M-(N-K) ≤ x ≤ M; N, M, K ≥ 0 E[X] = KM/N, Var[X] = KM(N-M)(N-K)/N^2(N-1) Moment Generating Function does not exist. 2.5 Negative Binomial(r, p) Counts fails of n trials with Boolean responses. Can be approximated by a geometric with n = 1. P(X = x|r, p) = (r+x-1 choose x) p^r(1 - p)^x; x = 0, 1, ...; 0 ≤ p ≤ 1 • • • • • • • • • • • • • • • • • • • • • • • • • • 3.9 Normal(μ, σ^2) f(x|μ, σ^2) = (1/(√(2πσ^2))) e^(-(x-μ)^2/(2σ^2)), -∞ < x < ∞, -∞ < μ < ∞, σ > 0 E[X] = μ, Var[X] = σ^2 MGF: MX(t) = e^(μt+σ^2t^2/2) 3.10 t Related to F (F1,ν = t^2). f(x|ν) = Γ((ν+1)/2)/(√(νπ)Γ(ν/2)) (1 + x^2/ν)^(-(ν+1)/2), -∞ < x < ∞, ν = 1, ... E[X] = 0, Var[X] = ν/(ν-2), ν > 2 Moments: E[X^n] = 0 if n is odd, and √(2)Γ((n+1)/2)Γ((ν-n)/2)/ (√(πν)Γ(ν/2)) if n is even and n < ν 3.11 Uniform(a, b) If a = 0 and b = 1, this is a special case of the beta (α = β = 1). f(x|a, b) = 1/(b-a), a ≤ x ≤ b E[X] = (a+b)/2, Var[X] = (b-a)^2/12 MGF: MX(t) = (e^(bt) - e^(at))/(t(b-a)) 3.12 Weibull (γ, β) The MGF exists only for γ ≥ 1. Its form is not very useful. f(x|γ, β) = (γ/β)(x/β)^(γ-1)e^(-(x/β)^γ), 0 ≤ x < ∞, γ > 0, β > 0 E[X] = βΓ(1 + 1/γ), Var[X] = β^2(Γ(1 + 2/γ) - Γ(1 + 1/γ)^2) Moments: E[X Functions of Random Variables 6.1 Method of Distribution Functions To find the density for a function U of random variables Y1, ..., Yn with joint density function f(y1, ..., yn): Determine the region where U takes the value in the y1, ..., yn space. Determine the region in the y1, ..., yn space where U ≤ u. Find the distribution function FU(u) = P(U ≤ u) for U by integrating f(y1, ..., yn) over the region where U ≤ u. Calculate the density for U as the derivative of its distribution: fu(u) = δFU(u)/δu. Example: Find the density for U = 4Y + 2. Given: f(x, y) = ( 3y^2, 0 ≤ y ≤ 1 0, otherwise ) U ≤ u: 4Y + 2 ≤ U, so Y ≤ (U - 2)/4. • • • • • • • • • • • • • • • 1. 2. 3. 4. 1. FU(u) = P(U ≤ u) = ∫_(0)^((u-2)/4) 3y^2 dy = (u-2)^3/64, 2 ≤ u ≤ 6 0, u < 2 1, u > 6 fu(u) = δFU(u)/δu = 3(u-2)/64, 2 ≤ u ≤ 6 0, otherwise 6.2 Method of Transformations Let Y be a continuous random variable and U = h(Y) be a function of Y for which h is either strictly increasing or strictly decreasing for all y such that fY(y) > 0. To find the density of U: Find the inverse function y = h^(-1)(u). Evaluate the derivative of the inverse function δh^(-1)(u)/δu. Then, fU(u) = fY(h^(-1)(u)) * |δh^(-1)(u)/δu|. Example: Find the density for U = 4Y + 2. Given: f(x, y) = ( 3y^2, 0 ≤ y ≤ 1 0, otherwise ) h(Y) = 4Y + 2, so h^(-1)(u) = (u-2)/4. δh^(-1)(u)/δu = 1/4. fU(u) = fY(h^(-1)(u)) * |δh^(-1)(u)/δu| = 3((u-2)/4)^2 * 1/4 = 3(u-2)/64, 2 ≤ u ≤ 6 0, otherwise The method of distributions and the method of transformations produce the same result given the same probability density function and transformation function. 2. ◦ 3. 1. 2. 3. 1. 2. 3.