Download Probability Concepts: Discrete & Continuous Variables, Joint & Conditional Probabilities and more Study notes Economic statistics in PDF only on Docsity! Introduction to Basic Probability Concepts Charlie Gibbons Economics 140 September 3, 2009 Outline 1 Probability Basics Joint Probabilities Conditional Probabilities Independence 2 Expectations Definition and properties Conditional Expectations 3 Dispersion Variance Covariance and Correlation Preliminary definitions Define the cumulative distribution function (CDF), FX(x), as Pr(X ≤ x). Ex: The CDF in the die rolling example calculates the probability of rolling a number less than x: FX(3) = Pr(X ≤ 3) = Pr(X = 1)+Pr(X = 2)+Pr(X = 3) = 1 2 . Preliminary definitions The CDF has three important properties: lim x→−∞FX(x) = 0 (you can’t get anything less than −∞) lim x→∞FX(x) = 1 (everything is less than ∞) dFX (x) dX ≥ 0 (the CDF is non-decreasing) Preliminary definitions We saw that the PMF of a discrete random variable is Pr(X = x); thus the CDF is FX(x) = x∑ y=−∞ Pr(X = y). Joint Probabilities Previously, we considered the distribution of a lone random variable. Now we will consider the joint distribution of two random variables. Joint Probabilities The joint cumulative distribution function (joint CDF), FX,Y (x, y), of the random variables X and Y is defined by FX,Y (x, y) = Pr(X ≤ x and Y ≤ y) = x∑ s=−∞ y∑ t=−∞ fX,Y (s, t) ds dt As with any CDF, FX,Y (x, y) must equal 1 as x and y go to infinity. Joint Probabilities Consider the roll of two dice and let X and Y be the outcomes on each die. Then the 36 (equally-likely) possibilities are: x, y 1 2 3 4 5 6 1 1,1 1,2 1,3 1,4 1,5 1,6 2 2,1 2,2 2,3 2,4 2,5 2,6 3 3,1 3,2 3,3 3,4 3,5 3,6 4 4,1 4,2 4,3 4,4 4,5 4,6 5 5,1 5,2 5,3 5,4 5,5 5,6 6 6,1 6,2 6,3 6,4 6,5 6,6 Joint Probabilities The joint probability mass function (joint PMF), fX,Y is fX,Y (x, y) = Pr(X = x and Y = y) Joint Probabilities What is fX,Y (6, 5)? Joint Probabilities What is fX,Y (6, 5)? x, y 1 2 3 4 5 6 1 1,1 1,2 1,3 1,4 1,5 1,6 2 2,1 2,2 2,3 2,4 2,5 2,6 3 3,1 3,2 3,3 3,4 3,5 3,6 4 4,1 4,2 4,3 4,4 4,5 4,6 5 5,1 5,2 5,3 5,4 5,5 5,6 6 6,1 6,2 6,3 6,4 6,5 6,6 fX,Y (6, 5) = 1 36 Joint Probabilities Joint PDF of independent normals −4 −2 0 2 4 −4 −2 0 2 4 X Y Density Joint Probabilities Let’s take the joint CDF and let y go to infinity—i.e., take any possible value of Y . We get: FX(x) = lim y→∞FX,Y (x, y), where FX(x) is the marginal cumulative distribution function (marginal CDF) of X. Joint Probabilities
What is Fy (2)?
Joint Probabilities The marginal PMF of X is fX(x) = ∞∑ y=−∞ fX,Y (x, y) The marginal PDF of X is fX(x) = ∫ ∞ −∞ fX,Y (x, y) dy Joint Probabilities The marginal PMF of X is fX(x) = ∞∑ y=−∞ fX,Y (x, y) The marginal PDF of X is fX(x) = ∫ ∞ −∞ fX,Y (x, y) dy Note that, while a marginal PDF (PMF) can be found from a joint PDF (PMF), the converse is not true; there are an infinite number of joint PDFs (PMFs) that could be described by a given marginal PDF (PMF). Conditional Probabilities Suppose that the value of X in a joint distribution is known—what can we say about the distribution of Y given this knowledge? This is called the conditional distribution of Y given X = x. The conditional PDF (PMF) of Y given X = x, fY |X(y|X = x), is defined by fY |X(y|X = x) = fX,Y (x, y) fX(x) . As for any PDF (PMF), over the support of Y , the conditional PDF (PMF) must integrate (sum) to 1. It must also be non-negative for all real values. We divide by fX(x) because we’ve changed the sample space from all values of X to just X = x. Independence X and Y are independent if and only if FX,Y (x, y) = FX(x)FY (y) and fX,Y (x, y) = fX(x)fY (y). Independence We also see that X and Y are independent if and only if fY |X(y|X = x) = fY (y) ∀ x ∈ X. Independence We also see that X and Y are independent if and only if fY |X(y|X = x) = fY (y) ∀ x ∈ X. This implies that knowing X gives you no additional ability to predict Y , an intuitive notion underlying independence. Independence As we would imagine, the result of X influences the value of Z, so they shouldn’t be independent. Let’s prove it: Independence What is FX,Z(2, 5)? Independence What is FX,Z(2, 5)? x, z 1 2 3 4 5 6 1 1,2 1,3 1,4 1,5 1,6 1,7 2 2,3 2,4 2,5 2,6 2,7 2,8 3 3,4 3,5 3,6 3,7 3,8 3,9 4 4,5 4,6 4,7 4,8 4,9 4,10 5 5,6 5,7 5,8 5,9 5,10 5,11 6 6,7 6,8 6,9 6,10 6,11 6,12 FX,Z(2, 5) = 7 36 = 5 54 = 2 6 × 10 36 = FX(2) × FZ(5) Expectations of Random Variables The expectation, E(X) ≡ μ of a random variable X is simply the average of the possible realizations of X, weighted by their probability. For discrete random variables, this can be written as E(X) = ∑ x∈X Pr(X = x)x = ∑ x∈X f(x)x. In our die example, E(X) = 1 × 1 6 + 2 × 1 6 + 3 × 1 6 + 4 × 1 6 + 5 × 1 6 + 6 × 1 6 = 21 6 = 3.5 Expectations of Random Variables For continuous random variables: E(X) = ∫ ∞ −∞ xf(x) dx Expectations of Functions of Random Variables The definition of expectation can be generalized for functions of random variables, g(x). Properties of Expectations Expectations are linear operators, i.e., E(a · g(X) + b · h(X) + c) = a · E[g(X)] + b · E[h(X)] + c. Note that, in general, E[g(x)] = g[E(X)]. Properties of Expectations In our die example, E(X2) = 12 × 1 6 + 22 × 1 6 + 32 × 1 6 + 42 × 1 6 + 52 × 1 6 + 62 × 1 6 = 91 6 = 15.17 = 3.52 = 12.25 ⇒ E(X2) = [E(X)]2 Conditional Expectations The expectation of a random variable Y conditional on or given X is defined analogously to the preceding formulations, but uses the conditional distribution fY |X(y|X = x), rather than the unconditional fY (y). Conditional distributions are about changing the population that you are considering. Conditional Expectations Recall that, for independent random variables X and Y , fY |X(y|X = x) = fY (y) and fX|Y (x|Y = y) = fX(x) Conditional Expectations Recall that, for independent random variables X and Y , fY |X(y|X = x) = fY (y) and fX|Y (x|Y = y) = fX(x) Hence, E(Y |X) = E(Y ) and E(X|Y ) = E(X). Variance The variance of a random variable is a measure of its dispersion around its mean. It is defined as the second central moment of X: Var(X) = E [ (X − μ)2 ] Variance The variance of a random variable is a measure of its dispersion around its mean. It is defined as the second central moment of X: Var(X) = E [ (X − μ)2 ] Multiplying this out yields: = E ( X2 − 2μX + μ2 ) = E ( X2 ) − 2μE(x) + μ2 = E ( X2 ) − [E(X)]2 The standard deviation, σ, of a random variable is the square root of its variance; i.e., σ = √ Var(X). See that Var(aX + b) = a2Var(X). Covariance and Correlation The covariance of random variables X and Y is defined as Cov(X,Y ) ≡ σXY = E [(X − EX(X)) (Y − EY (Y ))] = E(XY ) − μXμY . Covariance and Correlation The covariance of random variables X and Y is defined as Cov(X,Y ) ≡ σXY = E [(X − EX(X)) (Y − EY (Y ))] = E(XY ) − μXμY . We have Var(aX + bY ) = a2Var(X) + b2Var(Y ) + 2abCov(X,Y ).