



















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Lecture notes on joint distributions, their properties, and the concept of continuous random variables. It covers the properties of joint discrete and continuous probability functions, the concept of marginal distributions, and the relationship between independent and dependent random variables. It also includes examples and calculations.
Typology: Lecture notes
1 / 27
This page cannot be seen from the preview
Don't miss anything!




















Section 6. Joint Distributions (LECTURE NOTES 6) 101
Properties of the joint (bivariate) discrete probability mass function pmf f (x, y) = P (X = x, Y = y) for random variables X and Y with ranges RX and RY where R = {(x, y)|x ∈ RX , y ∈ RY }, are:
(x,y)∈R
f (x, y) = 1,
(x,y)∈S
f (x, y),
with marginal pmfs of X and of Y ,
fX (x) = P (X = x) =
y∈RY
f (x, y), fY (y) = P (Y = y) =
x∈RX
f (x, y).
Properties of the joint (bivariate) continuous probability density function pdf f (x, y) for continuous random variables X and Y , are:
−∞
−∞ f^ (x, y)^ dy dx^ = 1,
S f^ (x, y)^ dy dx,
with marginal pdfs of X and of Y ,
fX (x) =
−∞
f (x, y) dy, fY (y) =
−∞
f (x, y) dx.
Random variables (discrete or continuous) X and Y are independent if and only if
f (x, y) = fX (x) · fY (y).
A set of n random variables are mutually independent if and only if
f (x 1 , x 2 ,... , xn) = fX 1 (x 1 ) · fX 2 (x 3 ) · · · fXn (xn).
Exercise 3.6 (Joint Distributions)
102 Chapter 3. Continuous Random Variables (LECTURE NOTES 6)
f(x, y)
0, blue^ 0, blue1, black 1, black x, first draw
y, second draw
Figure 3.14: Discrete bivariate function: marbles
(a) Chance of choosing two blue marbles is f (x, y) = f (0, 0) = (i) 148 ··^713 = 2891 (ii) 148 ··^613 = 2491 (iii) 146 ··^813 = 2491 (iv) 146 ··^513 = 1591. (b) Chance of a blue marble then black marble is f (x, y) = f (0, 1) = (i) 148 ··^713 = 2891 (ii) 148 ··^613 = 2491 (iii) 146 ··^813 = 2491 (iv) 146 ··^513 = 1591. (c) The joint density is first drawn, x blue, 0 blue, 0 black, 1 black, 1 second drawn, y blue, 0 black, 1 blue, 0 black, 1 f (x, y) 148 ··^713 = 2891 148 ··^613 = 2491 146 ··^813 = 2491 146 ··^513 = (^1591) (i) True (ii) False (d) Chance of choosing a blue marble in first of two draws is fX (0) = P (X = 0) = f (0, 0) + f (0, 1) = (i) 2891 + 2891 (ii) 2891 + 2491 (iii) 2891 + 1591 (iv) 2491 + 1591. (e) Chance of choosing a black marble in first of two draws is fX (1) = P (X = 1) = f (1, 0) + f (1, 1) = (i) 2891 + 2891 (ii) 2891 + 2491 (iii) 2891 + 1591 (iv) 2491 + 1591. (f) P (X + Y = 1) = f (0, 1) + f (1, 0) = 2491 + 2491 = (i) 4891 (ii) 2891 (iii) 2491 (iv) 1591. (g) The joint density, including the marginal probabilities, x f (x, y) blue, 0 black, 1 fY (y) = P (Y = y) y blue, 0 (^289124915291) black, 1 (^249115913991) fX (x) = P (X = x) 5291 3991 1 (i) True (ii) False
104 Chapter 3. Continuous Random Variables (LECTURE NOTES 6)
(f) P
= f (1, 1) + f (2, 2) = 118 ·^1 + 218 ·^2 = (i) 183 (ii) 184 (iii) 185 (iv) 186.
x
f(x ,y )
51
y
49
2 3 4 5
6 7 8
1/
y
6 7 8
y
6 7 8 5 666
weight of bag
amount of salt
Figure 3.15: Continuous joint bivariate function: potato chips
following function for potato chip machine
f (x, y) =
12 ,^49 ≤^ x^ ≤^51 ,^2 ≤^ y^ ≤^8 0 elsewhere (a) This function is a pdf because 49 ≤ x ≤ 51 , 2 ≤ y ≤ 8 and ∫ (^8)
2
49
dx dy =
2
(x)x x=51=49 dy =
2
2 dy =
(y)y y=8=2 =
(i) 0 (ii) 0. 5 (iii) 0. 75 (iv) 1. (b) Marginal for X
fX (x) =
−∞
f (x, y) dy =
2
dy =
(y)y y=8=2 =
(i) 12 (ii) 13 (iii) 14 (iv) 15 , where 49 ≤ x ≤ 51. (c) Marginal for Y
fY (y) =
−∞
f (x, y) dx =
49
dx =
(x)x x=51=49 =
(i) 13 (ii) 14 (iii) 15 (iv) 16 , where 2 ≤ y ≤ 8.
Section 6. Joint Distributions (LECTURE NOTES 6) 105
(d) Independence? Since
f (x, y) =
= f 1 (x)f 2 (y) =
random variables X and Y are (i) dependent (ii) independent
y
x
f(x ,y )
0 2
1
1
1
1
(^0 ) x
y Y > X
1
y = x
Figure 3.16: Continuous bivariate density f (x, y) = (2 − x)(1 − y)
(a) Determine k so that
f (x, y) =
k(2 − x)(1 − y) 0 ≤ x ≤ 2 , 0 ≤ y ≤ 1 0 elsewhere
is a joint probability density function. Since ∫ (^1)
0
0
k(2 − x)(1 − y) dx dy = k
0
0
(2 − x − 2 y + xy) dx dy
= k
0
2 x −
x^2 − 2 xy +
x^2 y
)x=
x=
dy
= k
0
(2 − 2 y) dy
= k
2 y − y^2
)y= y=0 =^ k^ (2^ −^ 1) = 1, so k = (i) 14 (ii) 24 (iii) 34 (iv) 1. (b) Marginal fX (x). ∫ (^1)
0
(2 − x)(1 − y) dy =
0
(2 − x − 2 y + xy) dy
2 y − xy − y^2 +
xy^2
)y=
y=
Section 7. Functions of Independent Random Variables (LECTURE NOTES 6) 107
V ar(X + Y ) = V ar(W ) = σ W^2 =
σ X^2 + σ^2 Y + 2σXY dependent σ X^2 + σ^2 Y independent
Exercise 3.7 (Functions of Independent Random Variables)
f (x, y) =
12 ,^49 ≤^ x^ ≤^51 ,^2 ≤^ y^ ≤^8 0 elsewhere (a) The expected value of u(x, y) = xy, E [XY ], is ∫ (^8)
2
49
(xy) f (x, y) dxdy =
2
49
(xy)
dxdy =
2
y
49
x dx
dy
2
y
x^2
)x=
x=
dy =
2
y dy
2
y dy =
y^2
)y=
y=
(i) 5 (ii) 50 (iii) 55 (iii) 250. (b) The expected value of u(x, y) = x, E [X], is ∫ (^8)
2
49
xf (x, y) dxdy =
2
49
x
dxdy =
2
49
x dx
dy
2
x^2
)x=
x=
dy =
2
1 dy
2
1 dy =
(y)y y=8=2 =
(i) 5 (ii) 50 (iii) 55 (iii) 250.
108 Chapter 3. Continuous Random Variables (LECTURE NOTES 6)
(c) The expected value of u(x, y) = y, E [Y ], is ∫ (^8)
2
49
yf (x, y) dxdy =
2
49
y
dxdy =
2
y
49
1 dx
dy
2
y (x)x x=51=49 dy =
2
y dy
2
y dy =
y^2
)y=
y=
(i) 5 (ii) 50 (iii) 55 (iii) 250. (d) Since E(XY ) = 250 = 5 · 50 = E(X) · E(Y ) random variables X and Y are (i) independent (ii) dependent. (e) Whether or not X and Y are independent, E(X + Y ) = E(X) + E(Y ) = 50 + 5 = (i) 5 (ii) 50 (iii) 55 (iii) 250. (f) Find covariance σ^2 XY. σ XY^2 = E(XY ) − E(X)E(Y ) = 250 − 5 · 50 = (i) 0 (ii) 28 (iii) 315 (iii) 9001236. (g) Find E(X + 3Y + XY ). E(X + 3Y + XY ) = E(X) + 3E(Y ) + E(XY ) = 50 + 3(5) + 250 = (i) 0 (ii) 28 (iii) 315 (iii) 9001236. (h) Find E(X^2 ). ∫ (^8)
2
49
x^2 f (x, y) dxdy =
2
49
x^2
dxdy =
2
49
x^2 dx
dy
2
x^3
)x=
x=
dy =
2
1 dy
2
1 dy =
(y)y y=8=2 =
(i) 5 (ii) 28 (iii) 315 (iii) 9001236. (i) Find V ar(X) = σ X^2.
V ar(X) = E(X^2 ) − [E(X)]^2 = E(X^2 ) − μ^2 X =
(i) 13 (ii) 28 (iii) 315 (iii) 9001236.
110 Chapter 3. Continuous Random Variables (LECTURE NOTES 6)
(a) Since each person chooses any of the ten tickets with equal chance, E[Xi] = 1 × 101 + 0 × 109 = (i) 101 (ii) 102 (iii) 103. (b) So expected number of ten individuals to choose their own ticket is E(X) = E(X 1 ) + · · · + E(X 10 ) = 10 × 101 = (i) 108 (ii) 109 (iii) 1010. We would expect one of ten individuals to choose their own ticket. (c) If n individuals played this game, then we would expect E(X) = E(X) + · · · + E(Yn) = n
n
= (i) n− n 1 (ii) nn (iii) n+1 n. Again, we would expect one of n individuals to choose their own ticket.
A population is a set of measurements or observations of a collection of objects. A sample is a selected subset of a population. A parameter is a numerical quantity calculated from a population, whereas a statistic is a numerical quantity calculated from a sample. The population is assumed modelled by some random variable X with probability distribution, for example, the normal distribution, N (μ, σ^2 ), with popu- lation parameter mean μ and population parameter variance σ^2. A typical example of a sample statistic is the sample mean of n of the X random variables,
X¯n = X^1 +^ X^2 +^ · · ·^ +^ Xn n
If X 1 , X 2 ,... , Xn are mutually independent random variables where each is N (μ, σ^2 ), then X¯n is
N (μ (^) X¯n , σX^2 ¯n ) = N (μ, σ^2 n
In fact, the central limit theorem (CLT) says if X 1 , X 2 ,... , Xn are mutually indepen- dent random variables where each which common μ and σ^2 , then as n → ∞,
X¯n → N
μ, σ^2 n
no matter what the distribution of the population. Often n ≥ 30 is “large enough” for the CLT to apply.
Exercise 3.8 (Central Limit Theorem)
Section 8. Central Limit Theorem (LECTURE NOTES 6) 111
i. μ (^) X¯ = μX = (i) 2. 7 (ii) 2. 8 (iii) 2. 9. ii. σ (^) X¯ = σ√Xn = 0 √.^6435 = (i) 0. 10817975 (ii) 0. 1110032 (iii) 0. 13099923. iii. P
≈ (i) 0. 30 (ii) 0. 32 (iii) 0. 35. 1 - pnorm(2.75,2.7,0.64/sqrt(35)) # normal P(X > 2.75) = 1 - P(X < 2.75) [1] 0. iv. P
(i) 0. 36 (ii) 0. 39 (iii) 0. 45. pnorm(2.75,2.7,0.64/sqrt(35)) - pnorm(2.65,2.7,0.64/sqrt(35)) [1] 0. (b) Temperatures. Temperature, X, on any given day during winter in Laporte averages μX = 0 degrees with standard deviation of σX = 1 degree. Consider average temperature over random n = 40 days during winter. i. μ (^) X¯ = μX = (i) 0 (ii) 1 (iii) 2. ii. σ (^) X¯ = σ√Xn = √^140 = (i) 0. 0900234 (ii) 0. 15811388 (iii) 0. 23198455. iii. P
≈ (i) 0. 03 (ii) 0. 10 (iii) 0. 15. 1 - pnorm(0.2,0,1/sqrt(40)) # normal P(X > 0.2) = 1 - P(X < 0.2) [1] 0. iv. P
≈ (i) 0. 03 (ii) 0. 10 (iii) 0. 15. 1 - pnorm(0.3,0,1/sqrt(40)) # normal P(X > 0.3) = 1 - P(X < 0.3) [1] 0. Since P
≈ 0. 03 < 0 .05, 0.3o^ (i) is (ii) is not unusual. (c) Another example. Suppose X has distribution where μX = 1.7 and σX = 1.5. i. μ (^) X¯ = μX = (i) 2. 3 (ii) 1. 7 (iii) 2. 4. ii. σ (^) X¯ = σ√Xn = √^1. 495 = (i) 0. 0243892 (ii) 0. 14444398 (iii) 0. 21428572. iii. If n = 49, P (− 2 < X <¯ 2 .75) ≈ (i) 0. 58 (ii) 0. 86 (iii) 0. 999. pnorm(2.75,1.7,1.5/sqrt(49)) - pnorm(-2,1.7,1.5/sqrt(49)) # P(X-bar < 2.75) - P(X-bar < -2) [1] 0. iv. True (ii) False. If n = 15, P (− 2 < X <¯ 2 .75) cannot be calculated since n = 15 < 30. v. σ (^) X¯ = σ√Xn = √^1. 155 = (i) 0. 0243892 (ii) 0. 14444398 (iii) 0. 38729835. vi. If n = 15 and normal, P (− 2 < X <¯ 2 .75) ≈ (i) 0. 75 (ii) 0. 78 (iii) 0. 997. pnorm(2.75,1.7,1.5/sqrt(15)) - pnorm(-2,1.7,1.5/sqrt(15)) # P(X-bar < 2.75) - P(X-bar < -2) [1] 0. (d) Dice average. What is the chance, in n = 30 rolls of a fair die, average is between 3. and 3.7, P
? What if n = 3?
Section 8. Central Limit Theorem (LECTURE NOTES 6) 113
_ (^) _ _ _____
(^1 2 3) X 1
1 2 3
1 3/2 2 5/2 3
X 1 X 2 X 1 +X 2
1 2 3
X 2 _ =
1 2 3
1 2 3
X 1 X 2
1 2 3
_ X 1 +X 2 = 3
X
(^1 2 ) 43 _ (^53 73 83) + X 3
μ = 1.
σ = 0.
σ = 0.
σ = 0.
X 1
1 2 3
Figure 3.17: Comparing sampling distributions of sample mean
pnorm(1.95,1.8,0.75/sqrt(30)) # normal P(X-bar < 1.95) [1] 0. (d) After n = 35 trips to lake, sampling distribution in average number of fish caught is essentially normal where μ (^) X¯ = μX = (i) 1. 2 (ii) 1. 5 (iii) 1. 8 , σ (^) X¯ = 0 √.^7535 ≈ 0. 12677313 (ii) 0. 13693064 (ii) 0. 2449987 , and chance average number of fish is less than 1.95 is P ( X <¯ 1 .95) ≈ (i) 0. 73 (ii) 0. 88 (iii) 0. 94. pnorm(1.95,1.8,0.75/sqrt(35)) # normal P(X-bar < 1.95) [1] 0. (e) Chance average number of fish is less than 1.95 after 30 trips, P ( X <¯ 1 .95) ≈ 0 .86, is smaller than / larger than chance average number of fish is less than 1.95 after 35 trips, P ( X <¯ 1 .95) ≈ 0 .88. (f) The CLT is useful because (circle one or more):
114 Chapter 3. Continuous Random Variables (LECTURE NOTES 6)
1.38 1.52 1.66 1.80 1.94 2.08 2.
average number of fish caught, n = 30 trips, mean 1.8, SD 0.
1.38 1.52 1.66 1.80 1.94 2.08 2.
average number of fish caught, n = 35 trips, mean 1.8, SD 0.
chance = 0.
chance = 0.
Figure 3.18: Chance when n = 30 compared to chance when n = 35
i. No matter what original parent distribution is, as long as a large enough random sample is taken, average of this sample follows a nor- mal distribution. ii. In practical situations where it is not known what parent probability distribution to use, as long as a large enough random sample is taken, average of this sample follows a normal distribution. iii. Rather than having to deal with many different probability distribu- tions, as long as a large enough random sample is taken, average of this sample follows one distribution, normal distribution. iv. Many distributions in statistics rely in one way or another on normal distribution because of CLT. (g) (i) True (ii) False Central limit theorem requires not only n ≥ 30, but also a random sample of size n ≥ 30 is used.
Four related distributions which are important for statistics are discussed, including the gamma, chi-square, Student-t and F distributions.
116 Chapter 3. Continuous Random Variables (LECTURE NOTES 6)
◦ critical t-value is a number tp(n) where
P (T ≥ tp(n)) = p, 0 ≤ p ≤ 1.
[n+ 2
nn^2 dd^2 xn^2 −^1 Γ
(n 2
(d 2
(d + nx)
n+ 2 d^ , x >^0.
◦ μ = (^) d−d 2 , d > 2, σ^2 = 2 d (^2) (n+d−2) n(d−2)^2 (d−4) ,^ d >^ 4,^ M^ (t) is undefined ◦ if U and V are independent where U is χ^2 (n) and V is χ^2 (d), then F =
Un Vd^ =^ V nU d has^ F^ distribution with^ n^ and^ d^ degrees of freedom.
Exercise 3.9 (The Gamma and Related Distributions)
0 y
Γ(n) = (n − 1)!
(i) True (ii) False (b) Graphs of gamma density.
Section 9. The Gamma and Related Distributions (LECTURE NOTES 6) 117
0.0 0.5 1.0 1.5 2.0 2.5 3.
gamma(shape r = 1,2,3; rate lambda = 3)
x
0.0 0.5 1.0 1.5 2.0 2.5 3.
0
2
4
6
8
10
gamma(shape r = 1,2,3; rate lambda = 10)
x
(1)
(2) (3)
(4)
(5) (6)
Figure 3.19: Gamma densities
i. Match gamma density, (r, λ), to graph, (1) to (6). (r, λ) = (1, 3) (2, 3) (3, 3) (1, 10) (2, 10) (3, 10) graph (1) library(graphics) par(mfrow = c(2,1)) plot(function(x) dgamma(x,1,3), 0, 3, main = "gamma(shape r = 1,2,3; rate lambda = 3)") curve(dgamma(x,2,3), add = TRUE, col = "red", lwd = 2) curve(dgamma(x,3,3), add = TRUE, col = "blue", lwd = 2) plot(function(x) dgamma(x,1,10), 0, 3, main = "gamma(shape r = 1,2,3; rate lambda = 10)") curve(dgamma(x,2,10), add = TRUE, col = "red", lwd = 2) curve(dgamma(x,3,10), add = TRUE, col = "blue", lwd = 2) par(mfrow = c(1,1)) ii. As r increases, “center” (mean, μ = (^) λr ) of gamma density (i) decreases. (ii) remains the same. (iii) increases. iii. As λ increases, “dispersion” (variance, σ^2 = (^) λr 2 ) of gamma density (i) decreases. (ii) remains the same.
Section 9. The Gamma and Related Distributions (LECTURE NOTES 6) 119
(ii)
f (x) =
xe−x Γ(3) ,^ x >^0 , 0 , elsewhere, (iii)
f (x) =
x^2 e−x/^2 22 Γ(1) ,^ x >^0 , 0 , elsewhere. (c) What is the chance of waiting at most 4.5 hours? Since (r, λ) = (1, 2), P (X < 4 .5) = F (4.5) ≈ (i) 0. 002 (ii) 1. 000 (iii) 0. 870 (iv) 1. 151. pgamma(4.5,1,2) # gamma, P(X < 4.5), r = 1, lambda = 2 [1] 0. (d) P (X > 3 .1) = 1 − P (X ≤ 3 .1) = 1 − F (3.1) ≈ (i) 0. 002 (ii) 1. 000 (iii) 0. 870 (iv) 1. 151. 1 - pgamma(3.1,1,2) # gamma, P(X > 3.1), r = 1, lambda = 2 [1] 0. (e) What is the 90th percentile waiting time; in other words, what is that time such that 90% of waiting times are less than this time? If P (X < x) = 0.90, then x ≈ (i) 0. 002 (ii) 1. 000 (iii) 0. 870 (iv) 1. 151. qgamma(0.90,1,2) # 90th percentile, r = 1, lambda = 2 [1] 1.
0 2 4 6 8 X• 0 2 4 6 8 •Ξ^2
f( )
2
2
Ξ•^2 2
f( Ξ•^2 )
(a) Chi-Square with 4 degrees of freedom (b) Chi-Square with 10 degrees of freedom
P( Ξ• < 3.9) =?
3.6 7.
P(3.6 < Ξ• < 7.0) =?
Figure 3.20: Chi–square probabilities
i. If n = 4, the probability of waiting less than 3.9 minutes is P (X < 3 .9) = F (3.9) ≈ (i) 0. 35 (ii) 0. 45 (iii) 0. 58 (iv) 0. 66.
120 Chapter 3. Continuous Random Variables (LECTURE NOTES 6)
pchisq(3.9,4) # chi-square, n = 4 [1] 0. ii. If n = 10, P (3. 6 < X < 7 .0) ≈ (i) 0. 24 (ii) 0. 33 (iii) 0. 42 (iv) 0. 56. pchisq(7,10) - pchisq(3.6,10) # chi-square, n = 10 [1] 0. iii. Chance of waiting time exactly 3 minutes, say, is zero, P (X = 3) = 0. (i) True (ii) False (b) Critical value χ^2 p(n) (percentile).
(a) Chi-square with 4 df (b) Chi-square with 10 df
72nd percentile = 0.28 critical value 72nd percentile = 0.28 critical value
0.72 0.
2 4 6 8 10 2 4 6 8 10
φ(ψ) φ(ψ)
χ (n)
0.28 0.
2 χ (n) 0. 2
Figure 3.21: Chi–square percentiles
i. If n = 4 and P (X > χ^20. 28 (4)) = 0.28, then 0.28 critical value χ^20. 28 (4) ≈ (i) 3. 1 (ii) 5. 1 (iii) 8. 3 (iv) 9. 1. qchisq(0.28,4,lower.tail=FALSE) # chi-square, n = 4, 0.28 critical value [1] 5. ii. If n = 4 and P (X < χ^20. 28 (4)) = 0.72, then 72nd percentile χ^20. 28 (4) ≈ (i) 3. 1 (ii) 5. 1 (iii) 8. 3 (iv) 9. 1. qchisq(0.72,4,lower.tail=TRUE) # chi-square, n = 4, 72nd percentile [1] 5. iii. If n = 10 and P (X > χ^20. 28 (10)) = 0.28, then χ^20. 28 (10) ≈ (i) 2. 5 (ii) 10. 5 (iii) 12. 1 (iv) 20. 4. qchisq(0.28,10,lower.tail=FALSE) # chi-square, n = 10, 0.28 critical value [1] 12. iv. The 0.05 critical value for a chi-square with n = 18 df, is χ^20. 05 (18) ≈ (i) 2. 5 (ii) 10. 5 (iii) 28. 870 (iv) 28. 869. qchisq(0.05,18,lower.tail=FALSE) # chi-square, n = 18, 0.05 critical value, 95th percentile [1] 28. or, equivalently using Table C. χ^20. 05 (18) ≈ (i) 2. 5 (ii) 10. 5 (iii) 28. 870 (iv) 28. 869. Table C.4 can only be used for a restricted set of (n, p). v. The 0.05 critical value (95th percentile) is that waiting time such that 95% of the waiting times are less than this waiting time and 5% are more than this time. (i) True (ii) False