




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Notes on probability, covering topics such as the probability generating function, negative binomial distribution, gambler's ruin, random walk, the Cauchy distribution, the projection theorem, least squares, and conditional expectation. The notes include examples and exercises. likely intended for a university-level course on probability or statistics.
Typology: Study notes
1 / 508
This page cannot be seen from the preview
Don't miss anything!





























































































Hence P (Ac) = 1 − P (A) and (1.1) holds. The axioms of probability guarantee that for any event 0 ≤ P (A) ≤ 1. By axiom (i), we have P (B) ≥ 0 for any event B. Since Ac^ is an event, P (Ac) ≥ 0. This readily implies that 0 ≤ P (Ac) = 1 − P (A). Hence 1 ≥ P (A). We claim that P (ABc) = P (A) − P (AB). (1.2)
Recall that for any sets F , G, and H, we have
F (G ∪ H) = F ∩ (G ∪ H) = F G ∪ F H.
Using this we obtain A = AΩ = A(B ∪ Bc) = AB ∪ ABc.
Since AB ∩ ABc^ = φ, axiom (iii), yields
P (A) = P (AB ∪ ABc) = P (AB) + P (ABc).
Hence P (ABc) = P (A) − P (AB) and (1.2) holds. For example consider a deck of 52 cards. Then the probability of drawing an ace but not an ace of spades is 3/52. To see this let A be the event of drawing an ace and ♠ the event of drawing a spade. Then we have
P (A♠c) = P (A) − P (A♠) =
Another method is to use axiom (iii′), that is,
P (A♠c) = P (A♣ ∪ A♦ ∪ A♥) = P (A♣) + P (A♦) + P (A♥) =
As expected, ♣ is the event of drawing a club, and ♦ is the event of drawing a diamond, while ♥ is the event of drawing a heart. The following result is useful.
P (A ∪ B) = P (A) + P (B) − P (AB). (1.3)
To see this first observe that
A ∪ B = ABc^ ∪ AB ∪ AcB. (1.4)
One can see this by drawing a Venn diagram. To mathematically verify this observe that
ABc^ ∪ AB ∪ AcB = (A ∩ (Bc^ ∪ B)) ∪ AcB = A ∪ AcB = (A ∪ Ac) ∩ (A ∪ B) = A ∪ B.
Thus (1.4) holds. Recall that P (EF c) = P (E) − P (EF ). Since ABc, AB and AcB have no common intersection, axiom (iii′) yields
P (A ∪ B) = P (ABc) + P (AB) + P (AcB) = P (A) − P (AB) + P (AB) + P (B) − P (AB) = P (A) + P (B) − P (AB).
Therefore (1.3) holds. For example, the probability of drawing an ace or a spade in a deck of 52 cards is 4/13. By consulting (1.3), we have
Another useful formula is
P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (AB) − P (AC) − P (BC) + P (ABC). (1.5)
To verify this recall that P (F ∪ G) = P (F ) + P (G) − P (F G). Using this we obtain
P (A ∪ B ∪ C) = P (A ∪ (B ∪ C)) = P (A) + P (B ∪ C) − P (A ∩ (B ∪ C)) = P (A) + P (B) + P (C) − P (BC) − P ((A ∩ B) ∪ (A ∩ C)) = P (A) + P (B) + P (C) − P (BC) − P (AB) − P (AC) + P (ABAC) = P (A) + P (B) + P (C) − P (AB) − P (AC) − P (BC) + P (ABC).
Therefore equation (1.5) holds. For example, find the probability of drawing an ace (denoted by A) or a queen (denoted by Q) or a spade (denoted by ♠) in a deck of 52 cards. By employing (1.5), we obtain
P (A ∪ Q ∪ ♠) = P (A) + P (Q) + P (♠) − P (AQ) − P (A♠) − P (Q♠) + P (AQ♠)
=
The probability space (Ω, A, P ). A mathematically precise definition of a probability space involves the notion of a σ-algebra. Since these notes are an introduction to the basic concepts of probability, we will not use measure theoretic techniques. However, for complete- ness we will define a probability space. A σ-algebra (Ω, A) is a pair consisting of a universal set Ω and a collection of sets A such that the empty set φ is A. Moreover, if A ∈ A, then its complement Ac^ ∈ A. In particular, Ω is in A. Finally, if any countable collection of sets {Aj }∞ 1 is in A, then its union ∪∞ 1 Aj is also in A. By De-Morgan’s law if any countable collection of sets {Aj }∞ 1 is in A, then its intersection ∩∞ 1 Aj is also in A. A probability measure space (Ω, A, P ) is a σ-algebra (Ω, A) along with a measure P mapping A into [0, 1] such that
(i) If A ∈ A, then P (A) ≥ 0.
(ii) P (Ω) = 1.
(iii) If the set {Aj }∞ 1 in A are pairwise distinct Aj ∩ Ak = φ when j 6 = k, then
j=
Aj ) =
j=
P (Aj ).
Since P (· | B) is a probability measure, all of our previous results concerning probability measures holds for P (· | B). In particular, the following results hold:
P (Ac^ | B) = 1 − P (A | B) P (ACc^ | B) = P (A | B) − P (AC | B) P (A ∪ C | B) = P (A | B) + P (C | B) − P (AC | B)
The following result is useful in applications:
P (A) = P (A | B)P (B) + P (A | Bc)P (Bc). (2.2)
To see this recall that P (F G) = P (F | G)P (G). Using this with the third axiom of proba- bility, we obtain
P (A) = P (AΩ) = P (A ∩ (B ∪ Bc)) = P (AB ∪ ABc) = P (AB) + P (ABc) = P (A | B)P (B) + P (A | Bc)P (Bc).
Therefore (2.2) holds. We say that {Bj }n 1 is a partition of Ω if the following two conditions hold
Bj ∩ Bk = φ for all j 6 = k and
⋃^ n
j=
Bj = Ω. (2.3)
If {Bj }n 1 is a partition of Ω, then we have the following useful result
∑^ n
j=
P (A | Bj )P (Bj ). (2.4)
To prove this simply observe that
P (A) = P (AΩ) = P (A ∩ (∪nj=1Bj )) = P (∪nj=1ABj )
=
∑^ n
j=
P (ABj ) =
∑^ n
j=
P (A | Bj )P (Bj ).
Therefore (2.4) holds.
The following example is taken from Rozanov [48]. Consider a hiker who is lost in the woods. The hiker equally likely chooses a path to find his way home from a starting position and does not backtrack; see Figure 1.1. (GeoGebra was used to draw Figure 1.1.) The hiker chooses a path at random an either ends up at home or gets lost. Let A be the event that the hiker makes it home. The hiker must pass through one of the intermediate points {Bj }^41.
Figure 1.1: A lost hiker
Moreover, {Bj }^41 form a partition of Ω. By consulting equation (2.4) with Figure 1.1, we see that
j=
P (A | Bj )P (Bj )
= P (A | B 1 )P (B 1 ) + P (A | B 2 )P (B 2 ) + P (A | B 3 )P (B 3 ) + P (A | B 4 )P (B 4 )
=
Therefore the probability that the hiker makes it home is P (A) = 2348. The probability the hiker gets lost in the woods is P (Ac) = 2548. Finally, one might be naive and think that the probability of arriving home is the number of paths leaving {Bj }^41 and arriving at home 5, divided by the total number of paths 14 leaving {Bj }^41. However, 145 is not the correct answer.
1.3 The birthday problem
Assume that there are n people in the room. The birthday problem is to find the probability that at least two of the n people have the same birthday. Here we exclude leapyear and assume that there are no twins in the room. Moreover, we assume that a birthday is equally likely to occur any day of the year, and n ≤ 365. Let Sn be the event that at least two people out of n have the same birthday. We claim that
P (Sn) = 1 −
n∏− 1
j=
j 365
To derive the formula for P (Sn) in (3.1), let Dn be the event that we have n people with a different birthday. Moreover, Dn means that we have chosen the n-th person at random with
Then P (Sn) = p(n). One can find an exponential approximation for P (Sn). To this end, recall that the Taylor series for ez^ is given by ez^ =
0
zn n!. In particular,^ e
−a (^) ≈ 1 − a when |a| is ”close” to zero.
(In fact, 1 − a < e−a^ for 0 < a ≤ 1.) Using this approximation, we have
P (Sn) = 1 −
n∏− 1
j=
j 365
n∏− 1
j=
e−^ 365 j = 1 − e−^ 3651 ∑n j=1−^1 j^ .
Now recall that the sum of the first n − 1 integers is given by n∑− 1
j=
j =
n(n − 1) 2
The derivation of this formula is left to the reader as a simple exercise. To prove this one simply rearranges the integers. For example,
∑^9
j=
j = 9 + (1 + 8) + (2 + 7) + (3 + 6) + (4 + 5) =
Using
∑n− 1 j=1 j^ =^
n(n−1) 2 , we see that
P (Sn) = 1 −
n∏− 1
j=
j 365
≈ 1 − e−^
n( 730 n−1) ≈ 1 − e−^ 730 n^2
. (3.2)
Let g and h be the functions defined by
g(n) = 1 − e−^
n( 730 n−1) for 1 ≤ n ≤ 365
h(n) = 1 − e−^ 730 n^2 for 1 ≤ n ≤ 365.
The graph of n vs p(n) = P (Sn) is given in Figure 1.2 in blue, the graph of g is in red, while the graph of h is in green. Finally, the approximations of P (Sn) by g and h are quite good. Since P (S 100 ) ≈ 1 we only plotted this graph for 1 ≤ n ≤ 100. In fact, using Matlab, we discovered, that
‖p − g‖ =
j=
|p(j) − g(j)|^2 = 0. 0500
‖p − h‖ =
j=
|p(j) − h(j)|^2 = 0. 0505
‖p − g‖∞ = max{|p(j) − g(j)| : 1 ≤ j ≤ 365 } = 0. 0103 |p − h‖∞ = max{|p(j) − h(j)| : 1 ≤ j ≤ 365 } = 0. 0124.
As expected, the approximation of p(n) = P (Sn) by g(n) is better, then the approximation of P (Sn) by h(n). Finally, it is noted that even in this simple problem, one sees an exponential function
of the form e−^
x γ^2 where γ is a constant. These exponential forms will be seen throughout probability theory. For some further comments on the birthday problem see Wikipedia.
0 20 40 60 80 100 n
0
1
Probability of at least 2 with the same birthday in a group of n
Figure 1.2: The birthday problem
An lower bound for P (Sn)
Following some ideas of P.R. Halmos, observe that 1 − a < e−a^ for 0 < a ≤ 1. Recall that xn = P (Dn), and thus,
xn =
n∏− 1
j=
j 365
n∏− 1
j=
e−^ 365 j = e−^
n( 730 n−1) .
Hence P (Sn) = 1 − xn > 1 − e−^
n(n−1) (^730) = g(n) (for 1 < n < 365).
Therefore the red graph corresponding to g is always below the blue graph corresponding to P (Sn) in Figure 1.2. (However, h is both above and below P (Sn).)
Recall that xn < e−^
n( 730 n−1) for 1 < n ≤ 365. Let us find the minimum value of n over the interval [1, 365] such that
e−^
n( 730 n−1) <
Consider the equation,
e−^
or equivalently 2 = e
n( 730 n−1) .
Here λ 1 and λ 2 are scalars determined by the characteristic equation, while α and β are constants determined by the initial or final conditions. To find λ 1 and λ 2 substitute γλn^ into the difference equation in (4.2), which yields
γλn^ = γλn+1p + γλn−^1 q.
Dividing by γλn−^1 , we arrive at the following quadratic equation
λ^2 −
λ p
q p
Now let set r = q/p. Then using p + q = 1, observe that
(λ − 1)(λ − r) = λ^2 −
λ p
q p
So the roots to the quadratic equation in (4.5) are given by
λ 1 = 1 and λ 2 = r =
q p
Using the fact that p + q = 1, we see that there is a repeated root if and only if p = q = 1/2. In this case, λ 1 = 1 is the repeated root. Now assume that p 6 = 1/2, and thus there is no repeated root of (4.5). In this case, the solution to the difference equation in (4.2) is of the form
yn = α + βrn.
By employing the initial condition y 0 = 0 and the final condition ym = 1, we arrive at the following matrix equation: (^) [ 1 1 1 rm
α β
Since r 6 = 1, the determinant of the previous 2 × 2 matrix is nonzero. By inverting this matrix, we arrive at
[ α β
rm^ − 1
rm^ − 1 − 1 1
rm^ − 1
So the solution to the difference equation in (4.2) is given by
yn =
1 − rn 1 − rm^
(if p 6 = 1/2).
For p = 1/2 an application of L’Hˆopital’s rule yields
yn =
n m
(if p = 1/2).
To obtain the case when p = 1/2 directly, recall that for repeated roots the solution is given by yn = αλn 1 + βnλn 1. Since λ 1 = 1, in this case yn = α + βn. By employing the initial condition y 0 = 0 and the final condition ym = 1, we arrive at the following matrix equation:
[ 1 0 1 m
α β
Thus α = 0 and β = 1/m. In other words, yn = n/m. Summing up our previous analysis, yields the following solution to the gambler’s ruin problem
P (Wn,m) =
1 − rn 1 − rm^
if p 6 = 1/ 2 (4.7) =
n m
if p = 1/ 2.
The solution to the gambler’s ruin problem shows that the probability of doubling one’s money m = 2n in a fair game p = 1/2 is 50%. This result is not surprising. However, the solution to the gambler’s ruin problem also shows that if p < 1 /2, then it is better to bet all your n dollars in one game rather than play one game at a time. On the other hand, if p > 1 /2, then it is better to bet one dollar on each game rather than bet all your money in one game. To be more explicit, for the moment assume that p > 1 /2. In this case, r = q/p < 1. In particular, rm^ converges to zero as m tends to infinity. So according to (4.7), the probability of making an infinite amount of money m = ∞ starting with n dollars is given by P (Wn,∞) = 1 − rn^ (p > 1 /2).
For example, if p = 0.51 and n = 100, then the probability that a gambler will make an infinite amount of money is 1 − (49/51)^100 ≈ 0 .98. This is why a casino will not let the players count cards or use a computer. In this case, p > 1 /2 and a player can bankrupt the casino. This is also why casino’s make a tremendous amount of money. The p for a casino for many games is greater than or equal to 0.55. Now assume that p < 1 /2. In this case, r = q/p > 1. So if n and m are large, then
P (Wn,m) =
1 − rn 1 − rm^
rn rm^
p q
)m−n .
If p = 1/2, then there is a 50% probability of achieving m = 200 dollars starting with n = 100 dollars. Now assume that p = 0.49 and the gambler starts out with n = 100 dollars and m = 200 dollars. Then P (Wn,m) = 0.018. So in this case, it is better to bet the one hundred dollars in the first game which yields a 49% chance of achieving 200 dollars, rather than playing one game at a time which has only a 1.8% chance of doubling the original 100 dollars. In fact, there is only a 36.4% chance of achieving m = 125 dollars starting with 100 dollars playing one game at a time. The situation is even worse the as p becomes smaller. For example, if p = 0.45 and n = 100, then there is only a 37% chance of achieving m = 105 dollars, and 0.66% chance of achieving m = 125 dollars.