

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to probability theory, focusing on the concept of a sample space, which consists of a set of outcomes, a collection of subsets called events, and a probability function. The author, d. Joyce, from clark university, outlines the axioms of probability theory, including the properties of probability functions and the principle of inclusion and exclusion. The document also introduces the concept of random variables and their relation to events and probability functions.
Typology: Study notes
1 / 3
This page cannot be seen from the preview
Don't miss anything!


Sample space. A sample space consists of a un- derlying set S, whose elements are called outcomes, a collection of subsets of S called events, and a function P on the set of events, called a probability function, satisfying the following axioms.
P (
⋃
i
Ei) =
∑
i
P (Ei).
From these axioms a number of other properties can be derived including these.
P (E) = 1 − P (E).
P (E ∪ F ) = P (E) + P (F ) − P (E ∩ F ),
therefore
P (E ∪ F ) ≤ P (E) + P (F ).
P (E) = P (E ∩ F ) + P (E ∩ F ).
⋃^ n
r=
Er) =
∑^ n
i=
P (Ei) −
∑
i<j
P (Ei ∩ Ej )
∑
i<j<k
P (Ei ∩ Ej ∩ Ek) − · · ·
In words, to find the probability of a union of n events, first sum their individual probabilities, then subtract the sum of the probabilities of all their pairwise intersections, then add back the sum of the probabilities of all their 3-way interections, then subtract the 4-way intersections, and continue adding and subtracting k-way intersections until you finally stop with the probability of the n-way intersection. Random variables notation. In order to de- scribe a sample space, we frequently introduce a symbol X called a random variable for the sam- ple space. With this notation, we can replace the probability of an event, P (E), by the notation P (X ∈ E), which, by itself, doesn’t do much. But many events are built from the set operations of complement, union, and intersection, and with the random variable notation, we can replace those by logical operations for ‘not’, ‘or’, and ‘and’. For in- stance, the probability P (E ∪ F ) can be written as P (X ∈ E but X /∈ F ). Also, probabilities of finite events can be writ- ten in terms of equality. For instance, the prob-
ability of a singleton, P ({a}), can be written as P (X=a), and that for a doubleton, P ({a, b}) = P (X=a or X=b). One of the main purposes of the random variable notation is when we have two uses for the same sample space. For instance, if you have a fair die, the sample space is S = { 1 , 2 , 3 , 4 , 5 , 6 } where the probability of any singleton is 16. If you have two fair dice, you can use two random variables, X and Y , to refer to the two dice, but each has the same sample space. (Soon, we’ll look at the joint distri- bution of (X, Y ), which has a sample space defined on S × S. Random variables and cumulative distri- bution functions. A sample space can have any set as its underlying set, but usually they’re related to numbers. Often the sample space is the set of real numbers R, and sometimes a power of the real numbers Rn. The most common sample space only has two el- ements, that is, there are only two outcomes. For instance, flipping a coin as two outcomes—Heads and Tails; many experiments have two outcomes— Success and Failure; and polls often have two outcomes—For and Against. Even though these events aren’t numbers, it’s useful to replace them by numbers, namely 0 and 1, so that Heads, Suc- cess, and For are identified with 1, and Tails, Fail- ure, and Against are identified with 0. Then the sample space can have R as its underlying set. When the sample space does have R as its un- derlying set, the random variable X is called a real random variable. With it, the probability of an in- terval like [a, b], which is P ([a, b]), can then be de- scribed as P (a ≤ X ≤ b). Unions of intervals can also be described, for instance P ((−∞, 3) ∪ [4, 5]) can be written as P (X < 3 or 4 ≤ X ≤ 5). When the sample space is R, the probability function P is determined by a cumulative distri- bution function (c.d.f.) F as follows. The function F : R → R is defined by
F (x) = P (X ≤ x) = P ((−∞, x]).
Then, from F , the probability of a half-open inter-
val can be found as
P ((a, b]) = F (b) − F (a).
Also, the probability of a singleton {b} can be found as a limit
P ({b}) = lim a→b (F (b) − F (a)).
From these, probabilities of unions of intervals can be computed. Sometimes, the c.d.f. is simply called the distribution, and the sample space is identified with this distribution. Discrete distributions. Many sample distribu- tions are determined entirely by the probabilities of their outcomes, that is, the probability of an event E is
P (E) =
∑
x∈E
P (X=x) =
∑
x∈E
P ({x}).
The sum here, of course, is either a finite or count- ably infinite sum. Such a distribution is called a dis- crete distribution, and when there are only finitely many outcomes x with nonzero probabilities, it is called a finite distribution. A discrete distributions is usually described in terms of a probability mass function (p.m.f.) f de- fined by
f (x) = P (X=x) = P ({x}).
This p.m.f. is enough to determine this distribution since, by the definition of a discrete distribution, the probability of an event E is
P (E) =
∑
x∈E
f (x).
In many applications, a finite distribution is uni- form, that is, the probabilities of its outcomes are all the same, 1/n, where n is the number of out- comes with nonzero probabilities. When that is the case, the field of combinatorics is useful in find- ing probabilities of events. Combinatorics includes various principles of counting such as the multipli- cation principle, permutations, and combinations.