






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Discrete probability models, focusing on permutations and combinations. It covers the calculation of the probability of guessing a 4-digit pin, the concept of permutations and combinations, and their applications in counting the number of ways to select objects from a list. The document also touches upon the importance of these concepts in understanding probability distributions and statistical inference.
Typology: Study notes
1 / 10
This page cannot be seen from the preview
Don't miss anything!







The link between probability and statistics arises because, in order to see, for example, how strong the evidence is in some data, we need to consider the probabilities concerned with how we came to observe the data. In this chapter, we describe some standard probability models which are often used used with data from various sources such as market research. However, before we describe these in detail, we need to establish some ground rules for counting.
Imagine that your cash point card has just been stolen. What is the probability of the thief guessing your 4 digit PIN in one go? To answer this question, we need to know how many different 4 digit PINs there are. We are also assuming that the thief chooses in such a way that all possibilities are equally likely. With this assumption the probability of a correct guess (in one go) is
P (Guess correctly) = (^) number of possible 4 digit PINsnumber of correct PINs = (^) number of possible 4 digit PINs^1.
There is, of course, only one correct PIN. The number of possible 4 digit PINs is calculated as follows. There are 10 choices for the first digit, another 10 choices for the second digit, and so on. Therefore the number of possible choices is
10 × 10 × 10 × 10 = 10, 000.
So the probability of a correct guess is
P (Guess correctly) = (^10) × 10 ×^110 × 10 = (^10) ,^1000 = 0. 00001.
The calculation of the card-thief’s correct guess of a PIN changes if the thief knows that your PIN uses 4 different digits. Now the number of possible PINs is smaller. To find this number we need to work out how many ways there are to arrange 4 digits out of a lsit of 10.
In more general terms, we need to know how many different ways there are of arranging r objects from a list of n objects. The best way of thinking about this to consider the choice of each item as a different experiment. The first experiment has n possible outcomes. The second experiment only has n − 1 possible outcomes, as one object has already been selected. The third experiment has n − 2 outcomes and so on until the rth experiment, which has n − r + 1 possible outcomes. Therefore the number of possible selections is
n × (n − 1) × (n − 2) × · · · × (n − r + 1) =
n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1 (n − r) × (n − r − 1) × · · · × 3 × 2 × 1
n! (n − r)!
Here n! = n(n − 1)(n − 2)(n − 3) × · · · × 3 × 2 × 1
and is called n factorial - it can be found on many calculators. The formula
n! (n − r)!
is a commonly encountered expression in counting calculations (combinatorics) and has its own notation. The number of ordered ways of selecting r objects from n is denoted nPr, where
nP r =^
n! (n − r)!
We refer to nPr as the number of permutations of r out of n objects.
If we are interested solely in the number of ways of arranging n objects, then this is clearly just
nP n =^ n!
Returning to the example in which the thief is trying to guess your 4-digit PIN, if the thief knows that the PIN contains no repreated digits then the number of possible PINS is
(^10) P 4 = 5040
so, assuming that each is equally likely to be guessed, the probability of a correct guess is
P (Guess correctly) =
This calculation is very similar to that of permutations except that the ordering of objects no longer matters. For example, if we select two objects from three objects A, B and C, there are 3 P 2 = 6 ways of doing this:
A, B A, C B, A B, C C, A C, B.
However, if we are not interested in the ordering, just in whether A, B or C are chosen then A, B is the same as B, A etc. and so the number of selections is just 3:
A, B A, C B, C.
The effect of ignoring the ordering reduces the number of permutations by a factor of 2 P 2 = 2. In general, the number of combinations of r objects from n objects is
number of ordered samples of size r number of orderings of samples of size r
nPr rPr
=
nPr r! =
n! r!(n − r)!
Again, this is a very commonly found expression in combinatorics, so it has its own notation:
nC r =^
n! r!(n − r)!
There are other commonly used notations for this quantity: Cnr and
n r
. These numbers are
known as the binomial coefficients.
Now we can see that the number of ways to select 4 retail outlets out of 20 is
An easy way to calculate binomial coefficients (at least small ones) is to use the fact that
nC r =^
n r
n − 1 r − 1
n − 2 r − 2
n − r + 1 1
For example,
To see how combinations can be used to calculate probabilities, we will look at the UK National Lottery. In this lottery, there are 49 numbered balls, and six of these are selected at random. A seventh ball is also selected, but this is only relevant if you get exactly five numbers correct. The player selects six numbers before the draw is made, and after the draw, counts how many numbers
are in common with those drawn. Players win a prize if they select at least three of the balls drawn. The order in which the balls are drawn in is irrelevant.
To begin with, let’s calculate the probability that exactly 3 of the 6 numbers we select are drawn. First we need to count the number of possible draws (the number of different sets of 6 numbers), and then how many of those draws correspond to getting exactly three numbers correct. The number of possible draws is the number of ways of choosing 6 objects from 49. This is
(^49) C 6 = 13,^983 ,^816.
The number of drawings corresponding to getting exactly three right is calculated as follows. Of the 49 balls from which the draw is made, 6 correspond to your selected numbers, and 43 correspond to other numbers. We want to know how many ways there are of choosing 3 of your selected numbers and 3 other numbers. This is the number of ways of choosing 3 from 6, multiplied by the number of ways of choosing 3 from 43. That is, there are
(^6) C 3 43 C 3 = 246, 820
ways of choosing exactly 3 of your selected numbers. So, the probability of matching exactly 3 numbers is (^6) C 3 43 C 3 (^49) C 6 =^
Similarly, we can calculate the probability of getting other prize-winning outcomes:
P (match exactly 6 correct numbers) =
P (match exactly 5 correct numbers plus bonus ball) =
P (match exactly 5 correct numbers) =
P (match exactly 4 correct numbers) =
These outcomes are not very likely and so the prizes are chosen to reflect how likely you are to win. For example, in a recent lottery draw, the prizes were
Number of balls matched Prize 6 £ 2.4M 5 plus bonus £ 240K 5 £ 3K 4 £ 100 3 £ 10 < 3 £ 0
This information allows us to calculate a fair price for such a bet. The expected monetary value of
x P (X = x) 1 1 / 6 2 1 / 6 3 1 / 6 4 1 / 6 5 1 / 6 6 1 / 6 sum 1
Just as with sample data, it is useful to have some summary information about probability distri- butions. For example, what is the average value of the random variable? How much variation is there in this distribution?
The mean of a quantitative random variable is a weighted sum of its possible values, where each weight is the probability of the value occurring. This is known as the expected value of the ran- dom variable or the population mean of the random variable and is usually written as E(X) or μ. Therefore, for a discrete random variable,
E(X) = μ =
x P (X = x).
Previously we have seen a similar calculation when determining the expected monetary value
EM V =
P (Event) × Monetary value of Event.
The expected value is the average value which we would get in an infinitely long sequence of identical experiments.
For example, suppose that the population of interest is this class and that it contains N students. Suppose that we are interested in the number of times that students have bought a particular product (e.g. a cinema ticket) in the last month. Clearly the population mean is just the average of this variable in the class:
μ =
n
∑^ n
i=
xi
where xi is the number of times student i has bought the product. We can also write this as
μ =
n
j=
jfj =
j=
j
fj n
where fj is the frequency of x = j in the population and fj /n is the relative frequency. If we choose a student at random from the class then the probability that we choose a student with x = j is
P (X = j) =
fj n
86
the relative frequency and so
μ =
j=
jP (X = j).
It is also clear that this is the average which we would get if we kept on sampling, with replacement, for a very long time.
For the die-rolling experiment, the average number of spots we would get if we repeated the ex- periment an “infinite” number of times is
x P (X = x) = 1 ×
This concept can be generalised to calculate the expected value of any function of X. For instance, in the lottery example discussed previously, the prize was determined by the number of matches. In the die-rolling experiment, we could consider a prize worth the square of the number showing: £ 1 for a 1, £ 4 for a 2, £ 9 for a 3, and so on. In this case the expected prize money is
E
x^2 P (X = x)
= 1 ×
In addition to having the population mean as a measure of location, it is also useful to know about the spread of the random variable about this value. The variance of a random variable is denoted Var(X) or sometimes σ^2 and is determined by
Var(X) = σ^2 = E
(X − μ)^2
It is simply the average squared deviation from the mean. Note that this is the same sort of calcu- lation as with sample variances. The larger the value for the variance, the larger the spread.
Referring again back to the die-rolling experiment, if X is the number of spots, we can calculate the variance (using μ = 3. 5 ):
x P (X = x) (x − μ)^2 (x − μ)^2 P (X = x) 1 1/6 6.25 1. 2 1/6 2.25 0. 3 1/6 0.25 0. 4 1/6 0.25 0. 5 1/6 2.25 0. 6 1/6 6.25 1. sum 1 2.
Hence Var(X) = 2. 9167.
7.4 Exercises 7
(a) Consider the sequence of 4 items. In how many different sequences can we get i. no unsatisfactory items? ii. exactly 1 unsatisfactory item? iii. exactly 2 unsatisfactory items? iv. exactly 3 unsatisfactory items? v. four unsatisfactory items. (b) Find the probability of a particular sequence containing i. no unsatisfactory items. ii. exactly 1 unsatisfactory item. iii. exactly 2 unsatisfactory items. iv. exactly 3 unsatisfactory items. v. four unsatisfactory items. (c) Hence find the probability that we get i. no unsatisfactory items. ii. exactly 1 unsatisfactory item. iii. exactly 2 unsatisfactory items. iv. exactly 3 unsatisfactory items. v. four unsatisfactory items. (d) Find the mean number of unsatisfactory items. (e) Find the variance and standard deviation of the number of unsatisfactory items.