Probability Models and Calculations: Discrete Distributions and Permutations, Study notes of Business Administration

Discrete probability models, focusing on permutations and combinations. It covers the calculation of the probability of guessing a 4-digit pin, the concept of permutations and combinations, and their applications in counting the number of ways to select objects from a list. The document also touches upon the importance of these concepts in understanding probability distributions and statistical inference.

Typology: Study notes

2012/2013

Uploaded on 07/26/2013

dilip
dilip 🇮🇳

4.6

(20)

133 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 7
Discrete Probability Models
7.1 Introduction
The link between probability and statistics arises because, in order to see, for example, how strong
the evidence is in some data, we need to consider the probabilities concerned with how we came
to observe the data. In this chapter, we describe some standard probability models which are often
used used with data from various sources such as market research. However, before we describe
these in detail, we need to establish some ground rules for counting.
7.2 Permutations and Combinations
7.2.1 Numbers of sequences
Imagine that your cash point card has just been stolen. What is the probability of the thief guessing
your 4 digit PIN in one go? To answer this question, we need to know how many different 4 digit
PINs there are. We are also assuming that the thief chooses in such a way that all possibilities are
equally likely. With this assumption the probability of a correct guess (in one go) is
P(Guess correctly) = number of correct PINs
number of possible 4 digit PINs
=1
number of possible 4 digit PINs .
There is, of course, only one correct PIN. The number of possible 4 digit PINs is calculated as
follows. There are 10 choices for the first digit, another 10 choices for the second digit, and so on.
Therefore the number of possible choices is
10 ×10 ×10 ×10 = 10,000.
80
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Probability Models and Calculations: Discrete Distributions and Permutations and more Study notes Business Administration in PDF only on Docsity!

Chapter 7

Discrete Probability Models

7.1 Introduction

The link between probability and statistics arises because, in order to see, for example, how strong the evidence is in some data, we need to consider the probabilities concerned with how we came to observe the data. In this chapter, we describe some standard probability models which are often used used with data from various sources such as market research. However, before we describe these in detail, we need to establish some ground rules for counting.

7.2 Permutations and Combinations

7.2.1 Numbers of sequences

Imagine that your cash point card has just been stolen. What is the probability of the thief guessing your 4 digit PIN in one go? To answer this question, we need to know how many different 4 digit PINs there are. We are also assuming that the thief chooses in such a way that all possibilities are equally likely. With this assumption the probability of a correct guess (in one go) is

P (Guess correctly) = (^) number of possible 4 digit PINsnumber of correct PINs = (^) number of possible 4 digit PINs^1.

There is, of course, only one correct PIN. The number of possible 4 digit PINs is calculated as follows. There are 10 choices for the first digit, another 10 choices for the second digit, and so on. Therefore the number of possible choices is

10 × 10 × 10 × 10 = 10, 000.

So the probability of a correct guess is

P (Guess correctly) = (^10) × 10 ×^110 × 10 = (^10) ,^1000 = 0. 00001.

7.2.2 Permutations

The calculation of the card-thief’s correct guess of a PIN changes if the thief knows that your PIN uses 4 different digits. Now the number of possible PINs is smaller. To find this number we need to work out how many ways there are to arrange 4 digits out of a lsit of 10.

In more general terms, we need to know how many different ways there are of arranging r objects from a list of n objects. The best way of thinking about this to consider the choice of each item as a different experiment. The first experiment has n possible outcomes. The second experiment only has n − 1 possible outcomes, as one object has already been selected. The third experiment has n − 2 outcomes and so on until the rth experiment, which has n − r + 1 possible outcomes. Therefore the number of possible selections is

n × (n − 1) × (n − 2) × · · · × (n − r + 1) =

n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1 (n − r) × (n − r − 1) × · · · × 3 × 2 × 1

n! (n − r)!

Here n! = n(n − 1)(n − 2)(n − 3) × · · · × 3 × 2 × 1

and is called n factorial - it can be found on many calculators. The formula

n! (n − r)!

is a commonly encountered expression in counting calculations (combinatorics) and has its own notation. The number of ordered ways of selecting r objects from n is denoted nPr, where

nP r =^

n! (n − r)!

We refer to nPr as the number of permutations of r out of n objects.

If we are interested solely in the number of ways of arranging n objects, then this is clearly just

nP n =^ n!

Returning to the example in which the thief is trying to guess your 4-digit PIN, if the thief knows that the PIN contains no repreated digits then the number of possible PINS is

(^10) P 4 = 5040

so, assuming that each is equally likely to be guessed, the probability of a correct guess is

P (Guess correctly) =

This calculation is very similar to that of permutations except that the ordering of objects no longer matters. For example, if we select two objects from three objects A, B and C, there are 3 P 2 = 6 ways of doing this:

A, B A, C B, A B, C C, A C, B.

However, if we are not interested in the ordering, just in whether A, B or C are chosen then A, B is the same as B, A etc. and so the number of selections is just 3:

A, B A, C B, C.

The effect of ignoring the ordering reduces the number of permutations by a factor of 2 P 2 = 2. In general, the number of combinations of r objects from n objects is

number of ordered samples of size r number of orderings of samples of size r

nPr rPr

=

nPr r! =

n! r!(n − r)!

Again, this is a very commonly found expression in combinatorics, so it has its own notation:

nC r =^

n! r!(n − r)!

There are other commonly used notations for this quantity: Cnr and

n r

. These numbers are

known as the binomial coefficients.

Now we can see that the number of ways to select 4 retail outlets out of 20 is

20 C

4 =^

An easy way to calculate binomial coefficients (at least small ones) is to use the fact that

nC r =^

n r

×

n − 1 r − 1

×

n − 2 r − 2

× · · · ×

n − r + 1 1

For example,

20 C

×

×

To see how combinations can be used to calculate probabilities, we will look at the UK National Lottery. In this lottery, there are 49 numbered balls, and six of these are selected at random. A seventh ball is also selected, but this is only relevant if you get exactly five numbers correct. The player selects six numbers before the draw is made, and after the draw, counts how many numbers

are in common with those drawn. Players win a prize if they select at least three of the balls drawn. The order in which the balls are drawn in is irrelevant.

To begin with, let’s calculate the probability that exactly 3 of the 6 numbers we select are drawn. First we need to count the number of possible draws (the number of different sets of 6 numbers), and then how many of those draws correspond to getting exactly three numbers correct. The number of possible draws is the number of ways of choosing 6 objects from 49. This is

(^49) C 6 = 13,^983 ,^816.

The number of drawings corresponding to getting exactly three right is calculated as follows. Of the 49 balls from which the draw is made, 6 correspond to your selected numbers, and 43 correspond to other numbers. We want to know how many ways there are of choosing 3 of your selected numbers and 3 other numbers. This is the number of ways of choosing 3 from 6, multiplied by the number of ways of choosing 3 from 43. That is, there are

(^6) C 3 43 C 3 = 246, 820

ways of choosing exactly 3 of your selected numbers. So, the probability of matching exactly 3 numbers is (^6) C 3 43 C 3 (^49) C 6 =^

Similarly, we can calculate the probability of getting other prize-winning outcomes:

P (match exactly 6 correct numbers) =

6 C 6

49 C 6 =^

' 7 × 10 −^8

P (match exactly 5 correct numbers plus bonus ball) =

6 C 5 1 C 1

49 C 6 =^

' 4 × 10 −^7

P (match exactly 5 correct numbers) =

6 C 5 43 C 1

49 C 6 =^

' 2 × 10 −^5

P (match exactly 4 correct numbers) =

6 C 4 43 C 2

49 C 6 =^

' 1 × 10 −^4.

These outcomes are not very likely and so the prizes are chosen to reflect how likely you are to win. For example, in a recent lottery draw, the prizes were

Number of balls matched Prize 6 £ 2.4M 5 plus bonus £ 240K 5 £ 3K 4 £ 100 3 £ 10 < 3 £ 0

This information allows us to calculate a fair price for such a bet. The expected monetary value of

x P (X = x) 1 1 / 6 2 1 / 6 3 1 / 6 4 1 / 6 5 1 / 6 6 1 / 6 sum 1

Just as with sample data, it is useful to have some summary information about probability distri- butions. For example, what is the average value of the random variable? How much variation is there in this distribution?

7.3.2 Expectation and the population mean

The mean of a quantitative random variable is a weighted sum of its possible values, where each weight is the probability of the value occurring. This is known as the expected value of the ran- dom variable or the population mean of the random variable and is usually written as E(X) or μ. Therefore, for a discrete random variable,

E(X) = μ =

x P (X = x).

Previously we have seen a similar calculation when determining the expected monetary value

EM V =

P (Event) × Monetary value of Event.

The expected value is the average value which we would get in an infinitely long sequence of identical experiments.

For example, suppose that the population of interest is this class and that it contains N students. Suppose that we are interested in the number of times that students have bought a particular product (e.g. a cinema ticket) in the last month. Clearly the population mean is just the average of this variable in the class:

μ =

n

∑^ n

i=

xi

where xi is the number of times student i has bought the product. We can also write this as

μ =

n

∑^ ∞

j=

jfj =

∑^ ∞

j=

j

fj n

where fj is the frequency of x = j in the population and fj /n is the relative frequency. If we choose a student at random from the class then the probability that we choose a student with x = j is

P (X = j) =

fj n

86

the relative frequency and so

μ =

∑^ ∞

j=

jP (X = j).

It is also clear that this is the average which we would get if we kept on sampling, with replacement, for a very long time.

For the die-rolling experiment, the average number of spots we would get if we repeated the ex- periment an “infinite” number of times is

E(X) =

x P (X = x) = 1 ×

+ 2 ×

+... + 6 ×

This concept can be generalised to calculate the expected value of any function of X. For instance, in the lottery example discussed previously, the prize was determined by the number of matches. In the die-rolling experiment, we could consider a prize worth the square of the number showing: £ 1 for a 1, £ 4 for a 2, £ 9 for a 3, and so on. In this case the expected prize money is

E

X^2

x^2 P (X = x)

= 1 ×

+ 4 ×

+... + 36 ×

7.3.3 Population variance and standard deviation

In addition to having the population mean as a measure of location, it is also useful to know about the spread of the random variable about this value. The variance of a random variable is denoted Var(X) or sometimes σ^2 and is determined by

Var(X) = σ^2 = E

[

(X − μ)^2

]

It is simply the average squared deviation from the mean. Note that this is the same sort of calcu- lation as with sample variances. The larger the value for the variance, the larger the spread.

Referring again back to the die-rolling experiment, if X is the number of spots, we can calculate the variance (using μ = 3. 5 ):

x P (X = x) (x − μ)^2 (x − μ)^2 P (X = x) 1 1/6 6.25 1. 2 1/6 2.25 0. 3 1/6 0.25 0. 4 1/6 0.25 0. 5 1/6 2.25 0. 6 1/6 6.25 1. sum 1 2.

Hence Var(X) = 2. 9167.

7.4 Exercises 7

  1. Consider a lottery that is slightly different to the National Lottery in that there are 48 balls instead of 49. What is the probability of winning the jackpot in this lottery? (That is, you choose six balls and exactly these six are drawn).
  2. A market survey has identified 10 desirable features for a new product. However, due to cost constraints, only four of these features can be included. If the features are selected randomly, what is the probability that your four favourites are chosen in your preferred ordering?
  3. If you dial 7 digits at random on a (non-mobile) telephone in Newcastle, what is the proba- bility you dial Dr. Farrow’s office number (which has 7 digits)?
  4. A sample of four mass-produced items is examined for quality control purposes. Each item can be either satisfactory (S) or unsatisfactory (U). Each item has a probability of 0.2 of being unsatisfactory and each item is independent of every other item

(a) Consider the sequence of 4 items. In how many different sequences can we get i. no unsatisfactory items? ii. exactly 1 unsatisfactory item? iii. exactly 2 unsatisfactory items? iv. exactly 3 unsatisfactory items? v. four unsatisfactory items. (b) Find the probability of a particular sequence containing i. no unsatisfactory items. ii. exactly 1 unsatisfactory item. iii. exactly 2 unsatisfactory items. iv. exactly 3 unsatisfactory items. v. four unsatisfactory items. (c) Hence find the probability that we get i. no unsatisfactory items. ii. exactly 1 unsatisfactory item. iii. exactly 2 unsatisfactory items. iv. exactly 3 unsatisfactory items. v. four unsatisfactory items. (d) Find the mean number of unsatisfactory items. (e) Find the variance and standard deviation of the number of unsatisfactory items.