Docsity
Docsity

Prepare-se para as provas
Prepare-se para as provas

Estude fácil! Tem muito documento disponível na Docsity


Ganhe pontos para baixar
Ganhe pontos para baixar

Ganhe pontos ajudando outros esrudantes ou compre um plano Premium


Guias e Dicas
Guias e Dicas


Understanding Discrete Probabilities: Scilab & Random Variables, Manuais, Projetos, Pesquisas de Eletrônica

An introduction to discrete probabilities using scilab. It covers the concepts of discrete random variables, probability calculations using disjoint sets and complementary events, and combinatorics. The document also includes examples and explanations of the gamma function and its relation to factorials, as well as the use of scilab functions for computing permutations and factorials.

Tipologia: Manuais, Projetos, Pesquisas

2013

Compartilhado em 06/03/2013

jacquelline-macena-8
jacquelline-macena-8 🇧🇷

2 documentos

1 / 39

Toggle sidebar

Esta página não é visível na pré-visualização

Não perca as partes importantes!

bg1
INTRODUCTION TO
DISCRETE PROBABILITIES WITH SCILAB
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27

Pré-visualização parcial do texto

Baixe Understanding Discrete Probabilities: Scilab & Random Variables e outras Manuais, Projetos, Pesquisas em PDF para Eletrônica, somente na Docsity!

INTRODUCTION TO

DISCRETE PROBABILITIES WITH SCILAB

4 Acknowledgments 35

5 References and notes 35

Bibliography 35

Index 36

1 Discrete random variables

In this section, we present discrete random variables. The first section presents general definition for sets, including union, intersection. Then we present the definition of the discrete distribution function and the probability of an event. In the third section, we give properties of probabilities, such as, for example, the probability of the union of two disjoints events. The fourth section is devoted to the very common discrete uniform distribution function. Then we present the definition of conditional probability. This leads to Bayes formula which allows to compute the posterior

1.1 Sets

A set is a collection of elements. In this document, we consider sets of elements in a fixed non empty set Ω, to be called a space. Assume that A is a set of elements. If x is a point in that set, we note x ∈ A. If there is no point in A, we write A = ∅. If the number of elements in A is finite, let us denote by #(A) the number of elements in the set A. If the number of elements in A is infinite, the cardinality cannot be computed (for example A = N). The set Ac^ is the set of all points in Ω which are not in A:

Ac^ = {x ∈ Ω / x /∈ A}. (1)

The set Ac^ is called the complementary set of A. The set B is a subset of A if any point in B is also in A and we can write A ⊂ B. The two sets A and B are equal if A ⊂ B and B ⊂ A. The difference set A − B is the set of all points of A which are not in B:

A − B = {x ∈ A / x /∈ B}. (2)

The intersection A ∩ B of two sets A and B is the set of points common to A and B:

A ∩ B = {x ∈ A and x ∈ B}. (3)

The union A ∪ B of two sets A and B is the set of points which belong to at least one of the sets A or B:

A ∪ B = {x ∈ A or x ∈ B}. (4)

The operations that we defined are presented in figure 1. These figures are often referred to as Venn’s diagrams.

Ω A B

Ω A B

Ω Ac

Ω A B A-B

Figure 1: Operations on sets – Upper left, union: A ∪ B, Upper right, intersection: A ∩ B, Lower left, complement: Ac, Lower right, difference: A − B

Two sets A and B are disjoints, or mutually exclusive if their intersection is empty, i.e. A∩B = ∅. In the following, we will use the fact that we can always decompose the union of two sets as the union of three disjoints subsets. Indeed, assume that A, B ⊂ Ω. We have

A ∪ B = (A − B) ∪ (A ∩ B) ∪ (B − A), (5)

where the sets A − B, A ∩ B and B − A are disjoints. This decomposition will be used several times in this chapter. The cross product of two sets is the set

A × B = {(x, y) /x ∈ A, y ∈ B}. (6)

Assume that n is a positive integer. The power set An^ is the set

An^ = {(x 1 ,... , xn) /x 1 , ..., xn ∈ A}. (7)

Example (Die with 6 faces) Assume that a 6-face die is rolled once. The sample space for this experiment is

Ω = { 1 , 2 , 3 , 4 , 5 , 6 }. (8)

The set of even numbers is A = { 2 , 4 , 6 } and the set of odd numbers is B = { 1 , 3 , 5 }. Their intersection is empty, i.e. A ∩ B = ∅ which proves that A and B are disjoints. Since their union is the whole sample space, i.e. A ∪ B = Ω, these two sets are mutually complement, i.e. Ac^ = B and Bc^ = A. 

1.3 Properties of discrete probabilities

In this section, we present the properties that the probability P (A) satisfies. We also derive some results for the probabilities of other events, such as unions of disjoints events. The following theorem gives some elementary properties satisfied by a probability P.

Proposition 1.3. (Probability) Assume that Ω is a sample space and that f is a distribution function on Ω. Assume that P is the probability associated with f. The probability of the event Ω is one, i.e.

P (Ω) = 1. (16)

The probability of the empty set is zero, i.e.

P (∅) = 0. (17)

Assume that A and B are two subsets of Ω. If A ⊂ B, then

P (A) ≤ P (B). (18)

For any event A ⊂ Ω, we have

0 ≤ P (A) ≤ 1. (19)

Proof. The equality 16 derives directly from the definition 10 of a distribution function, i.e. P (Ω) =

x∈Ω f^ (x) = 1. The equality 17 derives directly from 10. Assume that A and B are two subsets of Ω so that A ⊂ B. Since a probability is the sum of positive terms, we have

P (A) =

x∈A

f (x) ≤

x∈B

f (x) = P (B) (20)

which proves the inequality 18. The inequalities 19 derive directly from the definition of a probability 12. First, the probability P is positive since 9 states that f is positive. Second, the probability P of an event A is lower than 1, since P (A) =

x∈A f^ (x)^ ≤^

x∈Ω f^ (x) =^ P^ (Ω) = 1, which concludes the proof.

Proposition 1.4. (Probability of two disjoint subsets) Assume that Ω is a sample space and that f is a distribution function on Ω. Assume that P is the probability associated with f. Let A and B be two disjoints subsets of Ω, then

P (A ∪ B) = P (A) + P (B). (21)

The figure 2 presents the situation of two disjoints sets A and B. Since the two sets have no intersection, it suffices to add the probabilities associated with each event.

Proof. Assume that A and B be two disjoints subsets of Ω. We can decompose A ∪ B as A ∪ B = (A − B) ∪ (A ∩ B) ∪ (B − A), so that

P (A ∪ B) =

x∈A∪B

f (x) (22)

x∈A−B

f (x) +

x∈A∩B

f (x) +

x∈B−A

f (x). (23)

Ω A B

Figure 2: Two disjoint sets.

But A and B are disjoints, so that A − B = A, A ∩ B = ∅ and B − A = B. Therefore,

P (A ∪ B) =

x∈A

f (x) +

x∈B

f (x) (24)

= P (A) + P (B), (25)

which concludes the proof.

Notice that the equality 21 can be generalized immediately to a sequence of disjoints events.

Proposition 1.5. (Probability of disjoints subsets) Assume that Ω is a sample space and that f is a distribution function on Ω. Assume that P is the probability associated with f. For any disjoints events A 1 , A 2 ,... , Ak ⊂ Ω with k ≥ 0 , we have

P (A 1 ∪ A 2 ∪... ∪ Ak) = P (A 1 ) + P (A 2 ) +... + P (Ak). (26)

Proof. For example, we can use the proposition 1.4 to state the proof by induction on the number of events.

Example (Die with 6 faces) Assume that a 6-face die is rolled once so that the sample space for this experiment is Ω = { 1 , 2 , 3 , 4 , 5 , 6 }. Assume that the distribution function is f (x) = 1/6 for x ∈ Ω. The event A = { 1 , 2 , 3 } corresponds to the numbers lower or equal to 3. The probability of this event is P (A) = 12. The event B = { 5 , 6 } corresponds to the numbers greater than 5. The probability of this event is P (B) = 13. The two events are disjoints, so that the proposition 1. can be applied which proves that P (A ∪ B) = 56. 

Proposition 1.6. (Probability of the complementary event) Assume that Ω is a sample space and that f is a distribution function on Ω. Assume that P is the probability associated with f. For all subset A of Ω,

P (A) + P (Ac) = 1. (27)

Proof. We have Ω = A ∪ Ac, where the sets A and Ac^ are disjoints. Therefore, from proposition 1.4, we have

P (Ω) = P (A) + P (Ac), (28)

where P (Ω) = 1, which concludes the proof.

which leads to P (A) = P (A − B) + P (A ∩ B), which can be written as

P (A − B) = P (A) − P (A ∩ B). (33)

Similarly, we can prove that

P (B − A) = P (B) − P (B ∩ A). (34)

By plugging the two equalities 33 and 34 into 31, we find

P (A ∪ B) = P (A) − P (A ∩ B) + P (A ∩ B) + P (B) − P (B ∩ A), (35)

which simplifies into

P (A ∪ B) = P (A) + P (B) − P (B ∩ A), (36)

and concludes the proof.

Example (Disease) Assume that the probability of infections can be bacterial (B), viral (V) or both (B ∩ V ). This implies that B ∪ V = Ω but the two events are not disjoints, i.e. B ∩ V 6 = ∅. Assume that P (B) = 0.7 and P (V ) = 0.4. What is the probability of having both types of infections? The probability of having both infections is P (B ∩ V ). From proposition 1.7, we have P (B ∪ V ) = P (B) + P (V ) − P (B ∩ V ), which leads to P (B ∩ V ) = P (B) + P (V ) − P (B ∪ V ). We finally get P (B ∩ V ) = 0.1. This example is presented in [10]. 

1.4 Uniform distribution

In this section, we describe the particular situation where the distribution function is uniform.

Definition 1.8. (Uniform distribution) Assume that Ω is a finite sample space. The uniform distribution function is

f (x) =

for all x ∈ Ω.

Proposition 1.9. (Probability with uniform distribution) Assume that Ω is a finite sample space and that f is a uniform distribution function. Then the probability of the event A ⊂ Ω is

P (A) =

#(A)

Proof. When the distribution function is uniform, the definition 1.2 implies that

P (A) =

x∈A

f (x) =

x∈A

#(A)

which concludes the proof.

Ω

A

Figure 4: A set A, subset of the sample space Ω.

Example (Die with 6 faces) Assume that a 6-face die is rolled once so that the sample space for this experiment is Ω = { 1 , 2 , 3 , 4 , 5 , 6 }. In the previous analysis of this example, we have assumed that the distribution function is f (x) = 1/6 for x ∈ Ω. This is consistent with definition 1.8, since #(Ω) = 6. Such a die is a fair die, meaning that all faces have the same probability. The event A = { 2 , 4 , 6 } corresponds to the statement that the result of the roll is an even number. The number of outcomes in this event is #(A) = 3. From proposition 1.9, the probability of this event is P (A) = 12. 

1.5 Conditional probability

In this section, we define the conditional distribution function and the conditional probability. We analyze this definition in the particular situation of the uniform distribution. In some situations, we want to consider the probability of an event A given that an event B has occurred. In this case, we consider the set B as a new sample space, and update the definition of the distribution function accordingly.

Definition 1.10. (Conditional distribution function) Assume that Ω is a sample space and that f is a distribution function on Ω. Assume that A is a subset of Ω with P (A) =

x∈A f^ (x)^ >^0. The function f (x|A) defined by

f (x|A) =

∑f^ (x) x∈A f^ (x)^ , if x ∈ A, 0 , if x /∈ A,

is the conditional distribution function of x given A.

The figure 4 presents the situation where an event A is considered for a conditionnal distribu- tion. The distribution function f (x) is with respect to the sample space Ω while the conditionnal distribution function f (x|A) is with respect to the set A.

Proof. What is to be proved in this proposition is that the function f (x|A) is a distribution function. Let us prove that the function f (x|A) satisfies the equality

x∈Ω

f (x|A) = 1. (42)

since f (x|B) = 0 if x /∈ B. Hence,

P (A|B) =

x∈A∩B

f (x) ∑ x∈B f^ (x)^

∑^ x∈A∩B^ f^ (x) x∈B f^ (x)^

P (A ∩ B)

P (B)

The previous equality is well defined since P (B) > 0.

This definition can be analyzed in the particular case where the distribution function is uni- form. Assume that #(Ω) is the size of the sample space and #(A) (resp. #(B) and #(A ∩ B)) is the number of elements of A (resp. of B and A ∩ B). The conditional probability P (A|B) is

P (A|B) =

#(A ∩ B)

#(B)

We notice that

#(B) #(Ω)

#(A ∩ B)

#(B)

#(A ∩ B)

for all A, B ⊂ Ω. This leads to the equality

P (B)P (A|B) = P (A ∩ B), (55)

for all A, B ⊂ Ω. The previous equation could have been directly found based on the equation

The following example is given in [4], in section 4.1, ”Discrete conditional Probability”.

Example Grinstead and Snell [4] present a table which presents the number of survivors at single years of age. This table gathers data compiled in the USA in 1990. The first line counts 100, born alive persons, with decreasing values when the age is increasing. This table allows to see that 89.8 % in a population of 100,000 females can expect to live to age 60, while 57.0 % can expect to live to age 80. Given that a women is 60, what is the probability that she lives to age 80? Let us denote by A = {a ≥ 60 } the event that a woman lives to age 60, and let us denote by B = {a ≥ 80 } the event that a woman lives to age 80. We want to compute the conditionnal probability P ({a ≥ 80 }|{a ≥ 60 }). By the proposition 1.11, we have

P ({a ≥ 80 }|{a ≥ 60 }) =

P ({a ≥ 60 } ∩ {a ≥ 80 }) P ({a ≥ 60 })

P ({a ≥ 80 }) P ({a ≥ 60 })

with 3 significant digits. In other words, a women who is already 60, has 63.5 % of chance to live to 80. 

Figure 6: Tree diagram - The task is made with 3 steps. There are 2 choices for the step #1, 3 choices for step #2 and 2 choices for step #3. The total number of ways to perform the full sequence of steps is n = 2 · 3 · 2 = 12.

2 Combinatorics

In this section, we present several tools which allow to compute probabilities of discrete events. One powerful analysis tool is the tree diagram, which is presented in the first part of this section. Then, we detail permutations and combinations numbers, which allow to solve many probability problems.

2.1 Tree diagrams

In this section, we present the general method which allows to count the total number of ways that a task can be performed. We illustrate that method with tree diagrams. Assume that a task is carried out in a sequence of n steps. The first step can be performed by making one choice among m 1 possible choices. Similarly, there are m 2 possible ways to perform the second step, and so forth. The total number of ways to perform the complete sequence can be performed in n = m 1 m 2... mn different ways. To illustrate the sequence of steps, the associated tree can be drawn. An example of such a tree diagram is given in the figure 6. Each node in the tree corresponds to one step in the sequence. The number of children of a parent node is equal to the number of possible choices for the step. At the bottom of the tree, there are N leafs, where each path, i.e. each sequence of nodes from the root to the leaf, corresponds to a particular sequence of choices. We can think of the tree as representing a random experiment, where the final state is the outcome of the experiment. In this context, each choice is performed at random, depending on the probability associated with each branch. We will review tree diagrams throughout this section and especially in the section devoted to Bernoulli trials.

2.2 Permutations

In this section, we present permutations, which are ordered subsets of a given set.

Definition 2.1. ( Permutation) Assume that A is a finite set. A permutation of A is a one-to-one mapping of A onto itself.

Proof. #1 Let us pick an element to place at index 1. There are n elements in the set, leading to n possible choices. For the element at index 2, there are n − 1 elements left in the set. For the element at index n, there is only 1 element left. The total number of permutations is therefore n · (n − 1)... 2 · 1, which concludes the proof.

Proof. #2 The element at index 1 can be located at indexes 1, 2 ,... , n so that there are n ways to set the element #1. Once the element at index 1 is placed, there are n − 1 ways to set the element at index 2. The last element at index n can only be set at the remaining index. The total number of permutations is therefore n · (n − 1)... 2 · 1, which concludes the proof.

Example Let us compute the number of permutations of the set A = { 1 , 2 , 3 }. By the equation 62, we have 6! = 3 · 2 · 1 = 6 permutations of the set A. These permutations are:

(1 2 3) (1 3 2) (2 1 3) (2 3 1) (3 1 2) (3 2 1)

The previous permutations can also be directly read from the tree diagram 7, from the root of the tree to each of the 6 leafs. 

In some situations, all the elements in the set A are not involved in the permutation. Assume that j is a positive integer, so that 0 ≤ j ≤ n. A j-permutation is a permutation of a subset of j elements in A. The general counting method used for the previous proposition allows to count the total number of j-permutations of a given set A.

Proposition 2.3. ( Permutation number) Assume that j is a positive integer. The number of j-permutations of a set A of n elements is

(n)j = n · (n − 1)... (n − j + 1). (64)

Proof. The element at index 1 can be located at indexes 1, 2 ,... , n so that there are n ways to set the element at index 1. Once element at index 1 is placed, there are n − 1 ways to set the element at index 2. The element at index j can only be set at the remaining n − j + 1 indexes. The total number of j-permutations is therefore n · (n − 1)... (n − j + 1), which concludes the proof.

Example Let us compute the number of 2-permutations of the set A = { 1 , 2 , 3 , 4 }. By the equation 64, we have (4) 2 = 4 · 3 = 12 permutations of the set A. These permutations are:

(1 2) (1 3) (1 4)

We can check that the number of 2-permutations in a set of 4 elements is (4) 2 = 12 which is stricly lower that the number of permutations 4! = 24. 

2.3 The gamma function

In this section, we present the gamma function which is closely related to the factorial function. The gamma function was first introduced by the Swiss mathematician Leonard Euler in his goal to generalize the factorial to non integer values[13]. Efficient implementations of the factorial function are based on the gamma function and this is why this functions will be analyzed in detail. The practical computation of the factorial function will be analyzed in the next section.

Definition 2.4. ( Gamma function) Let x be a real with x > 0. The gamma function is defined by

Γ(x) =

0

(− log(t))x−^1 dt. (66)

The previous definition is not the usual form of the gamma function, but the following propo- sition allows to get it.

Proposition 2.5. ( Gamma function) Let x be a real with x > 0. The gamma function satisfies

Γ(x) =

0

tx−^1 e−tdt. (67)

Proof. Let us consider the change of variable u = − log(t). Therefore, t = e−u, which leads, by differenciation, to dt = −e−udu. We get (− log(t))x−^1 dt = −ux−^1 e−udu. Moreover, if t = 0, then u = ∞ and if t = 1, then u = 0. This leads to

Γ(x) = −

ux−^1 e−udu. (68)

For any continuously differentiable function f and any real numbers a and b.

∫ (^) b

a

f (x)dx = −

∫ (^) a

b

f (x)dx. (69)

We reverse the bounds of the integral in the equality 68 and get the result.

The gamma function satisfies

0

e−tdt =

[

−e−t

]∞

0 = (0 +^ e

The following proposition makes the link between the gamma and the factorial functions.

Proposition 2.6. ( Gamma and factorial) Let x be a real with x > 0. The gamma function satisfies

Γ(x + 1) = xΓ(x) (71)

and

Γ(n + 1) = n! (72)

for any integer n ≥ 0.

Proposition 2.7. ( Gamma function for negative arguments) For any non zero integer n and any real x such that x + n > 0 ,

Γ(x) =

Γ(x + n) x(x + 1)... (x + n − 1)

Proof. The proof is by induction on n. The equation 77 prooves that the equality is true for n = 1. Assume that the equality 80 is true for n et let us proove that it also holds for n + 1. By the equation 77 applied to x + n, we have

Γ(x + n) =

Γ(x + n + 1) x + n

Therefore, we have

Γ(x) =

Γ(x + n + 1) x(x + 1)... (x + n − 1)(x + n)

which proves that the statement holds for n + 1 and concludes the proof.

The gamma function is singular for negative integers values of its argument, as stated in the following proposition.

Proposition 2.8. ( Gamma function for integer negative arguments) For any non negative integer n,

Γ(−n + h) ∼

(−1)n n!h

when h is small.

Proof. Consider the equation 80 with x = −n + h. We have

Γ(−n + h) =

Γ(h) (h − n)(h − n + 1))... (h + 1)

But Γ(h) = Γ(h h+1) , which leads to

Γ(−n + h) =

Γ(h + 1) (h − n)(h − n + 1))... (h + 1)h

When h is small, the expression Γ(h+1) converges to Γ(1) = 1. On the other hand, the expression (h − n)(h − n + 1))... (h + 1)h converges to (−n)(−n + 1)... (1)h, which leads to the the term (−1)n^ and concludes the proof.

We have reviewed the main properties of the gamma function. In practical situations, we use the gamma function in order to compute the factorial number, as we are going to see in the next sections. The main advantage of the gamma function over the factorial is that it avoids to form the product n! = n · (n − 1)... 1, which allows to save a significant amount of CPU time and computer memory.

factorial returns n! gamma returns Γ(x) gammaln returns ln(Γ(x))

Figure 8: Scilab commands for permutations.

2.4 Overview of functions in Scilab

The figure 8 presents the functions provided by Scilab to compute permutations. Notice that there is no function to compute the number of permutations (n)j = n · (n − 1)... (n − j + 1). This is why, in the next sections, we provide a Scilab function to compute (n)j. In the next sections, we analyze each function in Scilab. We especially consider their numer- ical behavior and provide accurate and efficient Scilab functions to manage permutations. We emphasize the need for accuracy and robustness. For this purpose, we use the logarithmic scale to provide intermediate results which stays in the limited bounds of double precision floating point arithmetic.

2.5 The gamma function in Scilab

The gamma function allows to compute Γ(x) for real input argument. The mathematical function Γ(x) can be extended to complex arguments, but this has not be implemented in Scilab. The following script allows to plot the gamma function for x ∈ [− 4 , 4]. x = linspace ( -4 , 4 , 1001 ); y = gamma ( x ); plot ( x , y ); h = gcf (); h. children. data_bounds = [

    1. 6 ];

The previous script produces the figure 9. The following session presents various values of the gamma function. -->x = [ -2 -1 -0 +0 1 2 3 4 5 6] ’;

    • [ x gamma ( x )] ans = - 2. Nan - 1. Nan 0. - Inf 0. Inf 1. 1. 2. 1. 3. 2. 4. 6. 5. 24. 6. 120. Notice that the two floating point signed zeros +0 and -0 are associated with the function values −∞ and +∞. This is consistent with the value of the limit of the function from either sides of the singular point. This contrasts with the value of the gamma function on negative