




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
CourseNotesEE501.pdf stasticial inference
Typology: Essays (university)
1 / 231
This page cannot be seen from the preview
Don't miss anything!





























































































ii
Copyright ⃝c2006 by G. Drolet. All rights reserved. Permission is granted to make and distribute verbatim copies of these notes provided the copyright notice and this permission notice are preserved on all copies.
iv PREFACE
Chapter 5: presents the foundation of Coding Theory. The material is taken from Dr. S´eguin’s lecture notes for a course taught in 1984 at the Uni- versit´e Laval, Ste-Foy, Qu´ebec, QC, CANADA. Some of the results are contained in W&J Chapter 5. This chapter is self-contained and no refer- ences are made to the textbook by W&J.
Suggestions and comments to improve the notes are welcome.
August 2006
Germain Drolet Department of Electrical & Computer Engineering, Royal Military College of Canada, P.O. Box 17000, Station Forces, Kingston, Ontario, CANADA K7K 7B
Tel: (613) 541-6000, extension: 6192 Fax: (613) 544- Email: [email protected]
CONTENTS vii
B.3 Effect of sampling frequency..................... 212 B.4 Continuous time signals....................... 214
viii CONTENTS
2.1 Fundamental Definitions
Refer to Wozencraft & Jacobs pp 13 - 20 for the details. The following definitions are intuitive and serve as basis for the formal ax- iomatic definition of probability system that will follow.
Definition 2.1.1. A random experiment is a procedure producing an outcome that cannot be predicted accurately and with certainty at the same time. We have:
f∞(A) , lim N →∞ fN (seq, A)
= lim N →∞
N (seq, A) N
Remarks.
Example 2.1.1.
random experiment: tossing of a die,
set of all outcomes: { 1 , 2 , 3 , 4 , 5 , 6 },
result: { 1 } and { 3 , 6 } ≡ {multiples of 3} are two examples of results,
relative frequency of occurrence:
f∞({ 1 }) = lim N →∞
N (seq, { 1 }) N
f∞({ 1 , 3 }) = lim N →∞
N (seq, { 1 , 3 }) N
= lim N →∞
N (seq, { 1 }) + N (seq, { 3 }) N
= lim N →∞
N (seq, { 1 })) N
N (seq, { 3 })) N = f∞({ 1 }) + f∞({ 3 }) = 1/ 3
The above definitions are adequate to describe the physical concept but too loose and imprecise to describe a mathematical concept. We next give the formal axiomatic definition of probability system. Under certain physical conditions a random experiment can be modeled as a probability system and obeys the same laws. After giving the definitions and its immediate consequences, we illustrate the relationship between “probability system” and “random experiment”. This abstract definition should be well understood; this will make it easier to grasp the concepts of random variables and random processes later.
Definition 2.1.2. (Axiomatic definition) A probability system consists of a non- empty collection of objects Ω, a non-empty collection F of subsets of Ω and a function P : F → R satisfying the following (axioms):
Ω is called sample space and its elements are called sample points. The elements of F are called events (F is called class of events). A probability space will be denoted by (Ω, F , P : F → [0, 1] ⊂ R) or simply (Ω, F , P ).
Remarks.
The correspondence between “probability system” and “random experi- ment” is summarized in the following table:
Probability System Random Experiment Sample space Set of all outcomes Sample point Outcome Event Result Probability measure Relative frequency of occurrence
Remarks.
(a) It is very precise and formal and uses objects/concepts suitable to the development of a strong theory. (b) It does not rely on the existence of a physical random experiment.^3 We can construct many examples of probability systems more freely.
We will often refer to simple real world random experiments for example/il- lustration purposes. Conversely, for every random experiment we can construct a corresponding probability system to be used as idealized mathematical model. This is done as follows:
Relation of the Model to the Real World: Wozencraft & Jacobs pages 24 - 29. This is a difficult paragraph and may be viewed as comprising two intercon- nected parts:
Definition 2.1.3. A compound experiment is a random experiment which con- sists of a sequence of M independent (defined later) trials of a simpler experi- ment.
If A is an event of the simpler experiment with P (A) = p ̸= 0, P ( A¯) = 1 − p, then the following set is a sample space of the compound experiment (M trials):
ΩM = {(A, A,... , A),( A, A, A,... , A¯ ), (A, A, A, A,... , A¯ ), ( A,¯ A, A, A,... , A¯ ),... , ( A,¯ A,... ,¯ A¯)}
Let FM be the set of all subsets of ΩM , i.e. FM = 2ΩM^. We also define the mapping: PM : FM → [0, 1] PM : {x} 7 → p# of^ A^ in^ x(1 − p)# of^ A¯ in x
for any sequence x ∈ ΩM , and this together with the axioms of a probability measure completely determines PM for every event E ∈ FM :
P (E) =
x∈E
p# of^ A^ in^ x(1 − p)# of^ A¯ in x
One can easily verify that (ΩM , FM , PM ) is a probability system, i.e. FM is a valid class of events and PM is a valid probability measure. The following M + 1 events defined below are of special interest:
Ai^ = {x ∈ ΩM :x has i occurrences of A and M − i occurrences of A¯} ∈ FM ⊂ ΩM ,
for i = 0, 1 ,... , M. It is then seen (bottom of page 26) that
PM (Ai) =
i
pi(1 − p)M^ −i, i = 0, 1 ,... , M,
where we recall that
i
= (^) i!(MM −^ !i)!. This is called the binomial distribution (other distributions will be defined later). We see that
Ai^ ∩ Aj^ = ∅, whenever i ̸= j ∪Mi=0Ai^ = ΩM
so we expect [from axioms (4) and (5)]
i=0 PM^ (A i) = PM (ΩM ) = 1. This is
verified by the binomial theorem (top of page 27).
Example 2.1.2. The probability of obtaining five heads when throwing a coin 15 times is
5
2
2
Theorem 2.
i=
P (Ai|B).
If moreover A 1 ∪ A 2 ∪... ⊃ B then
i=1 P^ (Ai|B) = 1; this is known as the theorem of total probability.^4
(a) P (B) =
∑n j=1 P^ (Aj^ )P^ (B|Aj^ ). (b) P (Ai|B) = ∑njP=1^ ( APi ()APj^ ( )BP| (ABi)|Aj ) , for any i = 1, 2 ,... , n; this is known as Bayes theorem.^5
Proof. We prove the second part of the theorem only; the proof of the first part is left as an exercise.
∪nj=1Aj
= ∪nj=1(B ∩ Aj ) and all the B ∩ Aj are pairwise disjoint. It follows that P (B) =
∑n j=1 P^ (B^ ∩^ Aj^ ) from which the result follows.
Bayes theorem is useful in situations where P (B|Aj ) is given for every j but P (Aj |B) is not directly known. The following example illustrates this.
Example 2.1.3. (Refer to figure 2.1) We are given three boxes A 1 , A 2 , A 3 containing coloured balls as follows:
A 1 : 2 red balls and 3 black balls, A 2 : 3 red balls and 5 black balls, A 3 : 4 red balls and 4 black balls.
The (random) experiment consists in first choosing a box at random among the three boxes (equiprobably, i.e. each has a probability 1/3 of being chosen) and then draw a ball at random from the box chosen. Let B denote the event “a red ball has been drawn”. Calculate the probability that the ball was drawn from box number 2 if it is red, i.e. P (A 2 |B).
(^4) cf: Wozencraft & Jacobs page 31 (^5) cf: Wozencraft & Jacobs problem 2.
Solution: From the data given in the problem we have that P (Ai) = 1/ 3 , i = 1 , 2 , 3, and clearly B ⊂ ∪^3 j=1Aj. It follows from the theorem that:
j=1 P^ (Aj^ )P^ (B|Aj^ )
Figure 2.1:
Example 2.1.4. A binary source transmits at random one of two messages m 0 or m 1 , with probabilities P (m 0 ) = 1/ 3 , P (m 1 ) = 2/3. The message is fed through a random (noisy) channel of input and output respectively denoted as T X and RX, and with the following transition probabilities:
P (RX = 0 | T X = m 0 ) = 0. 99 P (RX = 1 | T X = m 0 ) = 0. 01 P (RX = 0 | T X = m 1 ) = 0. 01 P (RX = 1 | T X = m 1 ) = 0. 99
Calculate
Solution: we first notice that A = (A ∩ B) ∪ (A ∩ B¯). Next,
P (A ∩ B¯) = P (A) − P (A ∩ B) = P (A) − P (A)P (B) = P (A)(1 − P (B)) = P (A)P ( B¯)
2.2 Communication problem
Refer to Wozencraft & Jacobs, pages 33 to 37. In this example, a probability system is constructed by combining a random experiment (message source), a random transformation (digital communication channel) and a deterministic transformation (decision element).
Sample space: Ωsource = {m 0 , m 1 } Class of events: 2 Ωsource^ = {∅, {m 0 }, {m 1 }, {m 0 , m 1 }} Probability function:
PS (∅) = 0 , PS ({m 0 }) = Pm 0 , PS ({m 1 }) = Pm 1 , 1 = PS (Ωsource) = PS ({m 0 , m 1 }) = Pm 0 + Pm 1 = 1.
Pm 0 , Pm 1 are called a priori message probabilities. ( Ωsource, 2 Ωsource^ , PS
satisfies all the axioms of a probability system.
all j
P [rj |mi] = 1
for all i. Combining the source and the discrete communication channel we define:
Sample space:
ΩDCC = {(m 0 , r 0 ), (m 0 , r 1 ), (m 0 , r 2 ), (m 1 , r 0 ), (m 1 , r 1 ), (m 1 , r 2 )}
Class of events: 2 ΩDCC^ = {∅, {(m 0 , r 0 )}, {(m 0 , r 1 )},... , ΩDCC }
Probability function:
PDCC ({(mi, rj )}) = PS ({mi})P [rj |mi] ,
for every i and every j. One easily verifies that (you should verify this): PDCC (ΩDCC ) =
all i
all j
PDCC ({(mi, rj )}) = 1.
(ΩDCC , 2 ΩDCC^ , PDCC ) forms a probability system.
m b : {r 0 , r 1 , r 2 } → {m 0 , m 1 } m b : r 0 7 → m 0 m b : r 1 7 → m 1 m b : r 2 7 → m 1
This is just one example; there are 8 possible map- pings!
This mapping mb can be used together with the probability system (ΩDCC , 2 ΩDCC^ , PDCC ) to define the probability system (ΩD , 2 ΩD^ , PD ) de- scribed below:
Sample Space: ΩD = {(m 0 , m 0 ), (m 0 , m 1 ), (m 1 , m 0 ), (m 1 , m 1 )} where the first component of each pair denotes the message transmitted and the second component denotes the decision made. Events: 2 ΩD^ = {∅, {(m 0 , m 0 )}, {(m 0 , m 1 )},... , ΩD } Probability function: completely specified by the probability function PDCC ( ), the decision mapping mb( ), the axioms and the following:
PD ({(m 0 , m 0 )}) = PDCC ({(m 0 , r 0 )}) PD ({(m 0 , m 1 )}) = PDCC ({(m 0 , r 1 ), (m 0 , r 2 )}) PD ({(m 1 , m 0 )}) = PDCC ({(m 1 , r 0 )}) PD ({(m 1 , m 1 )}) = PDCC ({(m 1 , r 1 ), (m 1 , r 2 )})
(this corresponds to the above example of mapping mb( )).
The probability system (ΩD , 2 ΩD^ , PD) describes the overall operation of the discrete communication channel. Its performance is measured by its probability of correct decision P (C ) (page 35, W&J). We call C = {(m 0 , m 0 ), (m 1 , m 1 )} ⊂ ΩD the correct decision event. For the above example of mapping mb( ), the correct decision event C ⊂ ΩD corresponds to C˜ ⊂ ΩDCC given by:
C˜ = {(m 0 , r 0 ), (m 1 , r 1 ), (m 1 , r 2 )}
In general, C˜ is given by:
C˜ = {(mi, rj ) : ˆm(rj ) = mi, ∀i ∈ { 0 ,... , M − 1 }, ∀j ∈ { 0 ,... , J − 1 }} = {( ˆm(rj ), rj ) : j ∈ { 0 ,... , J − 1 }}