Formula Sheet for Probability, Study Guides, Projects, Research of Probability and Stochastic Processes

Formula Sheet for Probability....Formula Sheet for Probability.....Formula Sheet for Probability

Typology: Study Guides, Projects, Research

2019/2020
On special offer
30 Points
Discount

Limited-time offer


Uploaded on 03/09/2020

moin-q
moin-q 🇵🇰

5

(2)

1 document

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
For mutually exclusive events A1, A2:
𝑃𝑟{𝐴1𝐴2}=𝑃𝑟{𝐴1}+𝑃𝑟{𝐴2}
In general, we have:
𝑃𝑟{𝐴1𝐴2}=𝑃𝑟{𝐴1}+𝑃𝑟{𝐴2}𝑃𝑟{𝐴1𝐴2}
Conditional probability:
𝑃𝑟{𝐴|𝐵}=𝑃𝑟{𝐴𝐵}
𝑃𝑟{𝐵}
Two events A and B are independent only iff
𝑃𝑟{𝐴|𝐵}=𝑃𝑟{𝐴}
For independent events:
𝑃𝑟{𝐴𝐵}=𝑃𝑟{𝐴}𝑃𝑟{𝐵}
The Law of Total Probability
Pr{𝐴}=Pr{𝐴|𝐵1}Pr{𝐵1}+Pr{𝐴|𝐵2}Pr{𝐵2}+
+𝑃𝑟{𝐴|𝐵𝑁}𝑃𝑟{𝐵𝑁}
Bayes Theorem
𝑃𝑟{𝐵𝑖|𝐴}=𝑃𝑟{𝐴|𝐵𝑖}𝑃𝑟{𝐵𝑖}
𝑃𝑟{𝐴|𝐵𝑗}𝑃𝑟{𝐵𝑗}
𝑁
𝑗=1 =𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑× 𝑝𝑟𝑖𝑜𝑟
𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒
Properties of a PMF
𝑃1:⁡⁡0𝑝𝑥(𝑎𝑘)1
𝑃2:∑𝑝𝑥(𝑎𝑘)=1
Properties of a CDF
P1,P2,P3 same as F1,F2,F3 given below **
𝑃4:⁡𝐹𝑋(𝑥𝑖+1)=𝐹𝑋(𝑥𝑖)+𝑝𝑋(𝑥𝑖+1)
Expected Value of a Discrete RV
𝐸[𝑋]=𝜇𝑋=𝑥𝑝𝑥(𝑥)
𝑥∈𝑆𝑘
Binomial Random Variable
Pr{𝑋=𝑘}=(𝑛
𝑘)𝑝𝑘(1𝑝)𝑛−𝑘
𝜇=𝑛𝑝, 𝜎2=𝑛𝑝(1𝑝)
Geometric Random Variable
Pr{𝑍=𝑘}=𝑝(1𝑝)𝑘−1
𝜇=(1𝑝)/𝑝, 𝜎2=(1𝑝)/𝑝2
Memoryless Property
Pr{𝑍=𝑗+𝑘|𝑍𝑘}=Pr{𝑧=𝑗}
Poisson Random Variable:
Counts total # of arrivals
Pr{𝑁=𝑘}=𝜆𝑘𝑒−𝜆
𝑘!
𝜇=𝜆, 𝜎2=𝜆
λ⁡is⁡arrivals/unit time
Properties of continuous RV PDF:
𝐹1:⁡⁡⁡⁡𝑓𝑋(𝑥)0
𝐹2:⁡⁡⁡ 𝑓𝑋(𝑥)
−∞ 𝑑𝑥=1
⁡𝑓𝑋(𝑥)⁡could⁡be>1⁡but⁡ 𝑓𝑋(𝑥)
−∞ 𝑑𝑥=1
F3:⁡⁡⁡⁡⁡Pr{𝑎𝑋𝑏}=⁡∫𝑓𝑋(𝑥)
𝑏
𝑎𝑑𝑥
continuous RV CDF:
𝐹𝑋(𝑥)=Pr{𝑥𝑡}=⁡∫ 𝑓𝑋(𝑥)
𝑡
𝑥=⁡−∞ 𝑑𝑥
𝑓𝑋(𝑥)=𝑑
𝑑𝑥𝐹𝑋(𝑥)
Properties of CDF: **
𝐹1:0𝐹𝑋(𝑥)1
𝐹2:𝑎𝑏⁡𝐹𝑋(𝑎)≤⁡𝐹𝑋(𝑏)
𝐹3:⁡lim
𝑥→−∞𝐹𝑋(𝑥)=0,⁡⁡lim
𝑥→∞𝐹𝑋(𝑥)=1
𝐹4:⁡Pr{𝑋=𝑎}=𝑓𝑋(𝑥)
𝑎
𝑎𝑑𝑥=0
For very small Value of ϵ
Pr{𝑎𝜖2𝑋𝑎+𝜖2}= 𝑓𝑋(𝑥)
𝑎+𝜖
2
𝑎−𝜖
2𝑑𝑥𝜖⁡𝑓𝑋(𝑎)
𝐹5:Pr{𝑎𝑋𝑏}=𝐹𝑋(𝑏)−⁡𝐹𝑋(𝑎)
Properties of Expected/Variance
𝐸(𝑋⁡+⁡𝑌⁡)=⁡𝐸(𝑋)+⁡𝐸(𝑌⁡)
𝐸(𝑐𝑋)=𝑐𝐸(𝑋)
expected #of successes in n Bernoulli trials
𝐸(𝑆𝑛)=𝑛𝑝
If X , Y are independent random variables,
𝐸(𝑋𝑌)=⁡𝐸(𝑋)𝐸(𝑌⁡)⁡
𝑉𝑎𝑟(𝑐𝑋)=⁡𝑐2𝑉𝑎𝑟(𝑋)
𝑉𝑎𝑟(𝑋⁡+⁡𝑐)=⁡𝑉𝑎𝑟(𝑋)
Expected Value of continuous VR:
𝐸𝑋[𝑔(𝑥)]=⁡ 𝑔(𝑥)𝑓𝑋(𝑥)
−∞ 𝑑𝑥
𝜇𝑋=⁡∫ 𝑥𝑓𝑋(𝑥)
−∞ 𝑑𝑥
Variance of continuous RV:
𝑣𝑎𝑟{𝑋}=⁡𝜎𝑋2= (𝑥𝜇𝑋)2𝑓𝑋(𝑥)
−∞ 𝑑𝑥
⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡=𝐸[(𝑥𝜇𝑋)2]
=𝐸[𝑋2]𝐸[𝑋]2(2nd⁡moment−1st⁡moment2)
Uniform Distribution:
𝑓𝑋(𝑥)={1
𝑏𝑎, 𝑎𝑥𝑏
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝐹𝑋(𝑥)={ 0,⁡⁡⁡⁡⁡⁡⁡⁡𝑥<𝑎
𝑥𝑎
𝑏𝑎, 𝑎𝑥𝑏
1, 𝑥>𝑏
also called the Rectangular distribution
𝐸{𝑋}=(𝑏+𝑎)/2, 𝑉𝑎𝑟{𝑋}=(𝑏𝑎)2/12
Exponential Random Variable:
Measure inter arrival time b/w events
𝐸[𝑋]=𝜇=1/𝜆, 𝜎2=1/𝜆2
Derived from poisson:
𝐹𝑋(𝑡)=1⁡𝑒𝜆𝑡
𝑓𝑋(𝑡)=⁡𝜆𝑒𝜆𝑡
Exponential RV is a limiting case of a geometric RV
Like the geometric RV, the exponential RV also possess
the memoryless property
Laplacian Random Variable:
𝑓𝑋(𝑥)=𝛼
2𝑒−𝛼|𝑥|
𝜇=0, 𝜎2=2/𝛼2
Erlang Distribution:
For systems with r different stages PDF and CDF:
𝑓𝑅(𝑡)=𝜆𝑟𝑡𝑟−1𝑒𝜆𝑡
(𝑟1)!
𝐹𝑅(𝑡)=1(𝜆𝑡)𝑘𝑒𝜆𝑡
𝑘!
𝑟
𝑘=1
Gamma Distribution:
Replacing⁡r⁡by⁡α⁡(non-integer)
𝑓𝛤(𝑡)=𝜆𝛼𝑡𝛼−1𝑒𝜆𝑡
𝛤(𝛼)!
𝑤ℎ𝑒𝑟𝑒:⁡𝛤(𝛼)=⁡∫ 𝑥𝛼−1𝑒−𝑥𝑑𝑥
0
𝛼⁡𝑎𝑛𝑑⁡𝜆⁡𝑎𝑟𝑒⁡𝑠ℎ𝑎𝑝𝑒𝑠⁡&⁡𝑠𝑐𝑎𝑙𝑒
𝜇=𝛼/𝜆, 𝜎2=𝛼/𝜆2
Properties: 𝛤(𝛼)=(𝛼1)𝛤(𝛼1)
𝛤(𝛼)=(𝛼1)!⁡⁡⁡⁡⁡⁡𝑓𝑜𝑟(𝛼>0)
𝛤(1
2)=𝜋
𝑥𝛼−1𝑒𝜆𝑥𝑑𝑥
0=𝛤(𝛼)
𝜆𝛼
At⁡α⁡=⁡1:
𝑓𝛤(𝑡)=⁡𝜆𝑒𝜆𝑡
Gaussian or Normal Distribution:
𝑓𝑋(𝑥)= 1
𝜎2𝜋𝑒1
2(𝑥−𝜇
𝜎)2
standard normal distribution:
𝑁(𝜇,𝜎2)=𝑁(0,1)
Gaussian Distribution & The Error Function (erf):
erf(𝑥)=⁡𝛷(𝑥)=1
2𝜋 𝑒𝑡2
2
𝑥
−∞ 𝑑𝑥
Properties of erf:
𝛷(0)=1
2
𝛷(−𝑥)=1𝛷(𝑥)
Normal Approximation of a Binomial RV:
for large n, the Normal approximation of a binomial RV:
𝑏(𝑘;𝑛,𝑝)=Pr{𝑋=𝑘}
1
2𝜋𝑛𝑝(1𝑝)exp⁡{−(𝑘𝑛𝑝)2
2𝑛𝑝(1𝑝)}
This approximation is also called the Laplace Approx.
Provided
𝑝
is not too close to 0 or 1.
Moments of a Random Variable:
power function
𝑌=𝑔(𝑋)=𝑋𝑛
𝐸{𝑌}=𝐸{𝑋𝑛}=
{
𝑥𝑖𝑛𝑝𝑥(𝑥𝑖)⁡⁡⁡⁡⁡⁡𝐷𝑖𝑠𝑐𝑟𝑒𝑡𝑒
𝑖
𝑥𝑛𝑓𝑥(𝑥)𝑑𝑥
−∞ ⁡⁡𝐶𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
𝐸{𝑋𝑛}
is called the nth moment of X
For two RVs X and Y if
𝐸{𝑋𝑛}
=
𝐸{𝑌𝑛}
Then X and Y
have same distribution
Nth Central Moments of a RV X:
𝐸{𝑌}=𝐸{(𝑋𝜇𝑥)𝑛}
=
{
∑(𝑥𝑖𝜇𝑥)𝑛𝑝𝑥(𝑥𝑖)⁡⁡⁡⁡⁡⁡𝐷𝑖𝑠𝑐𝑟𝑒𝑡𝑒
𝑖
(𝑥𝜇𝑥)𝑛𝑓𝑥(𝑥)𝑑𝑥
−∞ ⁡⁡𝐶𝑜𝑛𝑡𝑖𝑛𝑢𝑒
Transform Domain Methods:
𝛷𝑋(𝜔)=𝐸{𝑒𝑗𝜔𝑋}
Characteristic Function of a Continuous RV:
𝛷𝑋(𝜔)= 𝑓𝑋(𝑥)𝑒𝑗𝜔𝑋𝑑𝑥
−∞
CF is the fourier transform of the pdf of X(with sign
reversal)
Characteristic Function of a Discrete RV:
𝛷𝑋(𝜔)=𝑝𝑋(𝑥𝑘)
𝑘𝑒𝑗𝜔𝑥𝑘
If the discrete RV takes integer values then:
𝛷𝑋(𝜔)=𝑝𝑋(𝑘)
𝑘𝑒𝑗𝜔𝑘
The CF of integer-valued discrete RV is a periodic function
of⁡ω.
The PMF of a discrete RV can be derived from its
characteristic function as:
𝑝𝑋(𝑘)=1
2𝜋 𝑒−𝑗𝜔𝑘
2𝜋
0𝛷𝑋(𝜔)𝑑𝜔
Characteristic functions have useful property that
the moments of a RV X can be computed by
differentiating this function w.r.t
𝜔
and evaluated
at
𝜔=0
.
𝐸{𝑋𝑛}=1
𝑗𝑛𝑑𝑛
𝑑𝜔𝑛𝛷𝑋(𝜔)|𝜔=0
So it is also called the Moment Generating
Function.
Probability Generating Function:
for a non-ive, discrete, integer, RV define (PGF) as:
𝐺𝑁(𝑧)=⁡𝐸[𝑍𝑛]=𝑧𝑘𝑝𝑁(𝑘)
𝑘=0
PGF is the z transform of the PMF with sign
reversal.
PGF can be used to generate the probabilities of
the integer RV as:
𝑝𝑁(𝑘)=1
𝑘!𝑑𝑛
𝑑𝑧𝑛𝐺𝑁(𝑧)|𝑧=0
PGF can also be used to evaluate the moments of
the integer RV.
𝑑
𝑑𝑧𝐺𝑁(𝑧)|𝑧=1=𝑘𝑧𝑘−1𝑝𝑁(𝑘)
𝑘=0
Evaluating 1st derivative at z=1 gives the 1st
moment.
𝑑
𝑑𝑧𝐺𝑁(𝑧)|𝑧=1=𝐺𝑁
(1)=𝑘𝑝𝑁(𝑘)
𝑘=0 =𝐸{𝑁}
Evaluating 2nd derivative at z=1 gives a 2nd
moment.
𝑑2
𝑑𝑧2𝐺𝑁(𝑧)|𝑧=1=𝐸[𝑁2]𝐸[𝑁]⁡=𝐸[𝑁2]𝐺𝑁
(1)
We can also compute the variance as well:
𝑣𝑎𝑟(𝑁)=𝐸[𝑁2]𝐸[𝑁]2
=𝐺𝑁
′′(1)+𝐺𝑁
(1)(𝐺𝑁
(1))2
pf3
pf4
pf5
Discount

On special offer

Partial preview of the text

Download Formula Sheet for Probability and more Study Guides, Projects, Research Probability and Stochastic Processes in PDF only on Docsity!

For mutually exclusive events A1, A2:

1

2

1

2

In general, we have:

1

2

1

2

1

2

Conditional probability:

Two events A and B are independent only iff

For independent events:

The Law of Total Probability

Pr{𝐴} = Pr{𝐴|𝐵

1

} Pr{𝐵

1

} + Pr{𝐴|𝐵

2

} Pr{𝐵

2

𝑁

𝑁

Bayes Theorem

𝑖

𝑖

𝑖

𝑗

𝑗

𝑁

𝑗= 1

𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 × 𝑝𝑟𝑖𝑜𝑟

Properties of a PMF

𝑥

𝑘

𝑥

𝑘

Properties of a CDF

P1,P2,P3 same as F1,F2,F3 given below **

𝑋

𝑖+ 1

𝑋

𝑖

𝑋

𝑖+ 1

Expected Value of a Discrete RV

[

]

𝑋

𝑥

𝑥∈𝑆 𝑘

Binomial Random Variable

Pr{𝑋 = 𝑘} = (

𝑘

𝑛−𝑘

2

Geometric Random Variable

Pr{𝑍 = 𝑘} = 𝑝( 1 − 𝑝)

𝑘− 1

2

2

Memoryless Property

Pr

= Pr{𝑧 = 𝑗}

Poisson Random Variable:

Counts total # of arrivals

Pr{𝑁 = 𝑘} =

𝑘

−𝜆

2

λis arrivals/unit time

Properties of continuous RV PDF:

𝑋

𝑋

−∞

𝑋

could be > 1 but ∫ 𝑓

𝑋

−∞

F3: Pr{𝑎 ≤ 𝑋 ≤ 𝑏} = ∫ 𝑓

𝑋

𝑏

𝑎

continuous RV CDF:

𝑋

= Pr

𝑋

𝑡

𝑥= −∞

𝑋

𝑋

Properties of CDF: **

𝑋

𝑋

𝑋

𝐹 3 : lim

𝑥→−∞

𝑋

= 0 , lim

𝑥→∞

𝑋

𝐹 4 : Pr

𝑋

𝑎

𝑎

For very small Value of ϵ

Pr {𝑎 −

𝑋

𝑎+

𝜖

2

𝑎−

𝜖

2

𝑋

𝐹 5 : Pr

𝑋

𝑋

Properties of Expected/Variance

expected #of successes in n Bernoulli trials

If X , Y are independent random variables,

2

Expected Value of continuous VR:

𝑋

[

)]

𝑋

−∞

𝑋

𝑋

−∞

Variance of continuous RV:

𝑋

2

𝑋

2

𝑋

−∞

= 𝐸[(𝑥 − 𝜇

𝑋

2

]

[

2

] − 𝐸

[

]

2

( 2nd moment − 1st moment

2

)

Uniform Distribution:

𝑋

𝑋

also called the Rectangular distribution

2

Exponential Random Variable:

Measure inter arrival time b/w events

[

]

2

2

Derived from poisson:

𝑋

−𝜆𝑡

𝑋

−𝜆𝑡

Exponential RV is a limiting case of a geometric RV

Like the geometric RV, the exponential RV also possess

the memoryless property

Laplacian Random Variable:

𝑋

−𝛼|𝑥|

2

2

Erlang Distribution:

For systems with r different stages PDF and CDF:

𝑅

𝑟

𝑟− 1

−𝜆𝑡

𝑅

𝑘

−𝜆𝑡

𝑟

𝑘= 1

Gamma Distribution:

Replacingrbyα(non - integer)

𝛤

𝛼

𝛼− 1

−𝜆𝑡

𝑤ℎ𝑒𝑟𝑒: 𝛤

( 𝛼

) = ∫ 𝑥

𝛼− 1

𝑒

−𝑥

𝑑𝑥

0

𝛼 𝑎𝑛𝑑 𝜆 𝑎𝑟𝑒 𝑠ℎ𝑎𝑝𝑒𝑠 & 𝑠𝑐𝑎𝑙𝑒

2

2

Properties:

𝛼− 1

−𝜆𝑥

𝑑𝑥

0

𝛼

Atα=1:

𝛤

−𝜆𝑡

Gaussian or Normal Distribution:

𝑋

1

2

(

𝑥−𝜇

𝜎

)

2

standard normal distribution:

2

Gaussian Distribution & The Error Function (erf):

erf(𝑥) = 𝛷(𝑥) =

𝑡

2

2

𝑥

−∞

Properties of erf:

Normal Approximation of a Binomial RV:

for large n, the Normal approximation of a binomial RV:

= Pr

exp{−

2

This approximation is also called the Laplace Approx.

Provided 𝑝 is not too close to 0 or 1.

Moments of a Random Variable:

power function 𝑌 = 𝑔(𝑋) = 𝑋

𝑛

𝑛

𝑖

𝑛

𝑥

𝑖

𝑖

𝑛

𝑥

−∞

𝐸{𝑋

𝑛

} is called the nth moment of X

For two RVs X and Y if 𝐸{𝑋

𝑛

𝑛

} Then X and Y

have same distribution

Nth Central Moments of a RV X:

𝑥

𝑛

𝑖

𝑥

𝑛

𝑥

𝑖

𝑖

𝑥

𝑛

𝑥

−∞

Transform Domain Methods:

𝑋

𝑗𝜔𝑋

Characteristic Function of a Continuous RV:

𝑋

𝑋

𝑗𝜔𝑋

−∞

CF is the fourier transform of the pdf of X(with sign

reversal)

Characteristic Function of a Discrete RV:

𝛷 𝑋

( 𝜔

) = ∑ 𝑝 𝑋

(𝑥 𝑘

)

𝑘

𝑒

𝑗𝜔𝑥 𝑘

If the discrete RV takes integer values then:

𝑋

𝑋

𝑘

𝑗𝜔𝑘

The CF of integer-valued discrete RV is a periodic function

ofω.

The PMF of a discrete RV can be derived from its

characteristic function as:

𝑋

−𝑗𝜔𝑘

2 𝜋

0

𝑋

Characteristic functions have useful property that

the moments of a RV X can be computed by

differentiating this function w.r.t 𝜔 and evaluated

at 𝜔 = 0.

𝑛

𝑛

𝑛

𝑛

𝑋

𝜔= 0

So it is also called the Moment Generating

Function.

Probability Generating Function:

for a non-ive, discrete, integer, RV define (PGF) as:

𝑁

(𝑧) = 𝐸[𝑍

𝑛

] = ∑ 𝑧

𝑘

𝑁

𝑘= 0

PGF is the z transform of the PMF with sign

reversal.

PGF can be used to generate the probabilities of

the integer RV as:

𝑁

𝑛

𝑛

𝑁

𝑧= 0

PGF can also be used to evaluate the moments of

the integer RV.

𝑁

𝑧= 1

𝑘− 1

𝑁

𝑘= 0

Evaluating 1

st

derivative at z=1 gives the 1

st

moment.

𝑁

𝑧= 1

𝑁

( 1

𝑁

𝑘= 0

Evaluating 2

nd

derivative at z=1 gives a 2nd

moment.

2

2

𝑁

𝑧= 1

= 𝐸[𝑁

2

] − 𝐸[𝑁] = 𝐸[𝑁

2

] − 𝐺

𝑁

We can also compute the variance as well:

[

2

] − 𝐸

[

]

2

𝑁

′′

𝑁

𝑁

2

Markov Inequality

Pr{𝑋 ≥ 𝑡} ≤

[

]

Chebyshev Inequality

Pr{|𝑋 − 𝜇| ≥ 𝑡} ≤

2

2

Markov and Chebychev Inequalities apply to All RVs!

Vector Random Variables

The joint cdf of a pair of rvs X and Y is defined as:

𝑋,𝑌

= Pr

= Pr

Joint distributions are also called compound distributions

Properties of Joint CDFs

𝑋,𝑌

1

2

1

2 ,

𝑋,𝑌

1

1

𝑋,𝑌

1

1

𝑋,𝑌

𝑋,𝑌

𝑎 & 𝑏 → ∞

𝑋,𝑌

𝑎 𝑜𝑟 𝑏 →−∞

𝑋,𝑌

𝑎 → ∞

𝑌

𝑋,𝑌

𝑏 →∞

𝑋

𝐹 6 : Pr{𝑎 < 𝑋 ≤ 𝑏 & 𝑐 < 𝑌 ≤ 𝑑}

𝑋,𝑌

𝑋,𝑌

𝑋,𝑌

𝑋,𝑌

Joint PDFs

𝑋,𝑌

2

𝜕𝑥𝜕𝑦

𝑋,𝑌

𝑋,𝑌

𝑋,𝑌

𝑦

−∞

𝑥

−∞

Pr

𝑋,𝑌

𝑑

𝑐

𝑏

𝑎

Point selected uniformly from area, then XY have joint pdf

𝑋,𝑌

𝐴𝑟𝑒𝑎

joint pdf must satisfy the following property:

𝑋,𝑌

−∞

−∞

𝑋

𝑋,𝑌

−∞

𝑌

𝑋,𝑌

−∞

Independent Random Variables

Two random variables X and Y are independent if:

𝑋,𝑌

𝑋

𝑌

𝑋,𝑌

𝑋

𝑌

𝑋,𝑌

𝑋

𝑌

Conditional Probability for Random Variables

𝑌

𝑋,𝑌

𝑦

−∞

𝑋

𝑌

𝑌

𝑋,𝑌

𝑋

𝑋,𝑌

𝑌

𝑋

𝑍|𝑋

𝑌

𝑦=𝑔(𝑦)

− 1 , 𝑓 𝑍|𝑌

𝑋

𝑥=𝑔(𝑥)

− 1

For a discrete RV, the same condition can be stated as:

𝑋,𝑌

= Pr

𝑌

𝑋

−∞

𝑌

𝑦

𝑌

Properties of Conditional Expectation

𝑘

𝑘

Multiple Random Variables

Joint pmf of n discrete random variables is:

𝑋 1

,𝑋 2

,…𝑋 𝑛

1 ,

2 ,

𝑛

= Pr{𝑋

1

1

𝑛

𝑛

Conditional pmfs are obtained as

𝑋 𝑛

𝑛

1 ,

𝑛− 1

𝑋

1

,𝑋

2

,…𝑋

𝑛

1 ,

2 ,

𝑛− 1 ,

𝑛

𝑋 1

,𝑋 2

,…𝑋 𝑛

1 ,

2 ,

𝑛− 1

Conditional pdf from joint pdfs is:

𝑋

𝑛

𝑛

1 ,

𝑛− 1

𝑋 1

,𝑋 2

,…𝑋 𝑛

1 ,

2 ,

𝑛− 1 ,

𝑛

𝑋

1

,𝑋

2

,…𝑋

𝑛

1 ,

2 ,

𝑛− 1

Repeatedly applying this expression gives:

𝑋 1

,𝑋 2

,…𝑋 𝑛

1 ,

2 ,

𝑛− 1 ,

𝑛

𝑋 𝑛

𝑛

1 ,

𝑛− 1

𝑋 𝑛− 1

𝑛− 1

1 ,

𝑛− 2

𝑋

2

2

1

𝑋

1

1

Marginal pmf of a RV is obtained by summing over the

images of all other RVs

𝑋 1

1

) = Pr{𝑋

1

1

𝑥

2

𝑋 1

,𝑋 2

,…𝑋 𝑛

1

2

𝑛

𝑥

𝑛

Joint CDF of n continuous RVs is:

𝑋

1

,𝑋

2

,…𝑋

𝑛

1

2

𝑛

𝑋

1

,𝑋

2

,…𝑋

𝑛

1

2

𝑛

𝑛

1

𝑥

𝑛

−∞

𝑥

2

−∞

𝑥

1

−∞

Conversely, joint pdf is then obtained as

𝑋 1

,𝑋 2

,…𝑋 𝑛

1

2

𝑛

2

1

2

𝑋 1

,𝑋 2

,…𝑋 𝑛

1

2

𝑛

A single marginal pdf can be obtained as:

𝑋

1

1

𝑋

1

,𝑋

2

,…𝑋

𝑛

1

2

𝑛

𝑛

2

−∞

−∞

A marginal pdf for a sub-vector RV can be obtained as:

𝑋 1

,𝑋 2

,…𝑋 𝑛

1

2

𝑛− 1

𝑋 1

,𝑋 2

,…𝑋 𝑛

1

2

𝑛− 1

𝑛

𝑛

−∞

Moments of Functions of Multiple RVs

𝑋,𝑌

−∞

−∞

𝑖

𝑛

𝑋,𝑌

𝑖

𝑛

𝑖 𝑛

The jk-th joint moment of two RVs, X and Y , is given as

𝑗

𝑘

𝑗

𝑘

𝑋,𝑌

−∞

−∞

𝑖

𝑗

𝑛

𝑘

𝑋,𝑌

𝑖

𝑛

𝑖 𝑛

By setting j=0 , we can obtain moments of Y

Similarly, k=0 yields moments of X

The (j=1, k=1) moment, E{XY} , is generally called the

correlation of X and Y.

if E{XY}=0, then X and Y are orthogonal or uncorrelated

The jk-th central moment of two RVs, X and Y, is:

𝑗

𝑘

By setting j=0, k=2 gives variance of Y

Similarly, j=2, k=0 gives variance of X

The (j=1, k=1) central moment, is generally called the c

covariance of X and Y.

𝐶𝑂𝑉

{ 𝑋, 𝑌

} = 𝐸

{( 𝑋 − 𝐸

{ 𝑋

})( 𝑌 − 𝐸

{ 𝑌

})}

𝐶𝑂𝑉

{ 𝑋, 𝑌

} = 𝐸

{ 𝑋𝑌

} − 𝐸

{ 𝑋

} 𝐸{𝑌}

Correlation Coefficient

𝑋,𝑌

𝑋

𝑌

𝑋

𝑌

𝑋,𝑌

The correlation coefficient is a normalized measure that

quantifies the amount of dependence between two RVs.

The correlation coefficient is a measure of the degree to which

a linear relationship exists between two RVs.

Properties of Moments of Multiple RVs

1

2

𝑛

1

2

𝑛

If all 𝑋

𝑖

s are independent:

1

2

𝑛

1

2

𝑛

Jointly Normal (Gaussian) Random Variables

𝑋 1

1

2

𝑁

𝑁/ 2

1 / 2

1

2

(𝑥̅ −𝑚̅ )

𝑇

− 1

(𝑥̅ −𝑚̅ )

Where: 𝑚̅ = [

1

2

𝑁

] ∑ =

[

1

2

12

1 𝑁

21

𝑁 1

2

2

2 𝑁

𝑁 2

𝑁

2

]

For the Bivariate case:

: 𝑚̅ = [

1

2

] ∑ = [

1

2

12

21

2

2

]

𝑋,𝑌

𝑋

𝑌

12

𝑋 1

, 𝑋 2

1 , 2

1

2

1 , 2

2 , 1

∑ = [

1

2

1 , 2

1

2

1 , 2

2

1

2

2

]

− 1

2

[

1

1

2

2

1

2

]

𝑓 𝑋,𝑌

( 𝑥, 𝑦

)

=

exp{

− 1

2 ( 1 − 𝜌

𝑋,𝑌

2

)

[(

𝑥 − 𝑚 𝑥

𝜎 𝑋

)

2

− 2 𝜌 𝑋,𝑌

(

𝑥 − 𝑚 𝑥

𝜎 𝑋

)(

𝑦 − 𝑚 𝑦

𝜎 𝑌

)(

𝑦 − 𝑚 𝑦

𝜎 𝑌

)

2

]}

2 𝜋𝜎 𝑋

𝜎 𝑌

√ 1 − 𝜌 𝑋,𝑌

2

If we set the exponents involving x and y in the

above

expression to a constant k , we obtain the equation

for an ellipse:

𝑓 𝑋,𝑌

( 𝑥, 𝑦

)

exp{

− 1

2 ( 1 − 𝜌

𝑋,𝑌

2

)

𝐾}

2 𝜋𝜎

𝑋

𝜎

𝑌

√ 1 − 𝜌

𝑋,𝑌

2

n Jointly Normal Random Variables

RVsX1,X2,…Xnarejointlynormaliftheirpdfhas

the following form:

𝑋

̅

𝑋

1

,𝑋

2

,…,𝑋

𝑛

1

2

𝑛

𝑛/ 2

1 / 2

1

2

(𝑥̅ −𝑚̅ )

𝑇

− 1

(𝑥̅ −𝑚̅ )

Where: 𝑥̅ =

[

1

2

𝑛

]

And 𝑚̅ = [

1

2

𝑛]

[

1

2

𝜎

12

1 𝑁

21

𝑁 1

2

2

2 𝑁

𝑁 2

𝑁

2

]

pdf of General Transformations

𝑉,𝑊

𝑋,𝑌

1

2

𝑉,𝑊

𝑋,𝑌

1

2

Where: |𝐽(𝑥, 𝑦)| = 𝑑𝑒𝑡 [

𝜕𝑣

𝜕𝑥

𝜕𝑣

𝜕𝑦

𝜕𝑤

𝜕𝑥

𝜕𝑤

𝜕𝑦

] , |𝐽(𝑣, 𝑤)| = 𝑑𝑒𝑡 [

𝜕𝑥

𝜕𝑣

𝜕𝑥

𝜕𝑤

𝜕𝑦

𝜕𝑣

𝜕𝑦

𝜕𝑤

]

Bernoulli’s Theorem

lim

𝑛→∞

Pr {|

If we run a large number of Bernoulli trials (n) then the

probability that the proportion of successes in the n trials

differs from p is arbitrarily small.

Weak Law of Large Numbers

lim

𝑛→∞

Pr{|𝑋

If you conduct a large number of trials (n), then the

probability that the sample mean (𝑋

̅

) deviates from the

true mean (μ) (by

more than a small value (δ) is.

Strong Law of Large Numbers

lim

𝑛→∞

Pr{𝑋

If you conduct a large number of trials (n), then the

probability that the sample mean (𝑋

̅ ) converges to the

true mean (μ) is 1.

Central Limit Theorem

The Sample Mean Version

𝑑

The Sample Sum Version

𝑆

𝑛

− 𝑛𝜇

𝑑

Let X 1 ,X 2 ,…Xn, be mutually RVs with a finite mean

E{X

i

}= μ i

and a finite variance Var{X i

} = σ i

2

𝑛

𝑋

𝑖

𝑛

𝑖= 1

𝜇

𝑖

𝑛

𝑖= 1

𝑛

𝑖= 1 𝑖

2

The Random Walk Process:

Since the random walk process is an IID Sum RP:

𝐷 𝑛

𝐷 𝑛

𝑆

𝐷

1

2

= min(𝑛 1

2

) × 4 𝑝

Continuous-Time RP Wiener Process:

The Wiener Process is sum of a very large number of IID

RVs

Therefore, according to the central limit theorem, the

Wiener Process has a Gaussian PDF:

𝑋

( 𝑡

)

𝑥

2

2 𝛼𝑡

Any RP having a Gaussian PDF is called a Gaussian

Random Process.

The Wiener Process has independent and stationary

increments.

𝑋(𝑡 1

),…,𝑋(𝑡 𝑘

)

1

𝑘

𝑋

( 𝑡 1

)

1

𝑋

( 𝑡 2

−𝑡 1

)

1

𝑋

( 𝑡 𝑘

−𝑡 𝑘− 1

)

𝑘

𝑘− 1

exp{−

[

1

2

1

2

1

2

2

1

𝑘

𝑘− 1

2

𝑘

𝑘− 1

]}

𝑘

1

2

1

𝑘

𝑘− 1

Since the Wiener Process is a zero-mean IID Sum RP, its

covariance is:

𝑋

1

2

𝑋

1

2

) = min(𝑡

1

2

Poisson Random Process:

A Poisson Random Process, N(t) , is the number of

occurrences or arrivals of an event A in the [0,t] time

interval.

Like the Poisson RV, the Poisson RP assumes that:

→The average number of arrivals per unit time (e.g., per

second), 𝜆 , is known

→ The arrivals are independent of each other.

If we observe 𝜆 arrivals per second, 𝜆𝑡 arrivals (on-

average) in t seconds that we should expect in the [0, t]

time interval (i.e. t seconds).

Thus the Poisson RP has a Poisson pmf with parameter

Pr{𝑁(𝑡) = 𝑘} =

𝑘

−𝜆𝑡

Recall that Poisson distribution is derived from a

Binomial distribution by dividing the [0,1] interval into

very (infinitely) small sub-interval.

Then an arrival either takes place in a sub-interval or it

does not

→ Which can be treated as a Bernoulli random variable

The Poisson RP is then counting the number of successes

(arrivals).

Thus the Poisson RP is the continuous-time counterpart

of the Binomial RP.

Like the Binomial RP, the Poisson RP also has

independent and stationary increments.

Pr{𝑁(𝑡

1

1

2

2

= Pr{𝑁(𝑡

1

1

}Pr{𝑁(𝑡

2

1

2

1

Since the Poisson RP is an IID Sum RP, its covariance is

given by:

𝑁

1

2

) = min(𝑡

1

2

Recall that the inter-arrival times of Poisson arrivals are

exponential random variables.

Therefore, the sum of the inter-arrival times of the

Poisson RP is a sum of independent exponential

distributions.

We know that the sum of independent exponential

distributions has an Erlang PDF.

Therefore, the sum of inter-arrival times

𝑛

1

2

𝑛

is given as:

𝑆 𝑛

𝑛− 1

−𝜆𝑦

Markov Chain:

Homogeneous Markov Chains:

A (time) homogeneous Markov Chain is one in which the

state transition probabilities are independent of time, i.e.

Homogeneous

𝑖𝑗

= Pr

𝑘+ 1

𝑘

= Pr

𝑘+𝑛+ 1

𝑘+𝑛

Non-homogeneous

Pr{𝑋

𝑘+ 1

𝑘

= 𝑖} ≠ Pr{𝑋

𝑘+𝑛+ 1

𝑘+𝑛

Absorbing States

An absorbing state of a Markov chain is a state which

cannot be transitioned out of.

𝑖𝑖

𝑖𝑗

= 0 ) 𝑖 ≠ 𝑗 , then 𝑖 is an absorbing state.

If every state in a Markov chain can reach an absorbing

state, the Markov chain is called an absorbing Markov

chain.

Recurrent and Transient States

A state 𝑖 of a Markov chain is a recurrent state if,

𝑖𝑖

𝑛

𝑛= 1

In words, if you start from a recurrent state you are

guaranteed to revisit it eventually.

A state of a Markov chain is a transient state if,

𝑖𝑖

𝑛

𝑛= 1

In word, a transient state is the converse of the

recurrent state.

State Distribution at Arbitrary Times

Given the state transition probability matrix (𝑃 ) and an

arbitrary starting pmf at time , is it possible to

compute the pmf of states at any arbitrary time 𝑡 + 𝑛?

Given Starting pmf at time 𝑡 = 0.

(𝑛)

( 0 )

𝑛

Periodicity

A state 𝑖 has period 𝑘 if any return to state 𝑖 must occur

in multiples of 𝑘 time steps. Formally, the period of a

state is defined as:

𝑘 = gcd{𝑛: Pr{𝑋

𝑛

0

Even though a state has period 𝑘 , it may not be possible

to reach the state in 𝑘 steps.

If 𝑘 = 1 , then the state is said to be aperiodic: returns

to state i can occur at irregular times.

A Markov chain is aperiodic if all its states are periodic.

Ergodic Markov Chain

A state i is said to be ergodic if it is aperiodic and

positive recurrent.

State i is ergodic if it is recurrent, has a period of 1 and

it has finite mean recurrence time.

If all states in an irreducible Markov chain are ergodic,

then the Markov chain is ergodic.

Reducibility

A Markov Chain is said to be irreducible if its state

space is a single communicating class.

Time Homogeneous Markov Chain

If the Markov chain is a time-homogeneous Markov

chain, so that the process is described by a single, time-

independent matrix 𝑃

𝑖𝑗

, then the vector 𝜋 is called a

stationary distribution (or invariant measure) if its

entries are non-negative and sum to 1 and if it satisfies:

𝑗

𝑖

𝑖𝑗

𝑤ℎ𝑒𝑟𝑒 𝜋 = [𝜋

1

2

𝑁

]

𝑇

Or in matrix-vector form:

𝑇

𝑇

2 - Step Transition Probability

𝑖𝑗

( 2 )

𝑖𝑘

( 1 )

𝑘𝑗

( 1 )

𝑆

𝑘= 1

InMatrixform…

( 2 )

Can we generalize this expression for n-steps, to

compute 𝑝 𝑖𝑗

(𝑛)

𝑖𝑗

(𝑛)

𝑖𝑘

(𝑛−𝑚)

𝑘𝑗

(𝑚)

𝑆

𝑘= 1

Or,inmorefriendlyform…

𝑖𝑗

(𝑛+𝑚)

𝑖𝑘

(𝑛)

𝑘𝑗

(𝑚)

𝑆

𝑘= 1

And in Matrix form:

(𝑛+𝑚)

(𝑛)

(𝑚)

𝑛

𝑚

𝑛+𝑚

That’s all for Stochastic Systems folks!

Misc:

3

3

3

2

2

2