Probabilitu Cheatsheet, Cheat Sheet of Mathematics

Cheatsheet for Probability Distributions

Typology: Cheat Sheet

2024/2025

Uploaded on 03/19/2025

nasir-hussain-7
nasir-hussain-7 🇺🇸

1 document

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Probability Cheat Sheet
Distributions
Unifrom Distribution
notation U[a, b]
cdf xa
bafor x[a, b]
pdf 1
bafor x[a, b]
expectation 1
2(a+b)
variance 1
12 (ba)2
mgf etb eta
t(ba)
story: all intervals of the same length on the
distribution’s support are equally probable.
Gamma Distribution
notation Gamma (k, θ)
pdf θkxk1eθx
Γ (k)Ix>0
Γ (k) = Z
0
xk1exdx
expectation
variance 2
mgf (1 θt)kfor t < 1
θ
ind. sum
n
X
i=1
XiGamma n
X
i=1
ki, θ!
story: the sum of k independent
exponentially distributed random variables,
each of which has a mean of θ(which is
equivalent to a rate parameter of θ1).
Geometric Distribution
notation G(p)
cdf 1(1 p)kfor kN
pmf (1 p)k1pfor kN
expectation 1
p
variance 1p
p2
mgf pet
1(1 p)et
story: the number X of Bernoulli trials
needed to get one success. Memoryless.
Poisson Distribution
notation P oisson (λ)
cdf eλ
k
X
i=0
λi
i!
pmf λk
k!·eλfor kN
expectation λ
variance λ
mgf exp λet1
ind. sum
n
X
i=1
XiP oisson n
X
i=1
λi!
story: the probability of a number of events
occurring in a fixed period of time if these
events occur with a known average rate and
independently of the time since the last event.
Normal Distribution
notation Nµ, σ2
pdf 1
2πσ2e(xµ)2/(2σ2)
expectation µ
variance σ2
mgf exp µt +1
2σ2t2
ind. sum
n
X
i=1
XiN n
X
i=1
µi,
n
X
i=1
σ2
i!
story: describes data that cluster around the
mean.
Standard Normal Distribution
notation N(0,1)
cdf Φ(x) = 1
2πZx
−∞
et2/2dt
pdf 1
2πex2/2
expectation 1
λ
variance 1
λ2
mgf exp t2
2
story: normal distribution with µ= 0 and
σ= 1.
Exponential Distribution
notation exp (λ)
cdf 1eλx for x0
pdf λeλx for x0
expectation 1
λ
variance 1
λ2
mgf λ
λt
ind. sum
k
X
i=1
XiGamma (k, λ)
minimum exp k
X
i=1
λi!
story: the amount of time until some specific
event occurs, starting from now, being
memoryless.
Binomial Distribution
notation Bin(n, p)
cdf
k
X
i=0 n
ipi(1 p)ni
pmf n
ipi(1 p)ni
expectation np
variance np (1 p)
mgf 1p+petn
story: the discrete probability distribution of
the number of successes in a sequence of n
independent yes/no experiments, each of
which yields success with probability p.
Basics
Comulative Distribution Function
FX(x) = P(Xx)
Probability Density Function
FX(x) = Z
−∞
fX(t)dt
Z
−∞
fX(t)dt = 1
fX(x) = d
dx FX(x)
Quantile Function
The function X: [0,1] Rfor which for any
p[0,1], FXX(p)pFX(X(p))
FX=FX
E(X) = E(X)
Expectation
E(X) = Z1
0
X(p)dp
E(X) = Z0
−∞
FX(t)dt +Z
0
(1 FX(t)) dt
E(X) = Z
−∞
xfXxdx
E(g(X)) = Z
−∞
g(x)fXxdx
E(aX +b) = aE(X) + b
Variance
Var(X) = EX2(E(X))2
Var(X) = E(XE(X))2
Var(aX +b) = a2Var (X)
Standard Deviation
σ(X) = pVar(X)
Covariance
Cov (X, Y ) = E(X Y )E(X)E(Y)
Cov (X, Y ) = E((XE(x)) (YE(Y)))
Var(X+Y) = Var (X)+ Var (Y) + 2Cov(X , Y )
Correlation Coefficient
ρX,Y =Cov (X, Y )
σX, σY
Moment Generating Function
MX(t) = EetX
E(Xn) = M(n)
X(0)
MaX+b(t) = etbMaX (t)
pf2

Partial preview of the text

Download Probabilitu Cheatsheet and more Cheat Sheet Mathematics in PDF only on Docsity!

Probability Cheat Sheet

Distributions

Unifrom Distribution

notation U [a, b]

cdf

x − a b − a

for x ∈ [a, b]

pdf

b − a

for x ∈ [a, b]

expectation

(a + b)

variance

(b − a)^2

mgf

etb^ − eta t (b − a) story: all intervals of the same length on the distribution’s support are equally probable.

Gamma Distribution

notation Gamma (k, θ)

pdf

θk^ xk−^1 e−θx Γ (k)

Ix> 0

Γ (k) =

0

xk−^1 e−xdx

expectation kθ

variance kθ^2

mgf (1 − θt)−k^ for t <

θ

ind. sum

∑^ n

i=

Xi ∼ Gamma

( (^) n ∑

i=

ki, θ

story: the sum of k independent exponentially distributed random variables, each of which has a mean of θ (which is equivalent to a rate parameter of θ−^1 ).

Geometric Distribution

notation G (p)

cdf 1 − (1 − p)k^ for k ∈ N

pmf (1 − p)k−^1 p for k ∈ N

expectation

p

variance

1 − p p^2

mgf

pet 1 − (1 − p) et story: the number X of Bernoulli trials needed to get one success. Memoryless.

Poisson Distribution

notation P oisson (λ)

cdf e−λ

∑k

i=

λi i!

pmf

λk k!

· e−λ^ for k ∈ N

expectation λ

variance λ

mgf exp

λ

et^ − 1

ind. sum

∑^ n

i=

Xi ∼ P oisson

( (^) n ∑

i=

λi

story: the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event.

Normal Distribution

notation N

μ, σ^2

pdf

2 πσ^2

e−(x−μ)

(^2) /( 2 σ (^2) )

expectation μ

variance σ^2

mgf exp

μt +

σ^2 t^2

ind. sum

∑^ n

i=

Xi ∼ N

( (^) n ∑

i=

μi,

∑^ n

i=

σ^2 i

story: describes data that cluster around the mean.

Standard Normal Distribution

notation N (0, 1)

cdf Φ(x) =

2 π

∫ (^) x

−∞

e−t

(^2) / 2 dt

pdf

2 π

e−x

(^2) / 2

expectation

λ variance

λ^2 mgf exp

t^2 2

story: normal distribution with μ = 0 and σ = 1.

Exponential Distribution

notation exp (λ)

cdf 1 − e−λx^ for x ≥ 0

pdf λe−λx^ for x ≥ 0

expectation

λ

variance

λ^2 mgf

λ λ − t

ind. sum

∑^ k

i=

Xi ∼ Gamma (k, λ)

minimum ∼ exp

( (^) k ∑

i=

λi

story: the amount of time until some specific event occurs, starting from now, being memoryless.

Binomial Distribution

notation Bin(n, p)

cdf

∑^ k

i=

(n

i

pi^ (1 − p)n−i

pmf

(n

i

pi^ (1 − p)n−i

expectation np

variance np (1 − p)

mgf

1 − p + pet

)n

story: the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p.

Basics

Comulative Distribution Function

FX (x) = P (X ≤ x)

Probability Density Function

FX (x) =

−∞

fX (t) dt ∫ (^) ∞

−∞

fX (t) dt = 1

fX (x) =

d dx

FX (x)

Quantile Function

The function X∗^ : [0, 1] → R for which for any p ∈ [0, 1], FX

X∗^ (p)−

≤ p ≤ FX (X∗^ (p))

FX∗^ = FX

E (X∗) = E (X)

Expectation

E (X) =

0

X∗(p)dp

E (X) =

−∞

FX (t) dt +

0

(1 − FX (t)) dt

E (X) =

−∞

xfX xdx

E (g (X)) =

−∞

g (x) fX xdx

E (aX + b) = aE (X) + b

Variance

Var (X) = E

X^2
− (E (X))^2

Var (X) = E

(X − E (X))^2

Var (aX + b) = a^2 Var (X)

Standard Deviation

σ (X) =

Var (X)

Covariance

Cov (X, Y ) = E (XY ) − E (X) E (Y )

Cov (X, Y ) = E ((X − E (x)) (Y − E (Y )))

Var (X + Y ) = Var (X) + Var (Y ) + 2Cov (X, Y )

Correlation Coefficient

ρX,Y =

Cov (X, Y ) σX , σY

Moment Generating Function

MX (t) = E

etX^

E (Xn) = M (^) X(n )(0)

MaX+b (t) = etbMaX (t)

Joint Distribution

PX,Y (B) = P ((X, Y ) ∈ B)

FX,Y (x, y) = P (X ≤ x, Y ≤ y)

Joint Density

PX,Y (B) =

B

fX,Y (s, t) dsdt

FX,Y (x, y) =

∫ (^) x

−∞

∫ (^) y

−∞

fX,Y (s, t) dtds ∫ (^) ∞

−∞

−∞

fX,Y (s, t) dsdt = 1

Marginal Distributions

PX (B) = PX,Y (B × R)
PY (B) = PX,Y (R × Y )

FX (a) =

∫ (^) a

−∞

−∞

fX,Y (s, t) dtds

FY (b) =

∫ (^) b

−∞

−∞

fX,Y (s, t) dsdt

Marginal Densities

fX (s) =

−∞

fX,Y (s, t)dt

fY (t) =

−∞

fX,Y (s, t)ds

Joint Expectation

E (ϕ (X, Y )) =

R^2

ϕ (x, y) fX,Y (x, y) dxdy

Independent r.v.

P (X ≤ x, Y ≤ y) = P (X ≤ x) P (Y ≤ y)

FX,Y (x, y) = FX (x) FY (y) fX,Y (s, t) = fX (s) fY (t) E (XY ) = E (X) E (Y )

Var (X + Y ) = Var (X) + Var (Y ) Independent events: P (A ∩ B) = P (A) P (B)

Conditional Probability

P (A | B) =
P (A ∩ B)
P (B)

bayes P (A | B) =

P (B | A) P (A)
P (B)

Conditional Density

fX|Y =y (x) =

fX,Y (x, y) fY (y)

fX|Y =n (x) =

fX (x) P (Y = n | X = x) P (Y = n)

FX|Y =y =

∫ (^) x

−∞

fX|Y =y (t) dt

Conditional Expectation

E (X | Y = y) =

−∞

xfX|Y =y (x) dx

E (E (X | Y )) = E (X) P (Y = n) = E (IY =n) = E (E (IY =n | X))

Sequences and Limits

lim sup An = {An i.o.} =

⋂^ ∞

m=

⋃^ ∞

n=m

An

lim inf An = {An eventually} =

⋃^ ∞

m=

⋂^ ∞

n=m

An

lim inf An ⊆ lim sup An (lim sup An)c^ = lim inf Acn (lim inf An)c^ = lim sup Acn

P (lim sup An) = lim n→∞

P

n=m

An

P (lim inf An) = lim n→∞

P

n=m

An

Borel-Cantelli Lemma

∑^ ∞

n=

P (An) < ∞ ⇒ P (lim sup An) = 0

And if An are independent: ∑^ ∞

n=

P (An) = ∞ ⇒ P (lim sup An) = 1

Convergence

Convergence in Probability

notation Xn

p −→ X

meaning lim n→∞

P (|Xn − X| > ε) = 0

Convergence in Distribution

notation Xn

D −→ X

meaning lim n→∞

Fn (x) = F (x)

Almost Sure Convergence

notation Xn −a.s.−−→ X

meaning P

lim n→∞

Xn = X

Criteria for a.s. Convergence

  • ∀ε∃N ∀n > N : P (|Xn − X| < ε) > 1 − ε
  • ∀εP (lim sup (|Xn − X| > ε)) = 0
  • ∀ε
∑^ ∞

n=

P (|Xn − X| > ε) < ∞ (by B.C.)

Convergence in Lp

notation Xn

Lp −−→ X

meaning lim n→∞

E (|Xn − X|p) = 0

Relationships

Lq −−→ ⇒ q>p≥ 1

Lp −−→

−^ a.s.−−→ ⇒ −−p→ ⇒ −−D→

If Xn

D −→ c then Xn

p −→ c If Xn

p −→ X then there exists a subsequence nk s.t. Xnk −^ a.s.−−→ X

Laws of Large Numbers

If Xi are i.i.d. r.v.,

weak law Xn

p −→ E (X 1 )

strong law Xn

a.s. −−−→ E (X 1 )

Central Limit Theorem

Sn − nμ σ

n

D −→ N (0, 1)

If tn → t, then

P

Sn − nμ σ

n

≤ tn

→ Φ (t)

Inequalities

Markov’s inequality

P (|X| ≥ t) ≤

E (|X|)

t

Chebyshev’s inequality

P (|X − E (X)| ≥ ε) ≤

Var (X) ε^2

Chernoff ’s inequality

Let X ∼ Bin(n, p); then: P (X − E (X) > tσ (X)) < e−t

(^2) / 2

Simpler result; for every X: P (X ≥ a) ≤ MX (t) e−ta

Jensen’s inequality

for ϕ a convex function, ϕ (E (X)) ≤ E (ϕ (X))

Miscellaneous

E (Y ) < ∞ ⇐⇒
∑^ ∞

n=

P (Y > n) < ∞ (Y ≥ 0)

E (X) =
∑^ ∞

n=

P (X > n) (X ∈ N)

X ∼ U (0, 1) ⇐⇒ − ln X ∼ exp (1)

Convolution

For ind. X, Y , Z = X + Y :

fZ (z) =

−∞

fX (s) fY (z − s) ds

Kolmogorov’s 0-1 Law

If A is in the tail σ-algebra Ft, then P (A) = 0 or P (A) = 1

Ugly Stuff

cdf of Gamma distribution:∫ t

0

θk^ xk−^1 e−θk (k − 1)!

dx

This cheatsheet was made by Peleg Michaeli in January 2010, using LATEX. version: 1. comments: [email protected]