CS70 Cheatsheet, Cheat Sheet of Probability and Statistics

A cheatsheet for CS70, covering topics such as propositional logic, modular arithmetic, proofs, sets, countability and computability, graph theory, polynomials, and error-correcting codes. It includes formulas, theorems, and algorithms related to each topic. useful for students studying CS70 and preparing for exams or assignments related to the covered topics.

Typology: Cheat Sheet

2021/2022

Uploaded on 05/11/2023

karthur
karthur 🇺🇸

4.8

(8)

230 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 70 CHE ATSH EET ALEC LI
Note 1 (Propositional Logic)
P= Q¬PQ
P= Q¬Q= ¬P
¬(PQ)¬P¬Q
¬(PQ)¬P¬Q
P(QR)(PQ)(PR)
P(QR)(PQ)(PR)
¬(xP (x))x¬P(x)
¬(xP (x))x¬P(x)
Note 2/3 (Proofs)
Direct proof
Proof by contraposition
Proof by cases
Proof by induction
Base case (prove smallest case is true)
Inductive hypothesis (assume
n=k
true for weak induction,
assume nktrue for strong induction)
Inductive step (prove n=k+1 is true)
Pigeonhole principle
Putting n+mballs in nbins =1 bin has 2 balls
Note 4 (Sets)
P(S)=powerset/set of all subsets; if |S|=k,|P(S)|=2k
One to one (injection); f(x)=f(y)=⇒ x=y
Onto (surjection); ¡yx¢¡f(x)=y¢; “hits" all of range
Bijection: both injective and surjective
Note 5 (Countability & Computability)
Countable if bijection to N
Cantor-Schroder-Bernstein Theorem: bijection between
A
and
B
if
there exists injections f:ABand g:BA
Cantor diagonalization: to prove uncountability, list out possibil-
ities, construct new possibility different from each listed at one
place (ex. reals (0, 1), infinite binary strings, etc)
ABand Bis countable = Ais countable
ABand Ais uncountable = Bis uncountable
Infinite cartesian product sometimes countable (
××···
), some-
times uncountable ({0,1})
Halting Problem: can’t determine for every program whether it
halts (uncomputable)
Reduction of TestHalt(P, x) to some task (here, TestTask)
define inner function that does the task if and only if P(x) halts
call
TestTask
on the inner function and return the result in
TestHalt
Note 6 (Graph Theory)
Knhas n(n1)
2edges
Handshaking lemma: total degree =2e
Trees: (all must be true)
connected & no cycles
connected & has n1 edges (n=|V|)
connected & removing an edge disconnects the graph
acyclic & adding an edge makes a cycle
Hypercubes:
n
-length bit strings, connected by an edge if differs by exactly 1
bit
n
-dimensional hypercube has
n
2
n1
edges, and is bipartite
(even vs odd parity bitstring)
Eulerian walk: visits each edge once; only possible if connected and
all even degree or exactly 2 odd degree
Eulerian tour: Eulerian walk but starts & ends at the same vertex;
only possible if all even degree and connected
Planar graphs
v+f=e+2
Pf
i=1si=2ewhere si=number of sides of face i
e3v6 if planar (because si3)
e2v4 if planar for bipartite graphs (because si4)
nonplanar if and only if the graph contains K5or K3,3
all planar graphs can be colored with 4 colors
Note 7 (Modular Artithmetic)
x1(modular inverse) exists mod mif and only if gcd(x,m)=1
Extended Euclidean Algorithm:
x y ¥x/y¦a b
35 12 2 1 3
12 11 1 1 1
11 1 11 0 1
1 0
gcd
start
answer new a=old b
new b=abjx
yk
if
gcd
(
x,y
)
=
1, then
a=
x1mod y,b=y1mod x
Chinese Remainder Theorem:
find bases bithat are 1 mod miand 0 mod mjfor j=i
bi=ci(c1
imod mi) where ci=Qi=jmj
xPaibi(mod Qmi)
solution is unique mod Qmi
mimust be pairwise relatively prime in order to use CRT
Note 8 (RSA)
Scheme: for primes p,q, find ecoprime to (p1)( q1)
public key: N=p q and e
private key: d=e1mod (p1)(q1)
encryption of message m:me(mod N)=y
decryption of encrypted message y:yd(mod N)=m
Fermat’s Little Theorem (FLT):
xpx
(
mod p
), or
xp1
1
(mod p) if xcoprime to p
Prime Number Theorem:
π
(
n
)
n
lnn
for
n
17, where
π
(
n
)
=
# of
primes n
Breaking RSA if we know d:
we know
de
1
=k
(
p
1)(
q
1), where
ke
because
d<(p1)(q1)
so
de 1
k=pq pq+
1;
pq =N
, so we can find
p,q
because we
know d,e,k
Note 9 (Polynomials)
Property 1: nonzero polynomial of degree dhas at most droots
Property 2:
d+
1 pairs of points (
xi
distinct) uniquely defines a
polynomial of degree at most d
Lagrange Interpolation:
i(x)=Qi=j
xxj
xixj
P(x)=Piyii(x)
Secret Sharing (normally under G F (p)):
P(0) =secret, P(1),. . ., P(n) given to all people
P
(
x
)
=
polynomial of degree
k
1, where
k
people are needed to
get the secret
Rational Root Theorem: for
P
(
x
)
=anxn+···+a0
, the roots of
P
(
x
)
that are of the form p
qmust have p|a0,q|an
Note 10 (Error Correcting Codes)
Erasure Errors:
k
packets lost, message length
n
; need to send
n+k
packets because P(x) of degree n1 needs npoints to define it
General Errors:
k
packets corrupted, message length
n
; send
n+
2
k
packets
Berlekamp Welch:
P(x) encodes message (degree n1)
E
(
x
) constructed so that roots are where the errors are (degree
k); coefficients unknown
Q(x)=P(x)E(x) (degree n+k1)
substitute all (
xi,ri
) into
Q
(
xi
)
=riE
(
xi
), make system of equa-
tions
solve for coefficients; P(x)=Q(x)
E(x)
1
pf3
pf4

Partial preview of the text

Download CS70 Cheatsheet and more Cheat Sheet Probability and Statistics in PDF only on Docsity!

Note 1 (Propositional Logic)

  • P =⇒ Q ≡ ¬ PQ
  • P =⇒ Q ≡ ¬ Q =⇒ ¬ P
  • ¬( PQ ) ≡ ¬ P ∨ ¬ Q
  • ¬( PQ ) ≡ ¬ P ∧ ¬ Q

• P ∧ ( Q ∨ R ) ≡ ( P ∧ Q ) ∨ ( P ∧ R )

• P ∨ ( Q ∧ R ) ≡ ( P ∨ Q ) ∧ ( P ∨ R )

  • ¬(∀ xP ( x )) ≡ ∃ x ¬ P ( x )
  • ¬(∃ xP ( x )) ≡ ∀ x ¬ P ( x )

Note 2/3 (Proofs)

  • Direct proof
  • Proof by contraposition
  • Proof by cases
  • Proof by induction - Base case (prove smallest case is true) - Inductive hypothesis (assume n = k true for weak induction, assume nk true for strong induction) - Inductive step (prove n = k + 1 is true)
  • Pigeonhole principle - Putting n + m balls in n bins =⇒ ≥ 1 bin has ≥ 2 balls

Note 4 (Sets)

• P( S ) = powerset/set of all subsets; if | S | = k , |P( S )| = 2 k

  • One to one (injection); f ( x ) = f ( y ) =⇒ x = y
  • Onto (surjection);

yx

f ( x ) = y

; “hits" all of range

  • Bijection: both injective and surjective

Note 5 (Countability & Computability)

  • Countable if bijection to N
  • Cantor-Schroder-Bernstein Theorem: bijection between A and B if there exists injections f : AB and g : BA
  • Cantor diagonalization: to prove uncountability, list out possibil- ities, construct new possibility different from each listed at one place (ex. reals ∈ (0, 1), infinite binary strings, etc)
  • AB and B is countable =⇒ A is countable
  • AB and A is uncountable =⇒ B is uncountable

• Infinite cartesian product sometimes countable (∅ × ∅ × · · · ), some-

times uncountable ({0, 1}∞)

  • Halting Problem: can’t determine for every program whether it halts (uncomputable)

• Reduction of TestHalt(P, x) to some task (here, TestTask)

– define inner function that does the task if and only if P(x) halts

– call TestTask on the inner function and return the result in

TestHalt

Note 6 (Graph Theory)

  • Kn has n ( n 2 − 1)edges
  • Handshaking lemma: total degree = 2 e
  • Trees: (all must be true) - connected & no cycles - connected & has n − 1 edges ( n = | V |) - connected & removing an edge disconnects the graph - acyclic & adding an edge makes a cycle
  • Hypercubes: - n -length bit strings, connected by an edge if differs by exactly 1 bit - n -dimensional hypercube has n 2 n −^1 edges, and is bipartite (even vs odd parity bitstring)
  • Eulerian walk: visits each edge once; only possible if connected and all even degree or exactly 2 odd degree
  • Eulerian tour: Eulerian walk but starts & ends at the same vertex; only possible if all even degree and connected
  • Planar graphs - v + f = e + 2 -

P f i = 1 si^ =^2 e^ where^ si^ =^ number of sides of face^ i

- e ≤ 3 v − 6 if planar (because si ≥ 3) - e ≤ 2 v − 4 if planar for bipartite graphs (because si ≥ 4) - nonplanar if and only if the graph contains K 5 or K 3, - all planar graphs can be colored with ≤ 4 colors

Note 7 (Modular Artithmetic)

  • x −^1 (modular inverse) exists mod m if and only if gcd( x , m ) = 1
  • Extended Euclidean Algorithm: x y

x / y

a b 35 12 2 − 1 3 12 11 1 1 − 1 11 1 11 0 1 1 0 gcd

start

answer ^ new^ a^ =^ old^ b

- new b = ab

j x y

k

- if gcd ( x , y ) = 1, then a = x −^1 mod y , b = y −^1 mod x

  • Chinese Remainder Theorem: - find bases bi that are ≡ 1 mod mi and ≡ 0 mod m (^) j for j ̸= ibi = ci ( ci^1 mod mi ) where ci =

Q

i ̸= j m^ j

- x

P

ai bi (mod

Q

mi )

- solution is unique mod

Q

mi

- mi must be pairwise relatively prime in order to use CRT Note 8 (RSA)

  • Scheme: for primes p , q , find e coprime to ( p − 1)( q − 1) - public key: N = pq and e - private key: d = e −^1 mod ( p − 1)( q − 1) - encryption of message m : me^ (mod N ) = y - decryption of encrypted message y : yd^ (mod N ) = m
  • Fermat’s Little Theorem (FLT): xp^ ≡ x ( mod p ), or xp −^1 ≡ 1 (mod p ) if x coprime to p
  • Prime Number Theorem: π ( n ) ≥ (^) ln n n for n ≥ 17, where π ( n ) = # of primes ≤ n
  • Breaking RSA if we know d : - we know de − 1 = k ( p − 1)( q − 1), where ke because d < ( p − 1)( q − 1) - so de k −^1 = pqpq + 1; pq = N , so we can find p , q because we know d , e , k Note 9 (Polynomials)
  • Property 1: nonzero polynomial of degree d has at most d roots
  • Property 2: d + 1 pairs of points ( xi distinct) uniquely defines a polynomial of degree at most d
  • Lagrange Interpolation: -i ( x ) =

Q

i ̸= j

xx (^) j xix (^) j

- P ( x ) =

P

i yii ( x )

  • Secret Sharing (normally under GF ( p )): - P (0) = secret, P (1),... , P ( n ) given to all people - P ( x ) = polynomial of degree k − 1, where k people are needed to get the secret
  • Rational Root Theorem: for P ( x ) = an xn^ + · · · + a 0 , the roots of P ( x ) that are of the form pq must have p | a 0 , q | an Note 10 (Error Correcting Codes)
  • Erasure Errors: k packets lost, message length n ; need to send n + k packets because P ( x ) of degree n − 1 needs n points to define it
  • General Errors: k packets corrupted, message length n ; send n + 2 k packets
  • Berlekamp Welch: - P ( x ) encodes message (degree n − 1) - E ( x ) constructed so that roots are where the errors are (degree k ); coefficients unknown - Q ( x ) = P ( x ) E ( x ) (degree n + k − 1) - substitute all ( xi , ri ) into Q ( xi ) = ri E ( xi ), make system of equa- tions - solve for coefficients; P ( x ) = Q E (( xx ))

Note 11 (Counting)

  • 1st rule of counting: multiply # of ways for each choice
  • 2nd rule of counting: count ordered arrangements, divide by # of ways to order to get unordered

¡ n k

= (^) k !( nn −! k )! = # ways to select k from n

  • Stars and bars: n objects, k groups → n stars, k − 1 bars →

¡ n + k − 1 k − 1

¡ n + k − 1 n

  • Zeroth rule of counting: if bijection between A and B , then | A | = | B |
  • Binomial theorem: ( a + b ) n^ =

P n k = 0

¡ n k

ak^ bnk

  • Hockey-stick identity:

¡ (^) n k + 1

¡ n − 1 k

¡ n − 2 k

¡ k k

  • Derangements: Dn = ( n − 1)( Dn − 1 + Dn − 2 ) = n!

P n k = 0

(−1) k k!

  • Principle of Inclusion-Exclusion: | A 1 ∪ A 2 | = | A 1 | + | A 2 | − | A 1 ∩ A 2 | More generally, alternate add/subtract all combinations
  • Stirling’s approximation: n! ≈

p 2 πn

¡ (^) n e

¢ n

Note 12 (Probability Theory)

  • Sample points = outcomes
  • Sample space = Ω = all possible outcomes
  • Probability space: (Ω, P( ω )); (sample space, probability function)
  • 0 ≤ P( ω ) ≤ 1, ∀ ω ∈ Ω;

P

ω ∈Ω P( ω )^ =^1

  • Uniform probability: P( ω ) = (^) |Ω^1 | , ∀ ω ∈ Ω
  • P( A ) =

P

ωA P( ω ) where^ A^ is an event

  • If uniform: P( A ) = (^) # sample points in# sample points in^ Ω A = ||Ω A ||
  • P( A ) = 1 − P( A )

Note 13 (Conditional Probability)

  • P( ω | B ) = P P (( ωB )) for ωB
  • P( A | B ) = P( P A (∩ BB ) )→ P( AB ) = P( A | B ) P( B )
  • Bayes’ Rule: P( A | B ) =

P( B | A ) P( A )

P( B )

P( B | A ) P( A )

P( B | A ) P( A ) + P( B | A ) P( A )

  • Total Probability Rule (denom of Bayes’ Rule):

P( B ) =

X^ n i = 1

P

BAi

X^ n i = 1

P

B | Ai

P

Ai

for Ai partitioning Ω

  • Independence: P( AB ) = P( A ) P( B ) or P( A | B ) = P( A )
  • Union bound: P

³S

n i = 1 Ai

P n i = 1 P

Ai

Note 14 (Random Variables)

  • Bernoulli distribution: used as an indicator RV
  • Binomial distribution: P( X = i ) = i successes in n trials, success probability p - If X ∼ Bin( n , p ), Y ∼ Bin( m , p ) independent, X + Y ∼ Bin( n + m , p )
  • Hypergeometric distribution: P( X = k ) = k successes in N draws w/o replacement from size N population with B objects (as suc- cesses)
  • Joint distribution: P( X = a , Y = b )
  • Marginal distribution: P( X = a ) =

P

bB P( X^ =^ a ,^ Y^ =^ b )

  • Independence: P( X = a , Y = b ) = P( X = a ) P( Y = b )
  • Expectation: E[ X ] =

P

x ∈X x^ P( X^ =^ x )

  • LOTUS: E

g ( X )

P

x ∈X g^ ( x )^ P( X^ =^ x )

  • Linearity of expectation: E[ aX + bY ] = a E[ X ] + b E[ Y ]
  • X , Y independent: E[ X Y ] = E[ X ] E[ Y ]

Note 15 (Variance/Covariance)

  • Variance: Var( X ) = E[

Xμ

] = E

X^2

− E[ X ]^2

- Var( c X ) = c^2 Var( X ), Var( X + Y ) = Var( X ) + Var( Y ) + 2 Cov( X , Y ) - if indep: Var( X + Y ) = Var( X ) + Var( Y )

  • Covariance: Cov( X , Y ) = E[ X Y ] − E[ X ] E[ Y ]
  • Correlation: Corr( X , Y ) = Cov( σX Xσ^ , YY ), always in [−1, 1]
  • Indep. implies uncorrelated (Cov = 0), but not other way around, ex.

X =

( 1 p = 0. − 1 p = −0.

, Y =

  

1 X = −1, p = 0. − 1 X = −1, p = 0. 0 X = 1

Note 16 (Geometric/Poisson Distributions

  • Geometric distribution: P( X = i ) = exactly i trials until success with probability p ; use X − 1 for failures until success - Memoryless Property: P( X > a + b | X > a ) = P( X > b ); i.e. wait- ing > b units has same probability, no matter where we start
  • Poisson distribution: λ = average # of successes in a unit of time - X ∼ Pois( λ ), Y ∼ Pois( μ ) independent: X + Y ∼ Pois( λ + μ ) - X ∼ Bin( n , λ n ) where λ > 0 is constant, as n → ∞, X → Pois( λ ) Note 20 (Continuous Distributions)
  • Probability density function: - fX ( x ) ≥ 0 for x ∈ R -

R ∞

−∞ fX^ ( x ) d x^ =^1

  • Cumulative density function: FX ( x ) = P( Xx ) =

R (^) x −∞ fX^ ( t^ ) d t^ , fX ( x ) = (^) dd x FX ( x )

  • Expectation: E[ X ] =

R ∞

−∞ x fX^ ( x ) d x

  • LOTUS: E

g ( X )

R ∞

−∞ g^ ( x )^ fX^ ( x ) d x

  • Joint distribution: P( aXb , cYd ) - fX Y ( x , y ) ≥ 0, ∀ x , y ∈ R -

R ∞

−∞

R ∞

−∞ fX Y^ ( x ,^ y ) d x^ d y^ =^1

  • Marginal distribution: fX ( x ) =

R ∞

−∞ fX Y^ ( x ,^ y ) d y ; integrate over all^ y

  • Independence: fX Y ( x , y ) = fX ( x ) fY ( y )
  • Conditional probability: fX | A ( x ) = f P X (^ ( Ax )) , fX | Y ( x | y ) = fX Y fY^ (( xy ,) y )
  • Exponential distribution: continuous analog to geometric distribu- tion - Memoryless property: P( X > t + s | X > t ) = P( X > s ) - Additionally, P( X < Y | min( X , Y ) > t ) = P( X < Y ) - If X ∼ Exp( λX ), Y ∼ Exp( λY ) independent, then min( X , Y ) ∼ Exp( λX + λY ) and P( XY ) = (^) λXλ + XλY
  • Normal distribution (Gaussian distribution)

– If X ∼ N ( μX , σ^2 X ), Y ∼ N ( μY , σ^2 Y ) independent:

Z = aX + bY ∼ N ( aμX + bμY , a^2 σ^2 X + b^2 σ^2 Y )

  • Central Limit Theorem: if Sn =

P n i = 1 Xi^ , all^ Xi^ iid with mean^ μ , variance σ^2 , Sn n

≈ N

Ã

μ ,

σ^2 n

Snnμ σ

p n

≈ N (0, 1).

Note 17 (Concentration Inequalities, LLN)

  • Markov’s Inequality: P( Xc ) ≤ E[ cX ], if X nonnegative, c > 0
  • Generalized Markov: P(| Y | ≥ c ) ≤ E[| Y^ |

r (^) ] cr^ for^ c ,^ r^ >^0

  • Chebyshev’s Inequality: P

Xμ

c

≤ Var( c 2 X^ )for μ = E[ X ], c > 0

- Corollary: P

Xμ

≤ (^) k^12 for σ =

p Var( X ), k > 0

  • Confidence intervals: - For proportions, P

p ˆ − p

ε

≤ Var( ˆ ε 2 p )≤ δ , where δ is the confi- dence level (95% interval → δ = 0.05)

- p ˆ = proportion of successes in n trials, Var( ˆ p ) = p (1 np ) - =⇒ n ≥ (^4) ε^12 δ - For means, P

¯ (^) n^1 Sn −^ μ

¯ ≥^ ε

σ

2 ^2 =^ δ

- Sn =

P n i = 1 Xi^ , all^ Xi^ ’s iid mean^ μ , variance^ σ

2

- =⇒ ε = p σ

, interval = Sn ± p σ nδ

- With CLT, P

Anμ

ε

= P

¯ ( An^ − μ )

p n σ

¯ ≤ ε

p n σ

ε

p n σ

= 1 − δ.

Here, An = n^1 Sn and CLT gives An ≈ N

μ , σ

2 n

; use inverse cdf to get ε

  • Law of large numbers: as n → ∞, sample average of iid X 1 ,... Xn tends to population mean

Table 1: Common Discrete Distributions

Distribution

Parameters

PMF (

P

( X

k

))^

CMF (

P(

X^

k

Expectation (

E[

X^

])

Variance (Var(

X^

Support

Uniform

Uniform(

a ,

b

)^

b^

a

k^

a

b^

a

a^

b 2

( b

a

X

[ a

,^ b

]

Bernoulli

Bernoulli(

p )

p^

k^

=^

p

k^

=^

p^

p

p

)^

X^

∈^

Binomial

Bin(

n ,

p

Ã^ n k

!^ p

k^ (

p

n ) − k

np

np

p

)^

X^

∈^

Geometric

Geom(

p )

p

p

k ) −^1

p

k )

1 p

p (^2) p

X^

∈^

Poisson

Pois(

λ

)^

λ

k^ e

λ k!

λ^

λ^

X^

∈^

Hypergeometric

Hypergeometric(

N

,^ K

,^ n

¡ Kk

N

K n

k

¡ Nn

¢^

n

K N

n

K^

( N

K

N

n

N

N

X^

∈^

Table 2: Common Continuous Distributions

Distribution

Parameters

PDF (

fX

( x

))^

CDF (

FX

( x

)^ =

P

( X

x

Expectation (

E[

X^

])^

Variance (Var(

X^

Support

Uniform

Uniform(

a ,

b

)^

b^

a

x^

a b^

a

a^

b 2

( b

a

X^

∈^

[ a

,^ b

]

Exponential

Exp(

λ )

λe

λ

x^

e

λ

x^

1 λ

(^12) λ

X

[0,

Normal/Gaussian

N

( μ

,^ σ

p

2 πσ

2

exp

Ã^ −

( x

μ

2 σ

2

( x

)^

μ^

(^2) σ

X

R