Statistics mathematics formulas, Cheat Sheet of Statistics

Statistics formulas are sample statistics, simple linear regression, normal distributions, probability, random variables and central limit theorem.

Typology: Cheat Sheet

2021/2022

Uploaded on 02/07/2022

danmarino
danmarino ๐Ÿ‡บ๐Ÿ‡ธ

4.2

(11)

267 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Stat 101 Formulas
Brian Powers, Summer 2014
Sample Statistics
Sample mean ๐‘ฅ๎ชง=1
๐‘›โˆ‘๐‘ฅ๐‘–
๐‘›
๐‘–=1
Sample variance
๐‘ 2=1
๐‘›โˆ’1โˆ‘(๐‘ฅ๐‘–โˆ’๐‘ฅ๎ชง)2
๐‘›
๐‘–=1 =โˆ‘๐‘ฅ๐‘–2โˆ’๐‘›๐‘ฅ๎ชง2
๐‘›โˆ’1
Sample standard deviation
๐‘ =โˆš๐‘ 2=โˆš1
๐‘›โˆ’1โˆ‘(๐‘ฅ๐‘–โˆ’๐‘ฅ๎ชง)2
๐‘›
๐‘–=1
5-Number summary
๐‘„0=๐‘š๐‘–๐‘›๐‘–๐‘š๐‘ข๐‘š
๐‘„1=1๐‘ ๐‘ก ๐‘ž๐‘ข๐‘Ž๐‘Ÿ๐‘ก๐‘–๐‘™๐‘’
๐‘„2=๐‘š๐‘’๐‘‘๐‘–๐‘Ž๐‘›
๐‘„3=3๐‘Ÿ๐‘‘ ๐‘ž๐‘ข๐‘Ž๐‘Ÿ๐‘ก๐‘–๐‘™๐‘’
๐‘„4=๐‘š๐‘Ž๐‘ฅ๐‘–๐‘š๐‘ข๐‘š
Range
๐‘…๐‘Ž๐‘›๐‘”๐‘’=๐‘š๐‘Ž๐‘ฅ๐‘–๐‘š๐‘ข๐‘šโˆ’๐‘š๐‘–๐‘›๐‘–๐‘š๐‘ข๐‘š
= ๐‘„4โˆ’๐‘„0
Inter-Quartile Range
๐ผ๐‘„๐‘…=๐‘„3โˆ’๐‘„1
Fences for Outliers
๐‘„1โˆ’1.5โˆ—๐ผ๐‘„๐‘…,๐‘„3+1.5โˆ—๐ผ๐‘„๐‘…
Simple Linear Regression
Sample Covariance
๐ถ๐‘œ๐‘ฃ(๐‘ฅ,๐‘ฆ)=โˆ‘๐‘ฅ๐‘–๐‘ฆ๐‘–โˆ’๐‘›๐‘ฅ๎ชง๐‘ฆ๏Œค
(๐‘›โˆ’1)
Sample Correlation
๐‘Ÿ=๐ถ๐‘œ๐‘ฃ(๐‘ฅ,๐‘ฆ)
๐‘ ๐‘ฅ๐‘ ๐‘ฆ=โˆ‘๐‘ฅ๐‘–๐‘ฆ๐‘–โˆ’๐‘›๐‘ฅ๎ชง๐‘ฆ๏Œค
(๐‘›โˆ’1)๐‘ ๐‘ฅ๐‘ ๐‘ฆ
Regression Model
๐‘ฆ๏œ=๐‘Ž+๐‘๐‘ฅ
Slope ๐‘=๐‘Ÿ๐‘ ๐‘ฆ
๐‘ ๐‘ฅ
Intercept ๐‘Ž=๐‘ฆ๏Œคโˆ’๐‘๐‘ฅ๎ชง
Residual
๐‘Ÿ๐‘’๐‘ ๐‘–๐‘‘๐‘–=๐‘ฆ๐‘–โˆ’๐‘ฆ๏œ๐‘–=๐‘ฆ๐‘–โˆ’(๐‘Ž+๐‘๐‘ฅ๐‘–)
Normal Distribution
Standardize ๐‘ง=๐‘ฅโˆ’๐œ‡
๐œŽ
Un-Standardize ๐‘ฅ= ๐œ‡+ ๐‘ง๐œŽ
68/95/99.7 Rule ๐‘ƒ(โˆ’1<๐‘<1)โ‰ˆ.68
๐‘ƒ(โˆ’2<๐‘<2)โ‰ˆ.95
๐‘ƒ(โˆ’3<๐‘<3)โ‰ˆ.997
kth Percentile ๐‘ฅ ๐‘ ๐‘ข๐‘โ„Ž ๐‘กโ„Ž๐‘Ž๐‘ก ๐‘ƒ(๐‘‹<๐‘ฅ)=๐‘˜%
pf3
pf4

Partial preview of the text

Download Statistics mathematics formulas and more Cheat Sheet Statistics in PDF only on Docsity!

Sample Statistics

Sample mean

๐‘ฅฬ… =

๐‘›

๐‘–= Sample variance

๐‘ ^2 =

๐‘› โˆ’ 1 โˆ‘(๐‘ฅ๐‘–^ โˆ’ ๐‘ฅฬ…)

2

๐‘›

๐‘–=

โˆ‘ ๐‘ฅ๐‘–^2 โˆ’ ๐‘›๐‘ฅฬ…^2

Sample standard deviation

๐‘  = โˆš๐‘ ^2 = โˆš^

๐‘› โˆ’ 1 โˆ‘(๐‘ฅ๐‘–^ โˆ’ ๐‘ฅฬ…)

2

๐‘›

๐‘–=

5-Number summary ๐‘„ 0 = ๐‘š๐‘–๐‘›๐‘–๐‘š๐‘ข๐‘š ๐‘„ 1 = 1๐‘ ๐‘ก ๐‘ž๐‘ข๐‘Ž๐‘Ÿ๐‘ก๐‘–๐‘™๐‘’ ๐‘„ 2 = ๐‘š๐‘’๐‘‘๐‘–๐‘Ž๐‘› ๐‘„ 3 = 3๐‘Ÿ๐‘‘ ๐‘ž๐‘ข๐‘Ž๐‘Ÿ๐‘ก๐‘–๐‘™๐‘’ ๐‘„ 4 = ๐‘š๐‘Ž๐‘ฅ๐‘–๐‘š๐‘ข๐‘š Range ๐‘…๐‘Ž๐‘›๐‘”๐‘’ = ๐‘š๐‘Ž๐‘ฅ๐‘–๐‘š๐‘ข๐‘š โˆ’ ๐‘š๐‘–๐‘›๐‘–๐‘š๐‘ข๐‘š = ๐‘„ 4 โˆ’ ๐‘„ 0

Inter-Quartile Range ๐ผ๐‘„๐‘… = ๐‘„ 3 โˆ’ ๐‘„ 1 Fences for Outliers ๐‘„ 1 โˆ’ 1.5 โˆ— ๐ผ๐‘„๐‘…, ๐‘„ 3 + 1.5 โˆ— ๐ผ๐‘„๐‘…

Simple Linear Regression

Sample Covariance

๐ถ๐‘œ๐‘ฃ(๐‘ฅ, ๐‘ฆ) =

Sample Correlation

๐‘Ÿ =

๐‘ ๐‘ฅ๐‘ ๐‘ฆ^ =

Regression Model ๐‘ฆฬ‚ = ๐‘Ž + ๐‘๐‘ฅ Slope ๐‘ = ๐‘Ÿ

Intercept ๐‘Ž = ๐‘ฆฬ… โˆ’ ๐‘๐‘ฅฬ… Residual ๐‘Ÿ๐‘’๐‘ ๐‘–๐‘‘๐‘– = ๐‘ฆ๐‘– โˆ’ ๐‘ฆฬ‚๐‘– = ๐‘ฆ๐‘– โˆ’ (๐‘Ž + ๐‘๐‘ฅ๐‘–)

Normal Distribution

Standardize

๐‘ง =

Un-Standardize ๐‘ฅ = ๐œ‡ + ๐‘ง๐œŽ 68/95/99.7 Rule ๐‘ƒ(โˆ’1 < ๐‘ < 1) โ‰ˆ. ๐‘ƒ(โˆ’2 < ๐‘ < 2) โ‰ˆ. ๐‘ƒ(โˆ’3 < ๐‘ < 3) โ‰ˆ. kth Percentile ๐‘ฅ ๐‘ ๐‘ข๐‘โ„Ž ๐‘กโ„Ž๐‘Ž๐‘ก ๐‘ƒ(๐‘‹ < ๐‘ฅ) = ๐‘˜%

Probability

Complement Rule ๐‘ƒ(๐ด๐ถ) = 1 โˆ’ ๐‘ƒ(๐ด) General Addition Rule ๐‘ƒ(๐ด โˆช ๐ต) = ๐‘ƒ(๐ด) + ๐‘ƒ(๐ต) โˆ’ ๐‘ƒ(๐ด โˆฉ ๐ต) Multiplication Rule for Independent Events ๐‘ƒ(๐ด โˆฉ ๐ต) = ๐‘ƒ(๐ด) โˆ— ๐‘ƒ(๐ต) General Multiplication Rule ๐‘ƒ(๐ด โˆฉ ๐ต) = ๐‘ƒ(๐ด) โˆ— ๐‘ƒ(๐ต|๐ด) = ๐‘ƒ(๐ต) โˆ— (๐‘ƒ(๐ด|๐ต) Conditional Probability

๐‘ƒ(๐ด|๐ต) =

A and B are Independent if:

Random Variables

Expected Value

๐œ‡ = ๐ธ(๐‘‹) = โˆ‘ ๐‘ฅ๐‘– โˆ— ๐‘ƒ(๐‘‹ = ๐‘ฅ๐‘–)

๐‘˜

๐‘–=

๐‘˜

๐‘–= Variance

๐œŽ^2 = ๐‘‰๐‘Ž๐‘Ÿ(๐‘‹) = ๐ธ((๐‘‹ โˆ’ ๐œ‡)^2 ) = ๐ธ(๐‘‹^2 ) โˆ’ ๐œ‡^2 = โˆ‘(๐‘ฅ โˆ’ ๐œ‡)^2 โˆ— ๐‘๐‘–

๐‘˜

๐‘–= Linearity of Expected Value ๐ธ(๐‘Ž๐‘‹) = ๐‘Ž๐ธ(๐‘‹) ๐ธ(๐‘‹ + ๐‘) = ๐ธ(๐‘‹) + ๐‘ ๐ธ(๐‘‹ + ๐‘Œ) = ๐ธ(๐‘‹) + ๐ธ(๐‘Œ) Variance of a Linear Combination ๐‘‰๐‘Ž๐‘Ÿ(๐‘Ž๐‘‹) = ๐‘Ž^2 ๐‘‰๐‘Ž๐‘Ÿ(๐‘‹) ๐‘‰๐‘Ž๐‘Ÿ(๐‘‹ + ๐‘) = ๐‘‰๐‘Ž๐‘Ÿ(๐‘‹) ๐‘‰๐‘Ž๐‘Ÿ(๐‘Ž๐‘‹ + ๐‘๐‘Œ) = ๐‘Ž^2 ๐‘‰๐‘Ž๐‘Ÿ(๐‘‹) + ๐‘^2 ๐‘‰๐‘Ž๐‘Ÿ(๐‘Œ) + 2๐‘Ž๐‘๐ถ๐‘œ๐‘ฃ(๐‘‹, ๐‘Œ) Variance of Linear Combination of Independent X,Y ๐‘‰๐‘Ž๐‘Ÿ(๐‘Ž๐‘‹ + ๐‘๐‘Œ) = ๐‘Ž^2 ๐‘‰๐‘Ž๐‘Ÿ(๐‘‹) + ๐‘^2 ๐‘‰๐‘Ž๐‘Ÿ(๐‘Œ) ๐‘†๐ท(๐‘‹ 1 + ๐‘‹ 2 + โ‹ฏ + ๐‘‹๐‘›) = โˆš๐‘›๐‘†๐ท(๐‘‹)

Population mean ฮผ (n<30, ฯƒ known) (^) ๐‘ฅฬ… ยฑ ๐‘ก๐›ผ/ 2 โˆ— ๐œŽ โˆš๐‘›^

Population mean ฮผ (ฯƒ unknown) (^) ๐‘ฅฬ… ยฑ ๐‘ก๐›ผ/ 2 โˆ— ๐‘  โˆš๐‘›^

Hypothesis Testing

Calculate the Test Statistic Population proportion p (n large) ๐‘ง =

Population difference p 1 - p 2 ( n 1 , n 2 large) H 0 : p 1 - p 2 =0 ๐‘ง^ =^

โˆš๐‘ฬ‚๐‘๐‘œ๐‘œ๐‘™๐‘’๐‘‘๐‘žฬ‚๐‘๐‘œ๐‘œ๐‘™๐‘’๐‘‘ ( ๐‘›^1

1 +^

And ๐‘ฬ‚๐‘๐‘œ๐‘œ๐‘™๐‘’๐‘‘ =

๐‘› 1 + ๐‘› 2 =^

Population mean ฮผ (nโ‰ฅ30, ฯƒ known) (^) ๐‘ง = ๐‘ฅฬ…^ โˆ’^ ๐œ‡ 0 ๐œŽ/โˆš๐‘› Population mean ฮผ (n<30, ฯƒ known) (^) ๐‘ก = ๐‘ฅฬ…^ โˆ’^ ๐œ‡ 0 ๐œŽ/โˆš๐‘› Population mean ฮผ (ฯƒ unknown) (^) ๐‘ก = ๐‘ฅฬ…^ โˆ’^ ๐œ‡ 0 ๐‘ /โˆš๐‘›

Conclusion Test Type p-value formula calculator Upper-Tail ๐‘ƒ(๐‘ > ๐‘ง) (^) normalcdf(z,10) Lower-Tail ๐‘ƒ(๐‘ < ๐‘ง) (^) normalcdf(-10,z) Two-Tailed 2 โˆ— ๐‘ƒ(๐‘ > |๐‘ง|) (^2) *normalcdf(|z|,10) Reject H 0 if p-value<ฮฑ