






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Complete Statistics cheat sheet for your finam exam
Typology: Cheat Sheet
1 / 11
This page cannot be seen from the preview
Don't miss anything!







On special offer
Population entire collection of objects or individuals about which information is desired. ➔ easier to take a sample ◆ Sample part of the population that is selected for analysis ◆ Watch out for: ● Limited sample size that might not be representative of population ◆ Simple Random Sampling Every possible sample of a certain size has the same chance of being selected
Observational Study there can always be lurking variables affecting results ➔ i.e, strong positive association between shoe size and intelligence for boys ➔ _should never show causation_**
Experimental Study lurking variables can be controlled; can give good evidence for causation
Descriptive Statistics Part I ➔ Summary Measures
➔ Mean arithmetic average of data values *◆ *** _Highly susceptible to extreme values (outliers). Goes towards extreme values_ ◆ Mean could never be larger or smaller than max/min value but could be the max/min value
➔ Median in an ordered array, the median is the middle number ◆ _Not affected by extreme values_**
➔ Quartiles split the ranked data into 4 equal groups ◆ Box and Whisker Plot
◆ Disadvantages: Ignores the way in which data are distributed; sensitive to outliers
➔ Interquartile Range (IQR) = 3 rd quartile 1 st quartile ◆ Not used that much ◆ Not affected by outliers
➔ Variance the average distance squared
sx^2 = (^) n 1
∑ ( x x )
n i = 1 i^
2
values ◆ units are squared
➔ Standard Deviation shows variation about the mean
∑ ( x x )
n i = 1 i
2
◆ highly affected by outliers ◆ has same units as original data ◆ finance = horrible measure of risk (trampoline example)
Descriptive Statistics Part II Linear Transformations
➔ Linear transformations change the center and spread of data
➔ Average(a+bX) = a+b[Average(X)]
➔ Effects of Linear Transformations: ◆ meannew = a + bmean ◆* mediannew = a + bmedian ◆* stdevnew = (^) | b | *stdev ◆ IQRnew = (^) | b | *IQR ➔ Z score new data set will have mean 0 and variance 1 z =^ X^ SX
Empirical Rule ➔ Only for mound shaped data Approx. 95 % of data is in the interval: ( x 2 s (^) x , x + 2 s (^) x ) = x + (^) / 2 s (^) x ➔ only use if you just have mean and std. dev.
Chebyshev's Rule ➔ Use for any set of data and for any number k, greater than 1 ( 1. 2 , 1. 3 , etc.)
k^2 ➔ (Ex) for k= 2 ( 2 standard deviations), 75 % of data falls within 2 standard deviations
Detecting Outliers ➔ Classic Outlier Detection ◆ doesn't always work ◆ | z | = | |^ X^ SX^ | | ≥ 2 ➔ The Boxplot Rule ◆ Value X is an outlier if: X<Q 1 1. 5 (Q 3 Q 1 ) or X>Q 3 + 1. 5 (Q 3 Q 1 )
Skewness ➔ measures the degree of asymmetry exhibited by data ◆ negative values= skewed left ◆ positive values= skewed right ◆ if (^) | s kewness | < 0. 8 = don't need to transform data
Measurements of Association ➔ Covariance ◆ Covariance > 0 = larger x, larger y ◆ Covariance < 0 = larger x, smaller y
n
i = 1
◆ Units = Units of x Units of y ◆ Covariance is only +, , or 0 (can be any number)
➔ Correlation measures strength of a linear relationship between two variables
◆ rxy =
covariancexy ( std. dev. (^) x ) ( std. dev. (^) y ) ◆ correlation is between 1 and 1 ◆ Sign: direction of relationship ◆ Absolute value: strength of relationship ( 0. 6 is stronger relationship than + 0. 4 )
◆ Correlation doesn't imply causation ◆ The correlation of a variable with itself is one
Combining Data Sets ➔ Mean (Z) = Z = aX + bY ➔ Var (Z) = sz^2 = a^2 V^ ar ( X ) + b^2 V^ ar ( Y )+ 2 a bCov ( X , Y )
Portfolios ➔ Return on a portfolio:
◆ weights add up to 1 ◆ return = mean ◆ risk = std. deviation
➔ Variance of return of portfolio
◆ Risk(variance) is reduced when stocks are negatively correlated. (when there's a negative covariance)
Probability ➔ measure of uncertainty ➔ all outcomes have to be exhaustive (all options possible) and mutually exhaustive (no 2 outcomes can occur at the same time)
➔ Combining Random Variables ◆ If X and Y are independent:
V ar ( X + Y ) = V ar ( X ) + Var ( Y )
◆ If X and Y are dependent: E ( X + Y ) = E ( X ) + E ( Y ) V ar ( X + Y ) = V ar ( X ) + V ar ( Y ) + 2 C ov ( X , Y )
➔ Covariance: C ov ( X , Y ) = E ( XY ) E ( X ) E ( Y ) ➔ If X and Y are independent, Cov(X,Y) = 0
Binomial Distribution ➔ doing something n times ➔ only 2 outcomes: success or failure ➔ trials are independent of each other ➔ probability remains constant
1 .) All Failures P ( all f ailures ) = ( 1 p ) n
2 .) All Successes P ( all successes )= pn 3 .) At least one success P ( at least 1 success ) = 1 ( 1 p ) n 4 .) At least one failure P ( at least 1 f ailure ) = 1 pn 5 .) Binomial Distribution Formula for x=exact value
6 .) Mean (Expectation)
7 .) Variance and Standard Dev.
Binomial Example
Continuous Probability Distributions ➔ the probability that a continuous random variable X will assume any particular value is 0 ➔ Density Curves ◆ Area under the curve is the probability that any range of values will occur. ◆ Total area = 1
Uniform Distribution
◆ X ~ U nif ( a , b )
Uniform Example
(Example cont'd next page)
➔ Mean for uniform distribution:
( a + b )
➔ Variance for unif. distribution:
( b a )^2
Normal Distribution ➔ governed by 2 parameters: μ (the mean) and σ (the standard deviation)
Standardize Normal Distribution:
Z = (^) σ
X μ
➔ Z score is the number of standard deviations the related X is from its mean ➔ ****Z< some value, will just be the probability found on table** ➔ ****Z> some value, will be ( 1 probability) found on table**
Normal Distribution Example
Sums of Normals
Sums of Normals Example:
➔ Cov(X,Y) = 0 b/c they're independent
Central Limit Theorem ➔ as n increases, ➔ x should get closer to μ (population mean) ➔ mean( x ) = μ ➔ variance (^) ( x ) = σ^2 / n ➔ X ~ N (μ, σ n )
2
◆ if population is normally distributed, n can be any value ◆ any population, n needs to be (^) ≥ 30
➔ Z =
X μ σ/√ n
Confidence Intervals = tells us how good our estimate is **Want high confidence, narrow interval **As confidence increases , interval also increases
A. One Sample Proportion
︿
number of successes in sample
➔ We are thus 95 % confident that the true population proportion is in the interval… ➔ We are assuming that n is large, n^ ︿ p> 5 and our sample size is less than 10 % of the population size.
**One Sample Hypothesis Tests
2. Test Statistic Approach (Population Mean) (^3). Test Statistic Approach (Population Proportion)
4. P Values ➔ a number between 0 and 1 ➔ the larger the p value, the more consistent the data is with the null ➔ the smaller the p value, the more consistent the data is with the alternative ➔ ** If P is low (less than 0. 05 ), H 0 must go reject the null hypothesis
**Two Sample Hypothesis Tests
➔ Test Statistic for Two Proportions 2. Comparing Two Means (large independent samples n> 30 )
➔ Calculating Confidence Interval
➔ Test Statistic for Two Means
Matched Pairs ➔ Two samples are DEPENDENT Example:
Assumptions of Simple Linear Regression 1. We model the AVERAGE of something rather than something itself
2.
◆ As ε (noise) gets bigger, it’s harder to find the line
➔ S
2 e =^ n 2
SSE
➔ Se^2 is our estimate of σ^2 ➔ Se = (^) √ Se^2 is our estimate of σ ➔ 95 % of the Y values should lie within
Example of Prediction Intervals:
Standard Errors for b 1 a nd b 0 ➔ standard errors when noise ➔ sb 0 amount of uncertainty in our estimate of β 0 (small s good, large s bad) ➔ sb 1 amount of uncertainty in our estimate of β 1
Confidence Intervals for b 1 and b 0
➔
➔ n small → bad se big → bad s^2 x small→ bad (wants x’s spread out for better guess)
*Regression Hypothesis Testing always a two sided test ➔ want to test whether slope ( β 1 ) is needed in our model ➔ H 0 : β 1 = 0 (don’t need x) Ha : (^) β 1 =/ 0 (need x) ➔ Need X in the model if: a. 0 isn’t in the confidence interval b. t > 1. 96 c. P value < 0. 05
Test Statistic for Slope/Y intercept ➔ can only be used if n> 30 ➔ if n < 30 , use p values
Multiple Regression
➔ ➔ Variable Importance: ◆ higher t value, lower p value = variable is more important ◆ lower t value, higher p value = variable is less important (or not needed)
Adjusted R squared ➔ k = # of X’s
➔ Adj. R squared will as you add junk x variables ➔ Adj. R squared will only if the x you add in is very useful ➔ **want Adj. R squared to go up and Se low for better model
The Overall F Test
➔ Always want to reject F test (reject null hypothesis) ➔ Look at p value (if < 0. 05 , reject null) ➔ H 0 : β 1 = β 2 = β 3. ..= β k = 0 (don’t need any X’s) Ha : (^) β 1 = β 2 = β 3 ... = β k =/ 0 (need at least 1 X) ➔ If no x variables needed, then SSR= 0 and SST=SSE
Modeling Regression Backward Stepwise Regression
Dummy Variables ➔ An indicator variable that takes on a value of 0 or 1 , allow intercepts to change
Interaction Terms ➔ allow the slopes to change ➔ interaction between 2 or more x variables that will affect the Y variable
How to Create Dummy Variables (Nominal Variables) ➔ If C is the number of categories, create (C 1 ) dummy variables for describing the variable ➔ One category is always the “baseline”, which is included in the intercept
Recoding Dummy Variables Example: How many hockey sticks sold in the summer (original equation) h o ckey = 10 0 + 10 W tr 20 Spr + 30 F all Write equation for how many hockey sticks sold in the winter h o ckey = 11 0 + 20 F all 30 Spri 10 Summer ➔ **always need to get same exact values from the original equation