Review of Basic Statistics , Lecture notes of Social Statistics and Data Analysis

Here is a review of the basics of stastistics. It covers basic probability distributions to regression analysis.

Typology: Lecture notes

2017/2018

Uploaded on 04/06/2018

NarutoLLN
NarutoLLN 🇺🇸

3

(1)

1 document

1 / 31

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Economics 20 - Prof. Anderson 1
Review of Probability and
Statistics
(i.e. things you learned in Ec 10 and
need to remember to do well in this
class!)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f

Partial preview of the text

Download Review of Basic Statistics and more Lecture notes Social Statistics and Data Analysis in PDF only on Docsity!

Review of Probability and

Statistics

(i.e. things you learned in Ec 10 and need to remember to do well in this class!)

Random Variables

X is a random variable if it represents a random draw from some population a discrete random variable can take on only selected values a continuous random variable can take on any value in a real interval associated with each random variable is a probability distribution

Expected Value of X – E(X)

The expected value is really just a probability weighted average of X E(X) is the mean of the distribution of X, denoted by x Let f(xi) be the probability that X=xi, then     n i X i i E X x f x 1  ( ) ( )

Variance of X – Var(X)

The variance of X is a measure of the dispersion of the distribution Var(X) is the expected value of the squared deviations from the mean, so

2 2

X X

  Var X  E X  

Covariance – Cov(X,Y)

Covariance between X and Y is a measure of the association between two random variables, X & Y If positive, then both move up or down together If negative, then if X is high, Y is low, vice versa     XY X Y

  Cov ( X , Y )  E X   Y  

Correlation Between X and Y

Covariance is dependent upon the units of X & Y [Cov(aX,bY)=abCov(X,Y)] Correlation, Corr(X,Y), scales covariance by the standard deviations of X & Y so that it lies between 1 & –   2 1 ( ) ( ) ( , ) Var X Var Y Cov X Y X Y XY XY      

Properties of Expectations

E(a)=a, Var(a)= E(X)=X, i.e. E(E(X))=E(X) E(aX+b)=aE(X)+b E(X+Y)=E(X)+E(Y) E(X-Y)=E(X)-E(Y) E(X- X)=0 or E(X-E(X))= E((aX) 2 )=a 2 E(X 2 )

More Properties

Var(X) = E(X 2 ) – x 2 Var(aX+b) = a^2 Var(X) Var(X+Y) = Var(X) +Var(Y) +2Cov(X,Y) Var(X-Y) = Var(X) +Var(Y) - 2Cov(X,Y) Cov(X,Y) = E(XY)-xy If (and only if) X,Y independent, then  Var(X+Y)=Var(X)+Var(Y), E(XY)=E(X)E(Y)

The Standard Normal

Any random variable can be “standardized” by subtracting the mean, , and dividing by the standard deviation,  , so E(Z)=0, Var(Z)= Thus, the standard normal, N(0,1), has pdf   2 2 2 1 z z e    

Properties of the Normal

If X~N(,^2 ), then aX+b ~N(a+b,a^2 ^2 ) A linear combination of independent, identically distributed (iid) normal random variables will also be normally distributed If Y 1 ,Y 2 , … Yn are iid and ~N(, 2 ), then         n ~ N , 2  Y

The Chi-Square Distribution

Suppose that Zi , i=1,…,n are iid ~ N(0,1), and X=(Zi^2 ), then X has a chi-square distribution with n degrees of freedom (df), that is X~ 2 n If X~ 2 n, then E(X)=n and Var(X)=2n

The t distribution

If a random variable, T, has a t distribution with n degrees of freedom, then it is denoted as T~tn E(T)=0 (for n>1) and Var(T)=n/(n-2) (for n>2) T is a function of Z~N(0,1) and X~^2 n as follows:

n

X

Z

T 

Random Samples and Sampling

For a random variable Y, repeated draws from the same population can be labeled as Y 1 , Y 2 ,... , Yn If every combination of n sample points has an equal chance of being selected, this is a random sample A random sample is a set of independent, identically distributed (i.i.d) random variables

Estimators and Estimates

Typically, we can’t observe the full population, so we must make inferences base on estimates from a random sample An estimator is just a mathematical formula for estimating a population parameter from sample data An estimate is the actual number the formula produces from the sample data