

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Introduction to Probability and Statistics; Subject: Statistics; University: University of California - Berkeley; Term: Unknown 1989;
Typology: Study notes
1 / 3
This page cannot be seen from the preview
Don't miss anything!


Outline:
1
Least-Squares Regression
Regression describes the relationship between two variables in the situation where one variable can be used to explain or predict the other. The regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes.
2 Fitting the Regression Line to Data
Since we intend to predict y from x, the errors of interest are mispredictions of y for a fixed x.
The least-squares regression line of y on x is the line that minimizes sum of squared errors. This is the least squares criterion.
Given pairs of observations (x 1 , y 1 ),... , (xn, yn), the regression line is given by
ˆy = a + bx
where b = r s syx and a = ¯y − b¯x.
Interpreting the Regression Model
Facts about Least Squares Regression
These properties depend on the least-squares fitting criterion and are one reason why that criterion is used.
5
Residuals
Residuals are the vertical distances between the data points and the corresponding predicted values.
ri = observed y − predicted y = yi − yˆi = yi − (a + bxi)
For a least squares regression, the residuals always have mean zero.
6 Residual Plots
A residual plot is a scatterplot of the residuals against the explanatory variable. It can be used to assess the fit of the regression line.
Patterns to look for:
Influential observations are individuals with extreme x values that exert a strong influence on the position of the regression line. Removing them would significantly change the regression line.
A Regression Example
Consider the following data on unemployment rate and unemployment expenditure for several countries: Unemp. Unemp. Country Rate Exp. swz 0.5 0. lux 1.4 0. swd 1.6 0. jap 2.1 0. aut 3.3 0. fin 3.4 0. por 4.6 0. ger 4.7 1. nor 5.2 1. us 5.4 0. uk 6.8 0. gr 7.0 0. aus 7.0 1. bel 7.6 1. nl 7.8 2. nz 7.9 1. can 8.1 1. fr 8.9 1. den 9.7 3. it 10.3 0. ir 13.8 2. sp 15.9 2.
Summary Statistics
¯x = 6. 5 y¯ = 1. 20 sx = 3. 87 sy = 0. 89 r = 0. 73