Multiple Regression Excel Lab - Econometrics - Lecture Notes, Study notes of Econometrics and Mathematical Economics

Multiple Regression Excel Lab, Excel Setup, Multiple Regression, Analysis, Regression Results, Estimate of the Intercept, Estimate of the Slopes, Standard Error, Regression Statistics, Summary Output are some points from this lecture, Econometrics handout.

Typology: Study notes

2011/2012

Uploaded on 12/18/2012

dipen
dipen 🇮🇳

4.4

(76)

140 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Multiple Regression Excel Lab
Part I. Excel Setup
A. Be sure you’ve got the data analysis tool pack added in
Under “tools” in the main tool bar at the top of the screen see if the option “data analysis” is available (it should be at the
bottom of the list). If data analysis is not there do the following:
1. Under “tools” select “add-ins”
2. Click on the box for “analysis toolpak”
3. Click “OK”
4. Go back and check to see that “data analysis” is now an option under tools.
B. Purpose of Today’s lab
In today’s lab we’re going to learn how to get Excel to perform multiple regression, examine dummy variables, and take a
“model-building” approach to regression.
Part II. Multiple Regression
Here we’re going to estimate the linear relationship between ATTENDANCE (Y) and POP (X1) and CURNTWIN (X2).
A. Sheet Set-up
1. Open the data set called “baseball attendance and 12 factors” (the file is on my webpage).
2. Insert 1 new worksheet and re-save the file as an Excel workbook with a new name.
3. Label the new sheet “mult reg”.
B. Analysis
1. Copy the ATTENDANCE, POP and CURNTWIN variables into the new sheet.
2. Choose “Tools”, “data analysis”, and “regression”.
3. In the “Input Y range” cell, highlight the ATTENDANCE column, including the top cell as a label.
4. In the “Input X range” cell, highlight both the POP and CURNTWIN columns, including the top cells as labels.
5. Check the box for “labels”. Do not check the box for “constant is zero” .
6. With the cursor blinking in the “output range” cell, select a cell somewhere to the right of your data.
7. Click “OK”
Examine the regression results. Locate each of the following:
a. The estimate of the intercept (
β
ˆ
0)
b. The 2 estimate of the slopes (
β
ˆ1
) and (
β
ˆ2
)
c. The standard error of the estimate (SER)
d. The standard error of the intercept ( S
β
ˆ0)
e. The standard errors of the slopes ( S
β
ˆ1) and (
S
β
ˆ
2
)
f. The 95% CI’s for β0 , β1 and β2
g. The t-statistics for Ho(1): β0 = 0 , Ho(2):β1 = 0 , and Ho(3): β2 = 0
h. The value of R-square (recall that it shows the percentage of the variation in Y that we explained with our
regression).
Docsity.com
pf3
pf4
pf5

Partial preview of the text

Download Multiple Regression Excel Lab - Econometrics - Lecture Notes and more Study notes Econometrics and Mathematical Economics in PDF only on Docsity!

Multiple Regression Excel Lab

Part I. Excel Setup A. Be sure you’ve got the data analysis tool pack added in Under “tools” in the main tool bar at the top of the screen see if the option “data analysis” is available (it should be at the bottom of the list). If data analysis is not there do the following:

  1. Under “tools” select “add-ins”
  2. Click on the box for “analysis toolpak”
  3. Click “OK”
  4. Go back and check to see that “data analysis” is now an option under tools.

B. Purpose of Today’s lab In today’s lab we’re going to learn how to get Excel to perform multiple regression, examine dummy variables, and take a “model-building” approach to regression.

Part II. Multiple Regression Here we’re going to estimate the linear relationship between ATTENDANCE (Y) and POP (X 1 ) and CURNTWIN (X 2 ).

A. Sheet Set-up

  1. Open the data set called “baseball attendance and 12 factors” (the file is on my webpage).
  2. Insert 1 new worksheet and re-save the file as an Excel workbook with a new name.
  3. Label the new sheet “mult reg”.

B. Analysis

  1. Copy the ATTENDANCE, POP and CURNTWIN variables into the new sheet.
  2. Choose “Tools”, “data analysis”, and “regression”.
  3. In the “Input Y range” cell, highlight the ATTENDANCE column, including the top cell as a label.
  4. In the “Input X range” cell, highlight both the POP and CURNTWIN columns, including the top cells as labels.
  5. Check the box for “labels”. Do not check the box for “constant is zero”.
  6. With the cursor blinking in the “output range” cell, select a cell somewhere to the right of your data.
  7. Click “OK”

Examine the regression results. Locate each of the following:

a. The estimate of the intercept ( βˆ

0

b. The 2 estimate of the slopes ( βˆ^

1

) and ( βˆ^

2

c. The standard error of the estimate (SER)

d. The standard error of the intercept ( S βˆ

0

e. The standard errors of the slopes ( S βˆ

1

) and ( S βˆ

2

f. The 95% CI’s for β 0 , β 1 and β 2

g. The t-statistics for Ho(1): β 0 = 0 , Ho(2):β 1 = 0 , and Ho(3): β 2 = 0

h. The value of R-square (recall that it shows the percentage of the variation in Y that we explained with our regression).

Questions:

**1. Write out and interpret the regression equation you’ve estimated.

  1. Compare the results of this multiple to the results of the two corresponding simple regressions that we did in the simple regression lab (pasted below). Are the regression coefficients different? If so, why?**

SUMMARY OUTPUT (Y = attendance, X = POP)

Regression Statistics Multiple R 0. R Square 0. Adjusted R Square 0. Standard Error 542. Observations 78

ANOVA

df SS MS F Significance F Regression 1 5180060.737 5180061 17.61062 7.28E- Residual 76 22354952.16 294144. Total 77 27535012.

Coefficients Standard Error t Stat P-value Lower 95%

Upper 95% Intercept 1393.433079 111.2778705 12.5221 3.7E-20 1171.804 1615. POP 0.109927776 0.026195102 4.196501 7.28E-05 0.057756 0.

SUMMARY OUTPUT (Y = attendance, X = CURNTWIN)

Regression Statistics Multiple R 0. R Square 0. Adjusted R Square 0. Standard Error 440. Observations 78

ANOVA

df SS MS F Significance F Regression 1 12819510.34 12819510 66.20792 6.02E- 12 Residual 76 14715502.55 193625 Total 77 27535012.

Coefficients Standard Error t Stat P-value Lower 95%

Upper 95% Intercept -1340.9213 387.1268058 -3.46378 0.000878 -2111.95 -569. CURNTWIN 38.6325231 4.74786222 8.136825 6.02E-12 29.17632 48.

Part III. Multiple regression with dummy variables: C. Import the data for this section The data set for this section is titled “salary, age, Ph.D, evaluation, pubs, and gender for 100 university professors”, and is on my webpage (“professor data”).

1 st, we’re going calculate the correlation coefficients for all combinations of these 6 variables.

  1. Copy and paste all data into a worksheet labeled “correlation”.
  2. Click “Tools” , then “Data Analysis” then “Correlation”, then “OK”
  3. In the input range box, highlight all 6 columns of data, including the labels
  4. Check the circle for “Grouped by: Columns”
  5. Check the “labels in first row” box
  6. Select an output range and click “OK”

Your output should consist of a 6 x 6 matrix. The correlation coefficient between two variables is found in the corresponding off-diagonal element of the matrix.

Rank age, Ph.D, evaluation, pubs, and gender in order of their highest correlation with salary.

Questions:Given that “gender” and “PhD” are dummy variables, how would you interpret the mean of these variables?

Given that “gender” and “PhD” are dummy variables, how do you interpret correlation coefficients between these variables and other (quantitative) variables?

  1. In separate worksheets, do 5 regressions using Y = salary, X1 = highest correlation with salary, X2 = second highest correlation with salary, etc… as follows:

Regressions:

  1. Y = f(X1)
  2. Y = f(X1, X2)
  3. Y = f(X1, X2, X3)
  4. Y = f(X1, X2, X3, X4)
  5. Y = f(X1, X2, X3, X4, X5)

What is the interpretation of the coefficient on “gender”?

What is the interpretation of the coefficient on “Ph.D”?

What would you conclude about the following null hypotheses in the final regression?

1. Ho(1): β **0 = 0

  1. Ho(2):** β **1 = 0
  2. Ho(3):** β **2 = 0
  3. Ho(4):** β **3 = 0
  4. Ho(4):** β **4 = 0
  5. Ho(4):** β **5 = 0
  6. Ho(5):** β 1 = β 2 = β 3 = β 4 = β 5 = 0

Based on the above, what can you therefore conclude about the significance of the true regression coefficients in the context of the regression? Explain.

2. How has the value of the coefficient of determination (R^2 ) and adjusted coefficient of determination changed as you add variables to the model? (fill-in the table below)

Regression R-squared Adjusted R-squared Y = f(X 1 ) Y = f(X 1 , X 2 ) Y = f(X 1 , X 2 , X 3 ) Y = f(X 1 , X 2 , X 3 , X 4 ) Y = f(X 1 , X 2 , X 3 , X 4 , X 5 )

Based on the changes in R^2 and adjusted R^2 , what can you say about the contribution of each of these X variables towards explaining the variation in Y?