Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Understanding R-squared Value and Adjusted R-squared in Regression Analysis, Lecture notes of Business Statistics

Hanoi University of Science (HUS)Business Statistics

An in-depth explanation of r-squared value and its limitations in evaluating the fit of a regression model. It also introduces the concept of adjusted r-squared and discusses how it differs from r-squared. The importance of considering the number of independent variables in a model and the relationship between r-squared and adjusted r-squared.

Typology: Lecture notes

2021/2022

Uploaded on 08/05/2022

nguyen_99 🇻🇳

4.2

(80)

1K documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

R-Squared Notes:

So far, we have not focused on the R-squared value to evaluate how “well” our model fits the data.

Why? Because too much emphasis can be placed on this particular measure, and if you go on to

study “time-series” data, you will see that the R-squared value can be extremely misleading.

Things to note:

Sthere is no value that R-squared should be for you to claim that your model does a good job

at explaining the variation in the dependent variable. It is simply an estimate of how much

variation can be explained.

Sa small R-squared value implies that the error variance is large relative to the variance of y,

which means that we may have a hard time precisely estimating the $ coefficients. BUT, this

can be offset by a large sample size. This is true even if we have not controlled for many

unobserved factors – which leads to the large error term. EXAMPLE: suppose that some

incoming students at a large university are RANDOMLY given grants to buy computer

equipment. If the amount of the grant is truly randomly determined, we can estimate the

ceteris paribus effect of the grant amount on subsequent college grade point average by using

simple regression analysis. Because of the random assignment, all of the other factors

affecting GPA would be UNCORRELATED with the grant size. Now, it seems pretty

unlikely that grant size would explain very much of the variation in GPA, so the R-squared

from this simple regression would probably be pretty low, BUT we mi ght still (with a large

enough N) get a reasonably precise estimator for the effect on the grant. (NOTE: we don’t

need to worry about omitted variable bias since all the omitted variables would be

uncorrelated with the grant size!)

SThe relative CHANGE in the R-squared value when variables are added to an equation

provides A LOT OF USEFUL INFORMATION. This is related to the joint F-tests that we

talked about earlier in testing joint restrictions.

R-squared and Adjusted R-squared Value: what happens when we add regressors to our equation.

SRecall that R-squared is the ratio between the explained SS/total SS, or:

Now, why is it helpful to write R-squared in this fashion? Think about the following: let Fy

2 be the

population variance of y (unobserved by us) and F,

2 be the population variance on the random

disturbance term (again, unobserved by us). Define the POPULATION R-squared to be:

Discover Lecture notes of Business Statistics Hanoi University of Science (HUS)

Partial preview of the text

Download Understanding R-squared Value and Adjusted R-squared in Regression Analysis and more Lecture notes Business Statistics in PDF only on Docsity!

R-Squared Notes:

So far, we have not focused on the R-squared value to evaluate how “well” our model fits the data. Why? Because too much emphasis can be placed on this particular measure, and if you go on to study “time-series” data, you will see that the R-squared value can be extremely misleading.

Things to note:

S there is no value that R-squared should be for you to claim that your model does a good job at explaining the variation in the dependent variable. It is simply an estimate of how much variation can be explained.

S a small R-squared value implies that the error variance is large relative to the variance of y, which means that we may have a hard time precisely estimating the $ coefficients. BUT, this can be offset by a large sample size. This is true even if we have not controlled for many unobserved factors – which leads to the large error term. EXAMPLE: suppose that some incoming students at a large university are RANDOMLY given grants to buy computer equipment. If the amount of the grant is truly randomly determined, we can estimate the ceteris paribus effect of the grant amount on subsequent college grade point average by using simple regression analysis. Because of the random assignment, all of the other factors affecting GPA would be UNCORRELATED with the grant size. Now, it seems pretty unlikely that grant size would explain very much of the variation in GPA, so the R-squared from this simple regression would probably be pretty low, BUT we might still (with a large enough N) get a reasonably precise estimator for the effect on the grant. (NOTE: we don’t need to worry about omitted variable bias since all the omitted variables would be uncorrelated with the grant size!)

S The relative CHANGE in the R-squared value when variables are added to an equation provides A LOT OF USEFUL INFORMATION. This is related to the joint F-tests that we talked about earlier in testing joint restrictions.

R-squared and Adjusted R-squared Value: what happens when we add regressors to our equation.

S Recall that R-squared is the ratio between the explained SS/total SS, or:

Now, why is it helpful to write R-squared in this fashion? Think about the following: let Fy^2 be the population variance of y (unobserved by us) and F,^2 be the population variance on the random disturbance term (again, unobserved by us). Define the POPULATION R-squared to be:

which tells us the proportion of the variation of y in the population explained by the independent variables. But we don’t observe the population variances. So, we can use estimators for them:

Okay: so RSS/N is our ESTIMATOR for F,^2 and TSS/N is our estimator for Fy^2 in the “usual” R- squared. That is, the usual R-squared is an estimator for the POPULATION R-squared. BUT WE KNOW THAT BOTH OF THESE ESTIMATORS ARE BIASED (numerator and denominator). We can, instead use unbiased estimators for F,^2 and Fy2.^ In particular, we could use:

RSS/N-k-1 and TSS/N-1.

If we do this, we can get an ADJUSTED-R-squared value that is given by:

BUT: something to keep in mind is that the ratio of unbiased estimators DOES NOT LEAD TO AN UNBIASED ESTIMATOR. And, in fact, the adjusted R-squared estimator is not generally thought to be a better estimator for the population R-squared over the usual R-squared value.

(Recalling that our UNBIASED estimator for the variance on the error term is RSS/N-k-1.)

So, how does the adjusted and regular R-squared differ?

Adjusted R-squared value takes into account the number of INDEPENDENT variables in the model, whereas the regular R-squared does not.
In fact, if we add new independent variables to our model, the adjusted R-squared value will ONLY go up if the t-statistic on the coefficient estimator f the new variable is GREATER THAN ONE in absolute value. (If you add MORE THAN ONE independent variable, the adjusted R-squared will only go up if the F-statistic for the JOINT SIGNIFICANCE of all the new variables is greater than one). SO: this is a little bit different than if you were to look at the individual t-stat or the F-stat, alone (since we would only reject the null if the test statistic is usually LARGER than one...at the usual levels of significance).
is the relationship between the adjusted and regular R-squared values.

Okay: so, now why would we ever look at the adjusted R-squared value and not the R-squared value? Using the Adjusted R-squared to Choose Between Non-nested Models.

R-squared will ALWAYS go up if you add RHS variables. Why? Because the RSS can never go up when you add additional variables to your equation. And, if that’s so, looking at the R-squared alone and whether it goes up doesn’t tell you if you’ve got a “better” model.

trying to show how much of the variation in the LHS variable is explained by the data. But the Var(y) and the Var (lny) are going to be DIFFERENT. So, this just doesn’t make sense.

Understanding R-squared Value and Adjusted R-squared in Regression Analysis, Lecture notes of Business Statistics

Related documents

Partial preview of the text

Download Understanding R-squared Value and Adjusted R-squared in Regression Analysis and more Lecture notes Business Statistics in PDF only on Docsity!