Regression Analysis in R: Fitted Values, Residuals, and Model Significance, Schemes and Mind Maps of Analytical Techniques

This r markdown document provides a comprehensive guide to performing regression analysis in r, focusing on calculating fitted values, residuals, and assessing model significance. It demonstrates how to interpret key statistical measures like r-squared, adjusted r-squared, and vif, and how to identify significant variables. The document also includes practical examples of how to calculate and interpret these measures, making it a valuable resource for students and researchers.

Typology: Schemes and Mind Maps

2023/2024

Uploaded on 01/31/2025

francisco-sanchez-8ib
francisco-sanchez-8ib 🇺🇸

1 document

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
6203hw2p1
2025-01-19
##2. Calculate Fitted Values, Residuals, and Check Results
# Calculate fitted values and residuals
fitted_values <- fitted(model) # Predicted values (y_hat)
residuals <- resid(model) # Residuals (y - y_hat)
# Combine original y, fitted y_hat, and residuals
results <- data.frame(
Original_Y = data$Price[1:10], # Actual Price for the first 10 observations
Fitted_Y = fitted_values[1:10], # Fitted values for the first 10 observations
Residuals = residuals[1:10] # Residuals for the first 10 observations
)
# Display the results
print(results)
## Original_Y Fitted_Y Residuals
## 1 21500 19186.61 2313.3889
## 2 20950 19902.38 1047.6167
## 3 19950 20114.27 -164.2689
## 4 19600 19891.72 -291.7201
## 5 21500 19223.83 2276.1696
## 6 22500 18930.85 3569.1535
## 7 22000 19698.89 2301.1095
## 8 22750 18564.51 4185.4929
## 9 17950 16239.52 1710.4842
## 10 16750 15604.50 1145.5044
# Check if Residuals = Original_Y - Fitted_Y
check <- all.equal(results$Residuals, results$Original_Y - results$Fitted_Y)
print(paste("Residuals equal Original_Y - Fitted_Y: ", check))
## [1] "Residuals equal Original_Y - Fitted_Y: TRUE"
pf3
pf4
pf5

Partial preview of the text

Download Regression Analysis in R: Fitted Values, Residuals, and Model Significance and more Schemes and Mind Maps Analytical Techniques in PDF only on Docsity!

6203hw2p

##2. Calculate Fitted Values, Residuals, and Check Results # Calculate fitted values and residuals fitted_values <- fitted(model) # Predicted values (y_hat) residuals <- resid(model) _# Residuals (y - y_hat)

Combine original y, fitted y_hat, and residuals_

results <- data.frame( Original_Y = data$Price[ 1 : 10 ], # Actual Price for the first 10 observations Fitted_Y = fitted_values[ 1 : 10 ], # Fitted values for the first 10 observations Residuals = residuals[ 1 : 10 ] # Residuals for the first 10 observations ) # Display the results print(results)

Original_Y Fitted_Y Residuals

1 21500 19186.61 2313.

2 20950 19902.38 1047.

3 19950 20114.27 -164.

4 19600 19891.72 -291.

5 21500 19223.83 2276.

6 22500 18930.85 3569.

7 22000 19698.89 2301.

8 22750 18564.51 4185.

9 17950 16239.52 1710.

10 16750 15604.50 1145.

# Check if Residuals = Original_Y - Fitted_Y check <- all.equal(results$Residuals, results$Original_Y - results$Fitted_Y) print(paste("Residuals equal Original_Y - Fitted_Y: ", check))

[1] "Residuals equal Original_Y - Fitted_Y: TRUE"

# Extract coefficients and standard errors coefficients <- coef(model) # Estimated coefficients (β) std_errors <- summary(model)$coefficients[, 2 ] _# Standard errors (se(β))

Calculate t-statistics manually_

calculated_t_stats <- coefficients / std_errors # Compare with t-statistics from summary() summary_t_stats <- summary(model)$coefficients[, 3 ] comparison <- data.frame( Variable = names(coefficients), Calculated_T_Stats = calculated_t_stats, Summary_T_Stats = summary_t_stats ) print(comparison)

Variable Calculated_T_Stats Summary_T_Stats

(Intercept) (Intercept) -1.91555684 -1.

Age Age -49.78213520 -49.

KM KM -10.16168388 -10.

HP HP 7.52407315 7.

Metallic Metallic -0.30678348 -0.

Automatic Automatic 3.33400809 3.

CC CC -0.06188146 -0.

Doors Doors 0.32182598 0.

Gears Gears 3.25899198 3.

Weight Weight 11.56676201 11.

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com (http://rmarkdown.rstudio.com). When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this: # Degrees of freedom df <- model$df.residual # Critical value at 95% confidence level critical_value <- qt(0.975, df) print(paste("Critical value at 95% confidence level: ", critical_value))

[1] "Critical value at 95% confidence level: 1.96185754407177"

Including Plots You can also embed plots, for example:

# Calculate SSE and SST SSE <- sum(residuals(model)^ 2 ) # Sum of Squared Errors SST <- sum((data$Price - mean(data$Price))^ 2 ) _# Total Sum of Squares

Calculate R-squared_

calculated_r_squared <- 1 - (SSE / SST) # Compare with R-squared from summary() routine_r_squared <- summary(model)$r.squared # Display comparison print(paste("Calculated R-squared: ", calculated_r_squared))

[1] "Calculated R-squared: 0.864867726797135"

print(paste("Routine R-squared: ", routine_r_squared))

[1] "Routine R-squared: 0.864867726797135"

# Install and load the "car" package if (! require (car)) install.packages("car", dependencies = TRUE)

Loading required package: car

Warning: package 'car' was built under R version 4.4.

Loading required package: carData

Warning: package 'carData' was built under R version 4.4.

library (car) # Calculate VIF for the model vif_values <- vif(model) # Display VIF values print(vif_values)

Age KM HP Metallic Automatic CC Doors Gears

1.914772 1.586299 1.548521 1.017014 1.101810 1.104307 1.269749 1.

Weight

2.

# Check for multicollinearity (VIF > 5 or 10 indicates potential multicollinearity) multicollinearity_check <- vif_values > 5 print(multicollinearity_check)

Age KM HP Metallic Automatic CC Doors Gears

FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Weight

FALSE

  1. Reproduce VIF for Weight _#i

Regress Weight on all other predictors_

weight_model <- lm(Weight ~ Age + KM + HP + Metallic + Automatic + CC + Doors + Gears, data = da ta) # Calculate R-squared for this regression r_squared_weight <- summary(weight_model)$r.squared print(paste("R-squared for Weight regression: ", r_squared_weight))

[1] "R-squared for Weight regression: 0.523023869724915"

_#ii

Calculate VIF for Weight_

vif_weight <- 1 / ( 1 - r_squared_weight) print(paste("VIF for Weight: ", vif_weight))

[1] "VIF for Weight: 2.09654097244504"

  1. Remove Non-Significant Variables and Run a New Regression First, identify the significant variables based on Step 6 (variables with p <0.05 or ∣t∣>critical value) and remove the non-significant ones. # New model with only significant variables significant_model <- lm(Price ~ Age + KM + HP + Automatic + Gears+ Weight, data = data) _# Repla ce with significant variables identified in Step 6

Summary of the new model_

summary(significant_model)

Example Calculations Effect of Age: Coefficient of Age tells how much the price decreases for each additional month. Multiply by 12 to calculate the effect for one year. Effect of KM: Coefficient of KM tells how much the price decreases for every additional kilometer. Multiply by 10,000 for the impact of 10,000 km. # Retrieve coefficients coefficients_reduced <- coef(significant_model) # Effects effect_of_age <- coefficients_reduced["Age"] * 12 # Effect of 1 year older effect_of_km <- coefficients_reduced["KM"] * 10000 # Effect of 10,000 more kilometers print(paste("Price decreases by", round(effect_of_age, 2 ), "Euros for 1 year older."))

[1] "Price decreases by -1558.46 Euros for 1 year older."

print(paste("Price decreases by", round(effect_of_km, 2 ), "Euros for 10,000 more kilometers."))

[1] "Price decreases by -146.31 Euros for 10,000 more kilometers."