ISLR Chapter 6 Linear Model Selection and Regularization Lab Manual, Exercises of Statistics

Introduction to Statistical Learning (James/Witten/Hastie/Tibshirani)

Typology: Exercises

2020/2021

Uploaded on 05/26/2021

ekanga
ekanga 🇺🇸

4.9

(16)

263 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 6 - Linear Model Selection and Regularization
Lab Solution
1 Problem 9
library(ISLR)
set.seed(1)
(a). Split the data set into a training set and a test set.
train.size =dim(College)[1]/2
train =sample(1:dim(College)[1], train.size)
test =-train
College.train =College[train, ]
College.test =College[test, ]
(b). Fit a linear model using least squares on the training set, and report the test error obtained
lm.fit =lm(Apps~., data=College.train)
lm.pred =predict(lm.fit, College.test)
mean((College.test[, "Apps"]-lm.pred)^2)
## [1] 1108531
The test MSE is 1108531.2630464
(c). Fit a ridge regression model on the training set, with λchosen by cross validation. Report the test error
obtained.
library(glmnet)
## Loading required package: Matrix
## Loading required package: foreach
## Loaded glmnet 2.0-16
train.mat =model.matrix(Apps~., data=College.train)
test.mat =model.matrix(Apps~., data=College.test)
grid =10 ^seq(4,-2,length=100)
mod.ridge =cv.glmnet(train.mat, College.train[, "Apps"], alpha=0,lambda=grid, thresh=1e-12)
lambda.best =mod.ridge$lambda.min
The best λfor ridge regression is 0.0114976
ridge.pred =predict(mod.ridge, newx=test.mat, s=lambda.best)
mean((College.test[, "Apps"]-ridge.pred)^2)
## [1] 1108512
The test MSE for ridge regression is 1108511.6962057, which is slightly smaller than linear regression
1
pf3
pf4
pf5

Partial preview of the text

Download ISLR Chapter 6 Linear Model Selection and Regularization Lab Manual and more Exercises Statistics in PDF only on Docsity!

Chapter 6 - Linear Model Selection and Regularization

Lab Solution

1 Problem 9

library (ISLR) set.seed (1)

(a). Split the data set into a training set and a test set.

train.size = dim (College)[1] / 2 train = sample (1: dim (College)[1], train.size) test = -train College.train = College[train, ] College.test = College[test, ]

(b). Fit a linear model using least squares on the training set, and report the test error obtained

lm.fit = lm (Apps~., data=College.train) lm.pred = predict (lm.fit, College.test) mean ((College.test[, "Apps"] - lm.pred)^2)

## [1] 1108531

The test MSE is 1108531.

(c). Fit a ridge regression model on the training set, with λ chosen by cross validation. Report the test error obtained.

library (glmnet)

_## Loading required package: Matrix

Loading required package: foreach

Loaded glmnet 2.0-_

train.mat = model.matrix (Apps~., data=College.train) test.mat = model.matrix (Apps~., data=College.test) grid = 10 ^ seq (4, -2, length=100) mod.ridge = cv.glmnet (train.mat, College.train[, "Apps"], alpha=0, lambda=grid, thresh=1e-12) lambda.best = mod.ridge$lambda.min

The best λ for ridge regression is 0.

ridge.pred = predict (mod.ridge, newx=test.mat, s=lambda.best) mean ((College.test[, "Apps"] - ridge.pred)^2)

## [1] 1108512

The test MSE for ridge regression is 1108511.6962057, which is slightly smaller than linear regression

(d). Fit a lasso model on the training set, with λ chosen by crossvalidation. Report the test error obtained, along with the number of non-zero coefficient estimates.

mod.lasso = cv.glmnet (train.mat, College.train[, "Apps"], alpha=1, lambda=grid, thresh=1e-12) lambda.best = mod.lasso$lambda.min

The best λ for the lasso is 28.

lasso.pred = predict (mod.lasso, newx=test.mat, s=lambda.best) mean ((College.test[, "Apps"] - lasso.pred)^2)

## [1] 1028718

The test MSE of 1028717.7713552 is lower than ridge regression and linear regression.

mod.lasso = glmnet ( model.matrix (Apps~., data=College), College[, "Apps"], alpha=1) predict (mod.lasso, s=lambda.best, type="coefficients")

19 x 1 sparse Matrix of class "dgCMatrix"

1

(Intercept) -649.

(Intercept).

PrivateYes -391.

Accept 1.

Enroll -0.

Top10perc 29.

Top25perc -0.

F.Undergrad.

P.Undergrad 0.

Outstate -0.

Room.Board 0.

Books.

Personal.

PhD -4.

Terminal -3.

S.F.Ratio 0.

perc.alumni -1.

Expend 0.

Grad.Rate 4.

The coefficients for several variables are pushed to zero.

(e). Fit a PCR model on the training set, with M chosen by crossvalidation. Report the test error obtained, along with the value of M selected by cross-validation.

library (pls)

_## Attaching package: ’pls’

The following object is masked from ’package:stats’:

loadings_

pcr.fit = pcr (Apps~., data=College.train, scale=T, validation="CV") validationplot (pcr.fit, val.type="MSEP")

Apps

number of components

MSEP

Here, 6 components seems best.

pls.pred = predict (pls.fit, College.test, ncomp=6) mean ((College.test[, "Apps"] - data.frame (pls.pred))^2)

Warning in mean.default((College.test[, "Apps"] - data.frame(pls.pred))^2): argument is not numeric

or logical: returning NA

[1] NA

The test MSE for PLS is NA

(g). Comment on the results obtained. How accurately can we predict the number of college applications received? Is there much difference among the test errors resulting from these five approaches?

test.avg = mean (College.test[, "Apps"]) lm.test.r2=1- mean ((College.test[,"Apps"]-lm.pred)^2)/ mean ((College.test[,"Apps"]-test.avg)^2) ridge.test.r2=1- mean ((College.test[,"Apps"]-ridge.pred)^2)/ mean ((College.test[,"Apps"]-test.avg)^2)

lasso.test.r2=1- mean ((College.test[,"Apps"]-lasso.pred)^2)/ mean ((College.test[,"Apps"]-test.avg)^2) pcr.test.r2=1- mean ((College.test[,"Apps"]- data.frame (pcr.pred))^2)/ mean ((College.test[,"Apps"]-test.avg)^2)

Warning in mean.default((College.test[, "Apps"] - data.frame(pcr.pred))^2): argument is not numeric

or logical: returning NA

pls.test.r2=1- mean ((College.test[,"Apps"]- data.frame (pls.pred))^2)/ mean ((College.test[,"Apps"]-test.avg)^2)

Warning in mean.default((College.test[, "Apps"] - data.frame(pls.pred))^2): argument is not numeric

or logical: returning NA

rbind ( c ("OLS", "Ridge", "Lasso", "PCR", "PLS"), c (lm.test.r2, ridge.test.r2, lasso.test.r2, pcr.test.r2, pls.test.r2))

## [,1] [,2] [,3] [,4] [,5]

[1,] "OLS" "Ridge" "Lasso" "PCR" "PLS"

[2,] "0.901068223618923" "0.901069969877579" "0.908191243758643" NA NA

Each methodology has a reasonably large R^2 and they are all fairly close together. There may not be much of a difference between them.