



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A concise overview of statistical learning techniques, focusing on model selection, shrinkage methods, and spline-based regression. It covers best subset selection, stepwise selection, aic, bic, adjusted r-squared, lasso and ridge regression, polynomial regression, piecewise functions, regression splines, gam, smoothing splines, decision trees, gradient boosting, and random forests. Key concepts include regularization, model complexity, and techniques to prevent overfitting, making it a valuable resource for understanding statistical modeling strategies.
Typology: Exams
1 / 7
This page cannot be seen from the preview
Don't miss anything!




Best Subset Selection Solution Evaluates all combinations of predictors and selects the best model for each size using metrics like AIC, BIC, or validation error. Forward Stepwise Selection Solution Adds predictors one at a time based on which improves model fit the most at each step. Backward Stepwise Selection Solution Starts with all predictors and removes the least useful one at each step. AIC Solution A model selection metric that balances model fit and complexity. Lower is better. Penalizes number of predictors with 2k. BIC Solution Similar to AIC but penalizes complexity more heavily using ln(n)k. Adjusted R-squared Solution Modified R-squared that adjusts for the number of predictors, preventing overfitting by penalizing extra variables. Lasso Regression Solution Linear regression with L1 penalty that shrinks some coefficients to zero for variable selection. Lasso: Model Shape Solution
Linear or curved depending on terms used; sparse because some coefficients are zero. Lasso: Lambda Solution Controls penalty strength; larger values increase regularization and shrink more coefficients to zero. Lasso: Choosing Lambda Solution Commonly chosen via cross-validation to minimize validation error. Ridge Regression Solution Linear regression with L2 penalty that shrinks all coefficients but keeps them nonzero. Ridge vs. Lasso Solution Ridge keeps all predictors; Lasso drops unimportant ones. Ridge works better with multicollinearity. Polynomial Regression Solution Models curves by including powers of predictors (e.g., x, x^2, x^3). Piecewise Constant Function Solution Splits input space into bins and assigns a constant value to each. Piecewise Polynomial Function Solution Fits polynomials to different sections, possibly with discontinuities at boundaries. Regression Splines Solution Piecewise polynomials that use basis functions to ensure continuity at knots. Natural Splines Solution
Solution More splits = deeper tree = higher complexity. Deep trees may overfit. Gradient Boosting Solution Builds many small trees sequentially; each corrects the errors of the previous model. Random Forests Solution Ensemble of trees trained on bootstrap samples with random feature selection at each split. Bagging Solution Training each tree on a different random subset of the data with replacement. Random Feature Selection Solution At each split, a random subset of predictors is considered to reduce correlation between trees. High dimensions in best subset selection Solution Evaluates all possible predictor combinations, which becomes computationally infeasible as the number of predictors increases. L1 penalty in Lasso Solution Penalizes the absolute value of coefficients, encouraging some to shrink exactly to zero. Sparse Solution in Lasso Solution Indicates the model is selecting only the most important predictors while ignoring or zeroing out others. Selection of λ in Lasso Solution Via cross-validation to minimize validation RMSE or MSE.
Preference for Ridge regression Solution When predictors are highly correlated and you want to retain all of them without variable selection. Role of λ in smoothing splines Solution Controls the trade-off between curve smoothness and data fit. Low λ in smoothing splines Solution Can lead to overfitting by allowing the curve to wiggle too much. Decision trees modeling f(x) Solution Models as a step function with constant values within rectangular regions defined by the splits. Additive structure in GAMs Solution Allows each predictor to influence the response separately. Limitation of basic GAMs Solution Do not automatically model interactions between predictors. Regression splines vs polynomial regression Solution Fit lower-degree polynomials within segments and ensure smooth transitions at knots. Lower AIC model complexity Solution AIC has a weaker penalty for complexity than BIC, potentially allowing models that slightly overfit. Low BIC value Solution Indicates a model that balances good fit with strong penalization for complexity. Gradient boosting error reduction Solution
Cross-validation improvement Solution Assesses how well the model performs on unseen validation folds. Linear model vs GAM performance Solution May perform similarly if the true relationship is approximately linear or the nonlinearity is weak.