Machine Learning: Q&A on LDA, QDA, and Cross-Validation | Exams Machine Learning

Machine Learning Questions and answers

If the Bayes decision boundary is linear, do we expect LDA or QDA to perform better on the training set?

On the test set? - ✅✅Even If the Bayes decision boundary is linear, we expect QDA to perform better

on the training set because it's higher flexibility will yield a closer fit (remember a more flexible model

will always fit the training data better). On the test set, we expect LDA to perform better than QDA

because QDA could overfit the linearity of the Bayes decision boundary

If the Bayes decision boundary is non-linear, do we expect LDA or QDA to perform better on the training

set? On the test set? - ✅✅If the Bayes decision boundary is non-linear, we expect QDA to perform

better both on the training and test since LDA cannot fit a non-linear decision boundary

In general, as the sample size n increases, do we expect the test prediction accuracy of QDA relative to

LDA to improve, decline, or be unchanged? Why? - ✅✅Since QDA does not assume equal variance for

the different classes it has more parameters to fit and hence require a larger dataset than LDA. We thus

expect the test prediction accuracy of QDA relative to LDA to improve.

True or False: Even if the Bayes decision boundary for a given problem is linear, we will probably achieve

a superior test error rate using QDA rather than LDA because QDA is flexible enough to model a linear

decision boundary. Justify your answer - ✅✅False. Even if the true decision boundary is linear for a

given sample it will never be exactly linear and QDA will model this "false" non-linearity and hence get a

bitter fit on the training data and worse predictions on the test (validation) data.

Explain how k-fold cross validation is implemented - ✅✅k-fold cross-validation is implemented by

taking the full set of n observations and randomly splitting into k non-overlapping groups of about equal

size. Each of these groups acts as a validation set and the remainder as a training set. The test error is

estimated by averaging the k resulting MSE estimates from the validation sets

Which estimation method would you use in the following cases, you can only pick from: OLS, Ridge

Regression and LASSO. Also shortly motivate your choice.

Your aim is to test an economic theory; you have many observations (N is large) but only a few potential

explanatory (predictor) variables. - ✅✅OLS. Both Ridge and Lasso has biased coefficients which

makes it hard to interpret the coefficients. (Also you don't get any standard errors for the coefficients in

Ridge and Lasso. )

Partial preview of the text

Download Machine Learning: Q&A on LDA, QDA, and Cross-Validation and more Exams Machine Learning in PDF only on Docsity!

Machine Learning Questions and answers

If the Bayes decision boundary is linear, do we expect LDA or QDA to perform better on the training set? On the test set? - ✅✅Even If the Bayes decision boundary is linear, we expect QDA to perform better on the training set because it's higher flexibility will yield a closer fit (remember a more flexible model will always fit the training data better). On the test set, we expect LDA to perform better than QDA because QDA could overfit the linearity of the Bayes decision boundary If the Bayes decision boundary is non-linear, do we expect LDA or QDA to perform better on the training set? On the test set? - ✅✅If the Bayes decision boundary is non-linear, we expect QDA to perform better both on the training and test since LDA cannot fit a non-linear decision boundary In general, as the sample size n increases, do we expect the test prediction accuracy of QDA relative to LDA to improve, decline, or be unchanged? Why? - ✅✅Since QDA does not assume equal variance for the different classes it has more parameters to fit and hence require a larger dataset than LDA. We thus expect the test prediction accuracy of QDA relative to LDA to improve. True or False: Even if the Bayes decision boundary for a given problem is linear, we will probably achieve a superior test error rate using QDA rather than LDA because QDA is flexible enough to model a linear decision boundary. Justify your answer - ✅✅False. Even if the true decision boundary is linear for a given sample it will never be exactly linear and QDA will model this "false" non-linearity and hence get a bitter fit on the training data and worse predictions on the test (validation) data. Explain how k-fold cross validation is implemented - ✅✅k-fold cross-validation is implemented by taking the full set of n observations and randomly splitting into k non-overlapping groups of about equal size. Each of these groups acts as a validation set and the remainder as a training set. The test error is estimated by averaging the k resulting MSE estimates from the validation sets Which estimation method would you use in the following cases, you can only pick from: OLS, Ridge Regression and LASSO. Also shortly motivate your choice. Your aim is to test an economic theory; you have many observations (N is large) but only a few potential explanatory (predictor) variables. - ✅✅OLS. Both Ridge and Lasso has biased coefficients which makes it hard to interpret the coefficients. (Also you don't get any standard errors for the coefficients in Ridge and Lasso. )

Which estimation method would you use in the following cases, you can only pick from: OLS, Ridge Regression and LASSO. Also shortly motivate your choice. You are trying to estimate a model to predict if a company will go into bankruptcy within the next year. You have a rich dataset with several hundred potential explanatory (predictor) variables for 100 companies but you only think a few of them actually has any predictive power. - ✅✅With many predictor variables and when the aim is prediction you should pick machine learning so Ridge or Lasso. In this case you think that many of the explanatory variables are useless (has value zero in population), in this case Lasso is better than Ridge since Lasso sets coefficients exactly to zero, Ridge does not.

Machine Learning: Q&A on LDA, QDA, and Cross-Validation, Exams of Machine Learning

Related documents

Partial preview of the text

Download Machine Learning: Q&A on LDA, QDA, and Cross-Validation and more Exams Machine Learning in PDF only on Docsity!

Machine Learning Questions and answers