A Note on Machine Learning Methods

Ying Nian Wu, UCLA Statistics, Based on M231B lectures

Updated March 2020

Contents

1 Ptolemy’s Epicycle and Gauss Paradigm 4

1.1 Themodel............................................ 4

1.2 Boostingmachine........................................ 6

1.3 Lasso, ridge and kernel machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Neuralnetwork ......................................... 6

1.5 Model complexity and regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.6 PtolemyorNewton? ...................................... 7

1.7 Euler’slinearmodel....................................... 7

1.8 Laplace’s estimating equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.9 Gaussparadigm......................................... 8

1.10ContinuingGaussparadigm .................................. 9

1.11Threemodesoflearning .................................... 9

1.12 Bayesian, Frequentist, variational . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Basics: Linear Models 12

2.1 Linearregression ........................................ 12

2.2 Ridge regression and shrinkage estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Linearspline .......................................... 14

2.4 Lassoregression......................................... 15

2.5 Logisticregression ....................................... 17

2.6 Classification and perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.7 Lossfunctions.......................................... 18

2.8 Regularization.......................................... 21

2.9 Gradient descent: learning from errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.10 Iterated reweighed least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.11 Multivariate or multinomial response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.12 Non-linear and non-parametric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Model Complexity and Overfitting 26

3.1 Gauss’ analysis of least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2 Biasandvariancetradeoff ................................... 27

3.3 Stein’sestimator......................................... 27

3.4 Stein’s estimator as empirical Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5 Modelbias ........................................... 29

3.6 Training and testing errors, overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

A Note on Machine Learning Methods, Slides of Machine Learning

Related documents

Partial preview of the text

Download A Note on Machine Learning Methods and more Slides Machine Learning in PDF only on Docsity!

Preface: “no time to be brief”

1 Ptolemy’s Epicycle and Gauss Paradigm

1.1 The model

[

]

1.6 Ptolemy or Newton?

1.7 Euler’s linear model

1.8 Laplace’s estimating equation

1.9 Gauss paradigm

1.12 Bayesian, Frequentist, variational

∂Y

∂ X>

∂ X

∂ Z>^

∂ L

∂ L

2.2 Ridge regression and shrinkage estimator

2.3 Linear spline

[

] 2

2.5 Logistic regression

2.6 Classification and perceptron