Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Machine Learning: Classification Models & Rosenblatt's Perceptron Algorithm - Prof. Gregor, Study notes of Computer Science

These lecture notes cover the introduction to classification models in machine learning, including generative models like fisher's linear discriminative analysis and gaussian mixture models, as well as discriminative models like rosenblatt's preceptron learning algorithm. The notes also discuss nonlinear extensions and binary classification.

Typology: Study notes

Pre 2010

Uploaded on 02/10/2009

koofers-user-0sg-1
koofers-user-0sg-1 🇺🇸

5

(1)

10 documents

1 / 9

Toggle sidebar

Related documents


Partial preview of the text

Download Machine Learning: Classification Models & Rosenblatt's Perceptron Algorithm - Prof. Gregor and more Study notes Computer Science in PDF only on Docsity! 1 Greg Grudic Machine Learning 1 Introduction to Classification Greg Grudic Greg Grudic Machine Learning 2 Today’s Lecture Goals • Introduction to classification • Generative Models – Fisher (Linear Discriminative Analysis) – Gaussian Mixture Models • Discriminative Models – Rosenblatt’s Preceptron Learning Algorithm • Nonlinear Extensions Greg Grudic Machine Learning 3 Last Week: Learning Regression Models • Collect Training data • Build Model: stock value = F(feature space) • Make a prediction Feature (input) Space Stock Value * * ** ** ** * ** * * *** * ** ** ** * ** * * * * ** ** ** * *** * Greg Grudic Machine Learning 4 This Class: Learning Classification Models • Collect Training data • Build Model: happy = F(feature space) • Make a prediction High Dimensional Feature (input) Space 2 Greg Grudic Machine Learning 5 Binary Classification • A binary classifier is a mapping from a set of d inputs to a single output which can take on one of TWO values • In the most general setting • Specifying the output classes as -1 and +1 is arbitrary! – Often done as a mathematical convenience { } inputs: output: 1, 1 d y ∈ ∈ − + x Greg Grudic Machine Learning 6 A Binary Classifier Classification Modelx { }ˆ 1, 1y∈ − + Given learning data: ( ) ( )1 1, ,..., ,N Ny yx x A model is constructed: ( )M x Greg Grudic Machine Learning 7 The Learning Data • Learning algorithms don’t care where the data comes from! • Here is a toy example from robotics… – Inputs from two sonar sensors: – Classification output: • Robot in Greg’s office: y = +1 • Robot NOT in Greg’s office: y = -1 1 2 sensor 1: sensor 2: x x ∈ ∈ Greg Grudic Machine Learning 8 Classification Learning Data… … Example 4 Example 3 Example 2 Example 1 ……… -10.760370.018504 10.432910.8913 -10.42350.23114 10.582790.95013 1x 2x y 5 Greg Grudic Machine Learning 17 Rosenblatt’s Minimization Function • This is classic Machine Learning! • First define a cost function in model parameter space • Then find an algorithm that modifies such that this cost function is minimized • One such algorithm is Gradient Descent ( )0 1 0 1 ˆ ˆ ˆ ˆ ˆ, ,..., d d i k ik i M k D y xβ β β β β ∈ =  =− +    ∑ ∑ ( )0 1ˆ ˆ ˆ, ,..., dβ β β Greg Grudic Machine Learning 18 Gradient Descent -1 0 1 2 -2 -1 0 1 2 3 0 5 10 15 20 25 w0 w1 E [w ] Greg Grudic Machine Learning 19 The Gradient Descent Algorithm ( )0 1ˆ ˆ ˆ, ,...,ˆ ˆ ˆ d i i i D β β β β β ρ β ∂ ← − ∂ Where the learning rate is defined by: 0ρ> Greg Grudic Machine Learning 20 The Gradient Descent Algorithm for the Perceptron 0 0 11 1 ˆ ˆ ˆ ˆ ˆ ˆ i i i i idd d y y x y x β β β β ρ β β                              ← −                                 ( )0 1 0 ˆ ˆ ˆ, ,..., ˆ d i i M D y β β β β ∈ ∂ =− ∂ ∑ ( )0 1ˆ ˆ ˆ, ,..., , 1,...,ˆ d i ij i Mj D y x j d β β β β ∈ ∂ =− = ∂ ∑ 6 Greg Grudic Machine Learning 21 The Good Theoretical Properties of the Perceptron Algorithm • If a solution exists the algorithm will always converge in a finite number of steps! • Question: Does a solution always exist? Greg Grudic Machine Learning 22 Linearly Separable Data • Which of these datasets are separable by a linear boundary? + a) b) + + - - - + + - - - Greg Grudic Machine Learning 23 Linearly Separable Data • Which of these datasets are separable by a linear boundary? + a) b) + + - - - + + - - - NotLinearly Separable! Greg Grudic Machine Learning 24 Bad Theoretical Properties of the Perceptron Algorithm • If the data is not linearly separable, algorithm cycles forever! – Cannot converge! – This property stopped research in this area between 1968 and 1984… • Perceptrons, Minsky and Pappert, 1969 • There are infinitely many solutions • When data is linearly separable, the number of steps to converge can be very large (depends on size of gap between classes) 7 Greg Grudic Machine Learning 25 What about Nonlinear Data? • Data that is not linearly separable is called nonlinear data • Nonlinear data can often be mapped into a nonlinear space where it is linearly separable Greg Grudic Machine Learning 26 Nonlinear Models • The Linear Model: • The Nonlinear (basis function) Model: • Examples of Nonlinear Basis Functions: 0 1 ˆ ˆˆ ( ) sgn d i i i y M xβ β =    = = +   ∑x ( )0 1 ˆ ˆˆ ( ) sgn k i i i y M β β φ =    = = +   ∑x x ( ) ( ) ( ) ( ) ( )2 21 1 2 2 3 1 2 4 55sinx x x x xφ φ φ φ= = = =x x x x Greg Grudic Machine Learning 27 Linear Separating Hyper-Planes In Nonlinear Basis Function Space 1φ 2φ 0 1 0 k i i i β β φ = + ≤∑ 0 1 0 k i i i β β φ = + >∑ 0 1 0 k i i i β β φ = + =∑ 1y=− 1y=+ Greg Grudic Machine Learning 28 An Example -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 x1 x 2 : y=+1 : y=-1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 φ1 = x1 2 φ 2 = x 22 : y=+1 : y=-1 Φ