Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An overview of support vector machines (svm) in machine learning, focusing on their classification and regression methods. Svms aim to find optimal linear or non-linear hyperplanes in feature space for separating data, with the goal of maximizing the margin between classes. The document also covers the concept of support vectors and the use of mercer kernels for non-linear data.
Typology: Study notes
1 / 22
Greg Grudic Machine Learning 1
Greg Grudic
(Notes borrowed from Bernhard Schölkopf)
A Good text on SVMs: Bernhard Schölkopf and Alex Smola.
Learning with Kernels. MIT Press, Cambridge, MA,
2002
Greg Grudic Machine Learning 3
optimal (canonical) linear hyperplanes.
How many lines can separate these points?
NO!
Which line should we use?
Greg Grudic Machine Learning 5
1
x
2
x
0
T
w x + b <
0
T
w x + b >
0
T
w x + b =
y = − 1
y = + 1
constraint satisfaction problem
(1,..., ), find such that
1 if 1
1 if 1
T
i i
T
i i
i N
b y
b y
∀ ∈
≥ + = +
≤ − = −
w
w x
w x
( ) ( ) 1 1
, ,..., , N N
x y x y
1 0,
T
i i
y w x + b − ≥ ∀ i
Greg Grudic Machine Learning 7
1
, ,
,
d
i i
i
w x
=
=
=
∑
w x
x x x
Calculating the Margin:
2
margin =
w
Find the one with the
Maximum MARGIN!
Greg Grudic Machine Learning 9
( ) ( ) 1 1
, ,..., , N N
x y x y
1 0, 1,...,
T
i i
y w x + b − ≥ ∀ i = N
2
w
Greg Grudic Machine Learning 11
b
∂ ∂
= =
∂ ∂
w α w α
w
1 1
0,
N N
i i i i i
i i
α y αy
= =
= = ∑ ∑
w x
L ( w , , b α )
1 1 1
1
,
2
N N N
i i j i j i j
i i j
W α α α α y y
= = =
= − ∑ ∑∑
x x
1
0, 1,...,
0
i
N
i i
i
i N
y
α
α
=
≥ =
= ∑
Greg Grudic Machine Learning 13
1
N
i i i
i
α y
=
∑
w x
, 1 0 irrelevant
i i i i
y b α
w x x
OR
, 1 (On Margin) Support Vector
i i i
y b
w x x
1
N
i i i
i
α y
=
∑
w x
f (^) ( x ) (^) = sgn (^) ( w x , + b )
OR
( )
1
sgn ,
N
i i i
i
f αy b
=
∑
x x x
Substitute
Greg Grudic Machine Learning 15
Maximized
Margin
SV gives the same solution!
Let #SV(N) be the number of SVs obtained by
training on N examples randomly drawn for
P(X,Y), and E be an expectation. Then
[ ]
[ #SV(N)]
Prob(test error)
N
E
E ≤
Greg Grudic Machine Learning 17
Add a Slack Variable
0 if correctly classified
distance to margin otherwise
i
i
ξ
= (^)
x
i
ξ
( ) ( ) 1 1
, ,..., , N N
x y x y
1 , 1,...,
T
i i i
y w x + b ≥ − ξ ∀ i = N
2
1
1
2
N
i
i
C ξ
=
w
Greg Grudic Machine Learning 19
1 1 1
1
,
2
N N N
i i j i j i j
i i j
W α α α α y y
= = =
= − ∑ ∑∑
x x
1
0 , 1,...,
0
i
N
i i
i
C i N
y
α
α
=
≤ ≤ =
= ∑
1
sgn ,
N
i i i
i
f αy b
=
∑
x x x
Greg Grudic Machine Learning 21
optimization formulation rely on dot products in input
space only!
1
sgn ,
N
i i i
i
f αy b
=
∑
x x x
1 1 1
1
,
2
N N N
i i j i j i j
i i j
W α α α α y y
= = =
= − ∑ ∑∑
x x
i
x x
Greg Grudic Machine Learning 23
Greg Grudic Machine Learning 25
Replace
i j
x x
i j i j
with
Can use the same algorithms in nonlinear kernel space!
1
sgn ,
N
i i i
i
f αy K b
=
x x x
1 1 1
1
,
2
N N N
i i j i j i j
i i j
W α α α α y y K
= = =
= −
x x
Maximize:
Boundary:
Greg Grudic Machine Learning 27
( ) (^ )^ ( )
( ) (^ )
( )
, ,
,
,
i j i j
j i
j i
K
K
= Φ Φ
= Φ Φ
=
x x x x
x x
x x
1 1 1
1
, ,
, ,
N
N N N
K K
K
K K
=
x x x x
x x x x
…
"
( ) ( ) 1 1
, ,..., , N N
x y x y Training Data:
Properties:
•Positive Definite Matrix
•Symmetric
•Positive on diagnal
•N by N
Greg Grudic Machine Learning 29
( ) (^) ( )
( ) (^) ( )
( )
2
2
, tanh ,
, exp
d
i j i j
i j i j
i j i j
K c
κ θ
σ
x x x x
x x x x
x x x x
Greg Grudic Machine Learning 33
character benchmark
10,0000 testing
28
SVM used a polynomial kernel of degree 9.
Greg Grudic Machine Learning 35
Feature Space
Stock
Value
Greg Grudic Machine Learning 41
Greg Grudic Machine Learning 43
( ) ( )
( ( ))
1 1
2
1
: , ,..., ,
1
K K
K
i i
i
TestData y y
MSE y f
= − ∑
x x
x
Mean Squared Error (MSE)
Results on test data: