Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Support Vector Machines (SVMs) for Classification: Wide Margin and Misclassification, Slides of Semantics of Programming Languages

Alliance University Semantics of Programming Languages

An overview of support vector machines (svms) for classification, focusing on maximizing the margin while minimizing misclassifications. The optimization problem, lagrangian relaxation, and the kernel trick. Svms are a powerful scheme for classifying data with wide margins and low misclassifications, and many standard kernels are available (linear, polynomial, rbf, string).

Typology: Slides

2012/2013

Uploaded on 04/24/2013

baishali 🇮🇳

(2)

84 documents

1 / 56

This page cannot be seen from the preview

Don't miss anything!

Sp’10

Classification

(SVMs / Kernel method)

Docsity.com

Discover Slides of Semantics of Programming Languages Alliance University

Partial preview of the text

Download Support Vector Machines (SVMs) for Classification: Wide Margin and Misclassification and more Slides Semantics of Programming Languages in PDF only on Docsity!

Sp’

Classification

(SVMs / Kernel method)

Sp’

LP versus Quadratic programming

min cT^ x

Ax  b

x  0



min xT^ Qx  cT^ x

Ax  b

x  0

LP: linear constraints, linear objective function
LP can be solved in polynomial time. - In QP, the objective function contains a quadratic form. - For +ve semindefinite Q, the QP can be solved in polynomial time

Sp’

Separating by a wider margin

Solutions with a wider margin are better.



Maximize 2  2

, or Minimize^ ^

2 2

Sp’

Separating via misclassification

In general, data is not linearly separable
What if we also wanted to minimize misclassified points
Recall that, each sample xi in our training set has the label yi {- 1,1}
For each point i, yi(Txi- 0 ) should be positive
Define i >= max {0, 1- yi(Txi- 0 ) }
If i is correctly classified ( yi(Txi- 0 ) >= 1), and i = 0
If i is incorrectly classified, or close to the boundaries i > 0
We must minimize ii

Sp’

Reformulating the optimization

min

2 ^ C^  i^  i

 i  0

 i  1  yi   T^ xi  0 

Sp’

Lagrangian relaxation

L 

 C  i  i   i  i   i  1  yi   T^ xi  0   i  i  i

Goal
S.t.
We minimize



min^ ^

2 ^ C^  i^  i



 i  0

 i  1  yi   T^ xi  0 

Sp’

Substituting

Substituting (1)

L 

 C  i  i   i  i   i  1  yi   T^ xi  0   i  i  i

  T

  i  i yi xi

 



 

^  i  C^  i  i ^  i   i  iyi^  0   i  i

L   1 2

 i  (^) j yi y (^) j xiT^ x (^) j i , j

 ^  i  C^^  i  i ^  i   i  iyi^  0   i  i

Sp’

Substituting (2,3), we have the minimization problem

L   1 2

 i  (^) j yi y (^) j xiT^ x (^) j i , j

 ^  i  C^^  i  i ^  i   i  iyi^  0   i  i

min  1 2

 i  (^) j yi y (^) j xiT^ x (^) j i , j

   i  i

s. t.  (^) i^ yi^  i ^0 0  i  C

Sp’

The kernel method

The SVM formulation can be solved using QP on dot-products.
As these are wide-margin classifiers, they provide a more robust solution.
However, the true power of SVMs approach from using ‘the kernel method’, which allows us to go to higher dimensional (and non- linear spaces)

Sp’

kernel

Let X be the set of objects
- Ex: X =the set of samples in micro-arrays.
- Each object xX is a vector of gene expression values
k: X  X -> R is a positive semidefinite kernel if - k is symmetric. - k is +ve semidefinite

k ( x , x ')  k ( x ', x )

cT^ kc  0  c  Rp

Docsity.com

Sp’

Linear kernel is +ve semidefinite

Recall X as a matrix, such that each column is a sample - X=[x 1 x 2 …]
By definition, the linear kernel kL=XTX
For any c

T kLc  c

T X

T Xc  Xc

2  0

Sp’

Generalizing kernels

Any object can be represented by a feature vector in real space.

 : X  Rp

k ( x , x ')  ( x )

( x ')

Sp’

The kernel trick

If an algorithm for vectorial data is expressed exclusively in the form of dot-products, it can be changed to an algorithm on an arbitrary kernel - Simply replace the dot-product by the kernel

Sp’

Kernel trick example

Consider a kernel k defined on a mapping 
- k(x,x’) = (x)T^ (x’)
It could be that  is very difficult to compute explicitly, but k is easy to compute
Suppose we define a distance function between two objects as
How do we compute this distance?

d ( x , x ')  ( x ) ( x ')

d ( x , x ')  ( x ) ( x ') 2 ( x ) T ( x ) ( x ') T ( x ')  2 ( x ) T ( x ')  k ( x , x )  k ( x ', x ')  2 k ( x , x ')

Support Vector Machines (SVMs) for Classification: Wide Margin and Misclassification, Slides of Semantics of Programming Languages

Related documents

Partial preview of the text

Download Support Vector Machines (SVMs) for Classification: Wide Margin and Misclassification and more Slides Semantics of Programming Languages in PDF only on Docsity!

Classification

(SVMs / Kernel method)

LP versus Quadratic programming

min cT^ x

Ax  b

x  0

Separating by a wider margin

Separating via misclassification

Reformulating the optimization

min

2 ^ C^  i^  i

 i  1  yi   T^ xi  0 

Lagrangian relaxation

 C  i  i   i  i   i  1  yi   T^ xi  0   i  i  i

2 ^ C^  i^  i

 i  1  yi   T^ xi  0 

Substituting

 C  i  i   i  i   i  1  yi   T^ xi  0   i  i  i

  T

  i  i yi xi

^  i  C^  i  i ^  i   i  iyi^  0   i  i

 ^  i  C^^  i  i ^  i   i  iyi^  0   i  i

 ^  i  C^^  i  i ^  i   i  iyi^  0   i  i

The kernel method

kernel

k ( x , x ')  k ( x ', x )

cT^ kc  0  c  Rp

Linear kernel is +ve semidefinite

Generalizing kernels

 : X  Rp

k ( x , x ')  ( x )

( x ')

The kernel trick

Kernel trick example

d ( x , x ')  ( x ) ( x ')