Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

SVM Optimization-Machine Learning-Lecture Handout, Exercises of Machine Learning

Pakistan Institute of Engineering and Applied Sciences, Islamabad (PIEAS)Machine Learning

This lecture notes was distributed for Machine Learning course by Pakistan Institute of Engineering and Applied Sciences, Islamabad (PIEAS). Its main points are: Algorithms, SVM, Optimization, KKT, Conditions, Shrinking, Chunking, Iterative, Solvers, Decomposition, Criterion

Typology: Exercises

2011/2012

Uploaded on 07/19/2012

zaraa 🇵🇰

6 documents

1 / 3

This page cannot be seen from the preview

Don't miss anything!

Algorithms for solving the SVM Optimization

1 Introduction

In previous lectures we derived the following form for the SVM optimization problem (L1soft

margin):

Max Pαi−1

2PαiαjyiyjK(xi˙xj) subject to 0 ≤αi≤C,Pαiyi= 0

Basic points:

•Using KKT conditions, we can check if a proposed solution is optimal.

•The objective function is quadratic with a global maximum, so if we do an uphill search while

maintaining constraints we will find the maximum. We do not necessarily need to perform

gradient ascent - any step that guarantees going up is useful. The difference is whether we

can get quick convergence.

•There are general solvers for quadratic programming problems, which includes SVM, but

SVM problem can be simpler than the general problem. Some of the solutions methods

discussed in the lecture do use such “off the shelf” solvers. However, these are only run on

small subproblems.

•The main advantage in developing algorithms specific for SVM is that that SVM solutions

are sparse, that is, do not have many support vectors (where αi6= 0). This will be used by

the fast solvers.

2 Techniques to speed up solver

•Shrinking: Guess examples is.t. αi= 0 and examples s.t. αi=C

Fix these α’s and solve subproblem on remaining examples.

Check using the KKT conditions, if optimal then we’re done.

If not, guess again.

•Chunking: pick subset Ito optimize over. Fix αifor i /∈Iand solve the problem for the

remaining indices.

L=Pαi−1

2PαiαjyiyjKi,j

=Pi∈Iαi+Const −1

2Pi∈IPj∈IαiαjyiyjKi,j +Const −Pi∈IαiyiPj /∈IyjαjKi,j

1

docsity.com

Discover Exercises of Machine Learning Pakistan Institute of Engineering and Applied Sciences, Islamabad (PIEAS)

Partial preview of the text

Download SVM Optimization-Machine Learning-Lecture Handout and more Exercises Machine Learning in PDF only on Docsity!

Algorithms for solving the SVM Optimization

1 Introduction

In previous lectures we derived the following form for the SVM optimization problem (L 1 soft margin): Max ∑ αi − (^12) ∑ αiαj yiyj K(xi x˙j ) subject to 0 ≤ αi ≤ C, ∑ αiyi = 0

Basic points:

Using KKT conditions, we can check if a proposed solution is optimal.
The objective function is quadratic with a global maximum, so if we do an uphill search while maintaining constraints we will find the maximum. We do not necessarily need to perform gradient ascent - any step that guarantees going up is useful. The difference is whether we can get quick convergence.
There are general solvers for quadratic programming problems, which includes SVM, but SVM problem can be simpler than the general problem. Some of the solutions methods discussed in the lecture do use such “off the shelf” solvers. However, these are only run on small subproblems.
The main advantage in developing algorithms specific for SVM is that that SVM solutions are sparse, that is, do not have many support vectors (where αi 6 = 0). This will be used by the fast solvers.

2 Techniques to speed up solver

Shrinking: Guess examples i s.t. αi = 0 and examples s.t. αi = C Fix these α’s and solve subproblem on remaining examples. Check using the KKT conditions, if optimal then we’re done. If not, guess again.
Chunking: pick subset I to optimize over. Fix αi for i /∈ I and solve the problem for the remaining indices. L =

∑ αi − (^12)

∑ αiαj yiyj Ki,j =

∑ i∈I αi^ +^ Const^ −^ 1 2

∑ i∈I

∑ j∈I αiαj^ yiyj^ Ki,j^ +^ Const^ −^

∑ i∈I αiyi

∑ j /∈I yj^ αj^ Ki,j

Let Wi = yi

∑ j /∈I yj^ αj^ Ki,j^. then^ L^ =^ Const+LI^ where^ LI^ =^

∑ i∈I αi(1^ −^ Wi)^ −^

∑ i,j∈I αiαj^ yiyj^ Ki,j The reduced optimization problem is to maximize ∑ LI subject to 0 ≤ αi ≤ C for i ∈ I and i∈I αiyi^ =^ −^

∑ i /∈I αiyi. Therefore like shrinking, chunking gives a subproblem of the same type. We can use this to develop an iterative algorithm as follows:

Iterative Chunking: Init some I and α Repeat : solve LI and check if optimal if not add to I any example violating KKT conditions Issue 1: want to make sure LI goes up with every iteration Issue 2: Hope to add to I only elements that are Support Vectors in final solution. Notice that even if this happens the last problem we solve includes all the support vectors so it may still be large.

3 More Efficient Solvers

Decomposition methods avoid this by keeping |I| small. One such method is to fix the size of I over all iterations. The main question is how to pick I s.t. LI always increases, because if we can do that then we will reach convergence.

Fact: It is sufficient to include one KKT violator in I to guarantee that L goes up.

Why? Solve LI , we know that the current setting for I is not optimal. This means that LI goes up. Therefore L goes up as well (the difference between them is constant since other parameters are not changed). So the main questions is what I to choose at each step. Notice that using a small I will probably give a small improvement in L therefore requiring many iterations, but the QP will be solved easily so will be fast. On the other hand a large I may need fewer iterations but each iteration will be slower. SVMlight picks |I| = q small but > 2. SMO takes the tradeoff to the extreme using I = 2. The advantage of SMO is that the QP can be solved analytically, there is no need for QP engine for the subproblem, and we get very fast run time per iteration. libSVM implements SMO; it differs from the original formulation of SMO in using a different formulation of stopping criterion and in the choice of I. It also avoids the potential for backtracking search for an I that makes progress that was needed in the original formulation. In the following we provide some of the details of libSVM. We first provide the following three ingredients:

new criterion to check optimality
from this, a method to choose I of size 2
analytic solution of size 2 problem

Stopping Criterion and Choice of I: Define: gi = (^) ∂α∂Li = 1 − yi

∑ j αj^ yj^ Ki,j^ and^ g ∗ i =^ gi|α=α∗^ where^ α ∗ (^) is the optimum α vector.

We must always satisfy

∑ αiyi = 0, so if we increase one αiyi in SMO, we must decrease the other. We know αiyi ∈ [0, C] if yi = 1 and αiyi ∈ [−C, 0] if yi = −1. We will use the notation

SVM Optimization-Machine Learning-Lecture Handout, Exercises of Machine Learning

Related documents

Partial preview of the text

Download SVM Optimization-Machine Learning-Lecture Handout and more Exercises Machine Learning in PDF only on Docsity!

Algorithms for solving the SVM Optimization

1 Introduction

2 Techniques to speed up solver

3 More Efficient Solvers