Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Minimum Volume Ellipsoid, Stochastic Subgradient, Exercises of Convex Optimization

Aliah University Convex Optimization

Prof. Chhaayank Buhpathi assigned this task to do at home for Convex Optimization course at Aliah University. It includes: Vector, Machine, Stochastic, Subgradient, Linear, Vector, Machine, SVM, Error, Probability

Typology: Exercises

2011/2012

Uploaded on 07/15/2012

saeeda 🇮🇳

4

(4)

49 documents

1 / 2

This page cannot be seen from the preview

Don't miss anything!

EE364b Prof. S. Boyd

EE364b Homework 4

1. Support vector machine training via stochastic subgradient. We suppose that feature-

label pairs, (x, y )∈Rn× {−1,1}, are generated from some distribution. We seek a

linear classifier or predictor, of the form ˆy=sign(wTx), where w∈Rnis the weight

vector. (We can add an entry to xthat is always 1 to get an affine classifier.) Our

classifier is correct when ywTx > 0; since this expression is homogeneous in w, we can

write this as ywTx≥1. Thus, our goal is to choose wso that 1 −ywTx≤0 with high

probability.

Asupport vector machine (SVM) chooses wsvm as the minimizer of

f(w) = E1−ywTx++ (ρ/2)kwk2

2,

where ρ > 0 is a parameter. The first term is the average loss, and the second term is a

quadratic regularizer. Finding wsvm involves solving a stochastic optimization problem.

Explain how to (approximately) solve this stochastic optimization problem using the

stochastic subgradient method, with one sample per subgradient step. In this context,

the samples from the distribution are called data or examples, and the collection of

these is called the training data. Since this method only processes one data sample in

each step, it is called a streaming algorithm (since it does not have to store more than

one data sample in each step).

Implement the stochastic subgradient method for a problem with n= 20, and (x, y )

samples generated using

randn(’state’,0)

w_true = randn(n,1); % ’true’ weight vector

% to get each data sample use snippet below

x = randn(n,1);

y = sign(w_true’*x+0.1*randn(1));

Experiment with the choice of ρ, the step size rule, and the number of iterations to

run (but don’t be afraid to run the algorithm for 10000 steps).

To view the convergence, you can plot two quantities at each step: the optimality gap

f(w)−f⋆and the classifier error probability Prob ywTx≤0. To (approximately)

compute these quantities, use a Monte Carlo method, using, say, 10000 samples. (You’ll

want to compute these 10000 samples, and evaluate the Monte Carlo estimates of the

two quantities above, without using Matlab for loops. Also note that evaluation of

these two quantities will be far more costly than each step of the stochastic subgradient

method.) You can use CVX to estimate f⋆.

1

docsity.com

Discover Exercises of Convex Optimization Aliah University

Partial preview of the text

Download Minimum Volume Ellipsoid, Stochastic Subgradient and more Exercises Convex Optimization in PDF only on Docsity!

EE364b Prof. S. Boyd

EE364b Homework 4

Support vector machine training via stochastic subgradient. We suppose that feature- label pairs, (x, y) ∈ Rn^ × {− 1 , 1 }, are generated from some distribution. We seek a linear classifier or predictor, of the form ˆy = sign(wT^ x), where w ∈ Rn^ is the weight vector. (We can add an entry to x that is always 1 to get an affine classifier.) Our classifier is correct when ywT^ x > 0; since this expression is homogeneous in w, we can write this as ywT^ x ≥ 1. Thus, our goal is to choose w so that 1 − ywT^ x ≤ 0 with high probability. A support vector machine (SVM) chooses wsvm^ as the minimizer of

f (w) = E

( 1 − ywT^ x

)

- (ρ/2)‖w‖

2 2 ,

where ρ > 0 is a parameter. The first term is the average loss, and the second term is a quadratic regularizer. Finding wsvm^ involves solving a stochastic optimization problem. Explain how to (approximately) solve this stochastic optimization problem using the stochastic subgradient method, with one sample per subgradient step. In this context, the samples from the distribution are called data or examples, and the collection of these is called the training data. Since this method only processes one data sample in each step, it is called a streaming algorithm (since it does not have to store more than one data sample in each step). Implement the stochastic subgradient method for a problem with n = 20, and (x, y) samples generated using

randn(’state’,0) w_true = randn(n,1); % ’true’ weight vector % to get each data sample use snippet below x = randn(n,1); y = sign(w_true’x+0.1randn(1));

Experiment with the choice of ρ, the step size rule, and the number of iterations to run (but don’t be afraid to run the algorithm for 10000 steps). To view the convergence, you can plot two quantities at each step: the optimality gap f (w) − f ⋆^ and the classifier error probability Prob

( ywT^ x ≤ 0

)

. To (approximately) compute these quantities, use a Monte Carlo method, using, say, 10000 samples. (You’ll want to compute these 10000 samples, and evaluate the Monte Carlo estimates of the two quantities above, without using Matlab for loops. Also note that evaluation of these two quantities will be far more costly than each step of the stochastic subgradient method.) You can use CVX to estimate f ⋆.

docsity.com

Minimum volume ellipsoid covering a half-ellipsoid. In this problem we derive the update formulas used in the ellipsoid method, i.e., we will determine the minimum volume ellipsoid that contains the intersection of the ellipsoid

E = {x ∈ Rn^ | (x − xc)T^ P −^1 (x − xc) ≤ 1 }

and the halfspace H = {x | gT^ (x − xc) ≤ 0 }. We’ll assume that n > 1, since for n = 1 the problem is easy.

(a) We first consider a special case: E is the unit ball centered at the origin (P = I, xc = 0), and g = −e 1 (e 1 is the first unit vector), so E∩H = {x | xT^ x ≤ 1 , x 1 ≥ 0 }. Let E˜ = {x | (x − ˜xc)T^ P˜ −^1 (x − x˜c) ≤ 1 } denote the minimum volume ellipsoid containing E ∩ H. Since E ∩ H is symmetric about the line through first unit vector e 1 , it is clear (and not too hard to show) that E˜ will have the same symmetry. This means that the matrix P˜ is diagonal, of the form P˜ = diag(α, β, β,... , β), and that ˜xc = γe 1 (where α, β > 0 and γ ≥ 0). So now we have only three variables to determine: α, β, and γ. Express the volume of E˜ in terms of these variables, and also the constraint that E ⊇ E ∩ H˜. Then solve the optimization problem directly, to show that

α =

n^2 (n + 1)^2

, β =

n^2 n^2 − 1

, γ =

n + 1

(which agrees with the formulas we gave, for this special case). Hint. To express E ∩ H ⊆ E˜ in terms of the variables, it is necessary and sufficient for the conditions on α, β, and γ to hold on the boundary of E ∩ H, i.e., at the points x 1 = 0, x^22 + · · · + x^2 n ≤ 1 , or the points x 1 ≥ 0 , x^21 + x^22 + · · · + x^2 n = 1.

(b) Now consider the general case, stated at the beginning of this problem. Show how to reduce the general case to the special case solved in part (a). Hint. Find an affine transformation that maps the original ellipsoid to the unit ball, and g to −e 1. Explain why minimizing the volume in these transformed coordinates also minimizes the volume in the original coordinates.

Minimum Volume Ellipsoid, Stochastic Subgradient, Exercises of Convex Optimization

Related documents

Partial preview of the text

Download Minimum Volume Ellipsoid, Stochastic Subgradient and more Exercises Convex Optimization in PDF only on Docsity!

EE364b Homework 4

docsity.com

docsity.com