

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Prof. Chhaayank Buhpathi assigned this task to do at home for Convex Optimization course at Aliah University. It includes: Matrix, Norm, Approximation, Linear, Combination, Random, Instance, Fejer, Monotone, LP, Warning
Typology: Exercises
1 / 2
This page cannot be seen from the preview
Don't miss anything!


EE364b Prof. S. Boyd
minimize ‖x 1 A 1 + · · · + xnAn − B‖.
(a) Explain how to find a subgradient of the objective function at x. (b) Generate a random instance of the problem with n = 5, p = 3, q = 6. Use CVX to find the optimal value f ⋆^ of the problem. Use a subgradient method to solve the problem, starting from x = 0. Plot f − f ⋆^ versus iteration. Experiment with several step size sequences.
‖x+^ − x⋆‖ 2 < ‖x − x⋆‖ 2 ,
for any optimal point x⋆. This implies that dist(x+, X⋆) < dist(x, X⋆). (Methods in which successive iterates move closer to the optimal set are called F´ejer monotone. Thus, the subgradient method, with Polyak’s optimal step size, is F´ejer monotone.)
(a) Work out alternating projections for this problem. (In other words, explain how to compute (Euclidean) projections onto {x | Ax = b} and Rn +.) (b) Implement your method, and try it on one or more problem instances with m = 500, n = 2000. With x(k)^ denoting the iterate after projection onto Rn +, plot ‖Ax(k)^ − b‖ 2 , the residual of the equality constraint. (This should converge to zero; you can terminate when this norm is smaller than 10−^5 .) Here is a simple way to generate data which is feasible. First generate a random A, and a random z with positive entries. Then set b = Az. (Of course, you cannot use z in your alternating projections method; you must start from some obvious point, such as 0.) Warning. When A is a fat matrix, the Matlab command A\b does not do what you might expect, i.e., compute a least-norm solution of Ax = b.
(c) Factorization caching is a general technique for speeding up some repeated cal- culations, such as projection onto an affine set. Assuming A is dense, the cost of computing the projection of a point onto the affine set {x | Ax = b} is O(m^2 n) flops. (See Appendix C in Convex Optimization.) By saving some of the matri- ces involved in this computation, such as a Cholesky factorization (or even more directly, the inverse) of AAT^ , subsequent projections can be carried out at a cost of O(mn) flops, i.e., m times faster. (There are several other ways to get this speedup, by saving other matrices.) Effectively, this makes each subgradient step (after the first one) a factor m times cheaper. Explain how to do this, and im- plement a caching scheme in your code. Verify that you obtain a speedup. (You may have to try your code on a larger problem instance.) (d) Over-projection. A general method that can speed up alternating projections is to over-project, which means replacing the simple projection x+^ = P (x) with x+^ = x+γ(P (x)−x), where γ ∈ [1, 2). (When γ = 1, this reduces to standard pro- jection.) It is not hard to show that alternating projections, with over-projection, converges to a point in the intersection of the sets. Implement over-projection and experiment with the over-projection factor γ, ob- serving the effect on the number of iterations required for convergence.
fˆ (k)(x) = max i=1,...,k
( f (x(i)) + g(i)T^ (x − x(i))
) ,
which satisfies fˆ (k)(x) ≤ f (x) for all x. It follows that
L(k)^ = inf x∈C f^ ˆ (k)(x) ≤ p⋆,
where p⋆^ = infx∈C f (x). Since fˆ (k+1)(x) ≥ fˆ (k)(x) for all x, we have L(k+1)^ ≥ L(k). In Kelley’s cutting-plane algorithm, we set x(k+1)^ to be any point that minimizes fˆ (k) over x ∈ C. The algorithm can be terminated when U (k)^ − L(k)^ ≤ ǫ, where U (k)^ = mini=1,...,k f (x(i)). Use Kelley’s cutting-plane algorithm to minimize the piecewise-linear function f (x) = maxi=1,...,m(aTi x + bi) that we have used for other numerical examples, with C the unit cube, i.e., C = {x | ‖x‖∞ ≤ 1 }. The data that defines the particular function can be found in the Matlab directory of the subgradient notes on the course web site. You can start with x(1)^ = 0 and run the algorithm for 40 iterations. Plot f (x(k)), U (k), L(k) and the constant p⋆^ (on the same plot) versus k.