




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This course discusses sev- eral classes of optimization problems (including linear, quadratic, integer, dynamic, stochastic, conic, and robust programming) ...
Typology: Exercises
1 / 347
This page cannot be seen from the preview
Don't miss anything!





























































































Carnegie Mellon University, Pittsburgh, PA 15213 USA
Foreword Optimization models play an increasingly important role in financial de- cisions. Many computational finance problems ranging from asset allocation to risk management, from option pricing to model calibration can be solved efficiently using modern optimization techniques. This course discusses sev- eral classes of optimization problems (including linear, quadratic, integer, dynamic, stochastic, conic, and robust programming) encountered in finan- cial models. For each problem class, after introducing the relevant theory (optimality conditions, duality, etc.) and efficient solution methods, we dis- cuss several problems of mathematical finance that can be modeled within this problem class. In addition to classical and well-known models such as Markowitz’ mean-variance optimization model we present some newer optimization models for a variety of financial problems.
Acknowledgements This book has its origins in courses taught at Carnegie Mellon University in the Masters program in Computational Finance and in the MBA program at the Tepper School of Business (G´erard Cornu´ejols), and at the Tokyo In- stitute of Technology, Japan, and the University of Coimbra, Portugal (Reha T¨ut¨unc¨u). We thank the attendants of these courses for their feedback and for many stimulating discussions. We would also like to thank the colleagues who provided the initial impetus for this project, especially Michael Trick, John Hooker, Sanjay Srivastava, Rick Green, Yanjun Li, Lu´ıs Vicente and Masakazu Kojima. Various drafts of this book were experimented with in class by Javier Pe˜na, Fran¸cois Margot, Miroslav Karamanov and Kathie Cameron, and we thank them for their comments.
that satisfies f (x∗) ≤ f (x), ∀x ∈ S.
Such an x∗^ is called a global minimizer of the problem (OP). If
f (x∗) < f (x), ∀x ∈ S, x 6 = x∗,
then x∗^ is a strict global minimizer. In other instances, we may only find an x∗^ ∈ S that satisfies
f (x∗) ≤ f (x), ∀x ∈ S ∩ Bx∗^ (ε)
for some ε > 0, where Bx∗^ (ε) is the open ball with radius ε centered at x∗, i.e., Bx∗^ (ε) = {x : ‖x − x∗‖ < ε}.
Such an x∗^ is called a local minimizer of the problem (OP). A strict local minimizer is defined similarly. In most cases, the feasible set S is described explicitly using functional constraints (equalities and inequalities). For example, S may be given as
S := {x : gi(x) = 0, i ∈ E and gi(x) ≥ 0 , i ∈ I},
where E and I are the index sets for equality and inequality constraints. Then, our generic optimization problem takes the following form:
(OP) minx f (x) gi(x) = 0 , i ∈ E gi(x) ≥ 0 , i ∈ I.
Many factors affect whether optimization problems can be solved effi- ciently. For example, the number n of decision variables, and the total num- ber of constraints |E| + |I|, are generally good predictors of how difficult it will be to solve a given optimization problem. Other factors are related to the properties of the functions f and gi that define the problem. Prob- lems with a linear objective function and linear constraints are easier, as are problems with convex objective functions and convex feasible sets. For this reason, instead of general purpose optimization algorithms, researchers have developed different algorithms for problems with special characteristics. We list the main types of optimization problems we will encounter. A more complete list can be found, for example, on the Optimization Tree available from http://www-fp.mcs.anl.gov/otc/Guide/OptWeb/.
One of the most common and easiest optimization problems is linear opti- mization or linear programming (LP). It is the problem of optimizing a linear objective function subject to linear equality and inequality constraints. This corresponds to the case in OP where the functions f and gi are all linear. If either f or one of the functions gi is not linear, then the resulting problem is a nonlinear programming (NLP) problem.
The standard form of the LP is given below:
(LP) minx cT^ x Ax = b x ≥ 0 ,
where A ∈ IRm×n, b ∈ IRm, c ∈ IRn^ are given, and x ∈ IRn^ is the variable vector to be determined. In this book, a k-vector is also viewed as a k × 1 matrix. For an m × n matrix M , the notation M T^ denotes the transpose matrix, namely the n×m matrix with entries M (^) ijT = Mji. As an example, in the above formulation cT^ is a 1 × n matrix and cT^ x is the 1 × 1 matrix with entry ∑n ∑n j=1^ cj^ xj^.^ The objective in (1.3) is to minimize the linear function j=1 cj^ xj^. As with OP, the problem LP is said to be feasible if its constraints are consistent and it is called unbounded if there exists a sequence of feasible vec- tors {xk} such that cT^ xk^ → −∞. When LP is feasible but not unbounded it has an optimal solution, i.e., a vector x that satisfies the constraints and minimizes the objective value among all feasible vectors. The best known (and most successful) methods for solving LPs are the interior-point and simplex methods.
A more general optimization problem is the quadratic optimization or the quadratic programming (QP) problem, where the objective function is now a quadratic function of the variables. The standard form QP is defined as follows: (QP) minx 12 xT^ Qx + cT^ x Ax = b x ≥ 0 ,
where A ∈ IRm×n, b ∈ IRm, c ∈ IRn, Q ∈ IRn×n^ are given, and x ∈ IRn. Since xT^ Qx = 12 xT^ (Q + QT^ )x, one can assume without loss of generality that Q is symmetric, i.e. Qij = Qji. The objective function of the problem QP is a convex function of x when Q is a positive semidefinite matrix, i.e., when yT^ Qy ≥ 0 for all y (see the Appendix for a discussion on convex functions). This condition is equivalent to Q having only nonnegative eigenvalues. When this condition is satisfied, the QP problem is a convex optimization problem and can be solved in polynomial time using interior-point methods. Here we are referring to a classical notion used to measure computational complexity. Polynomial time algorithms are efficient in the sense that they always find an optimal solution in an amount of time that is guaranteed to be at most a polynomial function of the input size.
Another generalization of (LP) is obtained when the nonnegativity con- straints x ≥ 0 are replaced by general conic inclusion constraints. This is
where A, b, c are given data and the integer p (with 1 ≤ p < n) is also part of the input.
Dynamic programming refers to a computational method involving recur- rence relations. This technique was developed by Richard Bellman in the early 1950’s. It arose from studying programming problems in which changes over time were important, thus the name “dynamic programming”. How- ever, the technique can also be applied when time is not a relevant factor in the problem. The idea is to divide the problem into “stages” in order to perform the optimization recursively. It is possible to incorporate stochastic elements into the recursion.
1.2 Optimization with Data Uncertainty
In all the problem classes we discussed so far (except dynamic programming), we made the implicit assumption that the data of the problem, namely the parameters such as Q, A, b and c in QP, are all known. This is not always the case. Often, the problem parameters correspond to quantities that will only be realized in the future, or cannot be known exactly at the time the problem must be formulated and solved. Such situations are especially common in models involving financial quantities such as returns on investments, risks, etc. We will discuss two fundamentally different approaches that address optimization with data uncertainty. Stochastic programming is an approach used when the data uncertainty is random and can be explained by some probability distribution. Robust optimization is used when one wants a solution that behaves well in all possible realizations of the uncertain data. These two alternative approaches are not problem classes (as in LP, QP, etc.) but rather modeling techniques for addressing data uncertainty.
The term stochastic programming refers to an optimization problem in which some problem data are random. The underlying optimization problem might be a linear program, an integer program, or a nonlinear program. An im- portant case is that of stochastic linear programs. A stochastic program with recourse arises when some of the decisions (recourse actions) can be taken after the outcomes of some (or all) ran- dom events have become known. For example, a two-stage stochastic linear program with recourse can be written as follows:
maxx aT^ x + E[maxy(ω) c(ω)T^ y(ω)] Ax = b B(ω)x + C(ω)y(ω) = d(ω) x ≥ 0 , y(ω) ≥ 0 ,
where the first-stage decisions are represented by vector x and the second- stage decisions by vector y(ω), which depend on the realization of a random
event ω. A and b define deterministic constraints on the first-stage deci- sions x, whereas B(ω), C(ω), and d(ω) define stochastic linear constraints linking the recourse decisions y(ω) to the first-stage decisions. The objec- tive function contains a deterministic term aT^ x and the expectation of the second-stage objective c(ω)T^ y(ω) taken over all realization of the random event ω. Note that, once the first-stage decisions x have been made and the ran- dom event ω has been realized, one can compute the optimal second-stage decisions by solving the following linear program:
f (x, ω) = max c(ω)T^ y(ω) C(ω)y(ω) = d(ω) − B(ω)x y(ω) ≥ 0 ,
Let f (x) = E[f (x, ω)] denote the expected value of the optimal value of this problem. Then, the two-stage stochastic linear program becomes
max aT^ x + f (x) Ax = b x ≥ 0 ,
Thus, if the (possibly nonlinear) function f (x) is known, the problem re- duces to a nonlinear programming problem. When the data c(ω), B(ω), C(ω), and d(ω) are described by finite distributions, one can show that f is piecewise linear and concave. When the data are described by probability densities that are absolutely continuous and have finite second moments, one can show that f is differentiable and concave. In both cases, we have a convex optimization problem with linear constraints for which specialized algorithms are available.
Robust optimization refers to the modeling of optimization problems with data uncertainty to obtain a solution that is guaranteed to be “good” for all possible realizations of the uncertain parameters. In this sense, this approach departs from the randomness assumption used in stochastic op- timization for uncertain parameters and gives the same importance to all possible realizations. Uncertainty in the parameters is described through un- certainty sets that contain all (or most) possible values that can be realized by the uncertain parameters. There are different definitions and interpretations of robustness and the resulting models differ accordingly. One important concept is constraint robustness, often called model robustness in the literature. This refers to solutions that remain feasible for all possible values of the uncertain inputs. This type of solution is required in several engineering applications. Here is an example adapted from Ben-Tal and Nemirovski. Consider a multi- phase engineering process (a chemical distillation process, for example) and a related process optimization problem that includes balance constraints (materials entering a phase of the process cannot exceed what is used in
roots of this trend in the portfolio selection models and methods described by Markowitz in the 1950’s and the option pricing formulas developed by Black, Scholes, and Merton in the late 1960’s. For the enormous effect these works produced on modern financial practice, Markowitz was awarded the Nobel prize in Economics in 1990, while Scholes and Merton won the Nobel prize in Economics in 1997. Below, we introduce topics in finance that are especially suited for mathe- matical analysis and involve sophisticated tools from mathematical sciences.
The theory of optimal selection of portfolios was developed by Harry Markowitz in the 1950’s. His work formalized the diversification principle in portfolio selection and, as mentioned above, earned him the 1990 Nobel prize for Economics. Here we give a brief description of the model and relate it to QPs. Consider an investor who has a certain amount of money to be invested in a number of different securities (stocks, bonds, etc.) with random re- turns. For each security i = 1,... , n, estimates of its expected return μi and variance σ i^2 are given. Furthermore, for any two securities i and j, their correlation coefficient ρij is also assumed to be known. If we represent the proportion of the total funds invested in security i by xi, one can compute the expected return and the variance of the resulting portfolio x = (x 1 ,... , xn) as follows:
E[x] = x 1 μ 1 +... + xnμn = μT^ x,
and
V ar[x] =
∑
i,j
ρij σiσj xixj = xT^ Qx
where ρii ≡ 1, Qij = ρij σiσj , and μ = (μ 1 ,... , μn). The portfolio vector x must satisfy
∑ i xi^ = 1 and there may or may not be additional feasibility constraints. A feasible portfolio x is called efficient if it has the maximal expected return among all portfolios with the same variance, or alternatively, if it has the minimum variance among all portfolios that have at least a certain expected return. The collection of efficient portfolios form the efficient frontier of the portfolio universe. Markowitz’ portfolio optimization problem, also called the mean-variance optimization (MVO) problem, can be formulated in three different but equiv- alent ways. One formulation results in the problem of finding a minimum variance portfolio of the securities 1 to n that yields at least a target value R of expected return. Mathematically, this formulation produces a convex quadratic programming problem:
minx xT^ Qx eT^ x = 1 μT^ x ≥ R x ≥ 0 ,
where e is an n-dimensional vector all of which components are equal to
As an alternative to problem (1.15), we may choose to maximize the expected return of a portfolio while limiting the variance of its return. Or, we can maximize a risk-adjusted expected return which is defined as the expected return minus a multiple of the variance. These two formulations are essentially equivalent to (1.15) as we will see in Chapter 8.
The model (1.15) is rather versatile. For example, if short sales are per- mitted on some or all of the securities, then this can be incorporated into the model simply by removing the nonnegativity constraint on the corre- sponding variables. If regulations or investor preferences limit the amount of investment in a subset of the securities, the model can be augmented with a linear constraint to reflect such a limit. In principle, any linear constraint can be added to the model without making it significantly harder to solve.
Asset allocation problems have the same mathematical structure as port- folio selection problems. In these problems the objective is not to choose a portfolio of stocks (or other securities) but to determine the optimal in- vestment among a set of asset classes. Examples of asset classes are large capitalization stocks, small capitalization stocks, foreign stocks, government bonds, corporate bonds, etc. There are many mutual funds focusing on specific asset classes and one can therefore conveniently invest in these as- set classes by purchasing the relevant mutual funds. After estimating the expected returns, variances, and covariances for different asset classes, one can formulate a QP identical to (1.15) and obtain efficient portfolios of these asset classes.
A different strategy for portfolio selection is to try to mirror the move- ments of a broad market population using a significantly smaller number of securities. Such a portfolio is called an index fund. No effort is made to identify mispriced securities. The assumption is that the market is efficient and therefore no superior risk-adjusted returns can be achieved by stock picking strategies since the stock prices reflect all the information available in the marketplace. Whereas actively managed funds incur transaction costs which reduce their overall performance, index funds are not actively traded and incur low management fees. They are typical of a passive management strategy. How do investment companies construct index funds? There are numerous ways of doing this. One way is to solve a clustering problem where similar stocks have one representative in the index fund. This naturally leads to an integer programming formulation.
and cash (borrowed or lent) today, such that the payoff from the portfolio at the expiration date of the option will match the payoff of the option? Note that the option payoff will be $30 if the price of the stock goes up and $ if it goes down. Assume this portfolio has ∆ shares of XYZ and $B cash. This portfolio would be worth 40∆+B today. Next month, payoffs for this portfolio will be:
* HH Hj
80∆+B=P 1 (u)
20∆+B=P 1 (d) Let us choose ∆ and B such that 80∆ + B = 30 20∆ + B = 0 ,
so that the portfolio replicates the payoff of the option at the expiration date. This gives ∆ = 12 and B = −10, which is the hedge we were looking for. This portfolio is worth P 0 = 40∆ + B =$10 today, therefore, the fair price of the option must also be $10.
Risk is inherent in most economic activities. This is especially true of fi- nancial activities where results of decisions made today may have many possible different outcomes depending on future events. Since companies cannot usually insure themselves completely against risk, they have to man- age it. This is a hard task even with the support of advanced mathematical techniques. Poor risk management led to several spectacular failures in the financial industry during the 1990’s (e.g., Barings Bank, Long Term Capital Management, Orange County). A coherent approach to risk management requires quantitative risk mea- sures that adequately reflect the vulnerabilities of a company. Examples of risk measures include portfolio variance as in the Markowitz MVO model, the Value-at-Risk (VaR) and the expected shortfall (also known as condi- tional Value-at-Risk, or CVaR)). Furthermore, risk control techniques need to be developed and implemented to adapt to rapid changes in the values of these risk measures. Government regulators already mandate that fi- nancial institutions control their holdings in certain ways and place margin requirements for “risky” positions. Optimization problems encountered in financial risk management often take the following form. Optimize a performance measure (such as expected investment return) subject to the usual operating constraints and the con- straint that a particular risk measure for the company’s financial holdings does not exceed a prescribed amount. Mathematically, we may have the following problem: maxx μT^ x RM[x] ≤ γ eT^ x = 1 x ≥ 0.
As in the Markowitz MVO model, xi represent the proportion of the total funds invested in security. The objective is the expected portfolio return and μ is the expected return vector for the different securities. RM[x] denotes the value of a particular risk measure for portfolio x and γ is the prescribed upper limit on this measure. Since RM[x] is generally a nonlinear function of x, (1.16) is a nonlinear programming problem. Alternatively, we can minimize the risk measure while constraining the expected return of the portfolio to achieve or exceed a given target value R. This would produce a problem very similar to (1.15).
How should a financial institution manage its assets and liabilities? A static mean-variance optimizing model, such as the one we discussed for asset al- location, fails to incorporate the multiple liabilities faced by financial insti- tutions. Furthermore, it penalizes returns both above and below the mean. A multi-period model that emphasizes the need to meet liabilities in each period for a finite (or possibly infinite) horizon is often required. Since li- abilities and asset returns usually have random components, their optimal management requires tools of “Optimization under Uncertainty” and most notably, stochastic programming approaches. Let Lt be the liability of the company in period t for t = 1,... , T. Here, we assume that the liabilities Lt are random with known distributions. A typical problem to solve in asset/liability management is to determine which assets (and in what quantities) the company should hold in each period to maximize its expected wealth at the end of period T. We can further assume that the asset classes the company can choose from have random returns (again, with known distributions) denoted by Rit for asset class i in period t. Since the company can make the holding decisions for each period after observing the asset returns and liabilities in the previous periods, the resulting problem can be cast as a stochastic program with recourse:
maxx E[
∑ ∑^ i^ xi,T^ ] i(1 +^ Rit)xi,t−^1 −^
∑ i xi,t^ =^ Lt, t^ = 1,... , T xi,t ≥ 0 ∀i, t.
The objective function represents the expected total wealth at the end of the last period. The constraints indicate that the surplus left after liability Lt is covered will be invested as follows: xi,t invested in asset class i. In this formulation, xi, 0 are the fixed, and possibly nonzero initial positions in the different asset classes.