Optimization Problem: Maximizing and Minimizing Functions with Constraints, Study notes of Calculus

Optimization problems, which involve maximizing or minimizing a function subject to certain constraints. Optimization problems can be categorized based on their distinguishing features, such as being descriptive or prescriptive, linear or nonlinear, convex or nonconvex, and can be solved using techniques like differential calculus or subdifferential calculus. an example of an optimization problem in engineering design, where the goal is to find the optimal proportions of a can that meet certain performance specifications while minimizing the cost. The document also mentions other applications of optimization, such as inventory management and approximation problems.

Typology: Study notes

2021/2022

Uploaded on 07/05/2022

allan.dev
allan.dev 🇦🇺

4.5

(86)

1K documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1. WHAT IS OPTIMIZATION?
Optimization problem: Maximizing or minimizing some function relative to some set,
often representing a range of choices available in a certain situation. The function
allows comparison of the different choices for determining which might be “best.”
Common applications: Minimal cost, maximal profit, minimal error, optimal design,
optimal management, variational principles.
Goals of the subject: The understanding of
Modeling issues—
What to look for in setting up an optimization problem?
What features are advantageous or disadvantageous?
What devices/tricks of formulation are available?
How can problems usefully be categorized?
Analysis of solutions—
What is meant by a “solution?”
When do solutions exist, and when are they unique?
How can solutions be recognized and characterized?
What happens to solutions under perturbations?
Numerical methods—
How can solutions be determined by iterative schemes of computation?
What modes of local simplification of a problem are convenient/appropriate?
How can different solution techniques be compared and evaluated?
Distinguishing features of optimization as a mathematical discipline:
descriptive prescriptive
equations inequalities
linear/nonlinear convex/nonconvex
differential calculus subdifferential calculus
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Optimization Problem: Maximizing and Minimizing Functions with Constraints and more Study notes Calculus in PDF only on Docsity!

1. WHAT IS OPTIMIZATION?

Optimization problem: Maximizing or minimizing some function relative to some set, often representing a range of choices available in a certain situation. The function allows comparison of the different choices for determining which might be “best.” Common applications: Minimal cost, maximal profit, minimal error, optimal design, optimal management, variational principles.

Goals of the subject: The understanding of

Modeling issues— What to look for in setting up an optimization problem? What features are advantageous or disadvantageous? What devices/tricks of formulation are available? How can problems usefully be categorized? Analysis of solutions— What is meant by a “solution?” When do solutions exist, and when are they unique? How can solutions be recognized and characterized? What happens to solutions under perturbations? Numerical methods— How can solutions be determined by iterative schemes of computation? What modes of local simplification of a problem are convenient/appropriate? How can different solution techniques be compared and evaluated?

Distinguishing features of optimization as a mathematical discipline:

descriptive −→ prescriptive equations −→ inequalities linear/nonlinear −→ convex/nonconvex differential calculus −→ subdifferential calculus

Finite-dimensional optimization: The case where a choice corresponds to selecting the values of a finite number of real variables, called decision variables. For general purposes the decision variables may be denoted by x 1 ,... , xn and each possible choice therefore identified with a point x = (x 1 ,... , xn) in the space IRn. This is what we’ll be focusing on in this course. Feasible set: The subset C of IRn^ representing the allowable choices x = (x 1 ,... , xn). Objective function: The function f 0 (x) = f 0 (x 1 ,... , xn) that is to be maximized or minimized over C.

Constraints: Side conditions that are used to specify the feasible set C within IRn.

Equality constraints: Conditions of the form fi(x) = ci for certain functions fi on IRn and constants ci in IRn. Inequality constraints: Conditions of the form fi(x) ≤ ci or fi(x) ≥ ci for certain functions fi on IRn^ and constants ci in IR. Range constraints: Conditions restricting the values of some decision variables to lie within certain closed intervals of IR. Very important in many situations, for instance, are nonnegativity constraints: some variables xj may only be allowed to take values ≥ 0; the interval then is [0, ∞). Range constraints can also arise from the desire to keep a variable between certain upper and lower bounds. Linear constraints: Range constraints or conditions of the form fi(x) = ci, fi(x) ≤ ci, or fi(x) ≥ ci, in which the function is linear in the standard sense of being expressible as sum of constant coefficients times the variables x 1 ,... , xn. Data parameters: General problem statements usually involve not only decision vari- ables but symbols designating known coefficients, constants, or other data ele- ments. Conditions on such elements, such as the nonnegativity of a particular coefficient, are not among the “constraints” in a problem of optimization, since the numbers in question are supposed to be given and aren’t subject to choice.

Mathematical programming: A traditional synonym for finite-dimensional optimiza- tion. This usage predates “computer programming,” which actually arose from early attempts at solving optimization problems on computers. “Programming,” with the meaning of optimization, survives in problem classifications such as linear program- ming, quadratic programming, convex programming, integer programming, etc.

Comments. This example illustrates several features that are quite typically found in problems of optimization. Redundant constraints: It is obvious that the condition 6r ≤ D 0 is implied by the other constraints and therefore could be dropped without affecting the prob- lem. But in problems with many variables and constraints such redundancy may be hard to recognize. From a practical point of view, the elimination of redundant constraints could pose a challenge as serious as that of solving the optimization problem itself. Inactive constraints: It could well be true that the optimal pair (r, h) (unique??) is such that either the condition 8r ≤ D 0 or the condition h ≤ D 0 is satisfied as a strict inequality, or both. In that case the constraints in question are inactive in the local characterization of optimal point, although they do affect the shape of the set C. Again, however, there is little hope, in a problem with many variables and constraints, of determining by some preliminary procedure just which constraints will be active and which will not. This is the crux of the difficulty in many numerical approaches. Redundant variables: It would be possible to solve the equation πr^2 h = V 0 for h in terms of r and thereby reduce the given problem to one in terms of just r, rather than (r, h). Fine—but besides being a technique that is usable only in special circumstances, the elimination of variables from (generally nonlinear) systems of equations is not necessarily helpful. There may be a trade-off between the lower dimensionality achieved in this way and other properties. Inequalities versus equations: The constraint πr^2 h = V 0 could be written in the form πr^2 h ≥ V 0 without affecting anything about the solution. This is because of the nature of the cost function; no pair (r, h) in the larger set C′, obtained by substituting this weaker condition for the equation, can minimize f 0 unless actually (r, h) ∈ C. While it may seem instinctive to prefer the equation to the inequality in the formulation, the inequality turns to be superior in the present case because the set C′^ happens to be “convex,” whereas C isn’t. Convexity: This problem is not fully of “convex” type in itself, despite the pre- ceding remark. Nonetheless, it can be made convex by a certain change of variables, as will be seen later. The lesson is that the formulation of a prob- lem of optimization can be quite subtle, when it comes to bringing out crucial features like convexity.

EXAMPLE 2: Management of Systems

General description. A sequence of decisions must be made in discrete time which will affect the operation of some kind of “system,” often of an economic nature. The decisions, each in terms of choosing the values of a number of variables, have to respect various limitations in resources. Typically the desire is to minimize cost, or maximize profit or efficiency, say, over a certain time horizon. Particular case: an inventory model. A warehouse with total capacity a (in units of volume) is to be operated over time periods t = 1,... , T as the sole facility for the supply of a number of different commodities (or medicines, or equipment parts, etc.), indexed by j = 1,... , n. The demand for commodity j during period t is the known amount dtj ≥ 0 (in volume units)—this is a deterministic approach to modeling the situation. In each period t it is possible not only to fill demands but to acquire additional supplies up to certain limits, so as to maintain stocks. The problem is to plan the pattern of acquiring supplies in such a way as to maximize the net profit over the T periods, relative to the original inventory amounts and the desired terminal inventory amounts. inventory variables: xtj units of j at the end of period t inventory constraints: xtj ≥ 0, ∑n j=1 xtj^ ≤^ a^ for^ t^ = 1,... , T initial inventory: x 0 j units of j given at the beginning terminal constraints: xT j = bj (given amounts) for j = 1,... , n inventory costs: stj dollars per unit of j held from t to t + 1 supply variables: utj units of j acquired during period j supply constraints: 0 ≤ utj ≤ atj (given availabilities) supply costs: ctj dollars per unit of j acquired during t dynamical constraints: xtj = max

0 , xt− 1 ,j + utj − dtj

rewards: ptj dollars per unit of filled demand filled demand: min

dtj , xt− 1 ,j + utj

units of j during period t net profit:

∑T

t=

∑n j=

[

ptj min

dtj , xt− 1 ,j + utj

− stj xtj − ctj utj

]

Summary. The latter expression as a function of all the variables xtj and utj for t = 1 ,... , T and j = 1,... , n is to be maximized subject to the inventory constraints, terminal constraints, supply constraints and the dynamical constraints. (These constraints can be viewed as determining a certain subset of IR^2 T n.)

still more variables: vtj as the amount of good j used to meet demands at time t. In terms of these variables, constrained by 0 ≤ vtj ≤ dtj , the dynamics would take the linear form

xtj = xt− 1 ,j + utj − vtj

and the profit expression would likewise be linear:

∑^ T t=

∑^ n j=

[

ptj vtj − stj xtj − ctj utj

]

Hidden assumptions: The alternative model just described with variables vtj is better in other ways too. The original model had the hidden assumption that demands in any period should always be met as far as possible from the stocks on hand. But this might be disadvantageous if rewards will soon be higher, and inventory can only be built up slowly due to the constraints on availability. The alternative model allows sales to be held off in such circumstances.

EXAMPLE 3: Identification of Parameters

General description. A mathematical model has been formulated for a given situa- tion, but to implement it the values of a number of parameters must be specified. A body of data is known through experiment or observation. The task is to determine the parameter values that best fit the data. Here, in speaking of the “best” fit, reference is evidently being made to some criterion for optimization, but there isn’t just one interpretation always of which criterion to use. (Note a linguistic pitfall: “the” best fit suggests uniqueness of the answer being sought, but even relative to a single criterion there could be more than one choice of the parameters that is optimal.) Applications are found in statistics (regression, maximum likelihood), econometrics, and virtually every area of science. Particular case: “least squares” estimates. Starting out very simply, suppose that two variables x and y are being modeled as related by a linear law y = ax+b, either for inherent theoretical reasons or as a first-level approximation. The values of a and b are not known a priori but must be determined from the data, consisting of a large collection of pairs (xk, yk) ∈ IR^2 for k = 1,... , N. These pairs have been gleaned from experiments (where random errors of measurement could arise along with other discrepancies due to oversimplifications in the model). The

error expression E(a, b) =

∑^ N

k=

yk − (axk + b)

is often taken as representing the goodness of the fit of the parameter pair (a, b). The problem is to minimize this over all (a, b) ∈ IR^2. More generally, instead of a real variable x and a real variable y one could be dealing with a vector x ∈ IRn and a vector y ∈ IRm, which are supposed to be related by a formula y = Ax + b for a matrix A ∈ IRm×n^ and a vector b ∈ IRm. Then the error expression E(A, b) would depend on the m × (n + 1) components of A and b.

Comments. This kind of optimization is entirely technical: the introduction of some- thing to be optimized is just a mathematical construct. Still, in analysis and computation of solutions the same challenges arise as in other settings. Constraints: The problem, as stated so far, concerns the unconstrained minimiza- tion of a certain quadratic function in the parameters, but it is easy to imagine situations where the parameters may be subject to various side conditions. In the case of y = ax + b, for instance, it may be known on the basis of theory for the variables in question that 1/ 2 ≤ a ≤ 3 /2, while b ≥ −1. In the mul- tidimensional case of y = Ax + b there may be the requirement of A being symmetric (with m = n), which would entail the imposition of n(n − 1)/ 2 linear constraints of the form aji − aij = 0. Perhaps for some reason one also needs to have a 11 ≥ a 22 ≥ · · · ≥ ann, and so forth. (In the applications that are made of least squares estimation, such conditions are often neglected, and the numerical answer obtained is simply “fixed up” if it doesn’t have the right form. But this is clearly not good methodology.) Nonlinear version: A so-called problem of linear least squares has been presented, but the same ideas can be used when the underlying relation between x and y is supposed to be nonlinear. For instance, a law of the form y = eax^ − ebx would lead to an error expression

E(a, b) =

∑^ N

k=

yk − (eaxk^ − ebxk^ )

In minimizing this with respect to (a, b) ∈ IR^2 , we would not be dealing with a quadratic function, but something much more complicated. The graph of E in a problem of nonlinear least squares could have lots of “bumps” and “dips,” which could make it hard to find the minimum computationally.

EXAMPLE 4: Variational Principles

General description. The linear and nonlinear equations that are the focus of much of numerical analysis are often associated in hidden ways with problems of optimization. For an equation of the form F (x) = 0, involving a mapping F : IRn^7 → IRn, a variational principle is an expression of F as the gradient map- ping ∇f associated with some function f : IRn^7 → IR. Such an expression leads to the interpretation that the desired x satisfies a first-order optimality condition with respect to f. Under certain additional conditions on F , it may even be concluded that x minimizes f , at least “locally.” A route to solving F (x) = 0 is thereby opened up in terms of minimizing f. Quite similar in concept are numerous examples where instead of solving an equation F (x) = 0 with x ∈ IRn one is interested in solving A(u) = 0 where u is some unknown function, and A is a mapping from a function space (e.g. a certain Hilbert space) into itself. In particular, A might be a differential operator, so that an ordinary or partial differential equation is at issue. A variational principle then characterizes the desired u as providing the minimum, say, of some functional on the space. In fact, many of the most famous differential equations of physics have such an in- terpretation, including Newton’s laws of motion (the local variational principle of “least action”). On a different front, one can think of conditions of price equi- librium in economics that can be characterized as stemming from the actions of a multitude of “economic agents,” like producers and consumers, all optimizing from their individual perspectives. Yet again, the equilibrium state following the reactions which take place in a complicated chemical brew may be characterized through a variational principle as the configuration of substances that minimizes a certain energy function. Particular case: the Dirichlet problem. A classical problem in PDE’s, posed in its most elementary form, concerns an unknown function u(y 1 , y 2 ) on a closed, bounded region Ω ⊂ IR^2 with boundary curve Γ. The function, assumed to be continuous on Ω and twice differentiable on the interior of Ω, is required to satisfy, in terms of a given function ϕ on Ω, the partial differential equation ∂^2 u ∂y^21 (y^1 , y^2 ) +^

∂^2 u ∂y^22 (y^1 , y^2 ) =^ ϕ(y^1 , y^2 ) inside Ω as well as the boundary condition

u(y 1 , y 2 ) ≡ 0 on Γ.

It turns out that the solution to this problem is the unique function that minimizes the expression

J(u) =

Ω

[

ϕ(y 1 , y 2 )u(y 1 , y 2 ) +^12 ∂y^ ∂u 1

(y 1 , y 2 )^2 +^12 ∂y^ ∂u 2

(y 1 , y 2 )^2

]

dy 1 dy 2

over all functions u satisfying the boundary condition. Although this is not a problem of finite-dimensional optimization, because u ranges over a space with “infinitely many degrees of freedom,” one does get such a problem in passing to a discretized version or an approximate problem in which u is restricted to be a linear combination of a certain collection of basic functions (some kind of truncated series), as must inevitably be done in bringing numerical methods to bear. It is obvious that this is another way that optimization problems of very high dimension could arise. A major branch of theory has to address the question of how an infinite-dimensional problem can be approximated better and better by a sequence of finite-dimensional problems, what kind of convergence can be obtained from the respective solutions, and so on.

Comments. In the study of variational principles, optimization theory can provide interesting insights quite independently of whether a numerical solution to a particular case is sought or not. Classical roots: Optimization problems over function spaces have been studied since the 17th century, and they have been very influential not only in the discovery of variational principles but in the development of tools of analysis, especially functional analysis and topology. This branch of the subject has traditionally been referred to by the quaint title of the calculus of variations. A closely related modern counterpart, to be discussed in the next example, is the theory of optimal control. Unilateral constraints: While the side conditions considered with classical PDE’s are typically equations of some kind, many problems handled nowadays involve inequalities. For instance, it may be required above that

a(y 1 , y 2 ) ≤ u(y 1 , y 2 ) ≤ b(y 1 , y 2 )

for certain functions a and b given on Ω. From the standpoint of the PDE, it’s not completely clear what this is supposed to mean, but in the context of the optimization of J(u) the meaning is evident: this condition, in addition to the boundary condition already imposed, is to restrict further the set C of

specifies the positions of all the levers. Once a function u : [0, T ] 7 → U has been chosen, the trajectory x followed by the space ship will be completely determined over the time interval [0, T ] as the solution to an ODE ˙x(t) = f (t, x(t), u(t)) with x(0) = x 0. Restrict attention now to the class of control functions u such that the final state x(T ) has its position coordinates in the targeted area of the moon, its velocity coordinates all 0, and so forth; this will be a constraint of the form x(T ) ∈ E. Further make restrictions like x(t) ∈ X(t) ⊂ IRn, for instance to ensure that the trajectory of the ship does not penetrate the earth or the moon. Over the control functions so described, the problem is to minimize some expression like J(u) =

∫ T

0

g(t, x(t), u(t))dt

giving the cost of the control, say. Other possibilities include looking for a control function that gets the ship to its destination in the least time, for instance.

Comments.

Control in discrete time: Very similar problems can be set up in discrete rather than continuous time, not just as an expedient for numerical purposes, but as appropriate models in themselves. Such problems are finite-dimensional. A case in point is the inventory problem in Example 2, where the xtj ’s are state variables and the utj ’s are control variables. Stochastic version: The system being guided may be subject to random distur- bances, which the controller must react to. Further, there may be difficulty in knowing exactly what the state is at any time t, due to the shortcomings of sensors and measurement errors (another random effect). Control must then be framed in terms of mappings which give the response at time t that is most appropriate to the particular information available right then about x(t). This is the formidable subject of stochastic optimal control , which at present is only able to cope with rather special cases. Adaptive version: Also very interesting as a mathematical challenge, but largely out of reach of current concepts and techniques, is adaptive control , where the controller has not only to react to unexpected events but learn the basics of the system being controlled as time goes on. This is a bit like getting behind the wheel of a car in bad weather when the roads are icy. In choosing the control function, the desire to arrive at the destination in the quickest manner compatible with the configuration of the roads and hills may have

to be compromised with time spent on “test skids” to see what the tires can take. A major difficulty in this area is the clear formulation of the objective in the optimization, as well as the identification of what can or can’t be assumed about the imperfectly known system. Control of PDE’s: The state of a system may be given by an element of a function space rather than a point in IRn, as when the problem revolves around the temperature distribution at time t over a solid body represented by a closed, bounded region Ω ⊂ IR^3. The temperature can be influenced by heating or cooling elements arrayed on the surface of the body. How should these ele- ments be operated in order to bring the temperature of the body uniformly within a certain range—in the shortest possible time, or with the least expen- diture of energy?

EXAMPLE 6: Optimal Scheduling

General description. Choices must be made about the order in which certain ac- tions ought to be taken, as well as scope of the actions. Decisions may concern not only the values of continuous variables but discrete variables, which can take on only integer values, or even logical variables, which are limited to 0 and 1. There may thus be a mixture of finite-dimensional optimization and combinato- rial optimization. Many such problems are almost intractable, even when posed in just a halfway realistic form, but there are notable exceptions. Particular case: flight scheduling. An airline must set up its weekly schedule of flights. This involves specifying not only the departure and arrival times but the numbers of flights between various destinations (these numbers have to be treated as integer variables). Constraints involve, among other things, the avail- ability of aircraft and crew and are greatly complicated by the need to follow what happens to each individual plane and crew member. A particular plane, having flown from Seattle to New York, must next take off from New York, and it can’t do so without a certified pilot, who in the meantime has arrived from Atlanta and gotten the right amount of rest, and so on. Aircraft maintenance requirements are another serious issue along with the working requirements of personnel based in different locations and having to return home at specified intervals. The flight schedule must obviously take into account the passenger demand for various routes and times, and whether they are nonstop. To the important extent that random variables are involved, not only in the demands but in the possibility of mechanical breakdowns, sick crew members and weather