VARIATIONAL ANALYSIS, Slides of Calculus

In this book we aim to present, in a unified framework, a broad spectrum of mathematical theory that has grown in connection with the study of prob- lems of ...

Typology: Slides

2022/2023

Uploaded on 05/11/2023

arien
arien 🇺🇸

4.8

(24)

309 documents

1 / 743

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
VARIATIONAL ANALYSIS
R. Tyrrell Rockafellar
Roger J-B Wets
with figures drawn by Maria Wets
1997, 2nd printing 2004, 3rd printing 2009
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download VARIATIONAL ANALYSIS and more Slides Calculus in PDF only on Docsity!

VARIATIONAL ANALYSIS

R. Tyrrell Rockafellar

Roger J-B Wets

with figures drawn by Maria Wets

1997, 2nd printing 2004, 3rd printing 2009

PREFACE

In this book we aim to present, in a unified framework, a broad spectrum of mathematical theory that has grown in connection with the study of prob- lems of optimization, equilibrium, control, and stability of linear and nonlinear systems. The title Variational Analysis reflects this breadth. For a long time, ‘variational’ problems have been identified mostly with the ‘calculus of variations’. In that venerable subject, built around the min- imization of integral functionals, constraints were relatively simple and much of the focus was on infinite-dimensional function spaces. A major theme was the exploration of variations around a point, within the bounds imposed by the constraints, in order to help characterize solutions and portray them in terms of ‘variational principles’. Notions of perturbation, approximation and even generalized differentiability were extensively investigated. Variational theory progressed also to the study of so-called stationary points, critical points, and other indications of singularity that a point might have relative to its neighbors, especially in association with existence theorems for differential equations. With the advent of computers, there has been a tremendous expansion of interest in new problem formulations that similarly demand such modes of analysis but are far from being covered by classical concepts, not to speak of classical results. For those problems, finite-dimensional spaces of arbitrary dimensionality are important alongside of function spaces, and theoretical con- cerns go hand in hand with the practical ones of mathematical modeling and the design of numerical procedures. It is time to free the term ‘variational’ from the limitations of its past and to use it to encompass this now much larger area of modern mathematics. We see ‘variations’ as referring not only to movement away from a given point along rays or curves, and to the geometry of tangent and normal cones associated with that, but also to the forms of perturbation and approximation that are describable by set convergence, set-valued mappings and the like. Subgradients and subderivatives of functions, convex and nonconvex, are crucial in analyzing such ‘variations’, as are the manifestations of Lipschitzian continuity that serve to quantify rates of change. Our goal is to provide a systematic exposition of this broader subject as a coherent branch of analysis that, in addition to being powerful for the problems that have motivated it so far, can take its place now as a mathematical discipline ready for new applications. Rather than detailing all the different approaches that researchers have been occupied with over the years in the search for the right ideas, we seek to reduce the general theory to its key ingredients as now understood, so as to make it accessible to a much wider circle of potential users. But within that consolidation, we furnish a thorough and tightly coordinated exposition of facts and concepts. Several books have already dealt with major components of the subject. Some have concentrated on convexity and kindred developments in realms of nonconvexity. Others have concentrated on tangent vectors and subderiva- tives more or less to the exclusion of normal vectors and subgradients, or vice versa, or have focused on topological questions without getting into general- ized differentiability. Here, by contrast, we cover set convergence and set-valued mappings to a degree previously unavailable and integrate those notions with both sides of variational geometry and subdifferential calculus. We furnish a needed update in a field that has undergone many changes, even in outlook. In addition, we include topics such as maximal monotone mappings, generalized second derivatives, and measurable selections and integrands, which have not

iii

of G¨ul G¨urkan, Douglas Lepro, Yonca Ozge, and Stephen Robinson. Conver-¨ sations we had over the years with our students and colleagues contributed significantly to the final form of the book as well. Grants from the National Science Foundation were essential in sustaining the long effort. The changes in this third printing mainly concern various typographical, corrections, and reference omissions, which came to light in the first and second printing. Many of these reached our notice through our own re-reading and that of our students, as well as the individuals already mentioned. Really major input, however, arrived from Shu Lu and Michel Valadier, and above all from Lionel Thibault. He carefully went through almost every detail, detecting numerous places where adjustments were needed or desirable. We are extremely indebted for all these valuable contributions.

iv

  • Chapter 1. Max and Min CONTENTS
    • A. Penalties and Constraints
    • B. Epigraphs and Semicontinuity
    • C. Attainment of a Minimum
    • D. Continuity, Closure and Growth
    • E. Extended Arithmetic
    • F. Parametric Dependence
    • G. Moreau Envelopes
    • H. Epi-Addition and Epi-Multiplication
    • I Auxiliary Facts and Principles∗
    • Commentary
  • Chapter 2. Convexity
    • A. Convex Sets and Functions
    • B. Level Sets and Intersections
    • C. Derivative Tests
    • D. Convexity in Operations
    • E. Convex Hulls
    • F. Closures and Continuity
    • G Separation∗
    • H∗ Relative Interiors
    • I Piecewise Linear Functions∗
    • J Other Examples∗
    • Commentary
  • Chapter 3. Cones and Cosmic Closure
    • A. Direction Points
    • B. Horizon Cones
    • C. Horizon Functions
    • D. Coercivity Properties
    • E Cones and Orderings∗
    • F∗ Cosmic Convexity
    • G Positive Hulls∗
    • Commentary
  • Chapter 4. Set Convergence
    • A. Inner and Outer Limits
    • B. Painlev´e-Kuratowski Convergence
    • C. Pompeiu-Hausdorff Distance
    • D. Cones and Convex Sets
    • E. Compactness Properties
    • F. Horizon Limits
    • G Continuity of Operations∗
    • H∗ Quantification of Convergence
    • I Hyperspace Metrics∗
    • Commentary
  • Chapter 5. Set-Valued Mappings
    • A. Domains, Ranges and Inverses
    • B. Continuity and Semicontinuity
    • C. Local Boundedness v
    • D. Total Continuity
    • E. Pointwise and Graphical Convergence
    • F. Equicontinuity of Sequences
    • G. Continuous and Uniform Convergence
    • H∗ Metric Descriptions of Convergence
    • I Operations on Mappings∗
    • J Generic Continuity and Selections∗
    • Commentary
  • Chapter 6. Variational Geometry
    • A. Tangent Cones
    • B. Normal Cones and Clarke Regularity
    • C. Smooth Manifolds and Convex Sets
    • D. Optimality and Lagrange Multipliers
    • E. Proximal Normals and Polarity
    • F. Tangent-Normal Relations
    • G Recession Properties∗
    • H∗ Irregularity and Convexification
    • I Other Formulas∗
    • Commentary
  • Chapter 7. Epigraphical Limits
    • A. Pointwise Convergence
    • B. Epi-Convergence
    • C. Continuous and Uniform Convergence
    • D. Generalized Differentiability
    • E. Convergence in Minimization
    • F. Epi-Continuity of Function-Valued Mappings
    • G Continuity of Operations∗
    • H∗ Total Epi-Convergence
    • I Epi-Distances∗
    • J Solution Estimates∗
    • Commentary
  • Chapter 8. Subderivatives and Subgradients
    • A. Subderivatives of Functions
    • B. Subgradients of Functions
    • C. Convexity and Optimality
    • D. Regular Subderivatives
    • E. Support Functions and Subdifferential Duality
    • F. Calmness
    • G. Graphical Differentiation of Mappings
    • H∗ Proto-Differentiability and Graphical Regularity
    • I Proximal Subgradients∗
    • J Other Results∗
    • Commentary
  • Chapter 9. Lipschitzian Properties
    • A. Single-Valued Mappings
    • B. Estimates of the Lipschitz Modulus
    • C. Subdifferential Characterizations
    • D. Derivative Mappings and Their Norms
    • E. Lipschitzian Concepts for Set-Valued Mappings
    • F. Aubin Property and Mordukhovich Criterion vi
    • G. Metric Regularity and Openness
    • H∗ Semiderivatives and Strict Graphical Derivatives
    • I Other Properties∗
    • J Rademacher’s Theorem and Consequences∗
    • K Mollifiers and Extremals∗
    • Commentary
  • Chapter 10. Subdifferential Calculus
    • A. Optimality and Normals to Level Sets
    • B. Basic Chain Rule and Consequences
    • C. Parametric Optimality
    • D. Rescaling
    • E. Piecewise Linear-Quadratic Functions
    • F. Amenable Sets and Functions
    • G. Semiderivatives and Subsmoothness
    • H∗ Coderivative Calculus
    • I Extensions∗
    • Commentary
  • Chapter 11. Dualization
    • A. Legendre-Fenchel Transform
    • B. Special Cases of Conjugacy
    • C. The Role of Differentiability
    • D. Piecewise Linear-Quadratic Functions
    • E. Polar Sets and Gauges
    • F. Dual Operations
    • G. Duality in Convergence
    • H. Dual Problems of Optimization
    • I. Lagrangian Functions
    • J Minimax Problems∗
    • K Augmented Lagrangians and Nonconvex Duality∗
    • L Generalized Conjugacy∗
    • Commentary
  • Chapter 12. Monotone Mappings
    • A. Monotonicity Tests and Maximality
    • B. Minty Parameterization
    • C. Connections with Convex Functions
    • D. Graphical Convergence
    • E. Domains and Ranges
    • F∗ Preservation of Maximality
    • G Monotone Variational Inequalities∗
    • H∗ Strong Monotonicity and Strong Convexity
    • I Continuity and Differentiability∗
    • Commentary
  • Chapter 13. Second-Order Theory
    • A. Second-Order Differentiability
    • B. Second Subderivatives
    • C. Calculus Rules
    • D. Convex Functions and Duality
    • E. Second-Order Optimality
    • F. Prox-Regularity
    • G. Subgradient Proto-Differentiability vii
    • H. Subgradient Coderivatives and Perturbation
    • I Further Derivative Properties∗
    • J Parabolic Subderivatives∗
    • Commentary
  • Chapter 14. Measurability
    • A. Measurable Mappings and Selections
    • B. Preservation of Measurability
    • C. Limit Operations
    • D. Normal Integrands
    • E. Operations on Integrands
    • F. Integral Functionals
    • Commentary
  • References
  • Index of Statements
  • Index of Notation
  • Index of Topics

2 1. Max and Min

argminC f := argmin x∈C

f (x)

x ∈ C

∣ (^) f (x) = infC f

if infC f = ∞, ∅ if infC f = ∞, argmaxC f := argmax x∈C

f (x)

x ∈ C

∣ (^) f (x) = sup C f^

if supC f = −∞, ∅ if supC f = −∞.

Note that we don’t regard the minimum as being attained at any x ∈ C when

f ≡ ∞ on C, even though we may write minC f = ∞ in that case, nor do we

regard the maximum as being attained at any x ∈ C when f ≡ −∞ on C. The

reasons for these exceptions will be explained shortly. Quite apart from whether

infC f < ∞ or supC f > −∞, the sets argminC f and argmaxC f could be

empty in the absence of appropriate conditions of continuity, boundedness or

growth. A simple and versatile statement of such conditions will be devised in

this chapter.

The roles of ∞ and −∞ deserve close attention here. Let’s look specifically

at minimizing f over C. If there is a point x ∈ C where f (x) = −∞, we know

at once that x furnishes the minimum. Points x ∈ C where f (x) = ∞, on the

other hand, have virtually the opposite significance. They aren’t even worth

contemplating as candidates for furnishing the minimum, unless f has ∞ as

its value everywhere on C, a case that can be set aside as expressing a form of

degeneracy—which we underline by defining argminC f to be empty then. In

effect, the side condition f (x) < ∞ is considered to be implicit in minimizing

f (x) over x ∈ C. Everything of interest is the same as if we were minimizing

over C′^ :=

x ∈ C

∣ (^) f (x) < ∞

instead of C.

A. Penalties and Constraints

This gives birth to an important idea in the context of C being a subset of IRn.

Perhaps f is merely real-valued on C, but whether this is true or not, we can

transform the problem of minimizing f over C into one of minimizing f over

all of IRn^ just by defining (or as the case may be, redefining) f (x) to be ∞

for all the points x ∈ IRn^ such that x ∈ C. This helps in thinking abstractly

about minimization and in achieving a single framework for the development

of properties and results.

1.1 Example (equality and inequality constraints). A set C ⊂ IRn^ may be

specified as consisting of the vectors x = (x 1 ,... , xn) such that

x ∈ X and

fi(x) ≤ 0 for i ∈ I 1 , fi(x) = 0 for i ∈ I 2 ,

where X is some subset of IRn^ and I 1 and I 2 are index sets for families of

functions fi : IRn^ → IR called constraint functions. The conditions fi(x) ≤ 0

A. Penalties and Constraints 3

are inequality constraints on x, while those of form fi(x) = 0 are equality

constraints; the condition x ∈ X (where in particular X could be all of IRn) is

an abstract or geometric constraint.

A problem of minimizing a function f 0 : IRn^ → IR subject to all of these

constraints can be identified with the problem of minimizing the function f :

IRn^ → IR defined by taking f (x) = f 0 (x) when x satisfies the constraints but

f (x) = ∞ otherwise. The possibility of having inf f = ∞ corresponds then to

the possibility that C = ∅, i.e., that the constraints may be inconsistent.

C

Fig. 1–1. A set defined by inequality constraints.

Constraints can also have the form fi(x) ≤ ci, fi(x) = ci or fi(x) ≥ ci for

values ci ∈ IR, but this doesn’t add real generality because fi can always be

replaced by fi − ci or ci − fi. Strict inequalities are rarely seen in constraints,

however, since they could threaten the attainment of a maximum or minimum.

An abstract constraint x ∈ X is often convenient in representing conditions

of a more complicated or open-ended nature, to be worked out later, but also

for conditions too simple to be worth introducing constraint functions for, such

as upper or lower bounds on the variables xj as components of x.

1.2 Example (box constraints). A set X ⊂ IRn^ is called a box if it is a product

X 1 × · · · × Xn of closed intervals Xj of IR, not necessarily bounded. The

condition x ∈ X, a box constraint on x = (x 1 ,... , xn), then restricts each

variable xj to Xj. For instance, the nonnegative orthant

IRn + :=

x = (x 1 ,... , xn)

∣ (^) xj ≥ 0 for all j

= [0, ∞)n

is a box in IRn; the constraint x ∈ IRn + restricts all variables to be nonnegative.

With X = IRs + × IRn−s^ = [0, ∞)s^ × (−∞, ∞)n−s, only the first s variables xj

would have to be nonnegative. In other cases, useful for technical reasons, some

intervals Xj could have the degenerate form [cj , cj ], which would force xj = cj.

Constraints refer to the structure of the set over which the minimization or

maximization should effectively take place, and in the approach of identifying

a problem with a function f : IR n → IR they enter the specification of f. But

the structure of the function being minimized or maximized can be affected by

constraint representations in other ways as well.

A. Penalties and Constraints 5

Everything said about minimization can be translated into the language

of maximization, with −∞ taking the part of ∞. Such symmetry is reassur-

ing, but it must be understood that a basic asymmetry is implicit too in the

approach we’re taking. In passing from the minimization of a given function

over C to the minimization of a corresponding function over IRn, we’ve resorted

to an extension by the value ∞, but in the case of maximization it would be

−∞. The extended function would then be different, and so would be the

properties we’d like it to have. In effect we’re abandoning any predisposition

toward having a theory that treats maximization and minimization together on

an equal footing. In the assumptions eventually imposed to identify the classes

of functions most suitable for applying these operations, we mark out separate

territories for each.

In actual practice there’s rarely a need to consider both minimization and

maximization simultaneously for a single combination of a function f and a

set C, so this approach causes no discomfort. Rather than spend too many

words on parallel statements, we adopt minimization as the vehicle of expo-

sition and mention maximization only from time to time, taking for granted

that the reader will generally understand the accommodations needed in that

direction. We thereby enter a pattern of working mainly with extended-real-

valued functions on IRn^ and treating them in a one-sided manner where ∞ has

a qualitatively different role from that of −∞ in our formulas, and where the

terminology and notation reflect this bias.

Starting off now on this path, we introduce for f : IRn^ → IR the set

dom f :=

x ∈ IR n ∣∣ f (x) < ∞

called the effective domain of f , and write

inf f := infx f (x) := inf x∈IRn^

f (x) = inf x∈dom f

f (x),

argmin f := argminx f (x) := argmin x∈IRn

f (x) = argmin x∈dom f

f (x).

We call f a proper function if f (x) < ∞ for at least one x ∈ IRn, and f (x) >

−∞ for all x ∈ IRn, or in other words, if dom f is a nonempty set on which f is

finite; otherwise it is improper. The proper functions f : IR n → IR are thus the

ones obtained by taking a nonempty set C ⊂ IRn^ and a function f : C → IR,

and putting f (x) = ∞ for all x ∈ C. All other kinds of functions f : IRn^ → IR

are termed improper in this context. While proper functions are our central

concern, improper functions may arise indirectly and can’t always be excluded

from consideration.

The developments so far can be summarized as follows in the language of

optimization.

1.4 Example (principle of abstract minimization). Problems of minimizing a

finite function over some subset of IRn^ correspond one-to-one with problems of

minimizing over all of IRn^ a function f : IRn^ → IR, under the identifications:

6 1. Max and Min

dom f = set of feasible solutions, argmin f = set of optimal solutions, inf f = optimal value.

The convention that argmin f = ∅ when f ≡ ∞ ensures that a problem

is not regarded as having an optimal solution if it doesn’t even have a feasible

solution. A lack of feasible solutions is signaled by the optimal value being ∞.

argmin f

f

IR

IRn

Fig. 1–3. Local and global optimality in a difficult yet classical case.

It should be emphasized here that the notation argmin f refers to points

x¯ giving a global minimum of f. A local minimum occurs at ¯x if f (¯x) < ∞ and

f (x) ≥ f (¯x) for all x ∈ V , where

V ∈ N (¯x) := the collection of all neighborhoods of x.¯

Then ¯x is a locally optimal solution to the problem of minimizing f. By a

neighborhood of x one means any set having x in its interior, for example a

closed ball

IB(x, λ) :=

x′^

∣ (^) d(x, x′) ≤ λ

where we use the notation

d(x, x′) :=^ |x^ −^ x′|^ (Euclidean distance),^ with

|x| := |(x 1 ,... , xn)| =

x^21 + · · · + x^2 n (Euclidean norm).

A point ¯x giving a local minimum of f can also be viewed as giving the global

minimum in an auxiliary problem in which the function agrees with f on some

neighborhood of ¯x but takes the value ∞ elsewhere, so the study of local

optimality can to a large extent be subsumed into the study of global optimality.

An extremely useful type of function in the framework we’re adopting is

the indicator function δC of a set C ⊂ IRn, which is defined by

δC (x) = 0 if^ x^ ∈^ C,^ δC (x) =^ ∞^ if^ x^ ∈^ C.

The indicator functions on IRn^ are characterized as a class by taking on no value

other than 0 or ∞. The constant function 0 is the indicator of C = IRn, while

8 1. Max and Min

Every property of f has its counterpart in a property of epi f , because the

correspondence between functions and epigraphs is one-to-one. Many proper-

ties also relate very naturally to the various level sets of f. In general, we’ll

find it useful to have the notation

lev≤α f :=

x ∈ IR n ∣∣ f (x) ≤ α

lev<α f :=

x ∈ IRn^

∣ (^) f (x) < α

lev=α f :=

x ∈ IRn^

∣ (^) f (x) = α

lev>α f :=

x ∈ IRn^

∣ (^) f (x) > α

lev≥α f :=

x ∈ IRn^

∣ (^) f (x) ≥ α

The most important of these in the context of minimization are the lower level

sets lev≤α f. For α finite, they correspond to the ‘horizontal cross sections’ of

epi f. For α = inf f , one has lev≤α f = lev=α f = argmin f.

epi f

f α

n dom f

lev_α< f

IR

IR

Fig. 1–4. Epigraph and effective domain of an extended-real-valued function.

We’re ready now to answer a basic question about a function f : IR n → IR.

What property of f translates into the sets lev≤α f all being closed? The

answer depends on a one-sided concept of limit.

1.5 Definition (lower limits and lower semicontinuity). The lower limit of a

function f : IRn^ → IR at x¯ is the value in IR defined by

lim inf x→x¯

f (x) : = lim δ ↘^0

[

inf x∈IB(¯x,δ)

f (x)

]

= sup δ> 0

[

inf x∈IB(¯x,δ)

f (x)

]

= sup V ∈N (¯x)

[

inf x∈V

f (x)

]

The function f : IRn^ → IR is lower semicontinuous (lsc) at ¯x if

lim inf x→x¯

f (x) ≥ f (¯x), or equivalently lim inf x→¯x

f (x) = f (¯x), 1(2)

and lower semicontinuous on IRn^ if this holds for every x¯ ∈ IRn.

The two versions in 1(2) agree because inf

f (x)

∣ (^) x ∈ IB(¯x, δ)

≤ f (¯x) for

B. Epigraphs and Semicontinuity 9

all δ > 0. For this reason too,

lim inf x→x¯

f (x) ≤ f (¯x) always. 1(3)

In replacing the limit as δ ↘^ 0 by the supremum over δ > 0 in 1(1) we appeal

to the general fact that

inf x∈X 1

f (x) ≤ inf x∈X 2

f (x) when X 1 ⊃ X 2.

1.6 Theorem (characterization of lower semicontinuity). The following proper-

ties of a function f : IRn^ → IR are equivalent:

(a) f is lower semicontinuous on IR n ; (b) the epigraph set epi f is closed in IRn^ × IR; (c) the level sets of type lev≤α f are all closed in IR n .

These equivalences will be established after some preliminaries. An exam-

ple of a function on IR that happens to be lower semicontinuous at every point

but two is displayed in Figure 1–5. Notice how the defect is associated with

the failure of the epigraph to include all of its boundary.

f

epi f

x n

IR

IR

Fig. 1–5. An example where lower semicontinuity fails.

In the proof of Theorem 1.6 and throughout the book, we use sequence

notation in which the running index is always superscript ν (Greek ‘nu’). We

symbolize the natural numbers by IN, so that ν ∈ IN means ν = 1, 2 ,.. .. The

notation xν^ → x, or x = limν xν^ , refers then to a sequence

xν^

ν∈IN in^ IR

n

that converges to x, i.e., has |xν^ − x| → 0 as ν → ∞. We speak of x as a

cluster point of xν^ as ν → ∞ if, instead of necessarily claiming xν^ → x, we

wish merely to assert that some subsequence converges to x. (Every bounded

sequence in IRn^ has at least one cluster point. A sequence in IRn^ converges to

x if and only if it is bounded and has x as its only cluster point.)

1.7 Lemma (characterization of lower limits).

lim inf x→x¯

f (x) = min

α ∈ IR

∣ (^) ∃xν^ → x¯ with f (xν^ ) → α

(Here the constant sequence xν^ ≡ x¯ is admitted and yields α = f (¯x).)

C. Attainment of a Minimum 11

be true that f (xν^ ) ≤ α, or in other words, that xν^ belongs to lev≤α f. Since

xν^ → ¯x, this level set, which by assumption is closed, must contain ¯x. Thus

we have f (¯x) ≤ α for every α > α¯. Obviously, then, f (¯x) ≤ α¯.

When Theorem 1.6 is applied to indicator functions, it reduces to the fact

that δC is lsc if and only if the set C is closed. The lower semicontinuity of

a general function f : IRn^ → IR doesn’t require dom f to be closed, however,

even when dom f happens to be bounded. Figure 1–6 illustrates this.

C. Attainment of a Minimum

Another question can now be addressed. What conditions on a function f :

IRn^ → IR ensure that f attains its minimum over IRn^ at some x, i.e., that the

set argmin f is nonempty? The issue is central because of the wide spectrum

of minimization problems that can be put into this simple-looking form.

A fact customarily cited is this: a continuous function on a compact set

attains its minimum. It also, of course, attains its maximum; this assertion

is symmetric with respect to max and min. A more flexible approach is de-

sirable, however. We don’t always wish to single out a compact set, and con-

straints might not even be present. The very distinction between constrained

and unconstrained minimization is suppressed in working with the principle of

abstract minimization in 1.4, not to mention problem formulations involving

penalty expressions as in 1.3. It’s all just a matter of whether the function f

being minimized takes on the value ∞ in some regions or not. Another feature

is that the functions we want to deal with may be far from continuous. The one

in Figure 1–6 is a case in point, but that function f does attain its minimum.

A property that’s crucial in this regard is the following.

1.8 Definition (level boundedness). A function f : IR n → IR is (lower) level-

bounded if for every α ∈ IR the set lev≤α f is bounded (possibly empty).

Note that only finite values of α are considered in this definition. The level

boundedness property corresponds to having f (x) → ∞ as |x| → ∞.

1.9 Theorem (attainment of a minimum). Suppose f : IR n → IR is lower semi-

continuous, level-bounded and proper. Then the value inf f is finite and the

set argmin f is nonempty and compact.

Proof. Let ¯α = inf f ; because f is proper, ¯α < ∞. For α ∈ ( ¯α, ∞), the set

lev≤α f is nonempty; it’s closed because f is lsc (cf. 1.6) and bounded because

f is level-bounded. The sets lev≤α f for α ∈ ( ¯α, ∞) are therefore compact

and nested: lev≤α f ⊂ lev≤β f when α < β. The intersection of this family of

sets, which is lev≤ α¯ f = argmin f , is therefore nonempty and compact. Since f

doesn’t have the value −∞ anywhere, we conclude also that ¯α is finite. Under

these circumstances, inf f can be written as min f.

12 1. Max and Min

1.10 Corollary (lower bounds). If f : IRn^ → IR is lsc and proper, then it

is bounded from below (finitely) on each bounded subset of IRn^ and in fact

attains a minimum relative to any compact subset of IR n that meets dom f.

Proof. For any bounded set B ⊂ IRn^ apply the theorem to the function g

defined by g(x) = f (x) when x ∈ cl B but g(x) = ∞ when x /∈ cl B. The case

where g ≡ ∞ can be dealt with as a triviality, while in all other cases g is lsc,

level-bounded and proper.

The conclusion of Theorem 1.9 would hold with level boundedness replaced

by the weaker assumption that, for some α ∈ IR, the set lev≤α f is bounded

and nonempty; this is easily gleaned from the proof. But level boundedness is

more convenient to work with in applications, and it’s typically present anyway

in situations where the attainment of a minimum is sought.

The crucial ingredient in Theorem 1.9 is the fact that when f is both

lsc and level-bounded it is inf-compact, which means that the sets lev≤α f for

α ∈ IR are all compact. This property is very flexible in providing a criterion

for the existence of optimal solutions, and it can be applied to a variety of

problems, with or without constraints.

1.11 Example (level boundedness relative to constraints). For a problem of

minimizing a continuous function f 0 : IRn^ → IR over a nonempty, closed set

C ⊂ IRn, if all sets of the form

{ x ∈ C

∣ (^) f 0 (x) ≤ α

for α ∈ IR

are bounded, then the minimum of f 0 over C is finite and attained on a

nonempty, compact subset of C.

This criterion is fulfilled in particular if C is bounded or if f 0 is level

bounded, with the latter condition covering even the case of unconstrained

minimization, where C = IRn.

Detail. The problem corresponds to minimizing f = f 0 + δC over IR n

. Here

f is proper because C = ∅, and it’s lsc by 1.6 because its level sets of the form

C ∩

x

∣ (^) f 0 (x) ≤ α

for α < ∞ are closed—by virtue of the closedness of C

and the continuity of f 0. In assuming these sets are also bounded, we get the

desired conclusions from 1.9.

An illustration of existence in the pattern of Example 1.11 with C not

necessarily bounded but f 0 inf-compact is furnished by f 0 (x) = |x|. The min-

imization problem consists then of finding the point or points of C nearest to

the origin of IR n

. Theorem 1.9 is also applicable, of course, to minimization

problems that do not fit the pattern of 1.11 at all. For instance, in minimizing

the function in Figure 1–6 one isn’t simply minimizing a continuous function

relative to a closed set, but the conditions in 1.9 are satisfied and a minimizing

point exists. This is the kind of situation encountered in general when dealing

with barrier functions, for instance.