Subgradients and Optimality Conditions, Summaries of Mathematical finance

An in-depth exploration of subgradients, a fundamental concept in convex optimization. It covers the definition and properties of subgradients, including strong and weak subgradient calculus, optimality conditions via subgradients, and the relationship between subgradients and directional derivatives. The document delves into various examples and applications, such as piecewise linear minimization, constrained optimization, and the connection between subgradients and descent directions. It serves as a comprehensive resource for understanding the role of subgradients in convex analysis and optimization, particularly in the context of nondifferentiable functions. The content is drawn from the ee364b course at stanford university, providing a rigorous and insightful treatment of this important topic.

Typology: Summaries

2022/2023

Uploaded on 05/07/2024

yong-zhu
yong-zhu 🇨🇳

1 document

1 / 32

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Subgradients
subgradients
strong and weak subgradient calculus
optimality conditions via subgradients
directional derivatives
EE364b, Stanford University
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20

Partial preview of the text

Download Subgradients and Optimality Conditions and more Summaries Mathematical finance in PDF only on Docsity!

Subgradients

•^

subgradients

-^

strong and weak subgradient calculus

-^

optimality conditions via subgradients

-^

directional derivatives

EE364b, Stanford University

Basic inequality

recall basic inequality for convex differentiable

f

f^ (

y)

f

(x

f^

(x

T)

(y

x

•^

first-order approximation of

f

at

x

is global underestimator

•^

f^ (

x)

,^ −

supports

epi

f

at

x, f

(x

what if

f

is not differentiable?

EE364b, Stanford University

•^

g^

is a subgradient of

f

at

x

iff

g,

supports

epi

f

at

x, f

(x

•^

g^

is a subgradient iff

f

(x

g

T^

(y

x

)^

is a global (affine)

underestimator of

f

•^

if^

f^

is convex and differentiable,

f^

(x

)^

is a subgradient of

f

at

x

subgradients come up in several contexts:^ •

algorithms for nondifferentiable convex optimization • convex analysis,

e.g.

, optimality conditions, duality for nondifferentiable

problems (if

f

(y

)^

f

(x

g

T^

(y

x

)^

for all

y

, then

g

is a

supergradient

EE364b, Stanford University

Example

f^

= max

{f

, f 1

, with

f

f

2

convex and differentiable

x^0

f^1

(x

)

f^2

(x

)

f^ (

x)

•^

f^1

(x

f

x^0

): unique subgradient

g

f^1

(x

•^

f^2

(x

f

x^0

): unique subgradient

g

f^2

(x

•^

f^1

(x

f

x^0

): subgradients form a line segment

[∇

f^1

(x

,^ ∇

f^2

(x

)] 0

EE364b, Stanford University

Example

f^ (

x) =

|x

f^ (

x) =

|x

|^

∂f

(x

)

x

x

1

1

righthand plot shows

(x, g

)^

|^ x

R

, g

∂f

(x

EE364b, Stanford University

Subgradient calculus

•^

weak subgradient calculus

: formulas for finding

one

subgradient

g^

∂f

(x

•^

strong subgradient calculus

: formulas for finding the whole

subdifferential

∂f

(x

i.e.

,^

all

subgradients of

f

at

x

•^

many algorithms for nondifferentiable convex optimization require only one

subgradient at each step, so weak calculus suffices

-^

some algorithms, optimality conditions, etc., need whole subdifferential

-^

roughly speaking: if you can compute

f

(x

), you can usually compute a

g^

∂f

(x

•^

we’ll assume that

f

is convex, and

x

relint dom

f

EE364b, Stanford University

f^ (

x) = max

{f

x)

,... , f

m

(x

, with

f

,... , f 1

m

differentiable

∂f

(x

Co

fi

(x

)^

|^ f

(i x) =

f

(x

example:

f

(x

x‖

1

= max

{s

T^ x

|^

si

1 1

1

1

∂f

(x

)^

at

x

= (

,^ 0)

1

1 −

1 at

x

= (

,^ 0)

(1,1)

at

x

= (

,^ 1)

EE364b, Stanford University

Pointwise supremum

if^

f^

= sup

α∈A

cl Co

{∂f

(β x)

|^

(x

f

(x

∂f

(x

(usually get equality, but requires some technical conditions to hold,

e.g.

A

compact,

f

α^

cts in

x

and

α

roughly speaking,

∂f

(x

)^

is closure of convex hull of union of

subdifferentials of active functions EE364b, Stanford University

example

f^ (

x) =

λ

max

(A

(x

sup ‖y‖

=1 2

T y

A

(x

)y

where

A

(x

A

0

x

A 1

1

x

An

,n

A

∈i

S

k

•^

f^

is pointwise supremum of

g

(y x) =

y

T^

A

(x

)y

over

y‖

2

•^

gy

is affine in

x

, with

gy

(x

Ty

A

y,... , y 1

T^

A

yn

•^

hence,

∂f

(x

)^

Co

gy

A

(x

)y

λ

max

(A

(x

y,

y‖

2

(in fact equality holds here) to find

one

subgradient at

x

, can choose

any

unit eigenvector

y

associated

with

λ

max

(A

(x

; then

(y

T^ A

y,... , y 1

T^

A

yn

)^

∂f

(x

EE364b, Stanford University

Expectation

•^

f^ (

x) =

E

f

(x, ω

), with

f

convex in

x

for each

ω

,^ ω

a random variable

•^

for each

ω

, choose

any

g

ω^

(f x, ω

)^

(so

ω

g

ω^

is a function)

•^

then,

g

E

g

ω^

∂f

(x

Monte Carlo method for (approximately) computing

f

(x

)^

and

a

g

∂f

(x

•^

generate independent samples

ω

,... , ω 1

K

from distribution of

ω

•^

f^ (

x)

/K

Ki=

f (x, ω

)i

•^

for each

i

choose

g

∈i

fx (x, ω

)i

•^

g^

/K

Ki=

g i^

is an (approximate) subgradient

(more on this later) EE364b, Stanford University

Composition

•^

f^ (

x) =

h

(f

x)

,... , f

(k x))

, with

h

convex nondecreasing,

f

i^

convex

•^

find

q

∂h

(f

x)

,... , f

(k x))

,^ g

∈i

∂f

(i x)

•^

then,

g

q

g 1 1

q

gk k^

∂f

(x

•^

reduces to standard formula for differentiable

h

,^ f

i

proof:

f^

(y

)^

h(

f^1

(y

),... , f

(k y))

h(

f^1

(x

g

T 1

(y

x

),... , f

(k x) +

g

T(k^

y^

x

h(

f^1

(x

),... , f

(k x)) +

q

T^ (

Tg 1

y^

x

),... , g

Tk^

(y

x

f^ (

x) +

g

T^ (

y^

x

EE364b, Stanford University

Subgradients and sublevel sets

g^

is a subgradient at

x

means

f

(y

)^

f

(x

g

T^

(y

x

hence

f

(y

)^

f

(x

g

T^

(y

x

)^

f^ (

x)

f

(x

x ) 0 0

g^

∂f

(x

) 0

x^1 ∇f

(x

) 1

EE364b, Stanford University

Quasigradients

g^

is a

quasigradient

of

f

at

x

if

T g (y

x

)^

f

(y

)^

f

(x

holds for all

y

g

x

f^ (

y)

f

(x

)

quasigradients at

x

form a cone

EE364b, Stanford University

example:

f^ (

x) =

a T^ x

b

T c x

d

,^

(dom

f

x^

|^ c

T^ x

d >

g^

a

f

(x

c^

is a quasigradient at

x

0

proof: for

c

T^ x

d >

T a (x

x

f

(x

Tc (x

x

f

(x

)^

f

(x

EE364b, Stanford University