Optimal Methods for Unconstrained Minimization: Convergence Analysis - Prof. Sergiy Butenk, Study notes of Systems Engineering

An analysis of the convergence properties of optimal methods for unconstrained minimization of functions from the class s1,1 µ,l(rn). The scheme of optimal methods, the theorem of convergence, and the rate of convergence. It also discusses the relationship between the lower complexity bound and the proposed method.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-kpr
koofers-user-kpr 🇺🇸

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1/16
ISEN 629: Engineering Optimization
Lecture 9
Sergiy Butenko
Industrial and Systems Engineering
Texas A&M University
Fall 2007
2/16
Optimal methods
Next, we need to make sure that the condition of Lemma (2.2.1) is
satisfied. Assume that we already have xksuch that
φ
kf(xk).
Then by the above lemma, we have
φ
k+1 (1 αk)f(xk)+αkf(yk)α2
k
2γk+1 f(yk)2
+αk(1αk)γk
γk+1 f(yk)T(vkyk).
Since f(xk)f(yk)+f(yk)T(xkyk), we get
φ
k+1 f(yk)α2
k
2γk+1 f(yk)2
+(1 αk)f(yk)T(αkγk
γk+1 (vkyk)+xkyk).
We want to ha ve φ
k+1 f(xk+1)...
3/16
Optimal methods
From f(x)f(y)f(y)T(xy)+ L
2xy2,wecanensure
the inequality
f(yk)1
2Lf(yk)2f(xk+1)
by, e.g., taking the gradient step xk+1 =ykhkf(yk) with hk=1
L
and using the inequality f(y)f(x)f(x)T(yx)L
2xy2
with y=xk+1 and x=yk. Let us define αkas follows:
Lα2
k=γk+1 =(1αk)γk+αkµ.
Then α2
k
2γk+1 =1
2Land we have
φ
k+1 f(xk+1)+(1αk)f(yk)Tαkγk
γk+1
(vkyk)+xkyk.
Since we are free to choose yk, we can find it from the equation
αkγk
γk+1
(vkyk)+xkyk=0,
yielding yk=αkγkvk+γk+1xk
γk+αkµ.
4/16
General scheme of optimal method
0. Choose x0Rnand γ0>0. Set v0=x0.
1. k-th iteration (k0):
a) Compute αk(0,1) from equation
Lα2
k=(1αk)γk+αkµ.
Set γk+1 =(1αk)γk+αkµ.
b) Choose
yk=αkγkvk+γk+1xk
γk+αkµ
and compute f(yk)andf(yk).
c) Find xk+1 such that
f(xk+1)f(yk)1
2Lf(yk)2
d) Set vk+1 =(1αk)γkvk+αkµykαkf(yk))
γk+1 .
pf3
pf4

Partial preview of the text

Download Optimal Methods for Unconstrained Minimization: Convergence Analysis - Prof. Sergiy Butenk and more Study notes Systems Engineering in PDF only on Docsity!

1/

ISEN 629: Engineering Optimization

Lecture 9

Sergiy Butenko

Industrial and Systems Engineering Texas A&M University

Fall 2007

2/

Optimal methods

Next, we need to make sure that the condition of Lemma (2.2.1) is satisfied. Assume that we already have xk such that

φ∗ k ≥ f (xk ).

Then by the above lemma, we have

φ∗ k+1 ≥ (1 − αk )f (xk ) + αk f (yk ) − α

(^2) k 2 γk+1 ‖f^

′(yk )‖ 2

  • αk^ (1 γ−k+1αk )γkf ′(yk )T^ (vk − yk ).

Since f (xk ) ≥ f (yk ) + f ′(yk )T^ (xk − yk ), we get

φ∗ k+1 ≥ f (yk ) − α

(^2) k 2 γk+1 ‖f^

′(yk )‖ 2

+(1 − αk )f ′(yk )T^ ( α γkk^ +1γk (vk − yk ) + xk − yk ).

We want to have φ∗ k+1 ≥ f (xk+1)...

3/

Optimal methods

From f (x) − f (y ) ≤ f ′(y )T^ (x − y ) + L 2 ‖x − y ‖^2 , we can ensure the inequality

f (yk ) −

2 L

‖f ′(yk )‖^2 ≥ f (xk+1)

by, e.g., taking the gradient step xk+1 = yk − hk f ′(yk ) with hk = (^1) L and using the inequality f (y ) − f (x) − f ′(x)T^ (y − x) ≤ L 2 ‖x − y ‖^2 with y = xk+1 and x = yk. Let us define αk as follows: Lα^2 k = γk+1 = (1 − αk )γk + αk μ.

Then α

(^2) k 2 γk+1 =^

1 2 L and we have

φ∗ k+1 ≥ f (xk+1) + (1 − αk )f ′(yk )T

αk γk γk+

(vk − yk ) + xk − yk

Since we are free to choose yk , we can find it from the equation αk γk γk+

(vk − yk ) + xk − yk = 0,

yielding yk = αk^ γk γ^ vkk +^ +αγkk μ+1 xk. (^) 4/

General scheme of optimal method

  1. Choose x 0 ∈ Rn^ and γ 0 > 0. Set v 0 = x 0.
  2. k-th iteration (k ≥ 0): a) Compute αk ∈ (0, 1) from equation

Lα^2 k = (1 − αk )γk + αk μ.

Set γk+1 = (1 − αk )γk + αk μ. b) Choose yk =

αk γk vk + γk+1xk γk + αk μ and compute f (yk ) and f ′(yk ). c) Find xk+1 such that

f (xk+1) ≤ f (yk ) − 1 2 L

‖f ′(yk )‖^2

d) Set vk+1 = (1−αk^ )γk^ vk^ +αk^ μyk^ −αk^ f^

′(yk )) γk+.

5/

Theorem (2.2.1)

The above scheme generates a sequence {xk : k ≥ 0 } such that

f (xk ) − f ∗^ ≤ λk [f (x 0 ) − f ∗^ +

γ 0 2

‖x 0 − x∗‖^2 ],

where λ 0 = 1 and λk =

k∏− 1 i=

(1 − αi ).

Proof: Choose φ 0 (x) = f (x 0 ) + γ 20 ‖x − v 0 ‖^2. Then f (x 0 ) = φ∗ 0 and we get f (xk ) ≤ φ∗ k for any k. Therefore, we can apply Lemma (2.2.1) 

6/

To estimate the rate of convergence of {f (xk ) : k ≥ 0 }, we can use the rate of convergence of {λk }.

Lemma (2.2.4)

If γ 0 ≥ μ then

λk ≤ min

μ L

)k ,

4 L

L + k

γ 0 )^2

Optimal methods

Theorem (2.2.2)

Choose γ 0 = L. Then our scheme generates a sequence {xk : k ≥ 0 } such that

f (xk ) − f ∗^ ≤ L min

μ L

)k ,

(k + 2)^2

‖x 0 − x∗‖^2.

This means that our scheme is optimal for unconstrained minimization of functions from S μ,^1 ,^1 L(Rn), μ ≥ 0.

Optimal methods

Proof: We will use a property of f ∈ F^1 L ,^1 (Rn):

f (y ) − f (x) − f ′(x)T^ (y − x) ≤

L

‖x − y ‖^2 , ∀x, y ∈ Rn.

Since for y = x∗: f ′(x∗) = 0, we have

f (x 0 ) − f ∗^ ≤

L

‖x 0 − x∗‖^2.

Hence, from the previous theorem and lemma,

f (xk ) − f ∗^ ≤ λk [f (x 0 ) − f ∗^ + γ 20 ‖x 0 − x∗‖^2 ]

≤ min

μ L

)k , (^) (2√L+^4 Lk√γ 0 ) 2

L‖x 0 − x∗‖^2

= L min

μ L

)k , (^) (k+2)^4

‖x 0 − x∗‖^2.

Next, we will show that our scheme is indeed optimal for S μ,^1 ,^1 L(Rn).

13/

Since α^2 k L = (1 − αk )γk + μαk = γk+1, we have

βk = α αkk+1 (γγk+1k+1 +^ (1αk−+1αμk^ )) = (^) αk (γk+1 α+kα+1 2 γk+1^ (1−αk^ ) k+1L−(1−αk+1^ )γk+1^ ) = (^) αkγ (kγ+1k+1^ (1 +−ααkk+1^ )L) = α αk 2 (1−αk^ ) k +αk+

Note that α^2 k+1 = (1 − αk+1)α^2 k +

μ L

αk+1,

so we can completely eliminate the sequence {γk : k ≥ 0 }.

14/

We obtain the following scheme.

  1. Choose x 0 ∈ Rn^ and α 0 ∈ (0, 1). Set y 0 = x 0 and q = μ/L.
  2. k-th iteration (k ≥ 0): a) Compute f (yk ) and f ′(yk ). Set

xk+1 = yk − 1 L

f ′(yk ).

b) Compute αk+1 ∈ (0, 1) from equation

α^2 k+1 = (1 − αk+1)α^2 k + qαk+1.

Set βk = α αk (^2) k^ (1 +−ααk+1k^ ) ,

yk+1 = xk+1 + βk (xk+1 − xk ).

Optimal methods

Theorem (2.2.3)

If in the above scheme α 0 ≥

μ L

then

f (xk ) − f ∗^ ≤ min

μ L

)k , (^) (2√L+^4 Lk√γ 0 ) 2

×[f (x 0 ) − f ∗^ + γ 20 ‖x 0 − x∗‖^2 ],

where γ 0 = α^0 ( 1 α−^0 Lα− 0 μ).

Proof: The condition α 0 ≥

μ L is equivalent to^ γ^0 ≥^ μ. Therefore the statement of this theorem follows from Theorem (2.2.1) and Lemma (2.2.4). 

Optimal methods

If we choose α 0 =

μ/L then γ 0 = μ, αk =

μ/L, βk =

√ √L−√μ L+√μ and we obtain the following scheme.

  1. Choose y 0 = x 0 ∈ Rn.
  2. k-th iteration (k ≥ 0):

xk+1 = yk − (^1) L f ′(yk ),

yk+1 = xk+1 +

√ √L−√μ L+√μ (xk+1^ −^ xk^ ).