Eigenvalues of matrix, Exams of Mathematics

eigenvalues of matrix algebra best

Typology: Exams

2017/2018

Uploaded on 09/16/2018

Waqas712
Waqas712 🇵🇰

2 documents

1 / 81

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
A Calculus Approach to
Matrix Eigenvalue Algorithms
Habilitationsschrift
der Fakult¨at ur Mathematik und Informatik
der Bayerischen Julius-Maximilians-Universit¨at urzburg
ur das Fach Mathematik vorgelegt von
Knut uper
urzburg im Juli 2002
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51

Partial preview of the text

Download Eigenvalues of matrix and more Exams Mathematics in PDF only on Docsity!

A Calculus Approach to

Matrix Eigenvalue Algorithms

Habilitationsschrift

der Fakult¨at f¨ur Mathematik und Informatik

der Bayerischen Julius-Maximilians-Universit¨at W¨urzburg

f¨ur das Fach Mathematik vorgelegt von

Knut H¨uper

W¨urzburg im Juli 2002

Meiner Frau Barbara

und unseren Kindern Lea, Juval und Noa gewidmet

  • 1 Introduction
  • 2 Jacobi-type Algorithms and Cyclic Coordinate Descent
    • 2.1 Algorithms
      • 2.1.1 Jacobi and Cyclic Coordinate Descent
        • Descent 2.1.2 Block Jacobi and Grouped Variable Cyclic Coordinate
        • tion 2.1.3 Applications and Examples for 1-dimensional Optimiza-
      • 2.1.4 Applications and Examples for Block Jacobi
    • 2.2 Local Convergence Analysis
    • 2.3 Discussion
  • 3 Refining Estimates of Invariant Subspaces
    • 3.1 Lower Unipotent Block Triangular Transformations
    • 3.2 Algorithms
      • 3.2.1 Main Ideas
      • 3.2.2 Formulation of the Algorithm
      • 3.2.3 Local Convergence Analysis
      • 3.2.4 Further Insight to Orderings
    • 3.3 Orthogonal Transformations
      • 3.3.1 The Algorithm
      • 3.3.2 Local Convergence Analysis
      • 3.3.3 Discussion and Outlook
    • eralizations 4 Rayleigh Quotient Iteration, QR-Algorithm, and Some Gen-
    • 4.1 Local Cubic Convergence of RQI
  • CONTENTS - QR-Algorithms 4.2 Parallel Rayleigh Quotient Iteration or Matrix-valued Shifted - 4.2.1 Discussion
    • 4.3 Local Convergence Properties of the Shifted QR-Algorithm

Chapter 1

Introduction

The interaction between numerical linear algebra and control theory has cru- cially influenced the development of numerical algorithms for linear systems in the past. Since the performance of a control system can often be mea- sured in terms of eigenvalues or singular values, matrix eigenvalue methods have become an important tool for the implementation of control algorithms. Standard numerical methods for eigenvalue or singular value computations are based on the QR-algorithm. However, there are a number of compu- tational problems in control and signal processing that are not amenable to standard numerical theory or cannot be easily solved using current numerical software packages. Various examples can be found in the digital filter design area. For instance, the task of finding sensitivity optimal realizations for finite word length implementations requires the solution of highly nonlinear optimization problems for which no standard numerical solution algorithms exist. There is thus the need for a new approach to the design of numerical algorithms that is flexible enough to be applicable to a wide range of com- putational problems as well as has the potential of leading to efficient and reliable solution methods. In fact, various tasks in linear algebra and system theory can be treated in a unified way as optimization problems of smooth functions on Lie groups and homogeneous spaces. In this way the powerful tools of differential geometry and Lie group theory become available to study such problems. Higher order local convergence properties of iterative matrix algorithms are in many instances proven by means of tricky estimates. E.g., the Jacobi method, essentially, is an optimization procedure. The idea behind the proof

At third, we will present a new shifted QR-type algorithm, which is some- how the true generalization of the Rayleigh Quotien Iteration (RQI) to a full symmetric matrix, in the sense, that not only one column (row) of the matrix converges cubically in norm, but the off-diagonal part as a whole. Rather than being a scalar, our shift is matrix valued. A prerequisite for studying this algorithm, called Parallel RQI, is a detailed local analysis of the classi- cal RQI itself. In addition, at the end of that chapter we discuss the local convergence properties of the shifted QR-algorithm. Our main result for this topic is that there cannot exist a smooth shift strategy ensuring quadratic convergence. In this thesis we do not answer questions on global convergence. The algorithms which are presented here are all locally smooth self mappings of manifolds with vanishing first derivative at a fixed point. A standard argu- ment using the mean value theorem then ensures that there exists an open neighborhood of that fixed point which is invariant under the iteration of the algorithm. Applying then the contraction theorem on the closed neigh- borhood ensures convergence to that fixed point and moreover that the fixed point is isolated. Most of the algorithms turn out to be discontinous far away from their fixed points but we will not go into this. I wish to thank my colleagues in W¨urzburg, Gunther Dirr, Martin Kleins- teuber, Jochen Trumpf, and Piere-Antoine Absil for many fruitful discussions we had. I am grateful to Paul Van Dooren, for his support and the discus- sions we had during my visits to Louvain. Particularly, I am grateful to Uwe Helmke. Our collaboration on many different areas of applied mathematics is still broadening.

Chapter 2

Jacobi-type Algorithms and

Cyclic Coordinate Descent

In this chapter we will discuss generalizations of the Jacobi algorithm well known from numerical linear algebra text books for the diagonalization of real symmetric matrices. We will relate this algorithm to socalled cyclic coordinate descent methods known to the optimization community. Under reasonable assumptions on the objective function to be minimized and on the step size selection rule to be considered, we will prove local quadratic convergence.

2.1 Algorithms

Suppose in an optimization problem we want to compute a local minimum of a smooth function

f : M → R, (2.1)

defined on a smooth n-dimensional manifold M. Let denote for each x ∈ M

{γ( 1 x ),... , γ n(x )} (2.2)

a family of mappings,

γ( i x): R → M,

γ( i x)(0) = x,

2.1 Algorithms 10

Thus x( ki )is recursively defined as the minimum of the smooth cost function f : M → R when restricted to the i-th curve

{Gi(t, x( ki −1)) | t ∈ R} ⊂ M.

The algorithm then consists of the iteration of sweeps.

Algorithm 2.2 (Jacobi-type Algorithm on n-dimensional Manifold).

  • Let x 0 ,... , xk ∈ M be given for k ∈ N 0.
  • Define the recursive sequence x(1) k ,... , x( kn ) as above (sweep).
  • Set xk+1 := x( kn ). Proceed with the next sweep.

2.1.2 Block Jacobi and Grouped Variable Cyclic Co-

ordinate Descent

A quite natural generalization of the Jacobi method is the following. In- stead of minimizing along predetermined curves, one might minimize over the manifold using more than just one parameter at each algorithmic step. Let denote

TxM = V 1 ( x)⊕ · · · ⊕ V (^) m(x ) (2.5)

a direct sum decomposition of the tangent space TxM at x ∈ M. We will not require the subspaces V (^) i( x), dim V (^) i( x)= li, to have equal dimension. Let denote

{γ( 1 x ),... , γ m(x )} (2.6)

a family of smooth mappings smoothly parameterized by x,

γ( i x): Rli^ → M,

γ i( x)(0) = x,

2.1 Algorithms 11

such that for all i = 1,... , m, for the image of the derivative

im D γ i( x)(0) = V (^) i( x) (2.8)

holds. Again we refer to

Gi : Rli^ × M → M,

Gi(t, x) := γ i( x)(t)

as the basic transformations. Analogously, to the one-dimensional case above, the proposed algorithm for minimizing a smooth function f : M → R then consists of a recursive application of socalled grouped variable sweep opera- tions. The algorithm is termed a Block Jacobi Algorithm.

Algorithm 2.3 (Block Jacobi Sweep).

Given an xk ∈ M. Define

x(1) k := G 1 (t(1) ∗ , xk)

x(2) k := G 2 (t(2) ∗ , x(1) k ) .. . x( km ):= Gm(t( ∗m ), x( km −1))

where for i = 1,... , m

t( ∗i ):= arg min t∈Rli (f (Gi(t, x( ki −1)))) if f (Gi(t, x( ki −1))) 6 ≡ f (x( ki −1))

and

t( ∗i ):= 0 otherwise.

Thus x( ki )is recursively defined as the minimum of the smooth cost function f : M → R when restricted to the i-th li-dimensional subset

2.1 Algorithms 13

Lue84, LT92]. Various tasks in linear algebra and system theory can be treated in a unified way as optimization problems of smooth functions on Lie groups and homogeneous spaces. In this way the powerful tools of differential geometry and Lie group theory become available to study such problems. With Brock- ett’s paper [Bro88] as the starting point there has been ongoing success in tackling difficult computational problems by geometric optimization meth- ods. We refer to [HM94] and the PhD theses [Smi93, Mah94, Deh95, H¨up96] for more systematic and comprehensive state of the art descriptions. Some of the further application areas where our methods are potentially useful include diverse topics such as frequency estimation, principal component analysis, perspective motion problems in computer vision, pose estimation, system approximation, model reduction, computation of canonical forms and feedback controllers, balanced realizations, Riccati equations, and structured eigenvalue problems. In the survey paper [HH97] a generalization of the classical Jacobi method for symmetric matrix diagonalization, see Jacobi [Jac46], is considered that is applicable to a wide range of computational problems. Jacobi-type methods have gained increasing interest, due to superior accuracy properties, [DV92], and inherent parallelism, [BL85, G¨ot94, Sam71], as compared to QR-based methods. The classical Jacobi method successively decreases the sum of squares of the off-diagonal elements of a given symmetric matrix to compute the eigenvalues. Similar extensions exist to compute eigenvalues or singular values of arbitrary matrices. Instead of using a special cost function such as the off-diagonal norm in Jacobi’s method, other classes of cost functions are feasible as well. In [HH97] a class of perfect Morse-Bott functions on homogeneous spaces is considered that are defined by unitarily invariant norm functions or by linear trace functions. In addition to gaining further generality this choice of functions leads to an elegant theory as well as yielding improved convergence properties for the resulting algorithms. Rather than trying to develop the Jacobi method in full generality on arbitrary homogeneous spaces in [HH97] its applicability by means of exam- ples from linear algebra and system theory is demonstrated. New classes of Jacobi-type methods for symmetric matrix diagonalization, balanced realiza- tion, and sensitivity optimization are obtained. In comparison with standard numerical methods for matrix diagonalization the new Jacobi-method has the advantage of achieving automatic sorting of the eigenvalues. This sorting

2.1 Algorithms 14

property is particularly important towards applications in signal processing; i.e., frequency estimation, estimation of dominant subspaces, independant component analysis, etc. Let G be a real reductive Lie group and K ⊂ G a maximal compact subgroup. Let α : G × V → V, (g, x) 7 → g · x (2.10)

be a linear algebraic action of G on a finite dimensional vector space V. Each orbit G·x of such a real algebraic group action then is a smooth submanifold of V that is diffeomorphic to the homogeneous space G/H, with H := {g ∈ G|g · x = x} the stabilizer subgroup. In [HH97] we are interested in understanding the structure of critical points of a smooth proper function f : G · x → R+ defined on orbits G · x. Some of the interesting cases actually arise when f is defined by a norm function on V. Thus given a positive definite inner product 〈 , 〉 on V let ‖x‖^2 = 〈x, x〉 denote the associated Hermitian norm. An Hermitian norm on V is called K−invariant if

〈k · x, k · y〉 = 〈x, y〉 (2.11)

holds for all x, y ∈ V and all k ∈ K, for K a maximal compact subgroup of G. Fix any such K−invariant Hermitian norm on V. For any x ∈ V we consider the smooth distance function on G · x defined as

φ : G·x → R+, φ(g·x) = ‖g·x‖^2. (2.12)

We then have the following result due to Kempf and Ness [KN79]. For an important generalization to plurisubharmonic functions on complex homoge- neous spaces, see Azad and Loeb [AL90].

Theorem 2.1. 1. The norm function φ : G ·x → R+, φ(g·x) = ‖g·x‖^2 , has a critical point if and only if the orbit G ·x is a closed subset of V.

  1. Let G · x be closed. Every critical point of φ : G · x → R+^ is a global minimum and the set of global minima is a single uniquely determined K−orbit.
  2. If G · x is closed, then φ : G · x → R+^ is a perfect Morse-Bott function. The set of global minima is connected. §

Theorem 2.1 completely characterizes the critical points of K−invariant Hermitian norm functions on G−orbits G·x of a reductive Lie group G. Similar

2.1 Algorithms 16

[GvL89, SHS72]. If the real symmetric matrix to be diagonalized has dis- tinct eigenvalues then the isospectral manifold of this matrix is diffeomorphic to the orthogonal group itself. Some advantages of the Jacobi-type method as compared to other optimization procedures one might see from the fol- lowing example. The symmetric eigenvalue problem might be considered as a constrained optimization task in a Euclidian vector space embedding the orthogonal group, cf. [Chu88, Chu91, Chu96, CD90], implying rela- tively complicated lifting and projection computations in each algorithmic step. Intrinsic gradient and Newton-type methods for the symmetric eigen- value problem were first and independently published in the Ph.D. theses [Smi93, Mah94]. The Jacobi approach, in contrast to the above- mentioned ones, uses predetermined directions to compute geodesics instead of direc- tions determined by the gradient of the function or by calculations of second derivatives. One should emphasize the simple calculability of such directions: the optimization is performed only along closed curves. The bottleneck of the gradient-based or Newton-type methods with their seemingly good conver- gence properties is generally caused by the explicit calculation of directions, the related geodesics, and possibly step size selections. The time required for these computations may amount to the same order of magnitude as the whole of the problem. For instance, the computation of the exponential of a dense skew-symmetric matrix is comparable to the effort of determining its eigenvalues. The advantage of optimizing along circles will become evident by the fact that the complete analysis of the restriction of the function to that closed curve is a problem of considerably smaller dimension and sometimes can be solved in closed form. For instance, for the real symmetric eigenvalue problem one has to solve only a quadratic. A whole class of further examples are developed in [Kle00] generalizing earlier results from [H¨up96]. There, generalizations of the conventional Ja- cobi algorithm to the problem of computing diagonalizations in compact Lie algebras are presented. We would like two mention two additional applications, namely, (i) the computation of signature symmetric balancing transformations, being an im- portant problem in systems and circuit theory, and (ii), the stereo matching problem without correspondence, having important applications in computer vision. The results referred to here are developed more detailed in [HHM02], respectively [HH98].

2.1 Algorithms 17

Signature Symmetric Balancing

From control theory it is well konwn that balanced realizations of symmet- ric transfer functions are signature symmetric. Wellknown algorithms, e.g., [LHPW87, SC89], however, do not preserve the signature symmetry and they may be sensible to numerical perturbations from the signature symmetric class. In recent years there is a tremendous interest in structure preserv- ing (matrix) algorithms. The main motivation for this is twofold. If such a method can be constructed it usually (i) leads to reduction in complexity and (ii) often coincidently avoids that in finite arithmetic physically mean- ingless results are obtained. Translated to our case that means that (i) as the appropriate state space transformation group the Lie group O pq+ of spe- cial pseudo-orthogonal transformations is used instead of GLn. Furthermore, (ii) at any stage of an algorithm the computed transformation should corre- spond to a signature symmetric realization if one would have started with one. Put into other words, the result of each iteration step should have some physical meaning. Let us very briefly review notions and results on balancing and signature symmetric realizations. Given any asymptotically stable linear system (A, B, C), the continuous-time controllability Gramian Wc and the observability Gramian Wo are defined, respectively, by

Wc =

∫^ ∞

0

etA^ BB′^ etA ′ d t,

Wo =

∫^ ∞

0

etA′ C′C etA^ d t.

Thus, assuming controllability and observability, the Gramians Wc, Wo are symmetric positive definite matrices. Moreover, a linear change of variables in the state space by an invertible state space coordinate transformation T leads to the co- and contravariant transformation law of Gramians as

(Wc, Wo) 7 →

T WcT ′, (T ′)−^1 WoT −^1

Let p, q ∈ N 0 be integers with p + q = n, Ipq := diag(Ip, −Iq). A realization (A, B, C) ∈ Rn×n^ × Rn×m^ × Rm×n^ is called signature symmetric if

2.1 Algorithms 19

This choice is motivated by our previous work on balanced realizations [HH00], where we studied the smooth function tr (N (Wc +Wo)) with diagonal positive definite N having distinct eigenvalues. Now

tr (N (Wc + Wo)) = tr (N (Wc + IpqWcIpq)) = 2tr (N Wc)

by the above choice of a diagonal N. The following result summarizes the basic properties of the cost function fN.

Theorem 2.3. Let N := diag (μ 1 ,... , μp, ν 1 ,... , νq) with 0 < μ 1 < · · · < μp and 0 < ν 1 < · · · < νq. For the smooth cost function fN : M (Σ) → R, defined by fN (W ) := tr (N W ), the following holds true.

  1. fN : M (Σ) → R has compact sublevel sets and a minimum of fN exists.
  2. X ∈ M (Σ) is a critical point for fN : M (Σ) → R if and only if X is diagonal.
  3. The global minimum is unique and it is characterized by X = diag (σ 1 , .. ., σn), where σ 1 > · · · > σp and σp+1 > · · · > σn holds.
  4. The Hessian of the function fN at a critical point is nondegenerate. §

The constraint set for our cost function fN : M (Σ) → R is the Lie group O p,q+ with Lie algebra opq. We choose a basis of opq as

Ωij := ej e′ i − eie′ j (2.23)

where 1 ≤ i < j ≤ p or p + 1 ≤ i < j ≤ n holds and

Ωkl := ele′ k + eke′ l (2.24)

where 1 ≤ k ≤ p < l ≤ n holds. These basis elements are defined via the standard basis vectors e 1 ,... , en of Rn. Thus exp(tΩij ) is an orthogonal rotation with (i, j)−th sub matrix

[ cos t − sin t sin t cos t

]

2.1 Algorithms 20

and exp(tΩkl) is a hyperbolic rotation with (k, l)−th sub matrix

[ cosh t sinh t sinh t cosh t

]

Let N as in Theorem 2.3 above and let W be symmetric positive definite. Consider the smooth function

φ : R → R,

φ(t) := tr

N etΩ^ W etΩ

′ )^ (2.27)

where Ω denotes a fixed element of the above basis of opq. We have

Lemma 2.1. 1. For Ω = Ωkl = (Ωkl)′^ as in (2.24) the function φ : R → R defined by (2.27) is proper and bounded from below.

  1. A minimum tΩ := arg min t∈R φ(t) ∈ R (2.28)

exists for all Ω = Ωij = −(Ωij )′^ where 1 ≤ i < j ≤ p or p + 1 ≤ i < j ≤ n holds, and exists as well for all Ω = Ωkl = (Ωkl)′^ where 1 ≤ k ≤ p < l ≤ n holds. §

In [HHM02] the details are figured out. Moreover, a Jacobi method is presented for which local quadratic convergence is shown.

A Problem From Computer Vision

The Lie group G under consideration is the semidirect product G = R n R^2. Here G acts linearly on the projective space RP 2. A Jacobi-type method is formulated to minimize a smooth cost function f : M → R. Consider the Lie algebra

g :=

B =

b 1 b 2 b 3 0 0 0 0 0 0

 (^) ; b 1 , b 2 , b 3 ∈ R