Numerical Linear Algebra: Tensor Decomposition (CSE 494, CBS 598, Fall 2007), Study notes of Algorithms and Programming

An introduction to the concept of tensors and tensor decomposition in the context of numerical linear algebra for data exploration. Tensors are multi-dimensional arrays that can represent data organized according to more than two categories. The document focuses on three-dimensional arrays and discusses basic tensor concepts, higher order singular value decomposition (hosvd), and rank-(r1, …, rn) tensor factorization. It also covers the unfolding and folding operations and their relationship to matrix multiplication.

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-23v
koofers-user-23v 🇺🇸

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSE 494 CSE/CBS 598 (Fall 2007): Numerical Linear Algebra for Data
Exploration— Tensor Decomposition
Instructor: Jieping Ye
1 Introduction
So far we focus on vectors and matrices. These can be thought of as one-dimensional and
two-dimensional arrays of data, respectively. For instance, in a term-document matrix, each
element is associated with one term and one document.
In many applications it is common that data are organized according to more than two
categories. The corresponding mathematical objects are usually referred to as tensors, and
the area of mathematics dealing with tensors is multi-linear algebra.
Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. A Multilinear Singular
Value Decomposition, SIAM Journal on Matrix Analysis and Applications, 2000.
In this course, we focus on three-dimensional arrays (third-order tensor).
1.1 Examples
In the classification of handwritten digits, the training set is a collection of images, manually
classified into 10 classes. Each such class is a set of digits of one kind, which can be considered
as a tensor. If each digit is represented as a 16-by-16 matrix of numbers representing grey-
scale, then a set of ndigits can be organized as a tensor A IR16×16×n.
A video can be represented as a tensor A IRr×c×n, where rand care the number of rows and
columns of each image in the video, respectively and nis the length of the image sequences
in the video.
In computer graphics, the appearance of rendered surfaces is determined by a complex inter-
action of multiple factors related to scene geometry, illumination, and imaging.
M.A.O. Vasilescu and D. Terzopoulos. TensorTextures: Multilinear Image-Based Ren-
dering, SIGGRAPH 2004.
H. Wang and et al. Out-of-Core Tensor Approximation of Multi-Dimensional Matrices
of Visual Data, SIGGRAPH 2005.
2 Basic Tensor Concepts
We refer to a tensor AIRl×m×nas a 3-mode array, i.e., the different dimensions of the
array are called modes. The dimensions of a tensor AIRl×m×nare l,m, and n. In this
terminology, a matrix is a two-mode array.
The inner product of two tensors Aand Bis defined as
< A, B >=X
i,j,k
aijk bijk .
pf3
pf4

Partial preview of the text

Download Numerical Linear Algebra: Tensor Decomposition (CSE 494, CBS 598, Fall 2007) and more Study notes Algorithms and Programming in PDF only on Docsity!

CSE 494 CSE/CBS 598 (Fall 2007): Numerical Linear Algebra for Data

Exploration— Tensor Decomposition

Instructor: Jieping Ye

1 Introduction

  • So far we focus on vectors and matrices. These can be thought of as one-dimensional and two-dimensional arrays of data, respectively. For instance, in a term-document matrix, each element is associated with one term and one document.
  • In many applications it is common that data are organized according to more than two categories. The corresponding mathematical objects are usually referred to as tensors, and the area of mathematics dealing with tensors is multi-linear algebra. - Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. A Multilinear Singular Value Decomposition, SIAM Journal on Matrix Analysis and Applications, 2000.
  • In this course, we focus on three-dimensional arrays (third-order tensor).

1.1 Examples

  • In the classification of handwritten digits, the training set is a collection of images, manually classified into 10 classes. Each such class is a set of digits of one kind, which can be considered as a tensor. If each digit is represented as a 16-by-16 matrix of numbers representing grey- scale, then a set of n digits can be organized as a tensor A ∈ IR^16 ×^16 ×n.
  • A video can be represented as a tensor A ∈ IRr×c×n, where r and c are the number of rows and columns of each image in the video, respectively and n is the length of the image sequences in the video.
  • In computer graphics, the appearance of rendered surfaces is determined by a complex inter- action of multiple factors related to scene geometry, illumination, and imaging. - M.A.O. Vasilescu and D. Terzopoulos. TensorTextures: Multilinear Image-Based Ren- dering, SIGGRAPH 2004. - H. Wang and et al. Out-of-Core Tensor Approximation of Multi-Dimensional Matrices of Visual Data, SIGGRAPH 2005.

2 Basic Tensor Concepts

  • We refer to a tensor A ∈ IRl×m×n^ as a 3-mode array, i.e., the different dimensions of the array are called modes. The dimensions of a tensor A ∈ IRl×m×n^ are l, m, and n. In this terminology, a matrix is a two-mode array.
  • The inner product of two tensors A and B is defined as

< A, B >=

i,j,k

aijkbijk.

  • The corresponding norm is

||A||F =< A, A >^1 /^2 =

  ∑

i,j,k

a^2 ijk

 

1 / 2 .

  • Define i-mode multiplication of a tensor by a matrix.
    • The 1-mode product of a tensor A ∈ IRl×m×n^ by a matrix U ∈ IRl^0 ×l, denoted by A× 1 U , is an l 0 × m × n tensor in which the entries are given by

(A × 1 U )(j, i 2 , i 3 ) =

∑^ l

k=

uj,k ak,i 2 ,i 3.

  • For comparison, consider the matrix multiplication A × 1 U = U A, where

(U A)(i, j) =

∑^ l

k=

ui,k ak,j.

  • Recall that matrix multiplication is equivalent to multiplying each column in A by the matrix U. The corresponding is true for tensor-matrix multiplication. In the 1-mode product all column vectors in the 3-mode array are multiplied by the matrix U.
  • Similarly, 2-mode multiplication of a tensor by a matrix V is given by

(A × 2 V )(i 1 , j, i 3 ) =

∑^ m

k=

vj,kai 1 ,k,i 3.

Note that 2-mode multiplication of a matrix by V is equivalent to matrix multiplication by V T^ from the right, A × 2 V = AV T^.

  • It is common to unfold a tensor into a matrix.
    • The unfolding along mode i (the resulting matrix is called A(i)) makes that mode the first mode of the matrix A(i), and the other modes are handled cyclically.
    • For instance, row i of A(j) contains all the elements of A, which have the j-th index equal to i.
    • Example: Let B ∈ IR^3 ×^3 ×^3 be a tensor, defined in MATLAB as

B(:, :, 1) =

 

  , B(:, :, 2) =

 

  , B(:, :, 3) =

 

 .

Then unfolding along the third mode gives

B(3) =

  

  .

4 Rank-(R 1 , · · · , RN ) Tensor Factorization

  • Given an N th-order tensor A ∈ IRI^1 ×I^2 ×···×IN^ , rank-(R 1 , · · · , RN ) factorization of A is formu- lated as finding a lower-rank tensor A˜ ∈ IRI^1 ×I^2 ×···×IN^ with rankn( A˜) = Rn ≤ rankn(A), for all n, such that the following least-squares cost function is minimized:

A˜ = argmin (^) Aˆ

∥∥ ∥A^ −^ Aˆ

∥∥ ∥.^ (1)

More specifically, A˜ can be expressed as follows: A˜ = C × 1 U (1)^ × 2 U (2)^ × · · · ×N U (N^ ), (2)

where U (n)^ ∈ IRIn×Rn^ has orthonormal columns for n = 1, · · · , N.

  • When Rn is much smaller than In for all n, the core tensor C and the basis matrices {U (n)}Nn= give a compact representation of the original tensor A, resulting in data compression.
  • Given the basis matrices {U (n)}Nn=1, the core tensor C can be readily computed as C = A × 1 (U (1))T^ · · · ×N (U (N^ ))T^. Thus, the optimization problem focuses on the computation of the basis matrices only.

||A − A˜||^2 = ||A||^2 − 2 < A, A >˜ +|| A˜||^2. Based on the definition of the inner product, we have

< A, A >˜ = < A, C × 1 U (1)^ × 2 U (2)^ × · · · ×N U (N^ )^ > = < A × 1 (U (1))T^ × 2 (U (2))T^ × · · · ×N (U (N^ ))T^ , C > = ||C||^2.

  • Since U (n) for all n have orthonormal columns, they don’t affect the norm || A˜||^2 = ||C||^2.

Thus, ||A − A˜||^2 = ||A||^2 − ||C||^2.

  • The best Rank-(R 1 , · · · , RN ) tensor approximation can be computed by maximizing ||A × 1 (U (1))T^ · · · ×N (U (N^ ))T^ ||.
  • An iterative approach can be applied for the computation. Each iterative step optimizes only one of the basis matrices, while keeping the other N − 1 basis matrices fixed. With U (1), · · · , U (n−1), U (n+1), · · · , U (N^ )^ fixed, we first project A onto the (R 1 , · · · , Rn− 1 , Rn+1, · · · , RN )-dimensional space as follows:

V n^ = A × 1 (U (1))T^ · · · ×n− 1 (U (n−1))T^ ×n+1 (U (n+1))T^ × · · · ×N (U (N^ ))T^.

  • Then, U (n)^ is given by the first Rn columns of the left singular matrix of V (^) (nn), which consists of all n-th mode vectors of V n. The least-squares cost in Eq. (1) decreases monotonically during the iteration. - In HOSVD, U (n)^ is given by the first Rn columns of the left singular matrix of A(n).