Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Matrix Algebra and Calculus: A Comprehensive Study, Study notes of Civil Engineering

Evangelische Theologische Faculteit, Leuven Civil Engineering

An introduction to matrix algebra and calculus, including definitions, notations, and properties of matrices, vector spaces, and matrix operations. It covers topics such as matrix multiplication, transpose, inverse, determinant, rank, and partitioned matrices. The document also includes propositions and proofs related to matrix differentiation and the relationship between the transpose and inverse of a matrix.

Typology: Study notes

2021/2022

Uploaded on 08/05/2022

jacqueline_nel 🇧🇪

4.4

(242)

3.2K documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

Matrix Differentiation

( and some other stuff )

Randal J. Barnes

Department of Civil Engineering, University of Minnesota

Minneapolis, Minnesota, USA

1 Introduction

Throughout this presentation I have chosen to use a symbolic matrix notation. This choice

was not made lightly. I am a strong advocate of index notation, when appropriate. For

example, index notation greatly simplifies the presentation and manipulation of differential

geometry. As a rule-of-thumb, if your work is going to primarily involve differentiation

with respect to the spatial coordinates, then index notation is almost surely the appropriate

choice.

In the present case, however, I will be manipulating large systems of equations in which

the matrix calculus is relatively simply while the matrix algebra and matrix arithmetic is

messy and more involved. Thus, I have chosen to use symbolic notation.

2 Notation and Nomenclature

Definition 1 Let aij ∈R,i=1, 2, . . . , m,j=1, 2, . . . , n. Then the ordered rectangular

array

A=





a11 a12 ··· a1n

a21 a22 ··· a2n

.

..

.

..

.

am1am2··· amn







(1)

is said to be a real matrix of dimension m×n.

When writing a matrix I will occasionally write down its typical element as well as its

dimension. Thus,

A=[aij],i=1, 2, . . . , m;j=1, 2, . . . , n, (2)

denotes a matrix with mrows and ncolumns, whose typical element is aij. Note, the first

subscript locates the row in which the typical element lies while the second subscript locates

the column. For example, ajk denotes the element lying in the jth row and kth column of

the matrix A.

Definition 2 Avector is a matrix with only one column. Thus, all vectors are inherently

column vectors.

Convention 1

Multi-column matrices are denoted by boldface uppercase letters: for example, A,B,X.

Vectors (single-column matrices) are denoted by boldfaced lowercase letters: for example,

a,b,x. I will attempt to use letters from the beginning of the alphabet to designate known

matrices, and letters from the end of the alphabet for unknown or variable matrices.

1

Discover Study notes of Civil Engineering Evangelische Theologische Faculteit, Leuven

Partial preview of the text

Download Matrix Algebra and Calculus: A Comprehensive Study and more Study notes Civil Engineering in PDF only on Docsity!

Matrix Differentiation

( and some other stuff )

Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA

1 Introduction

Throughout this presentation I have chosen to use a symbolic matrix notation. This choice was not made lightly. I am a strong advocate of index notation, when appropriate. For example, index notation greatly simplifies the presentation and manipulation of differential geometry. As a rule-of-thumb, if your work is going to primarily involve differentiation with respect to the spatial coordinates, then index notation is almost surely the appropriate choice. In the present case, however, I will be manipulating large systems of equations in which the matrix calculus is relatively simply while the matrix algebra and matrix arithmetic is messy and more involved. Thus, I have chosen to use symbolic notation.

2 Notation and Nomenclature

Definition 1 Let aij ∈ R, i = 1, 2,... , m, j = 1, 2,... , n. Then the ordered rectangular array

A =

a 11 a 12 · · · a 1 n a 21 a 22 · · · a 2 n .. .

am 1 am 2 · · · amn

is said to be a real matrix of dimension m × n.

When writing a matrix I will occasionally write down its typical element as well as its dimension. Thus, A = [aij] , i = 1, 2,... , m; j = 1, 2,... , n, (2)

denotes a matrix with m rows and n columns, whose typical element is aij. Note, the first subscript locates the row in which the typical element lies while the second subscript locates the column. For example, ajk denotes the element lying in the jth row and kth column of the matrix A.

Definition 2 A vector is a matrix with only one column. Thus, all vectors are inherently column vectors.

Convention 1 Multi-column matrices are denoted by boldface uppercase letters: for example, A, B, X. Vectors (single-column matrices) are denoted by boldfaced lowercase letters: for example, a, b, x. I will attempt to use letters from the beginning of the alphabet to designate known matrices, and letters from the end of the alphabet for unknown or variable matrices.

Convention 2 When it is useful to explicitly attach the matrix dimensions to the symbolic notation, I will use an underscript. For example, A m×n , indicates a known, multi-column matrix with m rows

and n columns.

A superscript T^ denotes the matrix transpose operation; for example, AT^ denotes the transpose of A. Similarly, if A has an inverse it will be denoted by A−^1. The determinant of A will be denoted by either |A| or det(A). Similarly, the rank of a matrix A is denoted by rank(A). An identity matrix will be denoted by I, and 0 will denote a null matrix.

3 Matrix Multiplication

Definition 3 Let A be m × n, and B be n × p, and let the product AB be

C = AB (3)

then C is a m × p matrix, with element (i, j) given by

cij =

∑^ n

k= 1

aikbkj (4)

for all i = 1, 2,... , m, j = 1, 2,... , p.

Proposition 1 Let A be m × n, and x be n × 1 , then the typical element of the product

z = Ax (5)

is given by

zi =

∑^ n

k= 1

aikxk (6)

for all i = 1, 2,... , m. Similarly, let y be m × 1 , then the typical element of the product

zT^ = yT^ A (7)

is given by

zi =

∑^ n

k= 1

akiyk (8)

for all i = 1, 2,... , n. Finally, the scalar resulting from the product

α = yT^ Ax (9)

is given by

α =

∑^ m

j= 1

∑^ n

k= 1

ajkyjxk (10)

Proof: These are merely direct applications of Definition 3. q.e.d.

Proposition 4 Let A be a square, nonsingular matrix of order m. Partition A as

A =

[

A 11 A 12

A 21 A 22

]

so that A 11 is a nonsingular matrix of order m 1 , A 22 is a nonsingular matrix of order m 2 , and m 1 + m 2 = m. Then

A−^1 =

[ (

A 11 − A 12 A− 221 A 21

−A− 111 A 12

A 22 − A 21 A− 111 A 12

−A− 221 A 21

A 11 − A 12 A− 221 A 21

A 22 − A 21 A− 111 A 12

]

Proof: Direct multiplication of the proposed A−^1 and A yields

A−^1 A = I (22)

q.e.d.

5 Matrix Differentiation

In the following discussion I will differentiate matrix quantities with respect to the elements of the referenced matrices. Although no new concept is required to carry out such operations, the element-by-element calculations involve cumbersome manipulations and, thus, it is useful to derive the necessary results and have them readily available 2.

Convention 3 Let y = ψ(x), (23)

where y is an m-element vector, and x is an n-element vector. The symbol

∂y ∂x

∂y 1 ∂x 1

∂y 1 ∂x 2 · · ·^

∂y 1 ∂xn ∂y 2 ∂x 1

∂y 2 ∂x 2 · · ·^

∂y 2 ∂xn .. .

∂ym ∂x 1

∂ym ∂x 2 · · ·^

∂ym ∂xn

will denote the m × n matrix of first-order partial derivatives of the transformation from x to y. Such a matrix is called the Jacobian matrix of the transformation ψ().

Notice that if x is actually a scalar in Convention 3 then the resulting Jacobian matrix is a m × 1 matrix; that is, a single column (a vector). On the other hand, if y is actually a scalar in Convention 3 then the resulting Jacobian matrix is a 1 × n matrix; that is, a single row (the transpose of a vector).

Proposition 5 Let y = Ax (25) (^2) Much of the material in this section is extracted directly from Dhrymes (1978, Section 4.3). The interested reader is directed to this worthy reference to find additional results.

where y is m × 1 , x is n × 1 , A is m × n, and A does not depend on x, then

∂y ∂x

= A (26)

Proof: Since the ith element of y is given by

yi =

∑^ n

k= 1

aikxk (27)

it follows that ∂yi ∂xj

= aij (28)

for all i = 1, 2,... , m, j = 1, 2,... , n. Hence

∂y ∂x

= A (29)

q.e.d.

Proposition 6 Let y = Ax (30)

where y is m × 1 , x is n × 1 , A is m × n, and A does not depend on x, as in Proposition 5. Suppose that x is a function of the vector z, while A is independent of z. Then

∂y ∂z

= A

∂x ∂z

Proof: Since the ith element of y is given by

yi =

∑^ n

k= 1

aikxk (32)

for all i = 1, 2,... , m, it follows that

∂yi ∂zj

∑^ n

k= 1

aik

∂xk ∂zj

but the right hand side of the above is simply element (i, j) of A∂ ∂xz. Hence

∂y ∂z

∂y ∂x

∂x ∂z

= A

∂x ∂z

q.e.d.

Proposition 7 Let the scalar α be defined by

α = yT^ Ax (35)

where y is m × 1 , x is n × 1 , A is m × n, and A is independent of x and y, then

∂α ∂x

= yT^ A (36)

Proposition 9 For the special case where A is a symmetric matrix and

α = xT^ Ax (48)

where x is n × 1 , A is n × n, and A does not depend on x, then

∂α ∂x

= 2 xT^ A (49)

Proof: This is an obvious application of Proposition 8. q.e.d.

Proposition 10 Let the scalar α be defined by

α = yT^ x (50)

where y is n × 1 , x is n × 1 , and both y and x are functions of the vector z. Then

∂α ∂z

= xT^

∂y ∂z

yT^

∂x ∂z

Proof: We have

α =

∑^ n

j= 1

xjyj (52)

Differentiating with respect to the kth element of z we have

∂α ∂zk

∑^ n

j= 1

xj

∂yj ∂zk

yj

∂xj ∂zk

for all k = 1, 2,... , n, and consequently,

∂α ∂z

∂α ∂y

∂y ∂z

∂α ∂x

∂x ∂z

= xT^

∂y ∂z

yT^

∂x ∂z

q.e.d.

Proposition 11 Let the scalar α be defined by

α = xT^ x (55)

where x is n × 1 , and x is a function of the vector z. Then

∂α ∂z

= 2 xT^

∂x ∂z

Proof: This is an obvious application of Proposition 10. q.e.d.

Proposition 12 Let the scalar α be defined by

α = yT^ Ax (57)

where y is m × 1 , x is n × 1 , A is m × n, and both y and x are functions of the vector z, while A does not depend on z. Then

∂α ∂z

= xT^ AT^

∂y ∂z

yT^ A

∂x ∂z

Proof: Define wT^ = yT^ A (59)

and note that α = wT^ x (60)

Applying Propositon 10 we have

∂α ∂z

= xT^

∂w ∂z

wT^

∂x ∂z

Substituting back in for w we arrive at

∂α ∂z

∂α ∂y

∂y ∂z

∂α ∂x

∂x ∂z

= xT^ AT^

∂y ∂z

yT^ A

∂x ∂z

q.e.d.

Proposition 13 Let the scalar α be defined by the quadratic form

α = xT^ Ax (63)

where x is n × 1 , A is n × n, and x is a function of the vector z, while A does not depend on z. Then ∂α ∂z

= xT^

A + AT^

) (^) ∂x ∂z

Proof: This is an obvious application of Proposition 12. q.e.d.

Proposition 14 For the special case where A is a symmetric matrix and

α = xT^ Ax (65)

where x is n × 1 , A is n × n, and x is a function of the vector z, while A does not depend on z. Then ∂α ∂z

= 2 xT^ A

∂x ∂z

Proof: This is an obvious application of Proposition 13. q.e.d.

Definition 5 Let A be a m×n matrix whose elements are functions of the scalar parameter α. Then the derivative of the matrix A with respect to the scalar parameter α is the m × n matrix of element-by-element derivatives:

∂A

∂α

∂a 11 ∂α

∂a 12 ∂α · · ·^

∂a 1 n ∂α ∂a 21 ∂α

∂a 22 ∂α · · ·^

∂a 2 n ∂α .. .

∂am 1 ∂α

∂am 2 ∂α · · ·^

∂amn ∂α

Proposition 15 Let A be a nonsingular, m × m matrix whose elements are functions of the scalar parameter α. Then ∂A−^1 ∂α

= −A−^1

∂A

∂α

Matrix Algebra and Calculus: A Comprehensive Study, Study notes of Civil Engineering

Related documents

Partial preview of the text

Download Matrix Algebra and Calculus: A Comprehensive Study and more Study notes Civil Engineering in PDF only on Docsity!

Matrix Differentiation

1 Introduction

2 Notation and Nomenclature

A =

3 Matrix Multiplication

A =

[

A 11 A 12

A 21 A 22

]

A−^1 =

[ (

A 11 − A 12 A− 221 A 21

−A− 111 A 12

A 22 − A 21 A− 111 A 12

−A− 221 A 21

A 11 − A 12 A− 221 A 21

A 22 − A 21 A− 111 A 12

]

5 Matrix Differentiation

= A (26)

= A (29)

= A

= A

A + AT^

∂A

= −A−^1

∂A

A−^1 (68)