Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

For each uploaded document

Answer questions

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Matrix Differential Calculus for Optimization: Notes from 10-725 Course, Fall 2012, Exams of Calculus

Hogeschool voor Wetenschap & Kunst Calculus

These notes cover matrix differential calculus, including matrix differentials, chain rule, product rule, and identities. The document also discusses finding a maximum or minimum of a scalar function or matrix function using the coefficient of dX being set to zero. Examples are provided for Infomax Independent Component Analysis (ICA) and Newton's method.

Typology: Exams

2021/2022

Uploaded on 08/05/2022

dirk88 🇧🇪

4.4

(222)

3.1K documents

1 / 40

This page cannot be seen from the preview

Don't miss anything!

bg1

Matrix differential calculus

10-725 Optimization

Geoff Gordon

Ryan Tibshirani

pf3

pf4

pf5

pf8

pf9

pfa

pfd

pfe

pff

pf12

pf13

pf14

pf15

pf16

pf17

pf18

pf19

pf1a

pf1b

pf1c

pf1d

pf1e

pf1f

pf20

pf21

pf22

pf23

pf24

pf25

pf26

pf27

pf28

Discover Exams of Calculus Hogeschool voor Wetenschap & Kunst

Related documents

Evidence Admissibility in Thurber's Trial: Photo, Witness Testimony, and Hearsay

Matrix Matrix Matrix Matrix Matrix Matrix Matrix Matrix Matrix Matrixss

(1)

MatrixMatrix Matrix Matrix Matrix Matrix Matrix Matrix Matrix Matrix Matrix

Understanding Nucleic Acids: The Role of DNA and RNA in Cellular Activities

Managerial Accounting in the Information Age CHAPTER 1

M9 Service Pistol: Operation, Maintenance, and Safety

Photoelectron Spectroscopy: A Detailed Explanation with Exercises

A Historical Overview of River Edge, NJ: Early Settlement and Revolutionary War Era

Exploring Expert TA: Taking Assignments in a Virtual Learning Environment

Elementary Statistics Chapter 1 Test

Georgetown Roofing Contractors List

Exploring Biology: BCOR 11 Fall 2021 Course Overview

Partial preview of the text

Download Matrix Differential Calculus for Optimization: Notes from 10-725 Course, Fall 2012 and more Exams Calculus in PDF only on Docsity!

Matrix differential calculus

10-725 Optimization

Geoff Gordon

Ryan Tibshirani

Geoff Gordon—10-725 Optimization—Fall 2012

Review

Matrix differentials: sol’n to matrix calculus pain

‣ compact way of writing Taylor expansions, or …

‣ definition:

‣ df = a(x; dx) [+ r(dx)]

‣ a(x; .) linear in 2nd arg

‣ r(dx)/||dx|| → 0 as dx → 0

d(…) is linear: passes thru +, scalar *

Generalizes Jacobian, Hessian, gradient, velocity

Geoff Gordon—10-725 Optimization—Fall 2012

Finding a maximum

or minimum, or saddle point

ï 3 ï 2 ï 1 0 1 2 3 ï 1 ï0. 0

1

2

ID for df(x) scalar x vector x matrix X

scalar f

vector f

matrix F

df = a dx df = a

T

d x df = tr(A

T

dX)

d f = a dx d f = A d x

dF = A dx

Geoff Gordon—10-725 Optimization—Fall 2012

Finding a maximum

or minimum, or saddle point

ID for df(x) scalar x vector x matrix X

scalar f

vector f

matrix F

df = a dx df = a

T

d x df = tr(A

T

dX)

d f = a dx d f = A d x

dF = A dx

Geoff Gordon—10-725 Optimization—Fall 2012

Ex: Infomax ICA

Training examples xi ∈ ℝ d , i = 1:n
Transformation yi = g(Wxi) ‣ W ∈ ℝ d!d ‣ g(z) =
Want:

ï 10 ï 5 0 5 10 ï 10 ï 5

Wxi 0 .2 0 .4 0. 6 0. 8

yi ï 10 ï 5 0 5 10 ï 10 ï 5

xi

Geoff Gordon—10-725 Optimization—Fall 2012

Volume rule

Geoff Gordon—10-725 Optimization—Fall 2012

Gradient

L = ∑ ln |det Ji| yi = g(Wxi) dyi = Ji dxi

i

Geoff Gordon—10-725 Optimization—Fall 2012

Gradient

Ji = diag(ui) W dJi = diag(ui) dW + diag(vi) diag(dW xi) W

dL =

Geoff Gordon—10-725 Optimization—Fall 2012

yi

ICA natural gradient

[W

-T

+ C] W

T

W =

Wxi

start with W 0 = I

Geoff Gordon—10-725 Optimization—Fall 2012

yi

ICA natural gradient

[W

-T

+ C] W

T

W =

Wxi

start with W 0 = I

Geoff Gordon—10-725 Optimization—Fall 2012

ICA on natural image patches

Geoff Gordon—10-725 Optimization—Fall 2012

More info

Minka’s cheat sheet:

‣ http://research.microsoft.com/en-us/um/people/minka/

papers/matrix/

Magnus & Neudecker. Matrix Differential Calculus.

Wiley, 1999. 2nd ed.

‣ http://www.amazon.com/Differential-Calculus-

Applications-Statistics-Econometrics/dp/047198633X

Bell & Sejnowski. An information-maximization

approach to blind separation and blind

deconvolution. Neural Computation, v7, 1995.

Geoff Gordon—10-725 Optimization—Fall 2012

Nonlinear equations

x ∈ R

d

f: R

d

→R

d

, diff’ble

‣ solve:

Taylor:

‣ J:

Newton:

0 1 2 ï 1 ï0. 0

1

Geoff Gordon—10-725 Optimization—Fall 2012

Error analysis