Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

For each uploaded document

Answer questions

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Introduction to Deep Learning, Lecture notes of Architecture

University of Pennsylvania (UPenn)Architecture

This document introduces the concept of deep learning and its applications in approximating unknown functions. It discusses the intuition behind deep learning, the comparison with other approximations, and the consequences of using neural networks. It also covers the advantages of deep learning, popular activation functions, and classic results in series approximations. suitable for students interested in machine learning and artificial intelligence.

Typology: Lecture notes

2022/2023

Uploaded on 03/14/2023

ekanga 🇺🇸

4.9

(16)

263 documents

1 / 57

This page cannot be seen from the preview

Don't miss anything!

bg1

Introduction to Deep Learning

Jes´us Fern´andez-Villaverde1and Galo Nu˜no2

September 1, 2022

1University of Pennsylvania

2Banco de Espa˜na

pf3

pf4

pf5

pf8

pf9

pfa

pfd

pfe

pff

pf12

pf13

pf14

pf15

pf16

pf17

pf18

pf19

pf1a

pf1b

pf1c

pf1d

pf1e

pf1f

pf20

pf21

pf22

pf23

pf24

pf25

pf26

pf27

pf28

pf29

pf2a

pf2b

pf2c

pf2d

pf2e

pf2f

pf30

pf31

pf32

pf33

pf34

pf35

pf36

pf37

pf38

pf39

Discover Lecture notes of Architecture University of Pennsylvania (UPenn)

Related documents

Introduction to Deep Learning.

(1)

Deep Learning and Reinforcement Learning

Deep Learning in Python

Neural Networks and Deep Learning Concepts

Deep Learning Fundamentals

This the deep learning notes.

Deep learning information

Deep Learning Report

deep learning methods

Optimization in Deep Learning

Deep Learning: Understanding the Concepts and Applications of Deep Neural Networks

(1)

Deep Learning Coursera - Course 1

Partial preview of the text

Download Introduction to Deep Learning and more Lecture notes Architecture in PDF only on Docsity!

Introduction to Deep Learning

Jes´us Fern´andez-Villaverde^1 and Galo Nu˜no^2 September 1, 2022 (^1) University of Pennsylvania

(^2) Banco de Espa˜na

The problem

We want to approximate (“learn”) an unknown function:

y = f (x)

where y is a scalar and x = {x 0 = 1, x 1 , x 2 , ..., xN } a vector (including a constant).

We care about the case when N is large (possibly in the thousands!).
Easy to extend to the case where y is a vector (e.g., a probability distribution), but notation becomes cumbersome.
In economics, f (x) can be a value function, a policy function, a pricing kernel, a conditional expectation, a classifier, ...

Flow representation

Inputs Weights

x 0 θ 0

x 1 θ 1

x 2 θ 2

xn θn

X^ n

i=

θi xi

Linear Trans.

Activation

Output

Intuition

Intuition 1: A biological interpretation, but I do not find it too useful. Closer to econometrics (e.g., NOLS, semiparametric regression, and sieves) and differential geometry.
Intuition 2: We look for representations of the features of the data that are informationally efficient.
Intuition 3 (more advanced): We look for translations and rotations of the data that deliver a more convenient geometry by moving from a parent space to a simpler one.

Comparison with other approximations

Compare: f (x) ∼= g NN^ (x; θ) = θ 0 +

X^ M

m=

θmϕ

X^ N

n=

θn,mxn

with a standard projection:

f (x) ∼= g CP^ (x; θ) = θ 0 +

X^ M

m=

θmϕm (x)

where ϕm is, for example, a Chebyshev polynomial.

We exchange the rich parameterization of coefficients for the parsimony of basis functions.
How we determine the coefficients will also be different, but this is somewhat less important.

Why do neural networks “work”?

Neural networks consist entirely of chains of tensor operations: we take x, we perform affine transformations, and apply an activation function.
Thus, these tensor operations are geometric transformations of x. In fact, a better name for neural networks could be chained geometric transformations.
In other words: a neural network is a complex geometric transformation in a high-dimensional space.
Deep neural networks look for convenient geometrical representations of high-dimensional manifolds.
The success of any functional approximation problem is to search for the right geometric space in which to perform it, not to search for a “better” basis function.

Deep learning, I

A deep learning network is an acyclic multilayer composition of J > 1 neural networks:

z m^0 = θ 00 ,m +

X^ N

n=

θ^0 n,mxn

and z m^1 = θ^10 ,m +

M X(1)

m=

θ m^1 ϕ^1

z m^0

y ∼= g DL(x; θ) = θJ 0 +

M X(J)

m=

θJmϕJ^

z mJ−^1

where the M(1), M(2), ... and ϕ^1 (·), ϕ^2 (·), ... are possibly different across each layer of the network.

A deep network creates new features by composing older features.

x 0

x 1

x 2

Input Values

Input Layer

Hidden Layer 1

Hidden Layer 2

Output Layer

Why do deep neural networks “work” better?

Why do we want to introduce hidden layers?
1. It works! Evolution of ImageNet winners.
2. The number of representations increases exponentially with the number of hidden layers while computational cost grows linearly.
3. Intuition: hidden layers induce highly nonlinear behavior in the joint creation of representations without the need to have domain knowledge (used, in other algorithms, in some form of greedy pre-processing).

Further advantages

Neural networks and deep learning often require less “inside knowledge” by experts on the area.
While results can be highly counter-intuitive, deep neural networks deliver excellent performance.
Outstanding open source libraries (Tensorflow, Keras, Pytorch, Flux) that integrate well with easy scripting languages (Python).
Newer algorithms: batch normalization, residual connections, and depthwise separable convolutions.
More recently, development of dedicated hardware (TPUs, AI accelerators, FPGAs) are likely to maintain a hedge for the area.
The richness of an ecosystem is key for its long-run success.

References