Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Deep Learning and Reinforcement Learning, Lecture notes of Algorithms and Programming

University of Pennsylvania (UPenn)Algorithms and Programming

An introduction to deep learning and reinforcement learning. It compares the approximation of a standard projection with that of a deep learning network. why neural networks are a good solution method in economics and how they can efficiently approximate complex functions. It also discusses AlphaGo and its surprising strategies. a neural network training pipeline and architecture.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

magicphil 🇺🇸

4.3

(16)

241 documents

1 / 91

This page cannot be seen from the preview

Don't miss anything!

Deep learning and reinforcement learning

Jes´us Fern´andez-Villaverde1and Galo Nu˜no2

October 15, 2021

1University of Pennsylvania

2Banco de Espa˜na

Discover Lecture notes of Algorithms and Programming University of Pennsylvania (UPenn)

Partial preview of the text

Download Deep Learning and Reinforcement Learning and more Lecture notes Algorithms and Programming in PDF only on Docsity!

Deep learning and reinforcement learning

Jes´us Fern´andez-Villaverde^1 and Galo Nu˜no^2 October 15, 2021 (^1) University of Pennsylvania

(^2) Banco de Espa˜na

A short introduction

A neural network

An artificial neural network (a.k.a. ANN or connectionist system) is an approximation to f (x) built as a linear combination of M generalized linear models of x of the form:

y ∼= g NN^ (x; θ) = θ 0 +

∑^ M

θmφ (zm)

where φ(·) is an arbitrary activation function and:

zm = θ 0 ,m +

∑^ N

θn,mxn

M is known as the width of the model.
We can select θ such that g NN^ (x; θ) is as close to f (x) as possible given some relevant metric (e.g., L^2 norm).
This is known as “training” the network.

Comparison with other approximations

Compare: y ∼= g NN^ (x; θ) = θ 0 +

∑^ M

θmφ

θ 0 ,m +

∑^ N

θn,mxn

with a standard projection: y ∼= g CP^ (x; θ) = θ 0 +

∑^ M

θmφm (x)

where φm is, for example, a Chebyshev polynomial.

We exchange the rich parameterization of coefficients for the parsimony of basis functions.
Later, we will explain why this is often a good idea.
How we determine the coefficients will also be different, but this is somewhat less important.

Why are neural networks a good solution method in economics?

From now on, I will refer to neural networks as including both single and multilayer networks.
With suitable choices of activation functions, neural networks can efficiently approximate extremely complex functions.
In particular, under certain (relatively weak) conditions:
1. Neural networks are universal approximators.
2. Neural networks break the “curse of dimensionality.”
Furthermore, neural networks are easy to code, stable, and scalable for multiprocressing.
Thus, neural networks have considerable option value as solution methods in economics.

Current interest

Currently, neural networks are among the most active areas of research in computer science and applied math.
While original idea goes back to the 1940s, neural networks were rediscovered in the second half of the 2000s.
Why?
1. Suddenly, the large computational and data requirements required to train the networks efficiently became available at a reasonable cost.
2. New algorithms such as back propagation through gradient descent became popular.
Some well-known successes and industrial applications.

AlphaGo

Big splash: AlphaGo vs. Lee Sedol in March 2016.
Silver et al. (2018): now applied to chess, shogi, Go, and StarCraft II.
Check also:
1. https://deepmind.com/research/alphago/.
2. https://www.alphagomovie.com/
3. https: //deepmind.com/blog/article/alphastar-mastering-real-time-strategy-game-starcraft-ii
Very different than Deep Blue against Kasparov.
New and surprising strategies.
However, you need to keep this accomplishment in perspective.

ARTICLERESEARCH

Figure 1 | Neural network training pipeline and architecture. a , A fast rollout policy p π and supervised learning (SL) policy network p σ are trained to predict human expert moves in a data set of positions. A reinforcement learning (RL) policy network p ρ is initialized to the SL policy network, and is then improved by policy gradient learning to maximize the outcome (that is, winning more games) against previous versions of the policy network. A new data set is generated by playing games of self-play with the RL policy network. Finally, a value network v θ is trained by regression to predict the expected outcome (that is, whether

the current player wins) in positions from the self-play data set. b , Schematic representation of the neural network architecture used in AlphaGo. The policy network takes a representation of the board position s as its input, passes it through many convolutional layers with parameters σ (SL policy network) or ρ (RL policy network), and outputs a probability distribution p σ ( | )a s or p ρ ( | )a s over legal moves a, represented by a probability map over the board. The value network similarly uses many convolutional layers with parameters θ , but outputs a scalar value v θ (s′) that predicts the expected outcome in position s′.

Classification Regression

Classification Self Play

Policy gradient

a b

Human expert positions Self-play positions

Neural network

Data

Rollout policy pS pV pU Q (^) T pVU (a⎪s) QT (s′)

SL policy network RL policy network Value network Policy network Value network

s s′

Further advantages

Neural networks and deep learning often require less “inside knowledge” by experts on the area.
Results can be highly counter-intuitive and yet, deliver excellent performance.
Outstanding open source libraries: Tensorflow, Pytorch, Flux.
More recently, development of dedicated hardware (TPUs, AI accelerators, FPGAs) are likely to maintain a hedge for the area.
The width of an ecosystem is key for its long-run success.

Digging deeper

A neuron

N observables: x 1 , x 2 ,...,xN. We stack them in x.
Coefficients (or weights): θ 0 (a constant), θ 1 , θ 2 , ...,θN. We stack them in θ.
We build a linear combination of observations:

z = θ 0 +

∑^ N

θnxn

Theoretically, we could build non-linear combinations, but unlikely to be a fruitful idea in general.

We transform such linear combination with an activation function: y = g (x; θ) = φ (z) The activation function might have some coefficients γ on its own.
Why do we need an activation function?

Deep Learning and Reinforcement Learning, Lecture notes of Algorithms and Programming

Related documents

Partial preview of the text

Download Deep Learning and Reinforcement Learning and more Lecture notes Algorithms and Programming in PDF only on Docsity!

Deep learning and reinforcement learning

A short introduction

A neural network

∑^ M

∑^ N

Comparison with other approximations

∑^ M

∑^ N

∑^ M

Why are neural networks a good solution method in economics?

Current interest

AlphaGo

ARTICLERESEARCH

Further advantages

Digging deeper

A neuron

∑^ N

Flow representation

Inputs Weights

x 1 θ 1

x 2 θ 2

x 3 θ 3

xn θn

∑^ n

θi xi

Net input

Activation Perceptron

classification

output