Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Deep Residual Learning for Image Recognition: A Comprehensive Guide, Exercises of Architecture

University of California - Berkeley Architecture

Our current optimization solvers are not able to approximate the identity mappings of a stack of added non-linear layers.

Typology: Exercises

2022/2023

Uploaded on 02/28/2023

rakshan 🇺🇸

4.6

(18)

239 documents

1 / 27

This page cannot be seen from the preview

Don't miss anything!

Deep Residual Learning for Image Recognition

Authors:

Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun

Presenter: Masoud Hoveidar

Facilitators: Amber Ma and Ramya Balasubramaniam

12th August 2019

Discover Exercises of Architecture University of California - Berkeley

Partial preview of the text

Download Deep Residual Learning for Image Recognition: A Comprehensive Guide and more Exercises Architecture in PDF only on Docsity!

Deep Residual Learning for Image Recognition

Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun

Presenter: Masoud Hoveidar

Facilitators: Amber Ma and Ramya Balasubramaniam

12 th^ August 2019

How DEEP should we make our Neural Networks?

● It Depends on:

○ The complexity of the task at hand ○ Available computational capacity in the time of training ○ Available computational capacity in the time of inference (e.g. on edge devices)

● If the task needs a lot of parameters:

○ Can we train very deep networks efficiently using current optimization solvers? ○ Is training a better model as simple as adding more and more layers?

How DEEP should we make our Neural Networks?

● It Depends on:

○ The complexity of the task at hand ○ Available computational capacity in the time of training ○ Available computational capacity in the time of inference (e.g. on edge devices)

● If the task needs a lot of parameters:

○ Can we train very deep networks efficiently using current optimization solvers? ○ Is training a better model as simple as adding more and more layers?

Why is it not OK to just add more layers?

● Cause it introduces some problems during training such as:

○ Vanishing/Exploding gradients ■ Can be addressed by normalized initialization and intermediate normalization ○ Degradation problem ■ What should we do about it?

Degradation problem … (continued)

conv conv conv

fc softmax

Acc. = X%

Degradation problem … (continued)

conv conv conv

fc softmax

Acc. = X%

conv conv conv identity identity

fc softmax

Acc. = X% identity identity

Degradation problem … (continued)

● Our current optimization solvers are not able to approximate the identity

mappings of a stack of added non-linear layers

● Otherwise, the accuracy of a deeper network should have been at least the

same as a shallower one

● NOTE: This should not be misunderstood with “overfitting”

Degradation problem … (continued)

Residual block

● Residual architecture adds explicit identity connections throughout the

network to help learning the required identity mappings

weight layer weight layer

ReLU

X (identity)

ReLU

Residual block (continued)

● Using this approach, network will decide how deep it needs to be

● These identity connections introduce no new parameter to the network

architecture, hence it will not add any computational burden

● This method allows us to design deeper networks in order to deal with much

complicated problems and tasks

Resnet architecture

Y = F(x,{Wi}) + Wsx

Linear projections For dimension matching

5 min Break

Resnet architectures for ImageNet dataset

“18 layers vs 34 layers” on ImageNet dataset

Deep Residual Learning for Image Recognition: A Comprehensive Guide, Exercises of Architecture

Related documents

Partial preview of the text

Download Deep Residual Learning for Image Recognition: A Comprehensive Guide and more Exercises Architecture in PDF only on Docsity!

Deep Residual Learning for Image Recognition

Presenter: Masoud Hoveidar

Facilitators: Amber Ma and Ramya Balasubramaniam

12 th^ August 2019

● It Depends on:

● If the task needs a lot of parameters:

● It Depends on:

● If the task needs a lot of parameters:

● Cause it introduces some problems during training such as:

● Our current optimization solvers are not able to approximate the identity

mappings of a stack of added non-linear layers

● Otherwise, the accuracy of a deeper network should have been at least the

same as a shallower one

● NOTE: This should not be misunderstood with “overfitting”

● Residual architecture adds explicit identity connections throughout the

network to help learning the required identity mappings

weight layer weight layer

● Using this approach, network will decide how deep it needs to be

● These identity connections introduce no new parameter to the network

architecture, hence it will not add any computational burden

● This method allows us to design deeper networks in order to deal with much

complicated problems and tasks

5 min Break