Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Deep Residual Learning for Image Recognition: ResNet Architecture Analysis, Schemes and Mind Maps of Design

Design

Is learning better networks as simple as stacking more layers? Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image ...

Typology: Schemes and Mind Maps

2022/2023

Uploaded on 03/01/2023

anshula 🇺🇸

4.4

(12)

243 documents

1 / 27

This page cannot be seen from the preview

Don't miss anything!

Deep$Residual$Learning

for$Image$Recognition

Kaiming'He,'Xiangyu'Zhang,'Shaoqing'Ren,'Jian'Sun

work'done'at

Microsoft'Research'Asia

1x1#conv,#64

3x3#conv,#64

1x1#conv,#256

1x1#conv,#64

3x3#conv,#64

1x1#conv,#256

1x1#conv,#64

3x3#conv,#64

1x1#conv,#256

1x1#conv,#128,#/2

3x3#conv,#128

1x1#conv,#512

1x1#conv,#128

3x3#conv,#128

1x1#conv,#512

1x1#conv,#128

3x3#conv,#128

1x1#conv,#512

1x1#conv,#128

3x3#conv,#128

1x1#conv,#512

1x1#conv,#128

3x3#conv,#128

1x1#conv,#512

1x1#conv,#128

3x3#conv,#128

1x1#conv,#512

1x1#conv,#128

3x3#conv,#128

1x1#conv,#512

1x1#conv,#128

3x3#conv,#128

1x1#conv,#512

1x1#conv,#256,#/2

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#256

3x3#conv,#256

1x1#conv,#1024

1x1#conv,#512,#/2

3x3#conv,#512

1x1#conv,#2048

1x1#conv,#512

3x3#conv,#512

1x1#conv,#2048

1x1#conv,#512

3x3#conv,#512

1x1#conv,#2048

ave#pool,#fc#1000

7x7#conv,#64,#/2,# pool/2

Partial preview of the text

Download Deep Residual Learning for Image Recognition: ResNet Architecture Analysis and more Schemes and Mind Maps Design in PDF only on Docsity!

Deep Residual Learning

for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

work done at

Microsoft Research Asia

1 x^1 conv

,^64

3 x^3 conv

,^64

1 x^1 conv

,^64

3 x^3 conv

,^64

1 x^1 conv

,^64

3 x^3 conv

,^64

1 x^1 conv

/^2

3 x^3 conv

1 x^1 conv

3 x^3 conv

1 x^1 conv

3 x^3 conv

1 x^1 conv

3 x^3 conv

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^128

3 x^3 conv

,^128

1 x^1 conv

,^512

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^128

3 x^3 conv

,^128

1 x^1 conv

,^512

1 x^1 conv

/^2

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

/^2

3 x^3 conv

1 x^1 conv

,^2048

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^2048

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^2048

ave pool

,^ fc^1000

7 x^7 conv

, 64 ,^ /^2

,^ pool^ /

ResNet @ ILSVRC & COCO 2015 Competitions 1st places in all five main tracks

ImageNet Classification: “ Ultra-deep ” 152 - layer nets
ImageNet Detection: 16% better than 2nd
ImageNet Localization: 27% better than 2nd
COCO Detection: 11% better than 2nd
COCO Segmentation: 12% better than 2nd *improvements are relative numbers

Revolution of Depth

HOG, DPM AlexNet

(RCNN)

VGG

(RCNN)

ResNet

(Faster RCNN)*

PASCAL VOC 2007 Object Detection mAP (%)

shallow 8 layers 16 layers

101 layers

*w/ other improvements & more data Engines of visual recognition

Revolution of Depth 11 x 11 conv, 96 , / 4 , pool/ 2

5 x 5 conv, 256 , pool/ 2

3 x 3 conv, 384

3 x 3 conv, 256 , pool/ 2

fc, 4096

fc, 1000

AlexNet, 8 layers

(ILSVRC 2012)

AlexNet, 8 layers

(ILSVRC 2012)

Revolution of Depth

ResNet, 152 layers

VGG, 19 layers

Is learning better networks

as simple as stacking more layers?

Simply stacking layers? 0 1 2 3 4 5 6 0 5 10 20 iter. (1e4) error (%) plain- 20 plain- 32 plain- 44 plain- 56 CIFAR- 10 20 - layer 32 - layer 44 - layer 56 - layer 0 10 20 30 40 50 20 30 40 50 60 iter. (1e4) error (%) plain- 18 plain- 34 ImageNet- 1000 34 - layer 18 - layer

“Overly deep” plain nets have higher training error
A general phenomenon, observed in many datasets

solid: test/val

dashed: train

7 x 7 conv, 64 , / 2 3 x 3 conv, 64 3 x 3 conv, 64 3 x 3 conv, 64 3 x 3 conv, 64 3 x 3 conv, 128 , / 2 3 x 3 conv, 128 3 x 3 conv, 128 3 x 3 conv, 128 3 x 3 conv, 256 , / 2 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 512 , / 2 3 x 3 conv, 512 3 x 3 conv, 512 3 x 3 conv, 512 fc 1000

a shallower

model

(18 layers)

a deeper

counterpart

(34 layers)

7 x 7 conv, 64 , / 2 3 x 3 conv, 64 3 x 3 conv, 64 3 x 3 conv, 64 3 x 3 conv, 64 3 x 3 conv, 64 3 x 3 conv, 64 3 x 3 conv, 128 , / 2 3 x 3 conv, 128 3 x 3 conv, 128 3 x 3 conv, 128 3 x 3 conv, 128 3 x 3 conv, 128 3 x 3 conv, 128 3 x 3 conv, 128 3 x 3 conv, 256 , / 2 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 256 3 x 3 conv, 512 , / 2 3 x 3 conv, 512 3 x 3 conv, 512 3 x 3 conv, 512 3 x 3 conv, 512 3 x 3 conv, 512 fc 1000 “extra” layers

Richer solution space
A deeper model should not have higher

training error

A solution by construction :
- original layers: copied from a

learned shallower model

extra layers: set as identity
at least the same training error
Optimization difficulties: solvers cannot

find the solution when going deeper…

Deep Residual Learning

Residual net 𝐻 𝑥 is any desired mapping, hope the 2 weight layers fit 𝐻(𝑥) hope the 2 weight layers fit 𝐹(𝑥) let 𝐻 𝑥 = 𝐹 𝑥 + 𝑥

weight layer

relu relu 𝑥 𝐻 𝑥 = 𝐹 𝑥 + 𝑥

identity

𝑥 𝐹(𝑥)

Deep Residual Learning

𝐹 𝑥 is a residual mapping w.r.t. identity
- If identity were optimal,

easy to set weights as 0

If optimal mapping is closer to identity,

easier to find small fluctuations

weight layer

relu relu 𝑥 𝐻 𝑥 = 𝐹 𝑥 + 𝑥

identity

𝑥 𝐹(𝑥)

CIFAR-10 experiments 0 1 2 3 4 5 6 0 5 10 20 iter. (1e4) error (%) plain- 20 plain- 32 plain- 44 plain- 56 20 - layer 32 - layer 44 - layer 56 - layer CIFAR-10 plain nets (^00 1 2 3 4 5 ) 5 10 20 iter. (1e4) error (%) ResNet- 20 ResNet- 32 ResNet- 44 ResNet- 56 ResNet- 110 CIFAR- 10 ResNets 56 - layer 44 - layer 32 - layer 20 - layer 110 - layer

Deep ResNets can be trained without difficulties
Deeper ResNets have lower training error , and also lower test error solid: test dashed: train

ImageNet experiments 0 10 20 30 40 50 20 30 40 50 60 iter. (1e4) error (%) ResNet- 18 ResNet- 34 0 10 20 30 40 50 20 30 40 50 60 iter. (1e4) error (%) plain- 18 plain- 34 ImageNet plain nets ImageNet ResNets solid: test dashed: train 34 - layer 18 - layer 18 - layer 34 - layer

Deep ResNets can be trained without difficulties
Deeper ResNets have lower training error , and also lower test error

Beyond classification

A treasure from ImageNet is on learning features.

Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”. arXiv 2015.

“ Features matter. ” (quote [Girshick et al. 2014], the R-CNN paper) task

2nd-place

winner

ResNets

margin

(relative) ImageNet Localization (top-5 error) 12.0^ 9.0^ 27% ImageNet Detection ([email protected]) 53.6^ 62.1^ 16% COCO Detection ([email protected]:.95) 33.5^ 37.3^ 11% COCO Segmentation ([email protected]:.95) 25.1^ 28.2^ 12%

Our results are all based on ResNet- 101
Our features are well transferrable

Deep Residual Learning for Image Recognition: ResNet Architecture Analysis, Schemes and Mind Maps of Design

Related documents

Partial preview of the text

Download Deep Residual Learning for Image Recognition: ResNet Architecture Analysis and more Schemes and Mind Maps Design in PDF only on Docsity!

Deep Residual Learning

for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

work done at

Microsoft Research Asia

1 x^1 conv

,^64

3 x^3 conv

,^64

1 x^1 conv

1 x^1 conv

,^64

3 x^3 conv

,^64

1 x^1 conv

1 x^1 conv

,^64

3 x^3 conv

,^64

1 x^1 conv

1 x^1 conv

/^2

3 x^3 conv

1 x^1 conv

1 x^1 conv

3 x^3 conv

1 x^1 conv

1 x^1 conv

3 x^3 conv

1 x^1 conv

1 x^1 conv

3 x^3 conv

1 x^1 conv

1 x^1 conv

3 x^3 conv

1 x^1 conv

1 x^1 conv

,^128

3 x^3 conv

,^128

1 x^1 conv

,^512

1 x^1 conv

3 x^3 conv

1 x^1 conv

1 x^1 conv

,^128

3 x^3 conv

,^128

1 x^1 conv

,^512

1 x^1 conv

/^2

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024

1 x^1 conv

3 x^3 conv

1 x^1 conv

,^1024