



Besser lernen dank der zahlreichen Ressourcen auf Docsity
Heimse Punkte ein, indem du anderen Studierenden hilfst oder erwirb Punkte mit einem Premium-Abo
Prüfungen vorbereiten
Besser lernen dank der zahlreichen Ressourcen auf Docsity
Download-Punkte bekommen.
Heimse Punkte ein, indem du anderen Studierenden hilfst oder erwirb Punkte mit einem Premium-Abo
Deep Learning Klausur Midterm Exam 2017 2018
Art: Prüfungen
1 / 6
Diese Seite wird in der Vorschau nicht angezeigt
Lass dir nichts Wichtiges entgehen!




Welcome to the CS231N Midterm Exam!
I understand and agree to uphold the Stanford Honor Code during this exam.
Signature: Date:
1 Multiple Choice (20 points)
Circle the letters of your choice.
Each question is worth 2 points. Each one of the four individual choices is 0.5 points for a correct answer, or 0 points otherwise.
(a) The learning rate could be too low (b) The regularization strength could be too high (c) The class distribution could be very uneven in the dataset (d) The weight initialization scale could be incorrectly set
(a) The code would crash on the very first CONV layer because 3x3 filters with stride 1 pad 1 wouldn’t ”fit” across 32x32 input (b) The amount of memory needed to store the forward activations in the first CONV layer would be reduced by a factor of 7 (since 224/32 = 7) (c) The network would run fine until the very first Fully Connected layer, where it would crash (d) The network would run forward just fine but its predictions would, of course, be ImageNet class predictions
(a) Is approximately as fast to compute in both forward and backward pass as a CONV layer (with the same filter size and strides). (b) Is similar to batch normalization in that it will keep all of your neuron activities in a similar range. (c) Could contribute to difficulties during gradient checking. (d) Could contribute to the vanishing gradient problem (recall: this is a problem where by the end of a backward pass the gradients are very small)
3 Short Answer (60 points)
Answer each question in provided space.
Fill in the missing gradients underneath the forward pass activations in each circuit diagram. The gradient of the output with respect to the loss is one (1.00) for every circut, and has already been filled in.
Consider the convolutional network defined by the layers in the left column below. Fill in the size of the activation volumes at each layer, and the number of parameters at each layer. You can write your answer as a multiplication (e.g. 128x128x3).
Layer Activation Volume Dimensions (memory) Number of parameters INPUT 32x32x1 0 CONV5- POOL CONV5- POOL FC-
Consider the following 1-dimensional ConvNet, where all variables are scalars:
x 1 x 2 x 3 x 4 x 5
k, b
z 1
k, b
z 2
k, b
z 3
max max
w, a
f
v 1 v 2
y ˆ
y
conv
max pool, relu
fully connected
Loss
(y − yˆ)^2
y ˆ =
w 1 w 2
v 1 v 2
v 1 v 2
max{z 1 , z 2 , 0 } max{z 2 , z 3 , 0 }
z 1 z 2 z 3
k 1 k 2 k 3 0 0 0 k 1 k 2 k 3 0 0 0 k 1 k 2 k 3
x 1 x 2 x 3 x 4 x 5
b b b
(a) (1 point) List the parameters in this network.
(b) (3 points) Determine the following
∂L ∂w 1
∂w 2
∂a
(c) (3 points) Given the gradients of the loss L with respect to the second layer activations v, derive the gradient of the loss with respect to the first layer activations z. More precisely, given ∂L ∂v 1
= δ 1
∂v 2
= δ 2
Determine the following ∂L ∂z 1
∂z 2
∂z 3