7643 Quiz 2 | Actual Questions and Answers Latest Updated 2025/2026 (Graded A+) Georgia In, Exams of Advanced Education

7643 Quiz 2 | Actual Questions and Answers Latest Updated 2025/2026 (Graded A+) Georgia Institute of Technology

Typology: Exams

2025/2026

Available from 09/09/2025

Qualityexam
Qualityexam 🇰🇪

2.5

(4)

6.4K documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 7643 Quiz 2 | Actual Questions and Answers
Latest Updated 2025/2026 (Graded A+) Georgia
Institute of Technology
1. Which of the following are common issues while optimizing
the weights of a deep neural network? (Select all that apply)
A. Existence of local minima
B. Ill-conditioned loss surface
C. Noisy gradient estimates
D. Saddle points
Correct Answer: B, C, D
2. Which of the following is the advantage of Leaky ReLU
compared to ReLU?
A. There’s no saturation on the positive end
B. Its output is always positive
C. It is cheap to compute
D. There’s no “dead” neuron when computing gradients
Correct Answer: D
3. The vanishing gradient problem occurs primarily with:
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download 7643 Quiz 2 | Actual Questions and Answers Latest Updated 2025/2026 (Graded A+) Georgia In and more Exams Advanced Education in PDF only on Docsity!

CS 7643 Quiz 2 | Actual Questions and Answers

Latest Updated 2025/2026 (Graded A+) Georgia

Institute of Technology

1. Which of the following are common issues while optimizing the weights of a deep neural network? (Select all that apply)

A. Existence of local minima B. Ill-conditioned loss surface C. Noisy gradient estimates D. Saddle points Correct Answer: B, C, D

2. Which of the following is the advantage of Leaky ReLU compared to ReLU?

A. There’s no saturation on the positive end B. Its output is always positive C. It is cheap to compute D. There’s no “dead” neuron when computing gradients Correct Answer: D

3. The vanishing gradient problem occurs primarily with:

A. ReLU activations B. Tanh and Sigmoid activations C. Linear transformations D. Max pooling layers Correct Answer: B

4. Which of the following best describes batch normalization?

A. Adds noise to gradients during backpropagation B. Normalizes inputs within a mini-batch to stabilize training C. Increases the learning rate dynamically D. Removes neurons to prevent overfitting Correct Answer: B

5. Why does stochastic gradient descent (SGD) often converge better than full batch gradient descent?

A. It guarantees global minima B. It uses second-order derivatives C. The noise helps escape saddle points and local minima D. It requires fewer epochs Correct Answer: C

6. What is the role of momentum in gradient descent?

A. Ensureiweightsiareiallipositive B. Keepitheivarianceiofiactivationsiconsistentiacrossilayers C. Reduceitrainingitimeibyiskippinginormalization D. Initializeialliweightsiatizero CorrectiAnswer:iB

10. Whyiareiresidualiconnectionsi(ResNets)ieffectiveiinideepia rchitectures?

A. Theyipreventiunderfitting B. Theyiallowigradientsitoiflowidirectly,imitigatingivanishingigr adients C. Theyireduceitheinumberiofiparameters D. Theyiremoveitheineediforibackpropagation CorrectiAnswer:iB

11. Whichiofitheifollowingiisiaidrawbackiofiusingiveryideepine tworks?

A. Highericapacityiforifeatureilearning B. Moreiproneitoivanishing/explodingigradients C. Fasteritrainingitime D. Lessiriskiofioverfitting CorrectiAnswer:iB

12. Inidropoutiregularization,ineuronsiare:

A. Permanentlyiremovedifromitheinetwork B. Randomlyideactivatediduringitrainingitoipreventico- iadaptation C. Replacediwithinoiseiduringiforwardipropagation D. Normalizediacrossimini-batches CorrectiAnswer:iB

13. Whichilearningirateischeduleigraduallyidecreasesitheil earningirateiduringitraining?

A. Stepidecay B. Exponentialidecay C. Cosineiannealing D. Alliofitheiabove CorrectiAnswer:iD

14. Aiflatilossisurfaceinearitheioptimumigenerallyiindicates:

A. Poorigeneralization B. Betterigeneralization C. Higherioverfitting D. Lowerivariance CorrectiAnswer:iB

18. WhichiofitheifollowingiproblemsidoiReLUiactivationsihelpim itigate?

A. Vanishingigradient B. Explodingigradient C. Overfitting D. Saddleipoints CorrectiAnswer:iA

19. TheiHessianimatrixiisiusefuliinioptimizationibecauseiit:

A. Measuresigradientivariance B. Providesisecond-ordericurvatureiinformation C. Guaranteesiglobaliminima D. Eliminatesisaddleipoints CorrectiAnswer:iB

20. WhyiisiAdamioptimizeriwidelyipreferrediinipractice?

A. Itialwaysifindsiglobalioptima B. Itirequiresinoihyperparameters C. Itiadaptsilearningiratesiindividuallyiperiparameteriusingim omentiestimates D. Itiisisloweributimoreistable CorrectiAnswer:iC

21. WhichiinitializationiisitypicallyibestiforiReLUiactivations?

A. Xavieri(Glorot) B. Heiinitialization C. Zeroiinitialization D. Randomismalliconstants CorrectiAnswer:iB

22. Whichiofitheifollowingiissuesiareicausedibyisaddleipointsiini deepinetworks?

A. Trainingistopsiprematurely B. Extremelyislowiconvergence C. Oscillationsiiniweightiupdates D. Pooriinitialization CorrectiAnswer:iB

23. Whichitechniqueiinjectsinoiseiduringitrainingitoiimproveig eneralization?

A. BatchiNormalization B. WeightiDecay C. Dropout D. XavieriInitialization CorrectiAnswer:iC