Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

CS 7643 Quiz 2 | Actual Questions and Answers Latest Updated 2025/2026 (Graded A+) Georgia, Exams of Advanced Education

Harvard University Advanced Education

CS 7643 Quiz 2 | Actual Questions and Answers Latest Updated 2025/2026 (Graded A+) Georgia Institute of Technology

Typology: Exams

2025/2026

Available from 09/02/2025

Favorgrades 🇺🇸

2.6

(5)

13K documents

1 / 10

This page cannot be seen from the preview

Don't miss anything!

CS 7643 Quiz 2 | Actual Questions and Answers

Latest Updated 2025/2026 (Graded A+) Georgia

Institute of Technology

1. Which of the following are common issues while optimizing

the weights of a deep neural network? (Select all that apply)

A. Existence of local minima

B. Ill-conditioned loss surface

C. Noisy gradient estimates

D. Saddle points

Correct Answer: B, C, D

2. Which of the following is the advantage of Leaky ReLU

compared to ReLU?

A. There’s no saturation on the positive end

B. Its output is always positive

C. It is cheap to compute

D. There’s no “dead” neuron when computing gradients

Correct Answer: D

3. The vanishing gradient problem occurs primarily with:

Discover Exams of Advanced Education Harvard University

Partial preview of the text

Download CS 7643 Quiz 2 | Actual Questions and Answers Latest Updated 2025/2026 (Graded A+) Georgia and more Exams Advanced Education in PDF only on Docsity!

CS 7643 Quiz 2 | Actual Questions and Answers

Latest Updated 202 5 /202 6 (Graded A+) Georgia

Institute of Technology

1. Which of the following are common issues while optimizing the weights of a deep neural network? (Select all that apply) A. Existence of local minima B. Ill-conditioned loss surface C. Noisy gradient estimates D. Saddle points **Correct Answer: B, C, D

Which of the following is the advantage of Leaky ReLU compared to ReLU?** A. There’s no saturation on the positive end B. Its output is always positive C. It is cheap to compute D. There’s no “dead” neuron when computing gradients **Correct Answer: D
The vanishing gradient problem occurs primarily with:**

A. ReLU activations B. Tanh and Sigmoid activations C. Linear transformations D. Max pooling layers Correct Answer: B

4. Which of the following best describes batch normalization? A. Adds noise to gradients during backpropagation B. Normalizes inputs within a mini-batch to stabilize training C. Increases the learning rate dynamically D. Removes neurons to prevent overfitting **Correct Answer: B

Why does stochastic gradient descent (SGD) often converge better than full batch gradient descent?** A. It guarantees global minima B. It uses second-order derivatives C. The noise helps escape saddle points and local minima D. It requires fewer epochs **Correct Answer: C
What is the role of momentum in gradient descent?**

A. Ensure weights are all positive B. Keep the variance of activations consistent across layers C. Reduce training time by skipping normalization D. Initialize all weights at zero Correct Answer: B

10. Why are residual connections (ResNets) effective in deep architectures? A. They prevent underfitting B. They allow gradients to flow directly, mitigating vanishing gradients C. They reduce the number of parameters D. They remove the need for backpropagation **Correct Answer: B

Which of the following is a drawback of using very deep networks?** A. Higher capacity for feature learning B. More prone to vanishing/exploding gradients C. Faster training time D. Less risk of overfitting Correct Answer: B

12. In dropout regularization, neurons are: A. Permanently removed from the network B. Randomly deactivated during training to prevent co- adaptation C. Replaced with noise during forward propagation D. Normalized across mini-batches **Correct Answer: B

Which learning rate schedule gradually decreases the learning rate during training?** A. Step decay B. Exponential decay C. Cosine annealing D. All of the above **Correct Answer: D
A flat loss surface near the optimum generally indicates:** A. Poor generalization B. Better generalization C. Higher overfitting D. Lower variance Correct Answer: B

18. Which of the following problems do ReLU activations help mitigate? A. Vanishing gradient B. Exploding gradient C. Overfitting D. Saddle points **Correct Answer: A

The Hessian matrix is useful in optimization because it:** A. Measures gradient variance B. Provides second-order curvature information C. Guarantees global minima D. Eliminates saddle points **Correct Answer: B
Why is Adam optimizer widely preferred in practice?** A. It always finds global optima B. It requires no hyperparameters C. It adapts learning rates individually per parameter using moment estimates D. It is slower but more stable Correct Answer: C

21. Which initialization is typically best for ReLU activations? A. Xavier (Glorot) B. He initialization C. Zero initialization D. Random small constants **Correct Answer: B

Which of the following issues are caused by saddle points in deep networks?** A. Training stops prematurely B. Extremely slow convergence C. Oscillations in weight updates D. Poor initialization **Correct Answer: B
Which technique injects noise during training to improve generalization?** A. Batch Normalization B. Weight Decay C. Dropout D. Xavier Initialization Correct Answer: C

CS 7643 Quiz 2 | Actual Questions and Answers Latest Updated 2025/2026 (Graded A+) Georgia, Exams of Advanced Education

Related documents

Partial preview of the text

Download CS 7643 Quiz 2 | Actual Questions and Answers Latest Updated 2025/2026 (Graded A+) Georgia and more Exams Advanced Education in PDF only on Docsity!

CS 7643 Quiz 2 | Actual Questions and Answers

Latest Updated 202 5 /202 6 (Graded A+) Georgia

Institute of Technology