Learning Path
Question & Answer
Choose the Best Answer
Weight initialization that leads to small gradients
Overfitting due to excessive training
Using too many hidden layers without activation functions
Insufficient data for training
Understanding the Answer
Let's break down why this is correct
When weights are set too small at the start, each layer multiplies the gradient by a number less than one. Other options are incorrect because Overfitting happens when a model learns training data too well, but it does not stop gradients from flowing; Hidden layers alone do not cause vanishing gradients.
Key Concepts
Vanishing/Exploding Gradients Problem
medium level question
understand
Deep Dive: Vanishing/Exploding Gradients Problem
Master the fundamentals
Definition
The vanishing/exploding gradients problem poses a challenge in training deep neural networks, hindering convergence during optimization. Techniques such as normalized initialization and intermediate normalization layers have been developed to mitigate this issue and enable the training of deep networks with improved convergence rates.
Topic Definition
The vanishing/exploding gradients problem poses a challenge in training deep neural networks, hindering convergence during optimization. Techniques such as normalized initialization and intermediate normalization layers have been developed to mitigate this issue and enable the training of deep networks with improved convergence rates.
Ready to Master More Topics?
Join thousands of students using Seekh's interactive learning platform to excel in their studies with personalized practice and detailed explanations.