Learning Path
Question & Answer1
Understand Question2
Review Options3
Learn Explanation4
Explore TopicChoose the Best Answer
A
True
B
False
Understanding the Answer
Let's break down why this is correct
Answer
The statement is false: the vanishing gradient problem hurts deep networks far more than shallow ones. When back‑propagation moves through many layers, each weight multiplies the gradient by a factor usually less than one, so after dozens of layers the gradient can become almost zero, preventing learning in those early layers. Shallow networks have only a few multiplications, so the gradient remains large enough for useful updates. For example, in a 10‑layer ReLU network the gradient can shrink to 10⁻⁶, while in a 3‑layer network it stays close to the original value, allowing training to proceed. Thus, deep networks are the main victims of vanishing gradients.
Detailed Explanation
The vanishing gradient problem happens when gradients shrink as they travel back through many layers. Other options are incorrect because The misconception is that fewer layers mean more gradient loss.
Key Concepts
Vanishing Gradients Problem
Deep Neural Networks
Gradient Descent Optimization
Topic
Vanishing/Exploding Gradients Problem
Difficulty
medium level question
Cognitive Level
understand
Ready to Master More Topics?
Join thousands of students using Seekh's interactive learning platform to excel in their studies with personalized practice and detailed explanations.