📚 Learning Guide
Vanishing/Exploding Gradients Problem
medium

The vanishing gradient problem primarily affects the learning capabilities of shallow networks more than deep networks.

Master this concept with our detailed explanation and step-by-step learning approach

Learning Path
Learning Path

Question & Answer
1
Understand Question
2
Review Options
3
Learn Explanation
4
Explore Topic

Choose the Best Answer

A

True

B

False

Understanding the Answer

Let's break down why this is correct

Answer

The statement is false: the vanishing gradient problem hurts deep networks far more than shallow ones. When back‑propagation moves through many layers, each weight multiplies the gradient by a factor usually less than one, so after dozens of layers the gradient can become almost zero, preventing learning in those early layers. Shallow networks have only a few multiplications, so the gradient remains large enough for useful updates. For example, in a 10‑layer ReLU network the gradient can shrink to 10⁻⁶, while in a 3‑layer network it stays close to the original value, allowing training to proceed. Thus, deep networks are the main victims of vanishing gradients.

Detailed Explanation

The vanishing gradient problem happens when gradients shrink as they travel back through many layers. Other options are incorrect because The misconception is that fewer layers mean more gradient loss.

Key Concepts

Vanishing Gradients Problem
Deep Neural Networks
Gradient Descent Optimization
Topic

Vanishing/Exploding Gradients Problem

Difficulty

medium level question

Cognitive Level

understand

Ready to Master More Topics?

Join thousands of students using Seekh's interactive learning platform to excel in their studies with personalized practice and detailed explanations.