A team of researchers is developing a deep neural network for image recognition, but they notice that the network struggles to learn effectively as they increase the number of layers. Which of the following strategies would best address the vanishing/exploding gradients problem they are facing?

Question

Seekh · Accepted Answer

The most effective way to keep gradients from dying or blowing up in deep networks is to use residual connections, which let the gradient flow directly across layers instead of being forced through every nonlinear transformation. These skip connections add the input of a block to its output, creating a shortcut path that preserves the signal during back‑propagation. Because the gradient can travel along this shortcut, it avoids the repeated multiplication by small or large numbers that causes vanishing or exploding gradients. In practice, adding a few residual blocks to a very deep network often restores learning speed and accuracy, as shown in many modern image‑recognition models. Thus, residual connections are the best strategy to tackle this problem.

A team of researchers is developing a deep neural network for image recognition, but they notice that the network struggles to learn effectively as they increase the number of layers. Which of the following strategies would best address the vanishing/exploding gradients problem they are facing?

Learning Path

Choose the Best Answer

Understanding the Answer

Key Concepts

Deep Dive: Vanishing/Exploding Gradients Problem

Definition

Topic Definition

Ready to Master More Topics?