Learning Path
Question & Answer1
Understand Question2
Review Options3
Learn Explanation4
Explore TopicChoose the Best Answer
A
True
B
False
Understanding the Answer
Let's break down why this is correct
Answer
Transformers use attention to let every word look at every other word at once, so they do not need to read the sentence word by word like RNNs. Because attention can connect any two positions directly, the model can process the whole sentence in parallel, speeding up training. However, since attention alone has no sense of order, Transformers add positional encodings so the model knows which word comes first or last. For example, if the sentence is “The cat sat,” the attention mechanism will compute relationships between “The,” “cat,” and “sat,” while positional encodings tell it that “The” comes before “cat. ” This combination lets Transformers handle long sequences efficiently without sequential layers.
Detailed Explanation
Transformers use self‑attention so every token can see every other token in the same layer. Other options are incorrect because Many people think a Transformer still has some hidden sequence because it uses positional encoding.
Key Concepts
Transformer architecture
Attention mechanisms
Machine translation
Topic
Transformer Architecture
Difficulty
easy level question
Cognitive Level
understand
Ready to Master More Topics?
Join thousands of students using Seekh's interactive learning platform to excel in their studies with personalized practice and detailed explanations.