Learning Path
Question & Answer1
Understand Question2
Review Options3
Learn Explanation4
Explore TopicChoose the Best Answer
A
True
B
False
Understanding the Answer
Let's break down why this is correct
Answer
The Transformer uses attention to let every word look at every other word at once, so it doesn’t need a step‑by‑step loop like RNNs. Because it processes all positions together, it can be parallelized and is faster for long texts. However, to still know which word comes where, it adds positional encodings that give each word a sense of its place in the sentence. Thus, the model is independent of sequential layers but still respects word order. For example, in the sentence “The cat sat,” attention lets the word “sat” see both “The” and “cat” simultaneously, while positional codes tell the model that “cat” is second.
Detailed Explanation
Transformers use attention to look at all words at once. Other options are incorrect because Some think Transformers still read words one after another.
Key Concepts
Transformer architecture
Attention mechanisms
Machine translation
Topic
Transformer Architecture
Difficulty
easy level question
Cognitive Level
understand
Ready to Master More Topics?
Join thousands of students using Seekh's interactive learning platform to excel in their studies with personalized practice and detailed explanations.