✔ Scroll down and test yourself — answers are hidden under the “View Answer” button.
Attempt all questions first.
✔️ Click SUBMIT at the end to unlock VIEW ANSWER buttons.
Top 20 Tricky MCQs on Retrieval-Augmented Generation (RAG) with Answers
Retrieval-Augmented Generation (RAG) is a powerful technique that enhances large language models by combining information retrieval with text generation. It helps reduce hallucinations by grounding responses in external knowledge sources.
In this post, you will find 20 carefully designed MCQs on RAG with answers and explanations. These questions are useful for GATE, placements, NLP interviews, and data science exams.
Explanation:
The retriever fetches relevant external documents to provide context for generation.
Explanation:
Embedding models transform text into vector representations for similarity search.
An embedding model is a machine learning model that converts text (words, sentences, or documents) into numerical vectors in a high-dimensional space to enable semantic similarity search.
In the context of Retrieval-Augmented Generation (RAG):
- The query is converted into a vector
- Documents are also stored as vectors
- Then similarity (e.g., cosine similarity) is computed to find relevant documents
Explanation:
Cosine similarity measures angular similarity and is widely used for embeddings.
Explanation:
Large chunks reduce retrieval granularity, making precise matching harder.
More info on chunk size and granularity:
In Retrieval-Augmented Generation, documents are split into chunks before being embedded and stored. Large chunk sizes reduce retrieval granularity, making it harder to extract precise and relevant information.
What does “granularity” mean?
High granularity → fine, precise pieces of information
Low granularity → large, coarse blocks of text
What if each chunk size is too large?
If each chunk is very big, then:
- Each chunk contains too much mixed information
- Embeddings become less specific
- Retrieval returns broad but less relevant context
- This might result in the system cannot pinpoint the exact relevant passage.
That is exactly: Reduced granularity of retrieved information.
Explanation:
RAG reduces hallucinations by grounding outputs in retrieved external knowledge.
Why do LLMs hallucinate?
Standard language models generate answers based on internal (parametric) knowledge. If they don’t know the answer, or have outdated/incomplete knowledge, they may confidently generate incorrect information (hallucination).
How Retrieval-Augmented Generation reduces hallucinations?
RAG changes the process from: “Generate from memory” to “Retrieve → then generate based on evidence”
Steps in RAG to avoid hallucination
1. Query comes in.
2. Retrieval: RAG searches documents/databases, fetches relevant, real information. This is called grounding.
3. Context injection: The retrieved content is added to the prompt.
4. Controlled generation: The model now relies on actual retrieved facts not just its internal memory.
Note: RAG does NOT eliminate hallucination completely. It only reduces hallucination and improves factual accuracy.
Explanation:
Top-k retrieval returns the most relevant documents based on similarity.
Explanation:
Vector databases efficiently store and search embeddings.
Explanation:
The generator (LLM) produces the final response using retrieved context.
Explanation:
High-quality embeddings improve semantic matching and retrieval performance.
Explanation:
Chunk overlap ensures that important context is not lost between adjacent chunks.
Explanation:
Modern language models used in RAG are based on the Transformer architecture.
Explanation:
Re-ranking refines the initially retrieved documents to improve relevance.
Explanation:
Irrelevant context can mislead the generator and increase hallucinated outputs.
Explanation:
Caching avoids recomputation and speeds up repeated queries.
Explanation:
Dense retrieval uses continuous vector embeddings for semantic search.
Explanation:
Discriminators are used in GANs, not in RAG architectures.
Explanation:
Normalization ensures consistent and meaningful similarity calculations.
Explanation:
Increasing retrieval depth improves accuracy but also increases latency.
Explanation:
Hybrid retrieval combines semantic (dense) and keyword (sparse) search.
No comments:
Post a Comment