Top 20 Tricky MCQs on Retrieval-Augmented Generation (RAG) with Answers

Q: What is RAG in NLP?

Retrieval-Augmented Generation (RAG) is a technique that combines information retrieval with text generation to improve accuracy and reduce hallucinations in language models.

Q: Is RAG part of NLP or Machine Learning?

RAG is primarily considered an NLP application but is built using machine learning techniques such as embeddings and neural networks.

Q: Why is RAG important?

RAG improves the reliability of language models by grounding responses in external data, reducing hallucination and improving factual accuracy.

Q: What are key components of RAG?

The main components of RAG are retriever, generator, embedding model, and vector database.

Question 1

In a RAG system, what is the primary purpose of the retriever?

Answer

Correct Answer: C

Explanation:

The retriever fetches relevant external documents to provide context for generation.

Question 2

Which component converts text into vectors in RAG?

Answer

Correct Answer: C

Explanation:

Embedding models transform text into vector representations for similarity search.

An embedding model is a machine learning model that converts text (words, sentences, or documents) into numerical vectors in a high-dimensional space to enable semantic similarity search.

In the context of Retrieval-Augmented Generation (RAG):

The query is converted into a vector
Documents are also stored as vectors
Then similarity (e.g., cosine similarity) is computed to find relevant documents

Question 3

Which similarity metric is most commonly used in vector search?

Answer

Correct Answer: C

Explanation:

Cosine similarity measures angular similarity and is widely used for embeddings.

Question 4

What happens if chunk size is too large in RAG?

Answer

Correct Answer: B

Explanation:

Large chunks reduce retrieval granularity, making precise matching harder.

More info on chunk size and granularity:
In Retrieval-Augmented Generation, documents are split into chunks before being embedded and stored. Large chunk sizes reduce retrieval granularity, making it harder to extract precise and relevant information.

What does “granularity” mean?
High granularity → fine, precise pieces of information Low granularity → large, coarse blocks of text

What if each chunk size is too large?
If each chunk is very big, then:

Each chunk contains too much mixed information
Embeddings become less specific
Retrieval returns broad but less relevant context
This might result in the system cannot pinpoint the exact relevant passage.

That is exactly: Reduced granularity of retrieved information.

Question 5

Which problem does RAG primarily aim to mitigate?

Answer

Correct Answer: B

Explanation:

RAG reduces hallucinations by grounding outputs in retrieved external knowledge.

Why do LLMs hallucinate?

Standard language models generate answers based on internal (parametric) knowledge. If they don’t know the answer, or have outdated/incomplete knowledge, they may confidently generate incorrect information (hallucination).

How Retrieval-Augmented Generation reduces hallucinations?

RAG changes the process from: “Generate from memory” to “Retrieve → then generate based on evidence”

Steps in RAG to avoid hallucination

1. Query comes in.
2. Retrieval: RAG searches documents/databases, fetches relevant, real information. This is called grounding.
3. Context injection: The retrieved content is added to the prompt.
4. Controlled generation: The model now relies on actual retrieved facts not just its internal memory.

Note: RAG does NOT eliminate hallucination completely. It only reduces hallucination and improves factual accuracy.

Question 6

In RAG, what is “top-k retrieval”?

Answer

Correct Answer: B

Explanation:

Top-k retrieval returns the most relevant documents based on similarity.

Question 7

Which database is typically used in RAG systems?

Answer

Correct Answer: C

Explanation:

Vector databases efficiently store and search embeddings.

Why a Vector DB is used in RAG?

In Retrieval-Augmented Generation, the goal is to find the most relevant information based on meaning, not just exact words. That’s where a vector database becomes essential. A vector database is used in RAG to store embeddings and perform fast semantic similarity search for retrieving relevant documents.

RAG works with embeddings (vectors), not raw text. So instead of “Find documents containing the exact keyword”, it does “Find documents that are semantically similar to the query”.

What a Vector DB actually does?

A vector database:

Stores embeddings (numerical vectors of text)
Performs similarity search (e.g., cosine similarity)
Quickly retrieves top-k relevant documents

Question 8

What is the role of the generator in RAG?

Answer

Correct Answer: C

Explanation:

The generator (LLM) produces the final response using retrieved context.

Question 9

Which factor most affects retrieval accuracy?

Answer

Correct Answer: B

Explanation:

High-quality embeddings improve semantic matching and retrieval performance.

Question 10

What is “chunk overlap”?

Answer

Correct Answer: B

Explanation:

Chunk overlap ensures that important context is not lost between adjacent chunks.

Question 11

Which architecture is commonly used in RAG generators?

Answer

Correct Answer: C

Explanation:

Modern language models used in RAG are based on the Transformer architecture.

Question 12

Why is re-ranking used in RAG?

Answer

Correct Answer: B

Explanation:

Re-ranking refines the initially retrieved documents to improve relevance.

Question 13

What happens if irrelevant documents are retrieved in RAG?

Answer

Correct Answer: B

Explanation:

Irrelevant context can mislead the generator and increase hallucinated outputs.

Question 14

Which technique helps reduce latency in RAG?

Answer

Correct Answer: B

Explanation:

Caching avoids recomputation and speeds up repeated queries.

Question 15

What is “dense retrieval”?

Answer

Correct Answer: B

Explanation:

Dense retrieval uses continuous vector embeddings for semantic search.

Question 16

Which is NOT a component of a standard RAG system?

Answer

Correct Answer: D

Explanation:

Discriminators are used in GANs, not in RAG architectures.

Question 17

Why is normalization applied to embeddings?

Answer

Correct Answer: B

Explanation:

Normalization ensures consistent and meaningful similarity calculations.

Question 18

Which trade-off is critical in RAG systems?

Answer

Correct Answer: B

Explanation:

Increasing retrieval depth improves accuracy but also increases latency.

Question 19

What is hybrid retrieval?

Answer

Correct Answer: B

Explanation:

Hybrid retrieval combines semantic (dense) and keyword (sparse) search.

Question 20

Which step must occur before similarity search in a RAG pipeline?

Major links

Quicklinks

Sunday, April 19, 2026

Top 20 Tricky MCQs on RAG (Retrieval-Augmented Generation) with Answers | NLP & LLM Practice

Top 20 Tricky MCQs on Retrieval-Augmented Generation (RAG) with Answers

Why do LLMs hallucinate?

How Retrieval-Augmented Generation reduces hallucinations?

Steps in RAG to avoid hallucination

Why a Vector DB is used in RAG?

What a Vector DB actually does?

No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

All time most popular contents

Report Abuse