Major links



Quicklinks


📌 Quick Links
[ DBMS ] [ DDB ] [ ML ] [ DL ] [ NLP ] [ DSA ] [ PDB ] [ DWDM ] [ Quizzes ]


Showing posts with label NLP Quiz Questions. Show all posts
Showing posts with label NLP Quiz Questions. Show all posts

Thursday, February 19, 2026

RAG vs Fine-Tuning MCQs (Top 10 with Detailed Explanations) – Generative AI Guide

✔ Scroll down and test yourself — answers are hidden under the “View Answer” button.

☰ Quick Links - Browse Related MCQs
🚨 Quiz Instructions:
Attempt all questions first.
✔️ Click SUBMIT at the end to unlock VIEW ANSWER buttons.
Quiz Mode:

RAG vs Fine-Tuning: Top 10 MCQs with Detailed Explanations

As large language models (LLMs) become central to modern AI applications, two key techniques are widely used to adapt them for real-world tasks: Retrieval-Augmented Generation (RAG) and Fine-tuning. While both approaches enhance model performance, they serve different purposes and are suitable for different scenarios.

RAG improves factual accuracy by retrieving relevant external information at query time, making it ideal for dynamic and frequently updated knowledge. Fine-tuning, on the other hand, modifies the model’s internal parameters to align its behavior, tone, and task-specific capabilities.

Understanding the differences between these two techniques is essential for students, researchers, and AI practitioners working with generative AI systems. This MCQ set presents carefully designed questions that test conceptual understanding, practical use cases, scalability considerations, and real-world trade-offs between RAG and Fine-tuning.

Each question includes a detailed explanation to help you build strong conceptual clarity and prepare for exams, interviews, and advanced study in Generative AI.


RAG vs Fine-Tuning: A Simple Comparison

Retrieval-Augmented Generation (RAG) and Fine-tuning are two widely used techniques for improving the performance of large language models (LLMs). Although both methods enhance model usefulness, they work in fundamentally different ways and are suitable for different types of problems.

Aspect RAG Fine-Tuning
How it works Retrieves relevant external documents at query time and uses them as context Updates the model’s internal weights using additional training data
Knowledge updates Easy – just update the document database Difficult – requires retraining the model
Best for Frequently changing or large knowledge bases Consistent tone, style, or task-specific behavior
Infrastructure Requires embeddings and a vector database Requires training data and computational resources
Knowledge storage External (documents, databases) Internal (model parameters)
Use cases Chatbots with company knowledge, website assistants, enterprise search Structured outputs, domain-specific writing style, instruction alignment

In practice, many real-world systems combine both approaches. RAG provides up-to-date and factual information, while fine-tuning ensures consistent response quality, tone, and task alignment.


Practice Questions on RAG vs Fine-Tuning

1.
What is the primary purpose of Retrieval-Augmented Generation (RAG)?






Correct Answer: B

Explanation:

Retrieval-Augmented Generation (RAG) is designed to improve the factual accuracy and relevance of large language model outputs by providing external knowledge at inference time. Instead of modifying the model’s internal weights, RAG retrieves semantically relevant documents (using embeddings and vector search) based on the user’s query and includes this information in the prompt.

This approach is particularly useful when:

  • The knowledge base is large or frequently updated
  • The information is domain-specific or private
  • Retraining the model is expensive or impractical

Unlike fine-tuning, RAG keeps the model unchanged and separates knowledge storage from model learning, making it scalable and flexible for real-world applications such as enterprise search, website assistants, and documentation chatbots.

2.
Which component of a large language model is modified during fine-tuning?






Correct Answer: C

Explanation:

Fine-tuning involves updating the internal weights and parameters of a pretrained large language model using additional domain-specific or task-specific training data. This process adjusts how the model represents language patterns internally, allowing it to better perform a targeted task or adopt a specific behavioral style.

During fine-tuning:
  • The model undergoes additional gradient updates using supervised training data
  • Parameters are modified to reflect domain knowledge or output preferences
  • The learned changes become permanently embedded in the model

It is important to note that fine-tuning does not modify the context window size, the external document store and the prompt template.

Unlike RAG, which retrieves knowledge dynamically at inference time, fine-tuning encodes knowledge directly into model weights. This makes it suitable for:

  • Style control (formal, academic, conversational)
  • Structured output formatting
  • Task-specific behavior alignment

However, incorporating new factual knowledge through fine-tuning requires retraining, which can be computationally expensive and time-consuming.

3.
Which approach is most suitable for handling knowledge that changes frequently, such as company policies or product catalogs?






Correct Answer: B

Explanation:

RAG is specifically designed to retrieve external knowledge dynamically at inference time. This makes it highly suitable for domains where information changes frequently, such as policy updates, pricing data, inventory details, or regulatory documents.

In contrast, fine-tuning embeds knowledge into the model’s weights. If the knowledge changes, the model must be retrained, which is computationally expensive and operationally inefficient.

RAG allows organizations to:

  • Update documents in the knowledge base without retraining
  • Maintain separation between knowledge storage and model reasoning
  • Scale easily as data grows

Therefore, for dynamic and evolving knowledge environments, RAG is the preferred and scalable solution.

4.
A company wants its chatbot to consistently respond in a formal legal tone with structured output formatting. Which method is most appropriate?






Correct Answer: B

Explanation:

Fine-tuning modifies the model’s internal parameters to align its behavior with specific stylistic, structural, or task requirements. If a chatbot must consistently generate responses in a formal legal tone with defined output formatting, fine-tuning provides long-term behavioral alignment.

RAG, on the other hand, focuses on retrieving factual information. While it can improve knowledge accuracy, it does not guarantee stylistic consistency across responses.

Fine-tuning is ideal when:

  • Output style must remain consistent
  • Responses follow a predefined template
  • Task-specific reasoning behavior is required

Thus, for tone control and structural alignment, fine-tuning is the most appropriate method.

5.
What is a major limitation of fine-tuning when compared to RAG?






Correct Answer: B

Explanation:

Fine-tuning embeds knowledge directly into the model’s parameters. While this can improve task performance and stylistic alignment, updating knowledge requires retraining the model with new data.

Retraining is:

  • Computationally expensive
  • Time-consuming
  • Operationally complex

In contrast, RAG allows immediate knowledge updates by simply modifying the external document store. No retraining is required. This makes RAG significantly more flexible in rapidly evolving domains.

6.
In a RAG pipeline, what is the primary function of embeddings?






Correct Answer: C

Explanation:

Embeddings transform textual data into high-dimensional numerical vectors that capture semantic meaning. In a RAG system, both user queries and documents are converted into embeddings.

The system then performs similarity search to identify documents whose embeddings are closest to the query embedding. This enables contextually relevant retrieval beyond simple keyword matching.

Thus, embeddings are the core mechanism enabling semantic retrieval in RAG architectures.

7.
Why do many production systems combine RAG and fine-tuning?






Correct Answer: B

Explanation:

RAG provides up-to-date factual knowledge by retrieving external documents, while fine-tuning aligns the model’s internal reasoning style and output structure.

Combining both allows systems to:

  • Deliver accurate, grounded responses
  • Maintain stylistic consistency
  • Align with domain-specific requirements

This hybrid approach is increasingly common in enterprise AI deployments.

8.
Why is RAG considered more scalable for enterprise knowledge management?






Correct Answer: C

Explanation:

RAG decouples knowledge from the model itself. Documents are stored externally in databases or vector stores, allowing independent updates without retraining the model.

This separation:

  • Improves scalability
  • Reduces maintenance cost
  • Supports large and evolving knowledge bases

For enterprises managing thousands of documents, this architecture is significantly more efficient than embedding knowledge into model weights.

9.
Which statement best describes the cost trade-off between RAG and fine-tuning?






Correct Answer: C

Explanation:

Fine-tuning requires computational resources for training, dataset preparation, and validation. These costs occur upfront.

RAG avoids retraining but introduces ongoing infrastructure requirements such as:

  • Embedding generation
  • Vector database maintenance
  • Retrieval computation during inference

Therefore, each method has different cost dynamics depending on system scale and usage patterns.

10.
How does RAG help reduce hallucinations compared to fine-tuning?






Correct Answer: B

Explanation:

Hallucination occurs when a model generates plausible but incorrect information. RAG mitigates this by supplying retrieved documents as grounding evidence.

Because the model generates responses conditioned on real retrieved content, factual reliability improves. However, RAG does not eliminate hallucinations completely — it reduces them by anchoring responses in external knowledge.

Fine-tuning improves behavior and task alignment but does not inherently guarantee grounding in external evidence.

Tuesday, February 17, 2026

Top 10 Dependency Parsing MCQs (With Detailed Explanations) | NLP Practice Quiz Questions and Answers

✔ Scroll down and test yourself — answers are hidden under the “View Answer” button.

☰ Quick Links - Browse Related MCQs
🚨 Quiz Instructions:
Attempt all questions first.
✔️ Click SUBMIT at the end to unlock VIEW ANSWER buttons.
Quiz Mode:

Top Graph-Based Dependency Parsing MCQs - Quiz Questions and Detailed Explanations

Graph-based dependency parsing is a powerful approach in Natural Language Processing (NLP) that models sentence structure as a graph and selects the highest-scoring dependency tree using global optimization techniques. Unlike transition-based methods, graph-based parsers evaluate relationships across the entire sentence, making them effective for handling complex linguistic structures and long-distance dependencies.

This MCQ set focuses on key concepts related to graph-based dependency parsing, including projectivity, non-projective structures, Maximum Spanning Tree (MST) algorithms, scoring mechanisms, and global inference strategies. Each question is accompanied by a clear explanation to support learning, exam preparation, and interview readiness.

These questions are especially useful for students, researchers, and professionals working in NLP, computational linguistics, and machine learning who want to strengthen their understanding of modern dependency parsing techniques.

1.
In dependency parsing, what does a dependency relation represent?






Correct Answer: B

Dependency parsing models syntactic structure using directed relations between a head word and its dependent.

A dependency relation shows how one word (dependent) is grammatically connected to another word (head). Each relation answers the questions (a) Which word depends on which? and (b) What is their grammatical role? (subject, object, modifier, etc.)

Example

Sentence:
“She drives a Mercedes-Benz C-Class.”

Dependency Relations:

  • drives → root (main verb)
  • She → dependent of drives (subject → nsubj)
  • C-Class → dependent of drives (object → obj)
  • a → determiner of C-Class (det)
  • Mercedes-Benz → compound modifier of C-Class (compound)

So the structure is:

  • drives → She (nsubj)
  • drives → C-Class (obj)
  • C-Class → a (det)
  • C-Class → Mercedes-Benz (compound)

Each arrow represents a dependency relation between a head word and its dependent.


Why other options are INCORRECT?

Option A: Relationship between phrases. This is constituency parsing, not dependency parsing.
Option C: Relationship between sentences. Dependency parsing works within a sentence.
Option D: Relationship between characters. Parsing operates at the word level, not characters.

2.
Which property must every valid dependency tree satisfy?






Correct Answer: C

A dependency structure forms a tree: one root, no cycles, and every word has exactly one head except the root (root has no head).

Key Property: Single-Head Constraint

In a valid dependency tree:

  • Every word must have exactly one head, except the root word, which has no head.
  • This is called the single-head property.
  • One word (usually the main verb) is the root.
  • Every other word is connected to only one parent (head).

This constraint ensures that the structure forms a tree, not a graph with multiple parents.


Why other options are INCORRECT?

Option A: Multiple roots. A valid dependency tree must have exactly one root, not multiple.
Option B: Cycles allowed. Dependency trees must be acyclic. Cycles (A → B → A) are not allowed.
Option D: All dependencies must be projective. Projectivity is desirable but not required. Some valid trees are non-projective, especially in free-word-order languages.

3.
Which transition-based action adds a dependency and removes the dependent from the stack?






Correct Answer: A

In transition-based dependency parsing, a parser builds a dependency tree step-by-step using:

  • a stack
  • a buffer
  • a set of dependency arcs

Each action changes these structures.

LEFT-ARC creates a dependency where the top stack element becomes dependent and is removed.

What does LEFT-ARC do?

The LEFT-ARC action:

  • Creates a dependency.
  • The word at the top of the stack becomes the dependent.
  • The word at the front of the buffer becomes its head.
  • Removes the dependent from the stack.

So the LEFT-ARC operation adds a dependency and removes the dependent from the stack.

Example application of LEFT-ARC operation:

Stack: [She]
Buffer: [drives, a, car]

Apply LEFT-ARC:

  • Create: drives → She (nsubj)
  • Remove She from the stack

New state:

Stack: []
Buffer: [drives, a, car]


Why other options are INCORRECT?

Option B: SHIFT. Moves a word from buffer to stack. No dependency created.
Option C: REDUCE without arc. Removes a word from the stack but does not create a dependency.
Option D: SWAP. Used for handling non-projective structures. Does not directly add a dependency.

4.
What is the main advantage of graph-based dependency parsing?






Correct Answer: B

Graph-based parsers score possible edges and select the highest-scoring tree globally, reducing greedy errors.

Main Advantage of Graph-Based Parsing

In graph-based dependency parsing:

  • The sentence is viewed as a graph.
  • Words are treated as nodes.
  • Possible dependencies between words are treated as edges with scores.

The parser:

  • Assigns a score to each possible dependency.
  • Searches for the entire dependency tree with the highest total score.

This means:
The parser makes global decisions, considering the whole sentence at once. This process is called global optimization.

Why this is an advantage

  • Avoids errors caused by early local decisions.
  • Finds the best overall tree for the sentence.
  • Provides better accuracy for complex sentence structures.

Why other options are INCORRECT?

Option A: Linear-time parsing. Transition-based parsers are typically faster and closer to linear time.
Option C: No training required. Graph-based parsers are machine learning models and require training.
Option D: Works only for short sentences. They work for sentences of any length (though computation increases).

5.
Which condition indicates a non-projective dependency tree?






Correct Answer: C

A projective dependency tree is one where the dependency structure can be drawn above the sentence without any crossing lines. If any dependency arcs cross each other, the tree is called non-projective.

What is Projectivity?

In a projective dependency tree:

  • Dependencies follow the word order of the sentence.
  • If a head is connected to a dependent, all words between them must also belong to that head’s subtree.
  • No dependency arcs cross when the tree is drawn above the sentence.

When is a tree non-projective?

A dependency tree is non-projective when:

  • At least two dependency arcs cross each other.

This usually happens in:

  • Free word-order languages
  • Long-distance dependencies
  • Certain constructions such as topicalization or scrambling

Why other options are INCORRECT?

Option A and D are violating tree property. B is incorrect because labels are optional for the structure.

6.
Which algorithm is commonly used to find the highest-scoring tree in graph-based dependency parsing?






Correct Answer: D

Graph-based parsers often use MST algorithms (e.g., Chu–Liu/Edmonds) to find the optimal dependency tree.

In graph-based dependency parsing, the goal is to find the best dependency tree for a sentence. This problem is naturally solved using a Maximum Spanning Tree (MST) algorithm.

Why Maximum Spanning Tree (MST) is suitable?

  • Ensures a valid tree structure - MST guarantees one root, each word has exactly one headn, the structure is connected, and no cycles. These are the required properties of a dependency tree.
  • Global optimization - Instead of making local decisions, MST considers all possible dependencies and finds the globally best tree with the highest total score
  • Efficient algorithms exist (e.g., Chu–Liu/Edmonds)

Why other options are INCORRECT?

Option A Viterbi is used for sequence labeling problem, B (CKY algorithm) is used for constituency parsing and C (Beam search) approximates the best structure but does not guarantee the globally optimal tree.

7.
What is the time complexity of standard transition-based dependency parsing?






Correct Answer: A

Standard transition-based dependency parsing runs in O(n) time because it processes each word with a constant number of transitions (shift and arc operations), resulting in a linear number of steps relative to sentence length, hence linear-time complexity.

More explanation:

Transition-based dependency parsing builds a dependency tree by processing a sentence word by word using a sequence of simple actions (called transitions). Common transitions are: SHIFT (move a word from buffer to stack), LEFT-ARC (create a dependency and remove one word), and RIGHT-ARC (create a dependency and remove one word)

Why the Complexity is O(n)

Let n be the number of words in the sentence.

  • Each word is shifted once from the buffer to the stack → n operations.

Each word also participates in:

  • At most one Left-Arc
  • At most one Right-Arc

Therefore, the total number of transitions is proportional to the sentence length:

Total transitions ≤ 2n (or a small constant × n)

Since:

  • Each transition takes constant time: O(1)
  • Total number of transitions = O(n)

Therefore,

Total time complexity = O(n)

8.
What does Unlabeled Attachment Score (UAS) measure?






Correct Answer: B

Unlabeled Attachment Score (UAS) measures the percentage of words that are assigned the correct head in the dependency tree, regardless of the dependency label.

More explanation:

In dependency parsing, each word in a sentence is assigned a head (the word it depends on), and a dependency label (the type of relationship, e.g., subject, object, modifier). To evaluate a parser, we check how many of these predictions are correct.

What is Unlabeled Attachment Score (UAS)?

Unlabeled Attachment Score (UAS) measures the percentage of words for which a dependency parser predicts the correct head, while ignoring the dependency label.

It evaluates (equation below) only the correctness of the syntactic structure (i.e., who depends on whom).

UAS = (Number of words with correct head / Total number of words) × 100

Key Points:

  • Only the head prediction is evaluated.
  • The dependency relation type (label) is not considered.
  • UAS measures how accurately the parser captures the sentence structure.
9.
Which action is required in transition systems to handle non-projective structures?






Correct Answer: C

The SWAP operation allows reordering elements to handle crossing dependencies in non-projective parsing.

10.
Why are contextual embeddings (e.g., BERT) useful in modern dependency parsers?






Correct Answer: D

Traditional word embeddings (like Word2Vec or GloVe) assign one fixed vector per word, regardless of context. However, in real language, the meaning and syntactic role of a word often depend on the surrounding words.

Contextual embeddings like BERT improve dependency parsing by providing context-aware word representations that capture syntactic roles and relationships within a sentence.

What are Contextual Embeddings?

Contextual embeddings are word representations that change depending on the surrounding context. Models like BERT generate different vectors for the same word based on how it is used in a sentence.

Example:

  • “She sat on the bank of the river.” (bank = river side)
  • “He went to the bank to withdraw money.” (bank = financial institution)

BERT produces different representations for the word bank in each of the above sentences using the other words present in the context (sentence).


Why This Helps Dependency Parsing?

Dependency parsing requires understanding the grammatical roles (subject, object, modifier), long-distance relationships between words, and ambiguity resolution

Contextual embeddings help because they:

  • Capture syntactic role based on context - “book” in “book a ticket” (verb). “book” in “read a book” (noun)
  • Provide information about surrounding words - This helps the parser predict the correct head and dependency relation.
  • Handle ambiguity and long-range dependencies, which is important for complex sentences.

Why other options are INCORRECT?

Option A: They eliminate the need for tree structures. Parsers still produce dependency trees.
Option B: They reduce sentence length. Embeddings don’t change input size.
Option C: They replace dependency labels. Labels are still predicted by the parser.

Monday, February 9, 2026

Top 10 Syntactic Analysis MCQs in NLP with Answers | Dependency, Parsing & Neural NLP

✔ Scroll down and test yourself — answers are hidden under the “View Answer” button.

☰ Quick Links - Browse Related MCQs
🚨 Quiz Instructions:
Attempt all questions first.
✔️ Click SUBMIT at the end to unlock VIEW ANSWER buttons.
Quiz Mode:

Top 10 Syntactic Analysis MCQs in NLP (With Detailed Explanations)

Syntactic analysis is a core component of Natural Language Processing (NLP), enabling machines to understand grammatical structure and word relationships. This post presents 10 carefully selected multiple-choice questions (MCQs) on syntactic parsing, dependency structures, neural parsing, and modern NLP concepts. Each question includes a clear explanation to help students prepare for exams, interviews, and competitive tests.

1.
Which parsing strategy is most suitable for handling multiple valid parse trees for a sentence?






Correct Answer: C

Many sentences in natural language are ambiguous, meaning they can have multiple valid parse trees. For instance, structural ambiguity produces multiple parse trees. Consider an example sentence "I saw a man with a telescope" which can be interpreted as follows;

  • I used a telescope to see the man.
  • The man I saw had a telescope.

So the parser must choose the most likely structure, not just any valid one. Probabilistic parsers assign probabilities and choose the most likely structure.

Why probabilistic parsers?

They assign probabilities to grammar rules or parse trees. Evaluates all possible parses. Selects the most probable parse.

Example probabilisitc parsers:

  • PCFG (Probabilistic Context-Free Grammar)
  • Neural dependency parsers with scoring

Because ambiguity results in multiple possible parses, we need a ranking mechanism and probabilities provide that.

2.
What is the main advantage of dependency parsing over constituency parsing for modern NLP systems?






Correct Answer: B

Dependency trees directly model head–dependent relations, making them simpler, efficient, and useful for downstream tasks like translation and information extraction. Dependency parsing is preferred in modern NLP because it gives simple, efficient structures with direct word-to-word relationships.

What is Constituency Parsing?

Constituency parsing divides a sentence into phrases (constituents) based on grammar rules.

Example:

"The boy ate an apple".

Structure:

[ Sentence

[Noun Phrase: The boy]

[Verb Phrase: ate [Noun Phrase: an apple]]

]

The main focus of constituency parsing is on phrase structure (NP, VP, PP, etc.)

What is Dependency Parsing?

Dependency parsing shows direct relationships between words.

Example: ate → root; boy → subject of ate (nsubj); apple → object of ate (obj); The → modifier of boy;

The main focus of dependency parsing is to capture: Who depends on whom (word-to-word relations)

Main advantage of dependency parsing over constituency parsing for modern NLP systems

Modern NLP tasks (machine translation, information extraction, question answering, etc.) mainly need: Direct word relationships, Simpler structure, Computational efficiency.

Why dependency parsing is better?

Fewer nodes (only words, no extra phrase nodes), Simpler trees, Direct relations like subject, object, modifier, and Faster and easier for machine learning models

Why other options are INCORRECT?

Option A: Captures phrase boundaries more precisely. INCORRECT. That is the strength of constituency parsing, not dependency.

Option C: Requires no training data. INCORRECT. Modern parsers require training.

Option D: Works only for English. INCORRECT. Dependency parsing works for many languages.

3.
A dependency parser must support non-projective parsing when:






Correct Answer: B

Non-projective structures occur when dependency arcs cross. Crossing arcs indicate that the sentence structure cannot be represented using projective constraints. Some languages require parsers that can handle such structures. More can be found below.

What is dependency parser?

A dependency parser analyzes the grammatical structure of a sentence by identifying head–dependent relationships between words. Each word (except the root) depends on another word (its head). The structure forms a dependency tree. Example: “Ram wrote a letter”. Here, "wrote" is root, "Ram" is subject of wrote, and "letter" is object of wrote.

Dependency parsing focuses on word-to-word relations, which is very useful for modern NLP tasks.

What are Projective and Non-Projective Dependency parsing?

  • Projective dependency: A dependency tree is projective if when dependencies are drawn as arcs above the sentence and no arcs cross each other. All relations can be drawn without crossing. Projective structures are common in fixed word-order languages like English.
  • Non-projective dependency: A dependency tree is non-projective if some dependency arcs cross when drawn over the sentence. This often happens due to free word order, long-distance dependency and scrambling (common in languages like Hindi, German, Czech, Tamil, etc.). Some dependencies cross when drawn. Non-projective parsing is needed to correctly represent such structures.

Why other options are INCORRECT?

Sentence length (Option A) does not cause non-projectivity.

Unknown words (Option C) relate to vocabulary issues, not structure.

Grammar ambiguity (Option D) affects interpretation but does not necessarily create crossing dependencies.

4.
What is a key limitation of greedy transition-based parsers?






Correct Answer: C

Transition-based parsers make local decisions. Early mistakes cannot be corrected later (The parser cannot go back and fix the mistake), leading to error propagation.

What are Greedy Transition-Based Parsers?

Transition-based dependency parsers build a dependency tree step-by-step using a sequence of actions (such as, SHIFT, LEFT-ARC, RIGHT-ARC, REDUCE). A greedy parser chooses the best action at each step based only on current information. It does not reconsider previous decisions. They are very fast and memory-efficient.

Why other options are INCORRECT?

High memory usage (Option A) Greedy parsers use low memory.

Cannot handle short sentences (Option B) Works for any sentence length.

Cannot produce dependency trees (Option D) Produces trees efficiently.

5.
Graph-based dependency parsing differs from transition-based parsing because it:






Correct Answer: B

Graph-based parsers evaluate possible trees globally and select the highest-scoring structure, reducing greedy errors.

Transition-Based vs Graph-Based dependency parsing

  • Transition-Based Parsing: Builds the dependency tree step-by-step. Uses a stack, buffer, and actions (SHIFT, LEFT-ARC, RIGHT-ARC). Decisions are local and incremental.
  • Graph-Based Parsing: Treats parsing as a global optimization problem. Considers all possible head–dependent arcs. Assigns a score to the entire tree. Selects the highest-scoring valid tree (often using algorithms like MST – Maximum Spanning Tree)

Why other options are INCORRECT?

Builds the tree incrementally using a stack (Option A) This describes transition-based parsing, not graph-based.

Uses no machine learning (Option C) Modern graph-based parsers heavily use machine learning (neural networks).

Works only for projective trees (Option D) Many graph-based methods can handle non-projective trees (e.g., MST parser).

6.
In modern neural parsers, contextual embeddings like BERT help because they:






Correct Answer: C

Contextual embeddings capture agreement, phrase boundaries, and long-distance dependencies, improving parsing accuracy. BERT helps parsers by understanding each word based on its context, which improves detection of grammatical relationships.

Why embeddings are important in modern neural parsers?

Neural parsers (dependency or constituency) work with numbers, not words. So each word must be converted into a vector representation — this is called an embedding.

Embeddings are essential in modern neural parsers because they:

  • Convert words into numerical input
  • Capture semantic and syntactic information
  • Provide contextual understanding (with BERT)
  • Significantly improve parsing performance.
7.
In dependency parsing, the head selection problem refers to:






Correct Answer: C

Dependency parsing determines which word acts as the head and which is the dependent for each relationship.

What is the head selection problem?

In dependency parsing, for each word, the parser must decide "Which word is its head?". This decision is called head selection. For example, given a sentence "She saw a dog", for the word "dog", the parser must decide does "dog" depend on "saw"? Or on "a"?. So the head selection task is about choosing the governing word for each word.

Head selection = deciding which word is the head (governor) for each word in the sentence.

8.
Why do traditional syntactic parsers struggle with very long sentences?






Correct Answer: B

They struggle because the number of possible parses and computational cost grow rapidly with sentence length, making long-distance dependencies hard to handle.

Traditional parsers struggle with long sentences because:

  • Too many possible structures
  • High computational cost
  • Difficulty handling long-distance relationships
  • Error propagation
9.
What is the main purpose of the Universal Dependencies (UD) framework?






Correct Answer: B

The Universal Dependencies (UD) framework is designed to create a consistent and language-independent way to represent grammatical structure (syntactic annotaion) across many languages.

Different languages have different grammar. UD provides a common set of rules and labels so that dependency structures look similar across languages and the same annotation scheme is used worldwide.

10.
How does syntactic information help large language models during training?






Correct Answer: B

Even without explicit parsing, LLMs learn syntax implicitly, helping capture agreement, clause structure, and long-distance dependencies.

More explanation:

Large Language Models (LLMs) need to understand how words in a sentence are related to each other. In many sentences, important grammatical relationships occur between words that are far apart.

Example:

The book that the student bought yesterday is interesting.

The verb is agrees with book, not with student or yesterday. This is called a long-range dependency.

Syntactic signals (such as dependency relations or structural patterns) help the model:

  • Identify subject–verb and modifier relationships
  • Understand sentence structure
  • Maintain grammatical consistency
  • Handle complex and long sentences

Without syntactic information, the model may rely only on nearby words and miss these long-distance relationships.

Please visit, subscribe and share 10 Minutes Lectures in Computer Science

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents