Advanced Database Management System - Tutorials and Notes: Probabilistic Context Free Grammar (PCFG)
Please visit, subscribe and share 10 Minutes Lectures in Computer Science

# Probabilistic Context Free Grammar Formal Definition and Examples

## Probabilistic Context Free Grammar (PCFG)

Probabilistic Context Free Grammar (PCFG) is an extension of Context Free Grammar (CFG) with a probability for each production rule. Ambiguity is the reason why we are using probabilistic version of CFG. For instance, some sentences may have more than one underlying derivation. That is, the sentence can be parsed in more than one ways. In this case, the parse of the sentence become ambiguous. To eliminate this ambiguity, we can use PCFG to find the probability of each parse of the given sentence.
A PCFG is made up of a CFG and a set of probabilities for each production rule of CFG. A PCFG can be formally defined as follows;
A probabilistic context free grammar G is a quintuple G = (N, T, S, R, P) where

• (N, T, S, R) is a context free grammar where N is set of non-terminal (variable) symbols, T is set of terminal symbols, S is the start symbol and R is the set of production rules where each rule of the form A → s [Refer for more here – Context Free Grammar Formal Definition].
• A probability P(A → s) for each rule in R. The properties governing the probability are as follows;
• P(A → s) is a conditional probability of choosing a rule A → s in a left-most derivation, given that A is the non-terminal that is expanded.
• The value for each probability lies between 0 and 1.
• The sum of all probabilities of rules with A as the left hand side non-terminal should be equal to 1.

### Example PCFG:

Probabilistic Context Free Grammar G = (N, T, S, R, P)
• N = {S, NP, VP, PP, Det, Noun, Verb, Pre}
• T = {‘a’, ‘ate’, ‘cake’, ‘child’, ‘fork’, ‘the’, ‘with’}
• S = S
• R = { S → NP VP
NP → Det Noun | NP PP
PP → Pre NP
VP → Verb NP
Det → ‘a’ | ‘the’
Noun → ‘cake’ | ‘child’ | ‘fork’
Pre → ‘with’
Verb → ‘ate’ }
• P = R with associated probability as in the table below;
 Rule Probability Rule Probability S → NP VP 1.0 Det → ‘a’ Det → ‘the’ 0.5 0.5 NP → NP PP NP → Det Noun 0.6 0.4 Noun → ‘cake’ Noun → ‘child’ Noun → ‘fork’ 0.4 0.3 0.3 PP → Pre NP 1.0 Pre → ‘with’ 1.0 VP → Verb NP 1.0 Verb → ‘ate’ 1.0

Please observe from the table, the sum of probability values for all rules that have same left hand side is 1. For example,

In the next page, let us discuss the question “How to use PCFG to calculate the probability of a parse tree?”

********************

• Go to NLP Solved Exercise page