Wednesday, May 6, 2020

Probabilistic Context Free Grammar (PCFG)

Probabilistic Context Free Grammar Formal Definition and Examples


Probabilistic Context Free Grammar (PCFG)


Page 1    Page 2    Page 3

Probabilistic Context Free Grammar (PCFG) is an extension of Context Free Grammar (CFG) with a probability for each production rule. Ambiguity is the reason why we are using probabilistic version of CFG. For instance, some sentences may have more than one underlying derivation. That is, the sentence can be parsed in more than one ways. In this case, the parse of the sentence become ambiguous. To eliminate this ambiguity, we can use PCFG to find the probability of each parse of the given sentence.
A PCFG is made up of a CFG and a set of probabilities for each production rule of CFG. A PCFG can be formally defined as follows;
A probabilistic context free grammar G is a quintuple G = (N, T, S, R, P) where

  • (N, T, S, R) is a context free grammar where N is set of non-terminal (variable) symbols, T is set of terminal symbols, S is the start symbol and R is the set of production rules where each rule of the form A → s [Refer for more here – Context Free Grammar Formal Definition].
  • A probability P(A → s) for each rule in R. The properties governing the probability are as follows;
    • P(A → s) is a conditional probability of choosing a rule A → s in a left-most derivation, given that A is the non-terminal that is expanded.
    • The value for each probability lies between 0 and 1.
    • The sum of all probabilities of rules with A as the left hand side non-terminal should be equal to 1.

Example PCFG:

Probabilistic Context Free Grammar G = (N, T, S, R, P)
  • N = {S, NP, VP, PP, Det, Noun, Verb, Pre}
  • T = {‘a’, ‘ate’, ‘cake’, ‘child’, ‘fork’, ‘the’, ‘with’}
  • S = S
  • R = { S → NP VP
NP → Det Noun | NP PP
PP → Pre NP
VP → Verb NP
Det → ‘a’ | ‘the’
Noun → ‘cake’ | ‘child’ | ‘fork’
Pre → ‘with’
Verb → ‘ate’ }
  • P = R with associated probability as in the table below;
Rule
Probability
Rule
Probability
S → NP VP
1.0
Det → ‘a’
Det → ‘the’
0.5
0.5
NP → NP PP
NP → Det Noun
0.6
0.4
Noun → ‘cake’
Noun → ‘child’
Noun → ‘fork’
0.4
0.3
0.3
PP → Pre NP
1.0
Pre → ‘with’
1.0
VP → Verb NP
1.0
Verb → ‘ate’
1.0

Please observe from the table, the sum of probability values for all rules that have same left hand side is 1. For example,



In the next page, let us discuss the question “How to use PCFG to calculate the probability of a parse tree?”

Page 1    Page 2    Page 3


********************

Related links:

How to derive probabilities for production rules from Treebank using maximum likelihood estimate

PCFG probabilities estimation

How to calculated production rule probability in PCFG using tree banks

Probabilistic context free grammar rule probability estimation using tree banks


No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery