I am trying to learn the CYK parsing algorithm.
For this set of grammar rules, are the resulting tables correct for the two given sentences?
S -> NP VP
VP -> VB NP
NP -> DT NN
PP -> IN NP
NP -> NP PP
NP -> NN
VP -> VP PP
IN -> with
NN -> dog
NN -> cat
VB -> ate
NN -> mouse
DT -> the
['S']
[None, None]
[None, None, 'VP']
['NP', None, None, 'NP']
['DT', 'NN', 'VB', 'DT', 'NN']
['the', 'cat', 'ate', 'the', 'dog']
['S']
['NP', None]
['NP', None, 'VP']
['NP', None, None, 'NP']
[None, None, 'VP', None, None]
[None, None, 'VP', None, None, 'PP']
['NP', None, None, 'NP', None, None, 'NP']
['DT', 'NN', 'VB', 'DT', 'NN', 'IN', 'DT', 'NN']
['the', 'cat', 'ate', 'the', 'dog', 'with', 'the', 'cat']
You can try to minimize your grammar first, because there are some unnecessary rules and furthermore that's why it is
not in CNF
.Looking at it more concisely, you happen to have
None
on the first example, second row, second column. There it is actually possible to have aS
, but since the logic in CYK cannot do further optimizations such asNP->NN
. From thereS -> NP VP
for the mentionedNone
cell goes missing. Because of CYK's inability to perform those, the grammar must be in CNF. So, basically it is roughly like you are trying to apply a C-compiler on a C++ programm (with no C++ libraries). And you got lucky to even get the right result at the top.With that being said, I am not going to indulge in the second example of yours.
Just to clarify, a grammar is in
CNF
if it has rules only of these two forms:so clearly something like
NP -> NN
is not in CNF.