I am using parsey mcparseface and syntaxnet to parse some text. I wish to extract the positional data of words along with the parse tree.
Currently what the output is:
echo 'Alice brought the pizza to Alice.' | syntaxnet/demo.sh
Input: Alice brought the pizza to Alice .
Parse:
brought VBD ROOT
+-- ALice NNP nsubj
+-- pizza NN dobj
| +-- the DT det
+-- to IN prep
| +-- Alice NNP pobj
+-- . . punct
how i need it to be
Input: Alice brought the pizza to Alice .
Parse:
brought VBD ROOT 2
+-- Alice NNP nsubj 1
+-- pizza NN dobj 4
| +-- the DT det 3
+-- to IN prep 5
| +-- Alice NNP pobj 6
+-- . . punct 7
or similar. (this will be particularly useful when there are many occurances of same word.)
Thank you
You can edit conll2tree.py https://github.com/tensorflow/models/blob/master/syntaxnet/syntaxnet/conll2tree.py
Changing
token_str
toshould do it.