I am using SpaCy coreferee plugin. The execution is quite simple:
import coreferee, spacy
nlp = spacy.load('en_core_web_trf')
nlp.add_pipe('coreferee')
doc = nlp("Although he was very busy with his work, Peter had had enough of it. He and his wife decided they needed a holiday. They travelled to Spain because they loved the country very much.")
doc._.coref_chains.print()
0: he(1), his(6), Peter(9), He(16), his(18)
1: work(7), it(14)
2: [He(16); wife(19)], they(21), They(26), they(31)
3: Spain(29), country(34)
The problem I am having is how to map the coreference cluster back to text and return coreferenced text.
I guess I would somehow need to iterate over all tokens in the doc
and check if they can be mapped and solved with coreference clusters. I have little experience with SpaCy, so I don't really know what's the best route to achieve this.
The solution is the following:
which returns