I've been using pytextrank (https://github.com/DerwenAI/pytextrank/) with spacy and English models for keywords extraction - it works great!
Now I need to process non-English texts and I found udpipe (https://github.com/TakeLab/spacy-udpipe) but it doesn't work out of the box ... after
nlp = spacy_udpipe.load("sk")
tr = pytextrank.TextRank()
nlp.add_pipe(tr.PipelineComponent, name="textrank", last=True)
doc = nlp(text)
I get tokens with POS and DEP tags, but there is nothing in doc._.phrases
(doc.noun_chunks
is also empty) and in nlp.pipe_names
is just ['textrank']
What should I add to the spacy's pipeline to get it working? I assume pytextrank needs noun_chunks...
Any tip or suggestion where to look will help me - thanks!
would you mind starting an issue about this on the PyTextRank repo? https://github.com/DerwenAI/pytextrank/issues
Also, if you could please provide example text to use (in the language requested)
We'll try to debug this integration.
Thanks for pointing it out!
Paco