I am adding a custom component to spaCy but it never gets called:
@Language.component("custom_sentence_boundaries")
def custom_sentence_boundaries(doc):
print(".")
for token in doc[:-1]:
if token.text == "\n":
doc[token.i + 1].is_sent_start = True
return doc
nlp = spacy.load("de_core_web_sm")
nlp.add_pipe("custom_sentence_boundaries", after="parser")
nlp.analyze_pipes(pretty=True)
doc = nlp(text)
sentences = [sent.text for sent in doc.sents]
I get a result in sentences
and the analyzer does list my component but my custom component seams to have no effect and I never see the dots from the print appearing...
Any ideas?
In the code which you have pasted:
You are doing :
However, it should be :
I tried to reproduce your code and I got the result
#Output
(please see at the bottom
...$...
is printed andcustom_sentence_boundaries
is printed afterparser
as we have statedafter="parser"
in keyword argument)