I am trying to add spacy's already trained parser for Norwegian Bokmål to a blank spacy pipe. I get no error message when I add the pipe, but whatever the input, the pipe categorizes all tokens as nouns. What am I missing here?
import spacy
from spacy import displacy
nlp = spacy.blank("nb")
wanted_pipes = ["morphologizer", "parser"]
for pipe_name in wanted_pipes:
if pipe_name not in nlp.pipe_names:
nlp.add_pipe(pipe_name, source = spacy.load("nb_core_news_sm"))
nlp.initialize()
doc = nlp("Katten heter Petrus.") # a random Norwegian sentence
There are a couple of problems with the way you're loading the pipeline here. One is that you need a
tok2vecfor the morphologizer and parser to get meaningful input, but another is that callinginitializewipes their weights.A better way to load the pipeline is to use
disableto just exclude things you don't want, like this:I would recommend leaving the
attribute_rulerin because it's fast and often works with themorphologizer.Also, it should be easier to use
enablerather thandisableto list what you want to keep, but there's currently a bit of an issue with that. We're working on a fix, see here for details.