Why does a pretrained spacy pipe not work when added to a spacy.blank pipe?

Question

Why does a pretrained spacy pipe not work when added to a spacy.blank pipe?

29 views Asked by Sofie Jægtvik Benjaminsen At 20 September 2022 at 09:56

I am trying to add spacy's already trained parser for Norwegian Bokmål to a blank spacy pipe. I get no error message when I add the pipe, but whatever the input, the pipe categorizes all tokens as nouns. What am I missing here?

import spacy
from spacy import displacy

nlp = spacy.blank("nb")
wanted_pipes = ["morphologizer", "parser"] 

for pipe_name in wanted_pipes:
  if pipe_name not in nlp.pipe_names:
    nlp.add_pipe(pipe_name, source = spacy.load("nb_core_news_sm"))
nlp.initialize()
doc = nlp("Katten heter Petrus.") # a random Norwegian sentence

Original Q&A

There are 1 answers

**polm23** · Answer 1 · 2022-09-27T06:33:44+00:00

There are a couple of problems with the way you're loading the pipeline here. One is that you need a tok2vec for the morphologizer and parser to get meaningful input, but another is that calling initialize wipes their weights.

A better way to load the pipeline is to use disable to just exclude things you don't want, like this:

nlp = spacy.load("nb_core_news_sm", disable=["lemmatizer", "ner"])

I would recommend leaving the attribute_ruler in because it's fast and often works with the morphologizer.

Also, it should be easier to use enable rather than disable to list what you want to keep, but there's currently a bit of an issue with that. We're working on a fix, see here for details.

TechQA.

Why does a pretrained spacy pipe not work when added to a spacy.blank pipe?

There are 1 answers

Related Questions in SPACY

Related Questions in PART-OF-SPEECH

Popular Questions

Trending Questions