doc.vector not working after loading from stored model in spacy

922 views Asked by At

I have trained the model following https://github.com/explosion/spaCy/blob/master/examples/training/train_new_entity_type.py

I am saving it to some directory, then loading and using it again. But after loading it when I am trying to access doc.vector, it is throwing following error.

 Traceback (most recent call last):
  File "C:/Users/ankita.a.rath/Desktop/my_codes/Rasa_nlu/rasa_nlu-master/train_spacy_ner.py", line 248, in <module>
    main("en", "new_model")
  File "C:/Users/ankita.a.rath/Desktop/my_codes/Rasa_nlu/rasa_nlu-master/train_spacy_ner.py", line 238, in main
    print (doc2.vector)
  File "spacy/tokens/doc.pyx", line 275, in spacy.tokens.doc.Doc.vector.__get__ (spacy/tokens/doc.cpp:7291)
    self._vector = sum(t.vector for t in self) / len(self)
  File "spacy/tokens/doc.pyx", line 275, in genexpr (spacy/tokens/doc.cpp:7114)
    self._vector = sum(t.vector for t in self) / len(self)
  File "spacy/tokens/token.pyx", line 240, in spacy.tokens.token.Token.vector.__get__ (spacy/tokens/token.cpp:7249)
    raise ValueError(
ValueError: Word vectors set to length 0. This may be because you don't have a model installed or loaded, or because your model doesn't include word vectors. For more info, see the documentation: 
https://spacy.io/docs/usage

Info about my environment.

Python version: 2.7.13

Platform: Windows-10

spaCy version: 1.9.0

Installed models: en

Please suggest some solution.

2

There are 2 answers

1
Caleb Keller On

Sorry, I am probably not going to be the most helpful in answering your question if you are using Spacy NER for a specific reason. However the Spacy NER as used in Rasa is meant to be used with built in Entities. See the Rasa docs on ner_spacy here. Specifically this comment:

As of now, this component can only use the spacy builtin entity extraction models and can not be retrained.

Training entities in Rasa is done with either the ner_mitie or ner_crf pipeline components.

Rasa has a full Getting started guide.

For example to get started with ner_crf you would be best suited to use the spacy_sklearn pre-built pipeline.

git clone https://github.com/RasaHQ/rasa_nlu.git
cd rasa_nlu
pip install -r requirements.txt
python setup.py install
pip install -U spacy
python -m spacy download en
conda install scikit-learn
pip install -U sklearn-crfsuite

Once all that is done you can start the Rasa server:

python -m rasa_nlu.server -c sample_configs/config_spacy.json

and use the HTTP API to train and parse data.

curl -XPOST localhost:5000/train?name=my_project -d @data/examples/rasa/demo-rasa.json
curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_project"}

If you need any further help create an issue on Github or join us on Gitter. I will mention that Windows complicates things and you may be better of trying Rasa in Docker or on a unix VM.

0
user2550098 On

I think when we are saving the model, the vector is not getting saved. I could not find any direct solution, so what I am doing is, I am storing the vector separately, then loading it after loading the model.

It resolved my issue. Closing this one.