I used available pipeline enn_core_web_lg and added concise_concepts in the pipeline, with some dummy NER data. Immediately after that if i try NER, it works. But when i try to save the whole thing and load later for use, i am not able to do so.
What I did:
import spacy
import concise_concepts
pipeline = spacy.load("en_core_web_lg") # available trained pipeline
data = {
"FRUIT": ["apple", "orange", "pear"],
"VEGETABLE": ["potato", "spinach", "tomato"]
}
pipeline.add_pipe("concise_concepts", config={"data": data}) # adding my own component to pipeline
doc = pipeline("apple vs potato")
print(doc.ents) # prints: (apple, potato) # here it is recognizing entities
pipeline.to_disk("./saved") # saving to disk
# i tried multiple ways to load:
# 1:
pipeline = spacy.load("./saved")
# This gives following error:
# AssertionError: Choose a spaCy model with internal embeddings, e.g. md or lg.
# and i dont understand how can i specify "lg"
# 2:
pipeline = spacy.blank("en").from_disk("./saved")
doc = pipeline("apple vs potato")
print(doc.ents) # prints: () # recognizes no entity
What is actually happening here? What am I doing wrong?
The whole of AssertionError is below:
entity_ruler already exists in the pipeline. Removing old rulers
Traceback (most recent call last):
File "test2.py", line 7, in <module>
pipeline = spacy.load("./saved")
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\spacy\__init__.py", line 51, in load
return util.load_model(
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\spacy\util.py", line 422, in load_model
return load_model_from_path(Path(name), **kwargs) # type: ignore[arg-type]
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\spacy\util.py", line 488, in load_model_from_path
nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude)
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\spacy\util.py", line 525, in load_model_from_config
nlp = lang_cls.from_config(
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\spacy\language.py", line 1782, in from_config
nlp.add_pipe(
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\spacy\language.py", line 792, in add_pipe
pipe_component = self.create_pipe(
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\spacy\language.py", line 674, in create_pipe
resolved = registry.resolve(cfg, validate=validate)
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\thinc\config.py", line 746, in resolve
resolved, _ = cls._make(
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\thinc\config.py", line 795, in _make
filled, _, resolved = cls._fill(
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\thinc\config.py", line 867, in _fill
getter_result = getter(*args, **kwargs)
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\concise_concepts\__init__.py", line 55, in make_concise_concepts
return Conceptualizer(
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\concise_concepts\conceptualizer\Conceptualizer.py", line 122, in __init__
self.run()
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\concise_concepts\conceptualizer\Conceptualizer.py", line 152, in run
self.set_gensim_model()
File "C:\Users\...\miniconda3\envs\testEnv\lib\site-packages\concise_concepts\conceptualizer\Conceptualizer.py", line 245, in set_gensim_model
assert len(
AssertionError: Choose a spaCy model with internal embeddings, e.g. md or lg.