get error 'NoneType' object has no attribute 'dumps' when load model in HAYSTACK

3.7k views Asked by At

I trying to load 'bert-base-multilingual-uncased' in haystack FARMReader and get the error:

(huyenv) PS D:\study\DUANCNTT2\HAYSTACK\haystack_demo> & d:/study/DUANCNTT2/HAYSTACK/haystack_demo/huyenv/Scripts/python.exe d:/study/DUANCNTT2/HAYSTACK/haystack_demo/main.py 05/21/2021 00:12:58

  • INFO - faiss.loader - Loading faiss. 05/21/2021 00:12:58 - INFO - faiss.loader - Loading faiss. 05/21/2021 00:12:59 - INFO - farm.modeling.prediction_head - Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex . 05/21/2021 00:13:00 - INFO - faiss.loader - Loading faiss. 05/21/2021 00:13:00
  • INFO - faiss.loader - Loading faiss. 05/21/2021 00:13:01 - INFO - elasticsearch - HEAD http://localhost:9200/ [status:200 request:0.018s] 05/21/2021 00:13:01 - INFO - elasticsearch - HEAD http://localhost:9200/cv [status:200 request:0.005s] 05/21/2021 00:13:01 - INFO - elasticsearch - GET http://localhost:9200/cv [status:200 request:0.009s] 05/21/2021 00:13:01 - INFO - elasticsearch
  • PUT http://localhost:9200/cv/_mapping [status:200 request:0.041s] 05/21/2021 00:13:01 - INFO - elasticsearch - HEAD http://localhost:9200/label [status:200 request:0.008s] 05/21/2021 00:13:01 - INFO - farm.utils - Using device: CPU 05/21/2021 00:13:01
  • INFO - farm.utils - Number of GPUs: 0 05/21/2021 00:13:01 - INFO - farm.utils - Distributed Training: False 05/21/2021 00:13:01 - INFO
  • farm.utils - Automatic Mixed Precision: None Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertForQuestionAnswering: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
  • This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-multilingual-uncased and are newly initialized: ['qa_outputs.weight', 'qa_outputs.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 05/21/2021 00:13:21 - WARNING - farm.utils - ML Logging is turned off. No parameters, metrics or artifacts will be logged to MLFlow. 05/21/2021 00:13:21 - INFO - farm.utils - Using device: CPU 05/21/2021 00:13:21 - INFO - farm.utils - Number of GPUs: 0 05/21/2021 00:13:21 - INFO - farm.utils - Distributed Training: False 05/21/2021 00:13:21 - INFO
  • farm.utils - Automatic Mixed Precision: None 05/21/2021 00:13:21 - INFO - farm.infer - Got ya 3 parallel workers to do inference ... 05/21/2021 00:13:21 - INFO - farm.infer - 0 0 0 05/21/2021 00:13:21 - INFO - farm.infer - /w\ /w\ /w\ 05/21/2021 00:13:21 - INFO - farm.infer - /'\ / \ /'\ 05/21/2021 00:13:21 - INFO - farm.infer - Exception ignored in: <function Pool.del at 0x000001BBA1DC9C10> Traceback (most recent call last): File "C:\Users\Admin\AppData\Local\Programs\Python\Python38\lib\multiprocessing\pool.py", line 268, in del File "C:\Users\Admin\AppData\Local\Programs\Python\Python38\lib\multiprocessing\queues.py", line 362, in put AttributeError: 'NoneType' object has no attribute 'dumps'

This is my main.py file:


from haystack.reader.farm import FARMReader
from haystack.document_store.elasticsearch import ElasticsearchDocumentStore
from haystack.retriever.sparse import ElasticsearchRetriever

document_store = ElasticsearchDocumentStore(
    host="localhost",
    username="",
    password="",
    index="cv",
    embedding_dim=768,
    embedding_field="embedding")
retriever = ElasticsearchRetriever(document_store=document_store)
reader = FARMReader(model_name_or_path='bert-base-multilingual-uncased')

NOTICE: My elasticsearch server has been started successfully!

1

There are 1 answers

0
Malte On

Seems like an issue with multiprocessing on Windows. You can disable multiprocessing for the FARMReader like this:

...
reader = FARMReader(model_name_or_path='bert-base-multilingual-uncased', num_processes=0)

See also the docs for more details.