spaCy contextualSpellCheck: recurring issue with HF timeout & "local variable 'model' referenced before assignment"

72 views Asked by At

Running spaCy 3.7.2 on Databricks on AWS in a network limited environment. Error when trying initiate/use contextualSpellCheck. To get around what looks like a network issue I've installed the en_core_web_sm-3.7.1-py3-none-any.whl in the cluster environment but not sure how to call that local version in the code (I'm new to spaCy).

Also "local variable 'model' referenced before assignment" error.

CODE

import contextualSpellCheck
import spacy

nlp = spacy.load("en_core_web_sm") 
print(f"Model Name: {nlp.meta['name']}")
print(f"Model Lang: {nlp.meta['lang']}")
print(f"Model Version: {nlp.meta['version']}")

nlp.pipe_names

contextualSpellCheck.add_to_pipe(nlp)

nlp.pipe_names
doc = nlp('Income was $9.4 milion compared to the prior year of $2.7 milion.')
doc._.outcome_spellCheck

ERROR:
local variable 'model' referenced before assignment
ReadTimeout: (ReadTimeoutError("HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 3a762e01-c83f-4766-a16f-5fa3dcfe7d52)')
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File /databricks/python/lib/python3.10/site-packages/urllib3/connectionpool.py:386, in HTTPConnectionPool._make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    385 try:
--> 386     self._validate_conn(conn)
    387 except (SocketTimeout, BaseSSLError) as e:
    388     # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout.

File /databricks/python/lib/python3.10/site-packages/urllib3/connectionpool.py:1042, in HTTPSConnectionPool._validate_conn(self, conn)
   1041 if not getattr(conn, "sock", None):  # AppEngine might not have  `.sock`
-> 1042     conn.connect()
   1044 if not conn.is_verified:

File /databricks/python/lib/python3.10/site-packages/urllib3/connection.py:414, in HTTPSConnection.connect(self)
    412     context.load_default_certs()
--> 414 self.sock = ssl_wrap_socket(
    415     sock=conn,
    416     keyfile=self.key_file,
    417     certfile=self.cert_file,
    418     key_password=self.key_password,
    419     ca_certs=self.ca_certs,

ERROR TRACE

CLUSTER LIBRARY CONTENTS

I tried installing the py whl file for the model in my cluster library. I decomposed the code and ran each line in a separate cell to debug.

nlp.pipe_names runs fine.

The problem seems to starts with:

contextualSpellCheck.add_to_pipe(nlp)

Any and all thoughts gratefully received.

0

There are 0 answers