how to achieve faster tfidfvectorizer loading times from within a django view?

446 views Asked by At

I have a fitted TfidfVectorizer with ~120,000 features which I save to file using joblib.dump. I later load that model, from within a django view, using joblib.load but it is too slow (takes ~2 seconds). What is the best way to improve the loading speed? Should I cache the model using django's caching framework? Should I compress the model when serializing with joblib.dump? Is there a way to load the model into memory once and keep it there rather than reloading it each time the view is called?

2

There are 2 answers

0
jkarimi On BEST ANSWER

The model does not change between requests, therefore, we want to load it into memory once and leave it there. This can be achieved, in views.py by loading the model and assigning it to global variable.

0
hobs On

You must load you model in the apps.py file and then import that model from apps in your views.py. Otherwise the model is loaded again with every request (every time views.py is run). And you should pickle your model to disk using joblib rather than the built in pickle library.