What is the fast way to get tensorflow universal sentence embeddings on large corpus?

212 views Asked by At

I have a corpus of 100k rows with on an average 20 sentences in each row, stored in pandas column. What is the fastest way to get tensorflow universal sentence embedding for each row seperately?

Please note: Loading the entire corpus will take it forever & even memory constraint error even in a 30 GB machine. Making it chunks is going to still take it in a for loop, which is time consuming.

Any memory based fast operations feasible in python-tensorflow-tensorflowserve combination, similiar to how a stanford NLP back end server sharply reduces POS tagging process when run as java server in backend compared to front end, or h2o ML libraries?

1

There are 1 answers

1
Andrey Khorlin On

This tutorial on using Pandas with tf.data might be useful.