I have a large number of strings in a list: A small example of the list contents is :
["machine learning","Apple","Finance","AI","Funding"]
I wish to convert these into vectors and use them for clustering purpose. Is the context of these strings in the sentences considered while finding out their respective vectors?
How should I go about with getting the vectors of these strings if i have just this list containing the strings?
I have done this code so far..
from gensim.models import Word2Vec
vec = Word2Vec(mylist)
P.S. Also, can I get a good reference/tutorial on Word2Vec?
Word2Vec is an artificial neural network method. Word2Vec actually creates embeddings, which reflects the relationship among the words. The links below will help you get the complete code to implement Word2Vec.
Some good links are this and this. For the 2nd link try his github repo for the detail code. He is explaining only major part in the blog. Main article is this.
You can use the following code, to convert words to there corresponding numerical values.