So, I am relatively new using Gensim and LDA in general. The problem right now is that when I run LDA on my corpus, the topics' tokens' weights are all 0:
2015-06-15 12:21:12,439 : INFO : topic diff=0.082235, rho=0.250000
2015-06-15 12:21:12,454 : INFO : topic #0 (0.100): 0.000*sundayes + 0.000*nowe + 0.000*easter + 0.000*iniunctions + 0.000*eyther + 0.000*christ, + 0.000*authoritie + 0.000*sir + 0.000*saint + 0.000*thinge
2015-06-15 12:21:12,468 : INFO : topic #1 (0.100): 0.000*eu'n + 0.000*ioseph + 0.000*pharohs + 0.000*pharoh + 0.000*iosephs + 0.000*lo! + 0.000*egypts + 0.000*iacob + 0.000*ioseph, + 0.000*beniamin
2015-06-15 12:21:12,482 : INFO : topic #2 (0.100): 0.000*agreeable + 0.000*creede, + 0.000*fourme + 0.000*conteined + 0.000*apostolike, + 0.000*vicars, + 0.000*sacrament + 0.000*contrarywise + 0.000*parsons, + 0.000*propitiatorie
2015-06-15 12:21:12,495 : INFO : topic #3 (0.100): 0.000*yf + 0.000*suche + 0.000*lyke + 0.000*shoulde + 0.000*moste + 0.000*youre + 0.000*oure + 0.000*lyfe, + 0.000*anye + 0.000*thinges
2015-06-15 12:21:12,507 : INFO : topic #4 (0.100): 0.000*heau'nly + 0.000*eu'n + 0.000*heau'n + 0.000*sweet + 0.000*peace + 0.000*eu'ry + 0.000*constance + 0.000*constant + 0.000*doth + 0.000*oh
2015-06-15 12:21:12,521 : INFO : topic #5 (0.100): 0.000*eu'n + 0.000*ioseph + 0.000*pharohs + 0.000*pharoh + 0.000*vel + 0.000*iosephs + 0.000*heau'n + 0.000*lo! + 0.000*ac + 0.000*seu'n
2015-06-15 12:21:12,534 : INFO : topic #6 (0.100): 0.000*thou + 0.000*would + 0.000*love + 0.000*king + 0.000*sir, + 0.000*doe + 0.000*thee + 0.000*1. + 0.000*never + 0.000*2.
2015-06-15 12:21:12,546 : INFO : topic #7 (0.100): 0.000*quae + 0.000*vt + 0.000*qui + 0.000*ij + 0.000*non + 0.000*ad + 0.000*si + 0.000*vel + 0.000*atque + 0.000*cum
2015-06-15 12:21:12,558 : INFO : topic #8 (0.100): 0.000*suspected + 0.000*supersticious + 0.000*squire + 0.000*parsons + 0.000*ordinarie + 0.000*vsed, + 0.000*english, + 0.000*fortnight + 0.000*squire, + 0.000*offenders
2015-06-15 12:21:12,572 : INFO : topic #9 (0.100): 0.001*/ + 0.001*ile + 0.000*y^e + 0.000*che + 0.000*much + 0.000*tis + 0.000*could + 0.000*oh + 0.000*neuer + 0.000*heart
I have 307 documents and I'm running my LDA with the following code after removing the stopwords:
texts = [[token for token in text if frequency[token] > 3 ] for text in texts]
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
tfidf = models.TfidfModel(corpus) tfidf_corpus = tfidf[corpus]
lda = models.LdaModel(tfidf_corpus, id2word = dictionary, update_every=1, chunksize= 20, num_topics = 10, passes = 1)
lda[tfidf_corpus]
lda.print_topics(10)
I am not sure what is wrong but everytime I run this, the token weights are 0. What might be causing this and how could I correct this?