Gensim perplexity score increases

666 views Asked by blackmamba At 21 September 2020 at 18:46

I am trying to calculate the perplexity score in Spyder for different numbers of topics in order to find the best model parameters with gensim.

However, the perplexity score is not decreasing as it is supposed to [1]. Besides, there seem to be more persons experiencing this exact issue but no solution is available as far as I know.

Does anyone have any idea on how to solve the issue?

Code:

X_train, X_test = train_test_split(corpus, train_size=0.9, test_size=0.1, random_state=1)

topic_range = [10, 20, 25, 30, 40, 50, 60, 70, 75, 90, 100, 150, 200]

def lda_function(X_train, X_test, dictionary, nr_topics):
    ldamodel2 = gensim.models.LdaModel(X_train,
                                       id2word=dictionary,
                                       num_topics=nr_topics,
                                       alpha='auto',
                                       eta=0.01,
                                       passes=10
                                       iterations=500, 
                                       random_state=42)
    return 2**(-1*ldamodel2.log_perplexity(X_test))

log_perplecs = [lda_function(X_train, X_test, dictionary, nr_topics=topic) for topic in topic_range]

print("\n",log_perplecs)

fig1, ax1 = plt.subplots(figsize=(7,5))
ax1.scatter(x=topic_range, y=log_perplecs)
fig1.tight_layout()

fig1.savefig(output_directory + "Optimal Number of Topics (Perplexity Score).pdf", bbox_inches = 'tight')```




  [1]: https://i.stack.imgur.com/jFiF1.png

Original Q&A

TechQA.

Gensim perplexity score increases

There are 0 answers

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in GENSIM

Related Questions in LDA

Related Questions in PERPLEXITY

Popular Questions

Popular Tags

Trending Questions