BERTopic: Probabilities are NoneType when doing supervised learning

27 views Asked by At

I'm following the guide here on how to use BERTopic for supervised learning on the 20newsgroups dataset. I want to see all calculated probabilities so I added calculate_probabilities=True to the BERTopic instance, see below. However, probs comes up as NoneType. Is some dependency I am lacking or something wrong with my implementation?

probs does return an array of probabilities when doing unsupervised learning, so I wonder if this issue is unique to supervised learning.

# Skip over dimensionality reduction, replace cluster model with classifier,
# and reduce frequent words while we are at it.
empty_dimensionality_model = BaseDimensionalityReduction()
clf = LogisticRegression()
ctfidf_model = ClassTfidfTransformer(reduce_frequent_words=True)

# Create a fully supervised BERTopic instance
topic_model= BERTopic(
        umap_model=empty_dimensionality_model,
        hdbscan_model=clf,
        ctfidf_model=ctfidf_model,
        calculate_probabilities = True
)

topics, probs = topic_model.fit_transform(docs, y=y)
0

There are 0 answers