Hello all!
As apart of a project, I need to build a text classifier with the labeled data I have. A data point is composed of a single sentences and one of 3 categories for each sentence. I have extracted 5 topics from this database with LDA.
What I want to try is that I want to use these topics to determine which class an unseen sentence belongs to. I am thinking about training a supervised model with 5 indicator that show the topic distribution for a sentence given those 5 topics.
The problem is that I can not get separate likelihood for each topic given a sentence. I am confused about what perplexity and score of a LDA model indicates. They seem to return single float value.
Also, I am aware of supervised versions of LDA. I want to know if my approach make sense at all.
Thanks in advance!