How can I make a model in R that uses predefined topics with certain words on a new set of words to determine the relatedness to the topics

36 views Asked by Mathijs van den Broek At 03 May 2021 at 07:30

I'm trying to build a model that can determine how related a string of text is to a predefined topic and have tried several methods (LDA with seedwords, Naive Bayes mainly) but can't really get the desired results.

I have a list with two topics "inside" and "outside" and several words related to each of the topics

Inside	Outside
Production	Clients
Marketing	Suppliers
Finance	Banks
etc.	etc.

The text I want to analyze is contained in columns with for example a text like: banks_production_clients

Moreover, I have about 1115 documents with each related to several columns (about 200 each).

I want my model to recognize that this contains two words that belong to the topic "outside" and one that belongs to the topic "inside". So, this makes it something like 0.67 related to outside and 0.33 related to inside. In the end, I want to see how much each document (with 200 of these columns) relates to either topic.

The occurrences of the words differ highly, so when running an LDA, the highly occurring words were grouped together because they also occur together a lot more often.

Original Q&A

TechQA.

How can I make a model in R that uses predefined topics with certain words on a new set of words to determine the relatedness to the topics

There are 0 answers

Related Questions in TOPIC-MODELING

Related Questions in TOPICMODELS

Popular Questions

Popular Tags

Trending Questions