Am trying a to extract feature Keywords for each document in collection of txt files (all in one folder). Been trying different solutions without success. What I want is to extract keywords for each document, then connect these keywords to labels. Finally I am going to cross validate my (supervised created labels) with SVM, Logistic Regression and Random Forest Classification.
I’ve been using the widgets import docs, corpus, preprocess text, bag of words, distances, hierarchical clustering. Also tried …preprocess, word cloud, corpus, document embedding, distances, hierarchical clustering. The documents are separated when using corpus viewer. I assume ( have tried) you need to utilize document embedding to create vector space representation of the documents as words but havent gotten it to work. Thanks