Document Classification

652 views Asked by At

Kindly suggest me a classifier that classifies the documents based on the requirements mentioned below.

I have set of documents which are to be classified. For each classification label, I have the set of terms that are specific to that class label.

2

There are 2 answers

0
NilsHaldenwang On

Well, if you already have the terms for your classes you can use some different kinds of classifiers, e.g. a SVM, a Naive Bayes Classifier or even a Neural Network.

There are some libraries out there which include this classifiers, like weka or mahout.

Recetly I wrote an example how to do this with a Naive Bayes Classifier: Naive Bayes Example, but this is rather an explanation of the concept and no real-world-usable tool.

0
Rajkumar On

As you have labels attached to document, this come under supervised learning. You can use any of the below classifiers to achieve document classification. 1. Naive Bayes classifier 2. Nearest Neighbourhood classifier 3. Decision trees 4. Subspace method

Most of the ml libraries will have implementations for the above techniques. You can refer to this link, if you want to choose which ml library based on the programming language you are comfortabl with. http://daoudclarke.github.io/machine%20learning%20in%20practice/2013/10/08/machine-learning-libraries/