Eliminating predictions with low confidence with Naive Baye's

213 views Asked by At

I have been trying the Naive Baye's implementation of Spark's MLlib.During testing phase, I wish to eliminate data with low confidence of prediction.

My data set primarily consists of form based documents like reports and application forms. They contain key-value pair type text and hence I assume the independence condition holds better than with natural language.

About the quality of priors, I am not doing anything special. I am training more or less equal number of samples for each class and have left the heavy lifting to be done by MLlib.

Given these facts, does it make sense to have confidence thresholds defined for each category above which I will get correct results consistently?

Thanks

0

There are 0 answers