What does the useLaplace parameter do in the WEKA j48 algorithm?

1.1k views Asked by At

I am mining on a dataset using the j48 tree algorithm.

I have been trying to understand what the useLaplace parameter does. The only thing I have to go by is this:

Whether counts at leaves are smoothed based on LapLace

which is just the documentation which WEKA has provided. I have some questions about this though:

  1. What are counts at leaves?
  2. What is smoothing?
  3. What is LapLace? Is it an algorithm used for smoothing?

Everything I have found online doesn't really go into detail about what this parameter is actually doing, rather just explains that it "turns on Laplace smoothing."

1

There are 1 answers

2
lelabo_m On BEST ANSWER

Provost and Domingos found that frequency smoothing of the leaf probability estimates, such as Laplace correction, significantly enhances the performance of the decision tree. From what i have read, counts at leaves (a.k.a leaf probability in my previous sentence) are used to determine probabilistic estimate which can be define by:

P( to be class A | for attribute x) = TruePositive/(TruePositive + FalsePositive)

Smoothing consist in reducing noise and error among the results in the tree in order to produce more accurate probabilistic estimate.

Laplace is a frequency smoothing correction formula:

PLaplace ( to be class A | for attribute x)= (T P + 1)/(T P + F P + C)

where C is the number of clas in the dataset.