For a classification task I want to fit a gamma distribution to two pair of data: Distance population within class and between class. This is to determine the theoretical False Accept and False Reject Rate.
The fit Scipy returns puzzles me tough. A plot of the data is below, where circles denote within class distances and x-es between class distance, the solid line is the fitted gamma within class, the dotted line is the fitted gamma on the between class distance.
What I would have expected is that the gamma curves would peak at around ~10 and ~30, not at 0 for both. Does anyone see what's going wrong here?
This is my code:
pos = [7.4237931034482765, 70.522068965517235, 9.1634482758620681, 22.594137931034485, 7.3003448275862075, 6.3841379310344841, 10.693448275862071, 7.5237931034482761, 7.4079310344827594, 7.2696551724137928, 8.5551724137931036, 17.647241379310344, 7.8475862068965521, 14.397586206896554, 32.278965517241382]
neg = [32.951724137931038, 234.65724137931034, 25.530000000000001, 33.236551724137932, 258.49965517241378, 33.881724137931037, 18.853448275862071, 33.703103448275861, 33.655172413793103, 33.536551724137929, 37.950344827586207, 34.32586206896552, 42.997241379310346, 100.71379310344828, 32.875172413793102, 30.59344827586207, 19.857241379310345, 35.232758620689658, 30.822758620689655, 34.92896551724138, 29.619310344827586, 29.236551724137932, 32.668620689655171, 30.943448275862071, 30.80344827586207, 88.638965517241374, 25.518620689655172, 38.350689655172417, 27.378275862068971, 37.138620689655177, 215.63379310344828, 344.93896551724134, 225.93413793103446, 103.66758620689654, 81.92896551724138, 59.159999999999997, 463.89379310344827, 63.86827586206897, 50.453103448275861, 236.4603448275862, 273.53137931034485, 236.26103448275862, 216.26758620689654, 170.3003448275862, 340.60034482758618]
alpha1, loc1, beta1=ss.gamma.fit(pos, floc=0)
alpha2, loc2, beta2=ss.gamma.fit(neg, floc=0)
plt.plot(pos,[0.06]*len(pos),'ko')
plt.plot(neg,[0.04]*len(neg),'kx')
x = range(200)
plt.plot(x,ss.gamma.pdf(x, alpha1, scale=beta1), '-k')
plt.plot(x,ss.gamma.pdf(x, alpha2, scale=beta2), ':k')
plt.xlim((0,200))
The trick with floc=0 I got from here: Why does the Gamma distribution in SciPy have three parameters? But it does not always force loc1 and loc2 to be 0 :/
(This is really a comment, but I want to show the plot that I get.)
Are you sure you used
floc=0
in thefit
method when you made the plot? If I leave it out (or if I make the mistake--as I frequently do--of usingloc=0
instead offloc=0
), I get a plot that looks like the one you included.Which versions of scipy and numpy are you using?
With scipy 0.12.0 and numpy 1.7.1, your code works for me. I added a couple
print
statements, and I get:along with the plot: