Histogram overlay plot with lognormal distribution

Question

Histogram overlay plot with lognormal distribution

172 views Asked by Djanger At 20 October 2024 at 18:03

I want to check the fit of my data, which I suspect is lognormally distributed using a histogram and overlaying the lognormal PDF as a line. I estimate the lognormal parameters from the data and generate n=1000 data points (same number as the data). data_list is a list containing 1000 of my datapoints which are integers.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import lognorm

...
data = np.array(data_list)

plt.hist(data, bins=32, density=True, alpha=0.6, color='g', label='Data')

sigma, _, mu = lognorm.fit(np.log(data), floc=0)
x = np.linspace(min(data), max(data), 1000)
lognormal_data = lognorm.pdf(x, sigma, scale=np.exp(mu))


plt.plot(x, lognormal_data, 'r-', lw=2, label='Lognormal Distribution')
plt.xlabel('X-axis Label')
plt.ylabel('Y-axis Label')
plt.legend()
plt.title('Histogram Overlay with Lognormal Distribution')
plt.grid(True)

plt.show()

However, the resulting plot is this:

It seems like the initial parameters for the lognormal distribution ar off, as it does not coincide with the data. Furthermore, the curve looks more normal than lognormal. Does anybody see what i'm doing wrong here>

Original Q&A

There are 1 answers

**Tranbi** · Accepted Answer · 2023-10-26 13:20:50

I'm no statistician, but if you suspect that data has a lognormal distribution, shouldn't you try to fit data instead of np.log(data)?

The documentation of the fit method states that it returns the following:

Estimates for any shape parameters (if applicable), followed by those for location and scale.

The same documentation states that lognorm.pdf has the following signature: pdf(x, s, loc=0, scale=1).

I would therefore try the following:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import lognorm

data = np.random.lognormal(mean=1, sigma=0.2, size=1000)
plt.hist(data, bins=50, density=True, alpha=0.6, color='g', label='Data')

s, loc, scale = lognorm.fit(data)
x = np.linspace(min(data), max(data), 1000)
lognormal_data = lognorm.pdf(x, s, loc=loc, scale=scale)

plt.plot(x, lognormal_data, 'r-', lw=2, label='Lognormal Distribution')
plt.xlabel('X-axis Label')
plt.ylabel('Y-axis Label')
plt.legend()
plt.title('Histogram Overlay with Lognormal Distribution')
plt.grid(True)

plt.show()

Output:

TechQA.

Histogram overlay plot with lognormal distribution

There are 1 answers

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in MATPLOTLIB

Related Questions in SCIPY

Related Questions in SCIPY.STATS

Popular Questions

Popular Tags

Trending Questions