Asymmetrical bins for intervals

28 views Asked by At

I'm attempting to plot the percentage of time (logarithmic y axis) during which the data is above a given value (x axis) in Python. My data is a simple 2-columns time (daily mean)-water discharge (m3/s) table.

To do this, I thought to do a frequency table with the counts of the discharge of every interval, and then associate the number of counts to a %time considering that every discharge value is a 24h mean.

In order to get the frequency table, I manually set the number of bins for the discharge values (e.g.: 30 bins resulting in symmetrical intervals (0.151, 7.011], (0.581, 13.873], ..., (199.129, 205.99]). My data follows a right-skewed histogram look, so most of the frequency counts will be in the first intervals, while the last intervals only have 1 or even 0 counts.

I wanted to do asymmetrical intervals, with more bins in the first part of the data, resulting in narrower intervals at the left, and wider at the right, avoiding 0-counts intervals. First, I tried to do this manually "cutting" the data and giving a different number of bins to each cut, but this is highly inefficient because I need to edit the process every time for a lot of different tables. Also, this bins wouldn't have a mathematical logic and this'd force God to punish me.

I tried the numpy.histogram_bin_edges() with different arguments for the parameter bins, but in all cases, the intervals are symmetrical.

I'm not sure if this process even has sense, so I'm open to other procedures.

0

There are 0 answers