How to bin a numerical pandas Series into n groups of approximately the same size without qcut?

219 views Asked by aurorca At 11 October 2020 at 15:17

I would like to split my series into exactly n groups (assuming there are at least n distinct values in the series), where the group sizes are approximately equal.

The code needs to be generic, so I cannot know the distribution of the data in advance, hence using pd.cut with pre-defined bins is not an option for me.

I tried using pd.qcut or pd.cut with pd.Series.quantile but they all fall short when some value is repeated very often in the series.

For instance, if I want exactly 3 groups:

series = pd.Series([1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 4, 4, 4, 4])
pd.qcut(series, q=3, duplicates="drop")

creates only 2 categories: Categories (2, interval[float64]): [(0.999, 3.0] < (3.0, 4.0]], whereas I would like to get something like [(0.999, 1.0] < (1.0, 3.0] < (3.0, 4.0]].

Is there any way to do this easily with pandas' built-in methods?

Original Q&A

TechQA.

How to bin a numerical pandas Series into n groups of approximately the same size without qcut?

There are 0 answers

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in SERIES

Related Questions in BINS

Popular Questions

Popular Tags

Trending Questions