Sample to Create Uniform Distribution from Non-Uniform Data

2.4k views Asked by At

Given a dataset with a non-uniform distribution (highly peaked) I want to resample to create a new dataset with an approximately uniform distribution. My approach:

  1. Divide the data into bins.
  2. Target bin level = Smallest number of samples per bin, among all bins.
  3. Randomly delete samples until each bin count = target bin level.

Is there a better technique?

1

There are 1 answers

3
digestivee On BEST ANSWER

We know that for a uniform distribution we have

mean = (a+b) / 2

variance = (b-a)^2 / 12

So you could just construct these and sample from a uniform distribution with these parameters, where you either set a = min(data) and b = max(data) or maybe a = mean(lowest_bin) and b = mean(highest_bin) or something like that. How you want to set a and b depends on your data and what you want to accomplish