I have an array of values that I'm clipping to be within a certain range. I don't want large numbers of values to be identical though, so I'm adding a small amount of random noise after the operation. I think that I need to be accounting for the floating point resolution for this to work.
Right now I've got code something like this:
import numpy as np
np.minimum(x[:,0:3],topRtBk,x[:,0:3])
np.maximum(x[:,0:3],botLftFrnt,x[:,0:3])
np.add(x[:,0:3],np.random.randn(x.shape[0],3).astype(real_t)*5e-5,x[:,0:3])
where topRtBk
and botLftFrnt
are the 3D bounding limits (there's another version of this for spheres).
real_t
is configurable to np.float32
or np.float64
(other parts of the code are GPU accelerated, and this may be eventually as well).
The 5e-5
is a magic number which is twice np.finfo(np.float32).resolution
, and the crux of my question: what's the right value to use here?
I'd like to dither the values by the smallest possible amount while retaining sufficient variation-- and I admit that sufficient is rather ill defined. I'm trying to minimize duplicate values, but having some won't kill me.
I guess my question is two fold: is this the right approach to use, and what's a reasonable scale factor for the random numbers?