I have a starting coordinate (x1,y1) and an ending coordinate (x2, y2). I want to generate 'n' random points between start and end coordinates without any duplicates. How to do this with python?
I know that a simple way would be to generate 'n' x values and 'n' y values. So we get n*n pairs and I choose 'n' among them with no duplicates. This way I mayn't get a uniform distribution of random points. Any other way to do this?
Edit: I require floating point coordinates in the rectangle formed by the start and end coordinates as opposite corners.
TL;DR:
Arguably, practically:
Considering that the odds of collisions are tiny.
Assuming that "between start and end coordinates" means in a rectangular section between these two corners in a Cartesian coordinate system (i.e. flat, 2D).
And assuming that a "uniform distribution" is achieved sufficiently ignoring the non-uniform distribution of floating point values. (i.e. not the exact same number of floating point values on any interval of equal length, nor a constant distance between floating point values in a continuum)
There's basically three ways of ensuring the randomly generated points are not duplicated:
The first option can be a good choice if the space from which values are picked is of similar size to the target set size. However, when picking points with random floating point coordinates in some space, this is unlikely.
The second choice is the most straightforward, but can be expensive to compute if the target set size is large, as every new pick causes more comparisons.
The third choice is a bit more involved, but avoids comparisons until a candidate target set has been completed and certainly the best choice if the odds of collisions are small.
As a variant of the second choice, you could pick a target data structure that simply avoids the addition of duplicates altogether, relying on the language / interpreter to perform the checking more efficiently than any algorithm written in the language would be able to.
In Python, this means using a
set
instead of alist
, which is the fastest way to achieve the result and would likely be the way you'd check for duplicates in the third option anyway - so you may as well use it right away and go with the variant of the second option.Note that both the 2nd and 3rd option have a major flaw in case you're trying to create a set in the range of the selection function that's larger than the domain of the selection function. But for the given problem that's unlikely except for extremely large 'n'.
A solution (pitting the second option against the third):
Result (on my hardware):
I get consistently, but marginally better results for the
pick_coords_checked
function - I would favour the clarity of the first implementation.