Checking if a specific float value is in list/array in Python/numpy

7.6k views Asked by At

Care needs to be taken when checking for equality between floating point numbers, and should usually be done with a tolerance in mind, using e.g. numpy.allcose.

Question 1: Is it safe to check for the occurrence of a specific floating point number using the "in" keyword (or are there similar keywords/functions for this purpose)? Example:

if myFloatNumber in myListOfFloats:
  print('Found it!')
else:
  print('Sorry, no luck.')

Question 2: If not, what would be a neat and tidy solution?

2

There are 2 answers

2
cglacet On BEST ANSWER

If you don't compute your floats in the same place or with the exact same equation, then you might have false negatives with this code (because of rounding errors). For example:

>>> 0.1 + 0.2 in [0.6/2, 0.3]  # We may want this to be True
False

In this case, we can just have a custom "in" function that will actually make this true (in this case it may be better/faster to use numpy.isclose instead of numpy.allclose):

import numpy as np 

def close_to_any(a, floats, **kwargs):
  return np.any(np.isclose(a, floats, **kwargs))

There is an important note in the documentation:

Warning The default atol is not appropriate for comparing numbers that are much smaller than one (see Notes). [...] if the expected values are significantly smaller than one, it can result in false positives.

The note adds that atol is not zero contrary to math.isclose's abs_tol. If you need a custom tolerance when using close_to_any, use the kwargs to pass rtol and/or atol down to numpy. In the end, your existing code would translate to this:

if close_to_any(myFloatNumber, myListOfFloats):
  print('Found it!')
else:
  print('Sorry, no luck.')

Or you could have some options close_to_any(myFloatNumber, myListOfFloats, atol=1e-12), note that 1e-12 is arbitrary and you shouldn't use this value unless you have a good reason to.

Coming back to the rounding error we observed in the first example, this would give:

>>> close_to_any(0.1 + 0.2, [0.6/2, 0.3])
True
0
Ali Nuri Şeker On

Q1: Depends on how you are going to implement this. But as others mentioned with floats its not such a good idea to use in operator.

Q2: Do you have any restrictions performance-wise? Will myListOfFloats be sorted?

If it is a sorted list of float values and if you need to do it as fast as you possibly can, you can implement a binary search algorithm.

If the data is not sorted, depending on the ratio between number of queries you will be making and the size of the data, you might want to sort the data and keep it sorted.

If you dont have any requirements on performance and speed you can use the following example as a basis:

def inrng(number1,number2,prec):
   if(abs(number1-number2)<prec):
      return True
   else:
      return False


precision=0.001
for i in myListOfFloats:
   if(inrng(i,myInputNumber,precision)):
      #do stuff