Efficiently detect that rational numbers are equal

394 views Asked by At

I have a collection of many rational numbers, with the numerator and denominator of each stored as a large (hundreds or thousands of bits) unsigned integer. I'd like to be able to efficiently test whether any given rational number a/b in the set is equal to any other rational number c/d in the set.

The most straightforward method would be to test whether a*d == b*c, of course, but I'd like something more efficient than computing the full products.

Some notes on my particular use-case:

  • The pairs I'll be testing have a high likelihood of actually being equal (because I'm already precomputing and comparing them by their floating point approximations first), so early-outing if they're unequal won't save me much time.
  • I'm fine with precomputing extra data for each of the numbers, but each number will only be used in a handful of comparisons, so expensive precomputation (such as prime factorization) probably wouldn't be worthwhile.
  • Occasional false negatives would be fine, but false positives are not.

I think this may be theoretically impossible, but throwing it out to the hive mind just in case.

3

There are 3 answers

2
clemens On

You can filter out many not equal pairs of fractions by comparing the bit-lengths. Let l(a) = floor(log2(a)) + 1 the bit-length of a. If a/b = c/d than l(a) + l(d) = l(c) + l(b).

You can use this for a speedup when you are first comparing the lengths and comparing the products, only if the sum of the lengths are equal.

2
clemens On

Second try ;) If you must check new numbers repeatedly for set containment, you should store the relatively prime fractions in an ordered set. The comparison function of the set should compare first the counters, and than the denominators if the counters are equal. The comparison can be done in linear time, and thus finding an element in the ordered set with M items needs O(N log M) steps. Reducing a fraction costs O(N ²). Thus, testing a number for containment requires O(N ² + N log M) steps, and computing the set O(MN ²).

Edit: Instead of using a sorted or a tree set, you can use a hash set which reduces the required number of steps for searching to O(N ² + N) = O(N ²).

0
LSerni On

This is not going to be very helpful in your case, if you already precompute the floating point approximation; it might still save some time in the pipeline (or some approximations).

You examine the integer values of a, b, c and d.

The rational numbers being the same means they describe the same line through the origin.

If c > a, then it must also be that d > b, otherwise we would be in the gray area in the bottom right; if c < a, conversely, it must be that d < b, or we would be in the upper left gray corner. The gray areas have no possibilities of equality, and if the numbers were random (i.e. not float-approximation-filtered), we would have excluded N/2 of them with N2 bigint comparisons.

Of the remaining 50%, we can exclude half by noticing that if a > b, then the black line is below the bisector of the first and third quadrants, and it must be that c > d, otherwise C/D would be on the other side of the bisector; we would be in the top orange sector with no equality possible. The same or the a < b case. So other two bigint comparisons reduce the numbers to be checked to one quarter of the original; those can be approximated in floating point, and if they're "almost equal", we are in the tiny red zone where other techniques are needed.

You can also extend this method by observing that for any k, the relation of a to kb must be the same as the one between c and kd for a/b and c/d to be equal; and if k is an integer power of 2, that allows several possible optimizations.

(At some point, of course, the cost of this will exceed that of a*d==b*c test).

numbers on the plane